From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,5412c98a3943e746
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.68.220.229 with SMTP id pz5mr3738257pbc.5.1331331201233;
        Fri, 09 Mar 2012 14:13:21 -0800 (PST)
Path: 
 h9ni7739pbe.0!nntp.google.com!news2.google.com!news3.google.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: Niklas Holsti <niklas.holsti@tidorum.invalid>
Newsgroups: comp.lang.ada
Subject: Re: Verified compilers?
Date: Sat, 10 Mar 2012 00:13:19 +0200
Organization: Tidorum Ltd
Message-ID: <9rvdjvFfa8U1@mid.individual.net>
References: <op.v91cleyiule2fv@douda-yannick> <jj3p5v$2tp$1@munin.nbi.dk>
 <9207716.776.1331054644462.JavaMail.geo-discussion-forums@ynaz38>
 <nhzeozam35it$.13mzx8ap4cmdc$.dlg@40tude.net>
 <op.wascesqrule2fv@douda-yannick>
 <4edda5mav3cf$.149pbgyxl1wx5.dlg@40tude.net>
 <9rplcgF5a2U1@mid.individual.net>
 <1psd0g0womgxi.1sle7ol12x3d5.dlg@40tude.net>
 <9rsahhFmr3U1@mid.individual.net> <ezfn0og1qfov.14k34sxoky5ke.dlg@40tude.net>
Mime-Version: 1.0
X-Trace: individual.net +g6Q4afgU0Udi7kC2VbL/QPWGLHpmzCDxtOIdxHF3lRyT7c7sk
Cancel-Lock: sha1:E5LAW0av4y9AdF1AvVzZiQWgjQI=
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
In-Reply-To: <ezfn0og1qfov.14k34sxoky5ke.dlg@40tude.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Date: 2012-03-10T00:13:19+02:00
List-Id: <comp.lang.ada>

On 12-03-08 22:40 , Dmitry A. Kazakov wrote:
> On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote:
>
>> On 12-03-07 22:17 , Dmitry A. Kazakov wrote:
>>> On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote:
>>>
>>>> On 12-03-07 11:12 , Dmitry A. Kazakov wrote:
>>>>
>>>>> I was always wondering why people bother to study grammars, LR, LL parsers
>>>>> and other useless stuff.
>>>>
>>>> Grammars and the notion of parsing (derivations) rigorously define what
>>>> a given string of characters means, in some formally defined language.
>>>
>>> No. A grammar defines some formal language: a set of strings.
>>
>> Ok, I was being too brief. You are right that a grammar defines a
>> language. But notice that I was also speaking of parsing and
>> derivations, not just grammars.
>>
>> If you want to define the meanings of the sentences of a language in a
>> rigorous way, you have basically two options:
>>
>> 1. Define the meaning operationally, by giving a standard interpreter, a
>> black box, for example a Turing machine, that accepts the language, and
>> define that the actions of that machine, while reading a sentence, are
>> the meaning of that sentence. (From your point of view, the meaning
>> could be the "translation" that appears on the output tape.)
>>
>> 2. Define the meaning mathematically in some way. Note that by
>> "mathematically" I don't necessarily mean formally proven math, but
>> mathematically precise definitions with the usual amount of hand-waving
>> and shortcuts.
>
> 1 and 2 are same to me. In all cases it is an input for some machine. An
> inference system based on Peano axioms or whatever, is no less a machine
> than the Turing one. The only difference is in the computational power of
> such machines.

I see a big difference in the structure of the machine. In 1, it can be 
arbitrarily complex and opaque. And remember the halting problem... In 2 
(using a parse tree as said later), it has a very specific structure, 
clearly related to the structure of the language, and much more 
declarative than in 1.

>> As far as I know, all methods -- formal or informal -- for defining the
>> meaning of sentences in formal languages are based on how the sentence
>> is derived, or equivalently how it is parsed.
>
> No. 1234 has the same meaning for Ada parser, human reader and an OCR
> system.

1234 has no meaning, except as a literal string, until you give it some 
other meaning, for example as the integer one thousand two hundred and 
thirty four.

>> How else could you explain why "2*3+4*5" means 26, and not 70, nor
>> something else?
>
> Explain? It is a subject of proof. There are more than one way to do it.

Operator precedence, perhaps? See earlier post...

> BTW the "meaning" you are advocating for looks rather something like
> 23*45*+.

More like: the expression is a SUM of (a TERM that is the PRODUCT of 
(the number 2) and (the number 3)) and (a TERM that is the PRODUCT of 
(the number 4) and (the number 5)), together with a rule for evaluating 
such expression trees.

>> In practice, the role of the parse/derivation tree appears in the fact
>> that parser/compiler generators let you attach "semantic actions" to
>> grammar productions, and often help you to consider larger parts of the
>> derivation by passing information around the parse/derivation tree in
>> well-defined ways. Attribute grammars are one example.
>
> Parsing tree is just a less ambiguous representation of the program.

"Just"? It is the first step in the assignment of meaning to a sentence. 
The second step is a computation defined inductively on the parse tree.

> You don't even need to generate it in order to evaluate 2*3+4*5
> using a finite state machine.

You cannot do it with a finite state machine (assuming you want to allow 
literals of any size and expressions of any length). You need a stack of 
numbers, too.

The generation of parsing machines from grammars is one of the main 
achievements of formal language and parsing theory. Why sweat to make 
the machine manually, when you can do it automatically?

> And you certainly do not need it when using your head

How do you know what your head is using? :-)

> or calculator.

Depends on the calculator. If it does not have '(' and ')', or a stack, 
you may have to follow a parse tree, maybe even write down some 
intermediate results.

>>> The grammar defines only syntax.
>>
>> No, it also defines the structure, through the parse/derivation tree.
>
> I was talking about formal languages.

Me too. Maybe we understand the term differently?

> Do you want to say that the trace of
> productions applied during parsing is the program semantics?

That trace (or rather the parse tree) is the structure on which the 
semantics is defined, by structurally inductive definitions.

> It is amusing to read how LRM Annex P tries to describe operator precedence
> by syntactical means.

What do you mean, "tries"? It works, and as a formalism I like it better 
than the operator precedence definition. I might agree that an intuitive 
idea of operator precedence helps us human programmers, with our limited 
and unreliable working memory.

> This is a remnant of a delusion shared by some that
> formal languages could describe, if not natural languages, but at least
> machine languages. I thought nobody seriously believes in this now.

Natural languages, ok, since they have no precise definition to begin 
with, so can only be approximated by formal languages. But I am amazed 
by your position regarding machine languages. Perhaps we have a 
different notion of what "describing" means.

> Care to carve a grammar where
>
>     declare
>        X : Integer := 1.0 + 2;
>
> would be a syntax error?

Easy, the grammar that defines only the empty string :-)

But I guess you are thinking about types, and how to make the mixing of 
integers and floating point illegal. It is trivial to write a grammar 
that makes it a syntax error to mix integer literals and floating point 
literals in the same expression, but you have to duplicate the 
definition of expressions and have separate metasymbols 
Integer_Expression and FP_Expression, and all the way down to literals.

If you are thinking about identifiers with types, use an attribute 
grammar that checks types by means of a symbol table, and there you are. 
Perhaps you don't want to call that a "syntax error", but at least it is 
a "parse-time error".

>>> Furthermore the merits of formal grammar definitions are debatable, because
>>> the complexity of grammar descriptions usually make it impossible to judge
>>> if a given string do belong to the formal language or not.
>>
>> Entirely wrong. Can you give an example of this impossibility?
>
> Are you claiming that if I gave you a random set of production rules and a
> random string you could tell me [in 30 seconds time] if that string belongs
> to the language?

What have my personal abilities to do with anything?

The research in formal languages and parsing has led to tools. Take your 
grammar, feed it into a parser ganerator, and in a few seconds it will 
tell you if the grammar is acceptable (as it usually is, or can be made 
to be). A few seconds more, and you will have a parser, which will use a 
few more seconds to tell you if the string is in the language.

Or did you have some other meaning for "usually"?

Perhaps you want to use informal methods, and ask Jill, Joe, and John if 
the sentence is legal? Perhaps with majority voting?

>>> Maybe, but they are totally irrelevant for the grammars of real programming
>>> languages.
>>
>> No, they are relevant. For example, you want limited look-ahead to make
>> parsing fast, and for most "little languages" for which you implement a
>> parser manually, you want the language to be suitable for recursive
>> descent parsing.
>
> Really? Comp.compilers is busy with silly questions how do I modify the
> grammar [a list of productions follows] to parse [a description] by parser
> X. How is this activity different from programming?

How is logic programming different from functional programming different 
from event-driven programming different from ...

It is all about describing things with sufficient precision for 
automated processing.

 From this point of view, grammars are an application-specific 
programming language, for programming parsers. Nothing wrong with that. 
And it was made possible by the research into formal languages and parsing.

> People willingly choose a bad programming language [of productions]
> for what reason? Why not to do parser straight in Ada? I do.

There is much disagreement about which languages are "good", which "bad".

By all means write your parsers in Ada; I do too, although I let 
OpenToken do the lexical analysis. But I have simple languages, with 
simple semantics. For more complex cases, and especially for any 
attribute evaluation that requires several passes over the parse tree, 
attribute grammars are hard to beat for convenience.

If you want to describe what language your hand-written parser accepts, 
and what its processing means, I think you have to present the grammar 
and describe the meaning of the sentences. You can show the grammar in 
BNF, and describe the meaning informally, but I would be very surprised 
if you could define the meaning without referring to the BNF, that is, 
without using some structural induction method (informally, of course).

>>> On one hand these classes do not capture real differences
>>> between syntaxes of programming languages.
>>
>> What sort of "real differences" do you mean?
>
> Like between C++ and Ada.

Sorry, not clear yet. I'm sure you don't mean just that one uses {} and 
the other begin-end, but what do you mean?

>> I don't want to become too personal, but have you had a Horrible
>> Experience with some poor formal-languages

> No. Actually I enjoyed the subject for its simplicity compared to "hard"
> mathematics, and for its elegance too. I have a weakness to all sorts of
> formalisms, which is typical to Russian mathematical school. My major was
> mathematics.

That sounds like the Dmitry I've met before on c.l.a. (My major was 
math, too.)

> I didn't want to become a programmer, it just happened to me during 5th or
> 6th semester. That time I was exposed to the classical Aho and Ullmann.

That is the book we used. I don't remember the edition.

> I was saved by David Gries. He is a really great man, who deserves all
> possible fame and respect. His book on compiler construction opened my
> eyes.

I don't remember reading that book. I'll look out for it.

A stimulating discussion, thanks Dmitry. But I guess, no conversions of 
beliefs :-)

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .