From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5412c98a3943e746 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.220.229 with SMTP id pz5mr3738257pbc.5.1331331201233; Fri, 09 Mar 2012 14:13:21 -0800 (PST) Path: h9ni7739pbe.0!nntp.google.com!news2.google.com!news3.google.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Niklas Holsti Newsgroups: comp.lang.ada Subject: Re: Verified compilers? Date: Sat, 10 Mar 2012 00:13:19 +0200 Organization: Tidorum Ltd Message-ID: <9rvdjvFfa8U1@mid.individual.net> References: <9207716.776.1331054644462.JavaMail.geo-discussion-forums@ynaz38> <4edda5mav3cf$.149pbgyxl1wx5.dlg@40tude.net> <9rplcgF5a2U1@mid.individual.net> <1psd0g0womgxi.1sle7ol12x3d5.dlg@40tude.net> <9rsahhFmr3U1@mid.individual.net> Mime-Version: 1.0 X-Trace: individual.net +g6Q4afgU0Udi7kC2VbL/QPWGLHpmzCDxtOIdxHF3lRyT7c7sk Cancel-Lock: sha1:E5LAW0av4y9AdF1AvVzZiQWgjQI= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Date: 2012-03-10T00:13:19+02:00 List-Id: On 12-03-08 22:40 , Dmitry A. Kazakov wrote: > On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote: > >> On 12-03-07 22:17 , Dmitry A. Kazakov wrote: >>> On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: >>> >>>> On 12-03-07 11:12 , Dmitry A. Kazakov wrote: >>>> >>>>> I was always wondering why people bother to study grammars, LR, LL parsers >>>>> and other useless stuff. >>>> >>>> Grammars and the notion of parsing (derivations) rigorously define what >>>> a given string of characters means, in some formally defined language. >>> >>> No. A grammar defines some formal language: a set of strings. >> >> Ok, I was being too brief. You are right that a grammar defines a >> language. But notice that I was also speaking of parsing and >> derivations, not just grammars. >> >> If you want to define the meanings of the sentences of a language in a >> rigorous way, you have basically two options: >> >> 1. Define the meaning operationally, by giving a standard interpreter, a >> black box, for example a Turing machine, that accepts the language, and >> define that the actions of that machine, while reading a sentence, are >> the meaning of that sentence. (From your point of view, the meaning >> could be the "translation" that appears on the output tape.) >> >> 2. Define the meaning mathematically in some way. Note that by >> "mathematically" I don't necessarily mean formally proven math, but >> mathematically precise definitions with the usual amount of hand-waving >> and shortcuts. > > 1 and 2 are same to me. In all cases it is an input for some machine. An > inference system based on Peano axioms or whatever, is no less a machine > than the Turing one. The only difference is in the computational power of > such machines. I see a big difference in the structure of the machine. In 1, it can be arbitrarily complex and opaque. And remember the halting problem... In 2 (using a parse tree as said later), it has a very specific structure, clearly related to the structure of the language, and much more declarative than in 1. >> As far as I know, all methods -- formal or informal -- for defining the >> meaning of sentences in formal languages are based on how the sentence >> is derived, or equivalently how it is parsed. > > No. 1234 has the same meaning for Ada parser, human reader and an OCR > system. 1234 has no meaning, except as a literal string, until you give it some other meaning, for example as the integer one thousand two hundred and thirty four. >> How else could you explain why "2*3+4*5" means 26, and not 70, nor >> something else? > > Explain? It is a subject of proof. There are more than one way to do it. Operator precedence, perhaps? See earlier post... > BTW the "meaning" you are advocating for looks rather something like > 23*45*+. More like: the expression is a SUM of (a TERM that is the PRODUCT of (the number 2) and (the number 3)) and (a TERM that is the PRODUCT of (the number 4) and (the number 5)), together with a rule for evaluating such expression trees. >> In practice, the role of the parse/derivation tree appears in the fact >> that parser/compiler generators let you attach "semantic actions" to >> grammar productions, and often help you to consider larger parts of the >> derivation by passing information around the parse/derivation tree in >> well-defined ways. Attribute grammars are one example. > > Parsing tree is just a less ambiguous representation of the program. "Just"? It is the first step in the assignment of meaning to a sentence. The second step is a computation defined inductively on the parse tree. > You don't even need to generate it in order to evaluate 2*3+4*5 > using a finite state machine. You cannot do it with a finite state machine (assuming you want to allow literals of any size and expressions of any length). You need a stack of numbers, too. The generation of parsing machines from grammars is one of the main achievements of formal language and parsing theory. Why sweat to make the machine manually, when you can do it automatically? > And you certainly do not need it when using your head How do you know what your head is using? :-) > or calculator. Depends on the calculator. If it does not have '(' and ')', or a stack, you may have to follow a parse tree, maybe even write down some intermediate results. >>> The grammar defines only syntax. >> >> No, it also defines the structure, through the parse/derivation tree. > > I was talking about formal languages. Me too. Maybe we understand the term differently? > Do you want to say that the trace of > productions applied during parsing is the program semantics? That trace (or rather the parse tree) is the structure on which the semantics is defined, by structurally inductive definitions. > It is amusing to read how LRM Annex P tries to describe operator precedence > by syntactical means. What do you mean, "tries"? It works, and as a formalism I like it better than the operator precedence definition. I might agree that an intuitive idea of operator precedence helps us human programmers, with our limited and unreliable working memory. > This is a remnant of a delusion shared by some that > formal languages could describe, if not natural languages, but at least > machine languages. I thought nobody seriously believes in this now. Natural languages, ok, since they have no precise definition to begin with, so can only be approximated by formal languages. But I am amazed by your position regarding machine languages. Perhaps we have a different notion of what "describing" means. > Care to carve a grammar where > > declare > X : Integer := 1.0 + 2; > > would be a syntax error? Easy, the grammar that defines only the empty string :-) But I guess you are thinking about types, and how to make the mixing of integers and floating point illegal. It is trivial to write a grammar that makes it a syntax error to mix integer literals and floating point literals in the same expression, but you have to duplicate the definition of expressions and have separate metasymbols Integer_Expression and FP_Expression, and all the way down to literals. If you are thinking about identifiers with types, use an attribute grammar that checks types by means of a symbol table, and there you are. Perhaps you don't want to call that a "syntax error", but at least it is a "parse-time error". >>> Furthermore the merits of formal grammar definitions are debatable, because >>> the complexity of grammar descriptions usually make it impossible to judge >>> if a given string do belong to the formal language or not. >> >> Entirely wrong. Can you give an example of this impossibility? > > Are you claiming that if I gave you a random set of production rules and a > random string you could tell me [in 30 seconds time] if that string belongs > to the language? What have my personal abilities to do with anything? The research in formal languages and parsing has led to tools. Take your grammar, feed it into a parser ganerator, and in a few seconds it will tell you if the grammar is acceptable (as it usually is, or can be made to be). A few seconds more, and you will have a parser, which will use a few more seconds to tell you if the string is in the language. Or did you have some other meaning for "usually"? Perhaps you want to use informal methods, and ask Jill, Joe, and John if the sentence is legal? Perhaps with majority voting? >>> Maybe, but they are totally irrelevant for the grammars of real programming >>> languages. >> >> No, they are relevant. For example, you want limited look-ahead to make >> parsing fast, and for most "little languages" for which you implement a >> parser manually, you want the language to be suitable for recursive >> descent parsing. > > Really? Comp.compilers is busy with silly questions how do I modify the > grammar [a list of productions follows] to parse [a description] by parser > X. How is this activity different from programming? How is logic programming different from functional programming different from event-driven programming different from ... It is all about describing things with sufficient precision for automated processing. From this point of view, grammars are an application-specific programming language, for programming parsers. Nothing wrong with that. And it was made possible by the research into formal languages and parsing. > People willingly choose a bad programming language [of productions] > for what reason? Why not to do parser straight in Ada? I do. There is much disagreement about which languages are "good", which "bad". By all means write your parsers in Ada; I do too, although I let OpenToken do the lexical analysis. But I have simple languages, with simple semantics. For more complex cases, and especially for any attribute evaluation that requires several passes over the parse tree, attribute grammars are hard to beat for convenience. If you want to describe what language your hand-written parser accepts, and what its processing means, I think you have to present the grammar and describe the meaning of the sentences. You can show the grammar in BNF, and describe the meaning informally, but I would be very surprised if you could define the meaning without referring to the BNF, that is, without using some structural induction method (informally, of course). >>> On one hand these classes do not capture real differences >>> between syntaxes of programming languages. >> >> What sort of "real differences" do you mean? > > Like between C++ and Ada. Sorry, not clear yet. I'm sure you don't mean just that one uses {} and the other begin-end, but what do you mean? >> I don't want to become too personal, but have you had a Horrible >> Experience with some poor formal-languages > No. Actually I enjoyed the subject for its simplicity compared to "hard" > mathematics, and for its elegance too. I have a weakness to all sorts of > formalisms, which is typical to Russian mathematical school. My major was > mathematics. That sounds like the Dmitry I've met before on c.l.a. (My major was math, too.) > I didn't want to become a programmer, it just happened to me during 5th or > 6th semester. That time I was exposed to the classical Aho and Ullmann. That is the book we used. I don't remember the edition. > I was saved by David Gries. He is a really great man, who deserves all > possible fame and respect. His book on compiler construction opened my > eyes. I don't remember reading that book. I'll look out for it. A stimulating discussion, thanks Dmitry. But I guess, no conversions of beliefs :-) -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .