* Verified compilers?
@ 2012-02-21 15:42 Yannick Duchêne (Hibou57)
2012-02-24 1:41 ` Shark8
2012-03-06 1:27 ` Randy Brukardt
0 siblings, 2 replies; 63+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2012-02-21 15:42 UTC (permalink / raw)
Here is a quote, I’m curious to read comments about (may be from Randy ?)
> According to Andrew Appel, "Part of Leroy’s achievement is that he
> makes it look like it's not so difficult after all: now that he's
> found the right architecture for building verified compilers,
> everyone will know how to do it."
Quote source: http://en.wikipedia.org/wiki/Compcert
What do you think of this sentence, “now that he's found the right
architecture for building verified compilers, everyone will know how to do
it?”
--
“Syntactic sugar causes cancer of the semi-colons.” [1]
“Structured Programming supports the law of the excluded muddle.” [1]
[1]: Epigrams on Programming — Alan J. — P. Yale University
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-02-21 15:42 Verified compilers? Yannick Duchêne (Hibou57) @ 2012-02-24 1:41 ` Shark8 2012-02-24 8:52 ` Georg Bauhaus 2012-02-24 17:36 ` Peter C. Chapin 2012-03-06 1:27 ` Randy Brukardt 1 sibling, 2 replies; 63+ messages in thread From: Shark8 @ 2012-02-24 1:41 UTC (permalink / raw) On Feb 21, 9:42 am, Yannick Duchêne (Hibou57) <yannick_duch...@yahoo.fr> wrote: > Here is a quote, I’m curious to read comments about (may be from Randy ?) > > > According to Andrew Appel, "Part of Leroy’s achievement is that he > > makes it look like it's not so difficult after all: now that he's > > found the right architecture for building verified compilers, > > everyone will know how to do it." > > Quote source:http://en.wikipedia.org/wiki/Compcert > > What do you think of this sentence, “now that he's found the right > architecture for building verified compilers, everyone will know how to do > it?” > > -- > “Syntactic sugar causes cancer of the semi-colons.” [1] > “Structured Programming supports the law of the excluded muddle.” [1] > [1]: Epigrams on Programming — Alan J. — P. Yale University I think the sentence shows a bit of solipsistic thinking on the part of the writer's view of languages; as everyone on this thread ought to know Ada's had verified compilers since its first implementations. (I am using 'verified' and 'certified' rather interchangeably here, if that's not the case though I'd like to hear what exactly the difference is.) ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-02-24 1:41 ` Shark8 @ 2012-02-24 8:52 ` Georg Bauhaus 2012-02-24 17:36 ` Peter C. Chapin 1 sibling, 0 replies; 63+ messages in thread From: Georg Bauhaus @ 2012-02-24 8:52 UTC (permalink / raw) On 24.02.12 02:41, Shark8 wrote: > I think the sentence shows a bit of solipsistic thinking on the part > of the writer's view of languages; as everyone on this thread ought to > know Ada's had verified compilers since its first implementations. (I > am using 'verified' and 'certified' rather interchangeably here, if > that's not the case though I'd like to hear what exactly the > difference is.) http://compcert.inria.fr/doc/ "Compcert is a compiler that generates PowerPC, ARM and x86 assembly code from Compcert C, a large subset of the C programming language. The particularity of this compiler is that it is written mostly within the specification language of the Coq proof assistant, and its correctness --- the fact that the generated assembly code is semantically equivalent to its source program --- was entirely proved within the Coq proof assistant." The part saying "the generated assembly code is semantically equivalent to its source program" is the good bit. A result very different from a compiler's output passing some test suite. However, formal verification does not address the qualities of the semantics, unless it does change the semantics of (the subset of) C: Whether or not a project will profit, then, from the semantics of C's int, be it in C source text or in assembly code, is unrelated to the formal verification of the translation, I should think. The qualities of C's semantics would now be verified to be inherent in C, though, and will be truthfully reflected by the assembly code! ;-) ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-02-24 1:41 ` Shark8 2012-02-24 8:52 ` Georg Bauhaus @ 2012-02-24 17:36 ` Peter C. Chapin 1 sibling, 0 replies; 63+ messages in thread From: Peter C. Chapin @ 2012-02-24 17:36 UTC (permalink / raw) On 2012-02-23 20:41, Shark8 wrote: > I think the sentence shows a bit of solipsistic thinking on the part > of the writer's view of languages; as everyone on this thread ought to > know Ada's had verified compilers since its first implementations. (I > am using 'verified' and 'certified' rather interchangeably here, if > that's not the case though I'd like to hear what exactly the > difference is.) To me "verified" suggests formal proof. When I hear "verified compiler" I assume we are talking about a compiler that has been proven (mathematically) to implement the source language correctly. On the other hand a "certified compiler" is one, to my mind, that has passed some kind of official certification suite. The ideas are independent but both are useful. Peter ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-02-21 15:42 Verified compilers? Yannick Duchêne (Hibou57) 2012-02-24 1:41 ` Shark8 @ 2012-03-06 1:27 ` Randy Brukardt 2012-03-06 17:24 ` Shark8 2012-03-07 12:42 ` Stuart 1 sibling, 2 replies; 63+ messages in thread From: Randy Brukardt @ 2012-03-06 1:27 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1917 bytes --] "Yannick Duch�ne (Hibou57)" <yannick_duchene@yahoo.fr> wrote in message news:op.v91cleyiule2fv@douda-yannick... >Here is a quote, I'm curious to read comments about (may be from Randy ?) > >> According to Andrew Appel, "Part of Leroy's achievement is that he >> makes it look like it's not so difficult after all: now that he's >> found the right architecture for building verified compilers, >> everyone will know how to do it." >Quote source: http://en.wikipedia.org/wiki/Compcert > >What do you think of this sentence, "now that he's found the right >architecture for building verified compilers, everyone will know how to do >it?" Why would anyone want to do it? This is *not* a problem area in compilers (at least in my experience). When we introduced an intermediate code to separate semantics from optimization from code generation, one of the unexpected benefits was that code generation bugs dropped to almost none. The intermediate code is far simpler than Ada (or any other high-level language), and it is thus easy to translate correctly. Bugs come from incorrectly converting semantics (which could not be "verified" unless a formal description of the source language exists, and that has proven too hard to do for any real programming language - "a subset of C" not counting as such a language!). And most of all from optimizations; perhaps an optimizer could be formally verified but I suspect doing so would take 3 times as much work as fixing the bugs as they're found. In any case, no one is going to run out and totally rebuild their compiler because someone has found a supposedly "better" architecture for it. There were many such better architectures when I started out (supposely everything was going to be table-driven and automatically generated), and they're all long gone. No reason to assume that anything is different here. Randy. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-06 1:27 ` Randy Brukardt @ 2012-03-06 17:24 ` Shark8 2012-03-06 17:43 ` Dmitry A. Kazakov 2012-03-07 1:00 ` Randy Brukardt 2012-03-07 12:42 ` Stuart 1 sibling, 2 replies; 63+ messages in thread From: Shark8 @ 2012-03-06 17:24 UTC (permalink / raw) On Monday, March 5, 2012 7:27:25 PM UTC-6, Randy Brukardt wrote: > > In any case, no one is going to run out and totally rebuild their compiler > because someone has found a supposedly "better" architecture for it. There > were many such better architectures when I started out (supposely everything > was going to be table-driven and automatically generated), and they're all > long gone. No reason to assume that anything is different here. > > Randy. I seem to recall someone on this forum doing a table-driven approach, though it might have only been WRT tokenization... ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-06 17:24 ` Shark8 @ 2012-03-06 17:43 ` Dmitry A. Kazakov 2012-03-06 19:03 ` Shark8 2012-03-07 5:33 ` Yannick Duchêne (Hibou57) 2012-03-07 1:00 ` Randy Brukardt 1 sibling, 2 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-06 17:43 UTC (permalink / raw) On Tue, 6 Mar 2012 09:24:04 -0800 (PST), Shark8 wrote: > I seem to recall someone on this forum doing a table-driven approach, I do: http://www.dmitry-kazakov.de/ada/components.htm#Parsers_etc works fine to me [used in compilers from many ugly domain-specific languages.] -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-06 17:43 ` Dmitry A. Kazakov @ 2012-03-06 19:03 ` Shark8 2012-03-07 5:33 ` Yannick Duchêne (Hibou57) 1 sibling, 0 replies; 63+ messages in thread From: Shark8 @ 2012-03-06 19:03 UTC (permalink / raw) Cc: mailbox On Tuesday, March 6, 2012 11:43:21 AM UTC-6, Dmitry A. Kazakov wrote: > On Tue, 6 Mar 2012 09:24:04 -0800 (PST), Shark8 wrote: > > > I seem to recall someone on this forum doing a table-driven approach, > > I do: > > http://www.dmitry-kazakov.de/ada/components.htm#Parsers_etc > > works fine to me [used in compilers from many ugly domain-specific > languages.] > > -- > Regards, > Dmitry A. Kazakov > http://www.dmitry-kazakov.de I'll have to check it out. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-06 17:43 ` Dmitry A. Kazakov 2012-03-06 19:03 ` Shark8 @ 2012-03-07 5:33 ` Yannick Duchêne (Hibou57) 2012-03-07 9:12 ` Dmitry A. Kazakov 1 sibling, 1 reply; 63+ messages in thread From: Yannick Duchêne (Hibou57) @ 2012-03-07 5:33 UTC (permalink / raw) Le Tue, 06 Mar 2012 18:43:21 +0100, Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> a écrit: > On Tue, 6 Mar 2012 09:24:04 -0800 (PST), Shark8 wrote: > >> I seem to recall someone on this forum doing a table-driven approach, > > I do: > > http://www.dmitry-kazakov.de/ada/components.htm#Parsers_etc I did not investigate, so the question may be stupid: why two separate stacks? I did a basic expression parser too some long time ago, and it had a single stack. -- “Syntactic sugar causes cancer of the semi-colons.” [1] “Structured Programming supports the law of the excluded muddle.” [1] [1]: Epigrams on Programming — Alan J. — P. Yale University ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-07 5:33 ` Yannick Duchêne (Hibou57) @ 2012-03-07 9:12 ` Dmitry A. Kazakov 2012-03-07 17:49 ` Niklas Holsti 0 siblings, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-07 9:12 UTC (permalink / raw) On Wed, 07 Mar 2012 06:33:54 +0100, Yannick Duch�ne (Hibou57) wrote: > Le Tue, 06 Mar 2012 18:43:21 +0100, Dmitry A. Kazakov > <mailbox@dmitry-kazakov.de> a �crit: > >> On Tue, 6 Mar 2012 09:24:04 -0800 (PST), Shark8 wrote: >> >>> I seem to recall someone on this forum doing a table-driven approach, >> >> I do: >> >> http://www.dmitry-kazakov.de/ada/components.htm#Parsers_etc > I did not investigate, so the question may be stupid: why two separate > stacks? Because you need to store arguments some of which appear before the operators that bound them. That is the argument's stack. The operations too may be reordered according to the association priorities. That is the operation's stack. Consider parsing: A + B * C + D A+B*C+D ---> AS: A OS: +B*C+D ---> AS: A OS: + B*C+D ---> AS: A; B OS: + *C+D ---> AS: A; B OS: +; * C+D ---> AS: A; B; C OS: +; * +D ---> pop * AS: A; B*C OS: + +D ---> pop + AS: A+(B*C) OS: + D ---> AS: A+(B*C); D OS: + ---> pop + AS: (A+(B*C))+D OS: > I did a basic expression parser too some long time ago, and it had > a single stack. Well, Turing machine has only one tape, so yes you could do that. The question is why. P.S. This method has nothing to do with being table-driven. Table-driven is the lexer, which recognizes operators using tables. So the engine is same for all expressions. Tables are also used for the recursive descent parser put around infix expression parser. At each level there is a table of expected keywords. The input is matched against the table and parser does something. Some leaves are expressions, for them the described above method is used. Infix expression is only really difficult part here. The rest is just trivial for almost any normal language. I was always wondering why people bother to study grammars, LR, LL parsers and other useless stuff. This is a great example of making a "science" out of nothing and for nothing. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-07 9:12 ` Dmitry A. Kazakov @ 2012-03-07 17:49 ` Niklas Holsti 2012-03-07 20:17 ` Dmitry A. Kazakov 0 siblings, 1 reply; 63+ messages in thread From: Niklas Holsti @ 2012-03-07 17:49 UTC (permalink / raw) On 12-03-07 11:12 , Dmitry A. Kazakov wrote: > I was always wondering why people bother to study grammars, LR, LL parsers > and other useless stuff. Grammars and the notion of parsing (derivations) rigorously define what a given string of characters means, in some formally defined language. That is fundamental to all programming and most computing. As far as I can see, the only alternative would be to define the meaning of a language by defining a specific interpreter as the standard interpreter, and defining the meaning of a sentence as whatever this interpreter does, when interpreting this sentence. > This is a great example of making a "science" out > of nothing and for nothing. The various types of grammars and the corresponding parsing procedures help us understand which parsing methods work for which kinds of grammars. This is useful even when one writes one's parsers by hand. But I agree that the differences between the LR(k) languages, for various values of k, are mostly irrelevent in practice. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-07 17:49 ` Niklas Holsti @ 2012-03-07 20:17 ` Dmitry A. Kazakov 2012-03-07 23:28 ` Usefulness of Formal Notions in Programming (was: Verified compilers?) Georg Bauhaus ` (2 more replies) 0 siblings, 3 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-07 20:17 UTC (permalink / raw) On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: > On 12-03-07 11:12 , Dmitry A. Kazakov wrote: > >> I was always wondering why people bother to study grammars, LR, LL parsers >> and other useless stuff. > > Grammars and the notion of parsing (derivations) rigorously define what > a given string of characters means, in some formally defined language. No. A grammar defines some formal language: a set of strings. The meaning (program language semantics) is defined informally or else by another language into which these strings are translated. The grammar defines only syntax. Even that is defined incompletely for most cases, e.g. X : constant := 10**(10**(10**10)). Though nobody sheds a tear about it. An attempt to render all errors as syntax errors would reach nothing but unreadable diagnostics. Semantic error messages are much easier to understand. So syntax error messages are few and trivial, e.g. "end if expected." If I needed to recognize the infamous {ab, aabb, aaabbb, ...} why should I bother about the grammar? I would count a's and b's and compare two numbers. That is. Furthermore the merits of formal grammar definitions are debatable, because the complexity of grammar descriptions usually make it impossible to judge if a given string do belong to the formal language or not. Normally BNF is all one needs to define syntax. > As far as I can see, the only alternative would be to define the meaning > of a language by defining a specific interpreter as the standard > interpreter, and defining the meaning of a sentence as whatever this > interpreter does, when interpreting this sentence. Rather translation is the only way to formally define the semantics. >> This is a great example of making a "science" out >> of nothing and for nothing. > > The various types of grammars and the corresponding parsing procedures > help us understand which parsing methods work for which kinds of > grammars. Maybe, but they are totally irrelevant for the grammars of real programming languages. On one hand these classes do not capture real differences between syntaxes of programming languages. On the other hand, the classes they do describe have no other practical interest than torturing students of CS classes. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Usefulness of Formal Notions in Programming (was: Verified compilers?) 2012-03-07 20:17 ` Dmitry A. Kazakov @ 2012-03-07 23:28 ` Georg Bauhaus 2012-03-08 9:24 ` Usefulness of Formal Notions in Programming Dmitry A. Kazakov 2012-03-08 0:42 ` Verified compilers? Randy Brukardt 2012-03-08 18:02 ` Niklas Holsti 2 siblings, 1 reply; 63+ messages in thread From: Georg Bauhaus @ 2012-03-07 23:28 UTC (permalink / raw) On 07.03.12 21:17, Dmitry A. Kazakov wrote: >>> This is a great example of making a "science" out >>> of nothing and for nothing. Nothing seems inherent in knowing grammars that prevents good use of this knowledge. Uses in general system design, or in programming, or in talking about programs, and all these activities need not have to do with string processing. Granted, for the knowledge to be useful, the programmer need not have specialized in grammars. Two examples: 1. Suppose someone knows what a regular language is. Let a regular language be taught together with finite automata, Knowing some of the latter can instruct the design of a system whose structure (processes, their interactions, ...) is easier to understand because there is, in fact, a state machine that describes it. Knowing that general regular expressions, of the CS kind, are made by obeying simple rules will help writing REs of any variety, and correctly at that. (I'll claim this.) 2. Suppose someone remembers some things about brackets and CFGs. A system might have a nested structure. Things in the system first begin and then end by first taking, and then releasing, resources, say. Later, depending on an instance of I/O at some point, the program will choose from a alternative nested structures. When explaining the program's structure to someone knowing a scheme such as a CFG, the analogy paves the way towards an understanding of the structure of the system. The useful relation between grammars and programming in these cases is this: If a person knows one simplified, abstract notion well (grammar), the he or she will more easily understand a different notion, provided there is an analogy, or correspondence. I'd give a similar argument for writing programs. It seems somewhat like knowing math. If you know it, and know how to put math to good use, it helps understanding programs, and writing programs, >> The various types of grammars and the corresponding parsing procedures >> help us understand which parsing methods work for which kinds of >> grammars. > > Maybe, but they are totally irrelevant for the grammars of real programming > languages. On one hand these classes do not capture real differences > between syntaxes of programming languages. On the other hand, the classes > they do describe have no other practical interest than torturing students > of CS classes. Counting, as you suggest for ab, aabb, aaabbb, ... turns out to be a delusive solution. It is slippery, at least for use in programming languages' syntaxes. For example, Lisp interpreters would be counting brackets that match. ((())). When they have found N left brackets, they'd need to find N right brackets, too. Or, really, at least N right brackets, a relation to which the makers of Lisp had submitted. For, if a programmer had forgotten to write some right bracket in the middle of a program, the remaining lists would pile up, the remainder of the program's text would be added to the most recently opened list. The dreadful consequence is that the parser will report nothing but a "syntax error at end of file". An error not easy to spot, and, therefore, not easy to fix. You would hope to be helped by a Lisp system or a pretty printing editor. Thus, placement and counting of embedded ((()))s will require extreme care on the part of the programmer. But! It is possible to amend the grammar, and it has been done, see below. This change cannot, however, be performed if we just focus on counting, I think. Consequently, reflecting the fact that programmers do not emit from productions, but make mistakes, Lisp allows any number of right brackets to be written in certain places to help with () matching. For example, when closing a DEFUN list. A less ()y solution is found in some programming languages. Their grammars use indexed pairs of brackets, such as occur in "begin" ... "end" or [�L�:] loop" ... "end loop [�L�]" or "package �L� is" ... "end [�L�]" Don't know whether or not a van Wijngaarden grammar can produce the last sentences. The Ada LRM uses natural language statements such as "a name) at the end of the right bracket must match the index at the corresponding left bracket." In any case, I'll bet knowing grammar at all will help one produce(!) program designs that are easier to understand if a known scheme is used, as outlined above. The word marked (!) in the last sentence points out the recursive nature of the subject. ;-) ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Usefulness of Formal Notions in Programming 2012-03-07 23:28 ` Usefulness of Formal Notions in Programming (was: Verified compilers?) Georg Bauhaus @ 2012-03-08 9:24 ` Dmitry A. Kazakov 2012-03-08 10:30 ` Nasser M. Abbasi 0 siblings, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 9:24 UTC (permalink / raw) [I never said that formalism is bad, on the contrary, programming is too much informal. But that by no means should justify the kind of CS we have] On Thu, 08 Mar 2012 00:28:37 +0100, Georg Bauhaus wrote: > On 07.03.12 21:17, Dmitry A. Kazakov wrote: > >>>> This is a great example of making a "science" out >>>> of nothing and for nothing. > [...] > 1. Suppose someone knows what a regular language is. I don't buy REs. For a layman wild-carded patterns are easier to use and understand, while covering 90% all cases. For professional use (10%) patterns is a bad idea anyway. I think we discussed that in c.l.a. > If a person knows one simplified, abstract notion well (grammar), > the he or she will more easily understand a different notion, provided > there is an analogy, or correspondence. The only problem is that formal grammars are not simple. They are hard and difficulties grow exponentially closer you come to the real application domain. The formalism is just too weak. This is the problem of all poor solutions. Be it formal grammars, patterns, domain-specific languages, scripting languages etc. You get a simple job done simply and quickly. That makes you think - that is. But when you face a real task you are lost. > It seems somewhat like knowing math. If you know it, and know how to > put math to good use, it helps understanding programs, and writing > programs, That is true, except that good math has a purpose and beauty. Formal grammars (poor math) are more like learning how to divide and square Roman numerals. That too would train you mentally and thus help by writing program. But a sane man would convert to the decimal representation do the job and convert back. [... how bad Lisp is ...] That Lisp is bad does not imply that formal grammars are any good. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Usefulness of Formal Notions in Programming 2012-03-08 9:24 ` Usefulness of Formal Notions in Programming Dmitry A. Kazakov @ 2012-03-08 10:30 ` Nasser M. Abbasi 2012-03-08 12:37 ` Dmitry A. Kazakov 0 siblings, 1 reply; 63+ messages in thread From: Nasser M. Abbasi @ 2012-03-08 10:30 UTC (permalink / raw) On 3/8/2012 3:24 AM, Dmitry A. Kazakov wrote: > > This is the problem of all poor solutions. Be it formal grammars, patterns, > domain-specific languages, scripting languages etc. You get a simple job > done simply and quickly. That makes you think - that is. But when you face > a real task you are lost. > True. But most software apps these days are simple and throw away apps anyway. You can buy one for 59 cents only :) So people pick the easiest language to make these apps with in the fastest way. May be what we need is an AdaScript language? A very simplified version of Ada with a GUI build-in and an interpreter instead of a compiler, then Ada will become popular again ! --Nasser ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Usefulness of Formal Notions in Programming 2012-03-08 10:30 ` Nasser M. Abbasi @ 2012-03-08 12:37 ` Dmitry A. Kazakov 0 siblings, 0 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 12:37 UTC (permalink / raw) On Thu, 08 Mar 2012 04:30:42 -0600, Nasser M. Abbasi wrote: > On 3/8/2012 3:24 AM, Dmitry A. Kazakov wrote: > >> This is the problem of all poor solutions. Be it formal grammars, patterns, >> domain-specific languages, scripting languages etc. You get a simple job >> done simply and quickly. That makes you think - that is. But when you face >> a real task you are lost. > > True. But most software apps these days are simple and throw away apps > anyway. Really? Throw-away apps in your phone, car, electricity/water/gas meter, pacemaker, cash machine, cable decoder, wash machine... > You can buy one for 59 cents only :) I never did. It is the finest kind of business to sell things nobody needs. > May be what we need is an AdaScript language? No, I see much different things to be done to improve Ada. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-07 20:17 ` Dmitry A. Kazakov 2012-03-07 23:28 ` Usefulness of Formal Notions in Programming (was: Verified compilers?) Georg Bauhaus @ 2012-03-08 0:42 ` Randy Brukardt 2012-03-08 9:25 ` Dmitry A. Kazakov 2012-03-08 18:02 ` Niklas Holsti 2 siblings, 1 reply; 63+ messages in thread From: Randy Brukardt @ 2012-03-08 0:42 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message news:1psd0g0womgxi.1sle7ol12x3d5.dlg@40tude.net... ... > Furthermore the merits of formal grammar definitions are debatable, > because > the complexity of grammar descriptions usually make it impossible to judge > if a given string do belong to the formal language or not. Normally BNF is > all one needs to define syntax. This makes no sense at all. BNF is the classic formal definition of a grammar; indeed, I know of no other. Everybody has variations on it (the Ada BNF uses repeat operations not found in many BNFs; the annoted BNF used by our grammar generator is slightly different), but they're all forms of BNF. Perhaps you are talking about grammar generator processessable definitions -- that's a different beast altogether and forms just a tiny portion of "formal grammar definitions". As far as LR parsing is concerned, that's the most general practical parsing scheme. There are many legitmate BNFs that cannot be parsed using recursive descent. And writing recursive descent code (and error handling) is many times harder than adding a couple of lines to a file and lanuching a batch file: Aspect_Specification ::= WITH Aspect_Item {, Aspect_Item} Aspect_Item ::= IDENTIFIER => Expression [although I admit this is a small portion of the total work; it took me about 4 weeks to add the entire Ada 2005 syntax to Janus/Ada, but probably only about an hour or two on the actual grammar. The rest of the time was spent adding "unimplemented feature" error handlers to the compiler, adjusting existing parts of the compiler to be aware of the changes (new grammar features changing the stack position of old ones), updating the pretty printer to be aware of the new features, and the like.] Randy. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 0:42 ` Verified compilers? Randy Brukardt @ 2012-03-08 9:25 ` Dmitry A. Kazakov 2012-03-08 18:10 ` Niklas Holsti 0 siblings, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 9:25 UTC (permalink / raw) On Wed, 7 Mar 2012 18:42:23 -0600, Randy Brukardt wrote: > As far as LR parsing is concerned, that's the most general practical parsing > scheme. There are many legitmate BNFs that cannot be parsed using recursive > descent. There are also ambiguous BNFs, which means that BNF is not so formal as it may appear. The point is that I don't care about that or about restricting it to a certain class etc. Because there is no value in such activity at all. BNF is good for *informal* description of the language syntax. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 9:25 ` Dmitry A. Kazakov @ 2012-03-08 18:10 ` Niklas Holsti 2012-03-08 20:41 ` Dmitry A. Kazakov 0 siblings, 1 reply; 63+ messages in thread From: Niklas Holsti @ 2012-03-08 18:10 UTC (permalink / raw) On 12-03-08 11:25 , Dmitry A. Kazakov wrote: > On Wed, 7 Mar 2012 18:42:23 -0600, Randy Brukardt wrote: > >> As far as LR parsing is concerned, that's the most general practical parsing >> scheme. There are many legitmate BNFs that cannot be parsed using recursive >> descent. > > There are also ambiguous BNFs, which means that BNF is not so formal as it > may appear. Ambiguity has nothing to do with formality. If you think of a grammar only as defining a language (a set of strings), ambiguity is irrelevant; it only means that some sentence can be derived in more than one way from the start symbol, but that has no effect on the language, as a set of strings. Ambiguity is only harmful when you want to use the grammar to parse a sentence into a parse/derivation tree, and then want to use the tree for something, for example to evaluate the meaning of the sentence. If you define (as one always does) the meaning of a sentence as a mathematical function of the sentence's parse tree, and if (because of ambiguity) the sentence has several parse trees, the meaning also becomes ambiguous -- unless you can prove that the "meaning function" has the same value for all the parse trees for the same sentence. > The point is that I don't care about that or about restricting > it to a certain class etc. Because there is no value in such activity at > all. BNF is good for *informal* description of the language syntax. There is nothing informal about BNF, if it is used correctly. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 18:10 ` Niklas Holsti @ 2012-03-08 20:41 ` Dmitry A. Kazakov 0 siblings, 0 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 20:41 UTC (permalink / raw) On Thu, 08 Mar 2012 20:10:21 +0200, Niklas Holsti wrote: > On 12-03-08 11:25 , Dmitry A. Kazakov wrote: >> The point is that I don't care about that or about restricting >> it to a certain class etc. Because there is no value in such activity at >> all. BNF is good for *informal* description of the language syntax. > > There is nothing informal about BNF, if it is used correctly. Same can be said about English or other natural language. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-07 20:17 ` Dmitry A. Kazakov 2012-03-07 23:28 ` Usefulness of Formal Notions in Programming (was: Verified compilers?) Georg Bauhaus 2012-03-08 0:42 ` Verified compilers? Randy Brukardt @ 2012-03-08 18:02 ` Niklas Holsti 2012-03-08 20:40 ` Dmitry A. Kazakov 2 siblings, 1 reply; 63+ messages in thread From: Niklas Holsti @ 2012-03-08 18:02 UTC (permalink / raw) On 12-03-07 22:17 , Dmitry A. Kazakov wrote: > On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: > >> On 12-03-07 11:12 , Dmitry A. Kazakov wrote: >> >>> I was always wondering why people bother to study grammars, LR, LL parsers >>> and other useless stuff. >> >> Grammars and the notion of parsing (derivations) rigorously define what >> a given string of characters means, in some formally defined language. > > No. A grammar defines some formal language: a set of strings. Ok, I was being too brief. You are right that a grammar defines a language. But notice that I was also speaking of parsing and derivations, not just grammars. If you want to define the meanings of the sentences of a language in a rigorous way, you have basically two options: 1. Define the meaning operationally, by giving a standard interpreter, a black box, for example a Turing machine, that accepts the language, and define that the actions of that machine, while reading a sentence, are the meaning of that sentence. (From your point of view, the meaning could be the "translation" that appears on the output tape.) 2. Define the meaning mathematically in some way. Note that by "mathematically" I don't necessarily mean formally proven math, but mathematically precise definitions with the usual amount of hand-waving and shortcuts. The first "meaning ex machina" option is very undesirable, IMO, so we are left with the second option. Since most languages have an infinite number of sentences, of unbounded length, the only practical way to define the meaning of a sentence is by some form of induction. Induction based on the length of the string works only for some simple languages, such as the language of integer literals -- you can define the meaning (value) of a single digit, and also the how the meaning changes when one more digit is added to a digit string. But this does not work for most languages we use. The only alternative left is structural induction, but where is the structure of a string of characters? The structure is in the parse tree (derivation tree) defined by the grammar for this sentence. As far as I know, all methods -- formal or informal -- for defining the meaning of sentences in formal languages are based on how the sentence is derived, or equivalently how it is parsed. How else could you explain why "2*3+4*5" means 26, and not 70, nor something else? Rules like "do multiplication before addition" are fuzzy and only lead to "meaning ex machina"-like definitions. In practice, the role of the parse/derivation tree appears in the fact that parser/compiler generators let you attach "semantic actions" to grammar productions, and often help you to consider larger parts of the derivation by passing information around the parse/derivation tree in well-defined ways. Attribute grammars are one example. > The meaning (program language semantics) is defined informally There are various levels of "meaning" and semantics, for example static semantics and dynamic semantics. When one uses a parser/compiler generator to implement a programming language, the static semantics is often defined formally in the grammar and associated rules. The dynamic semantics is usually not, or is defined by a translation to some other form (intermediate or machine code), but even here the translation is driven and defined by the parse/derivation tree. > The grammar defines only syntax. No, it also defines the structure, through the parse/derivation tree. > Furthermore the merits of formal grammar definitions are debatable, because > the complexity of grammar descriptions usually make it impossible to judge > if a given string do belong to the formal language or not. Entirely wrong. Can you give an example of this impossibility? > Normally BNF is all one needs to define syntax. As was already replied, BNF (in its many forms) is a formal grammar description formalism. > Maybe, but they are totally irrelevant for the grammars of real programming > languages. No, they are relevant. For example, you want limited look-ahead to make parsing fast, and for most "little languages" for which you implement a parser manually, you want the language to be suitable for recursive descent parsing. > On one hand these classes do not capture real differences > between syntaxes of programming languages. What sort of "real differences" do you mean? > On the other hand, the classes > they do describe have no other practical interest than torturing students > of CS classes. I don't want to become too personal, but have you had a Horrible Experience with some poor formal-languages class that has left a permanent scar in your mind? My wife had something like that; she was among the first students of computer science in our university, and the teachers at that time were also beginners. I think most of the students were mystified by the course. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 18:02 ` Niklas Holsti @ 2012-03-08 20:40 ` Dmitry A. Kazakov 2012-03-09 0:44 ` Georg Bauhaus 2012-03-09 22:13 ` Niklas Holsti 0 siblings, 2 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 20:40 UTC (permalink / raw) On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote: > On 12-03-07 22:17 , Dmitry A. Kazakov wrote: >> On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: >> >>> On 12-03-07 11:12 , Dmitry A. Kazakov wrote: >>> >>>> I was always wondering why people bother to study grammars, LR, LL parsers >>>> and other useless stuff. >>> >>> Grammars and the notion of parsing (derivations) rigorously define what >>> a given string of characters means, in some formally defined language. >> >> No. A grammar defines some formal language: a set of strings. > > Ok, I was being too brief. You are right that a grammar defines a > language. But notice that I was also speaking of parsing and > derivations, not just grammars. > > If you want to define the meanings of the sentences of a language in a > rigorous way, you have basically two options: > > 1. Define the meaning operationally, by giving a standard interpreter, a > black box, for example a Turing machine, that accepts the language, and > define that the actions of that machine, while reading a sentence, are > the meaning of that sentence. (From your point of view, the meaning > could be the "translation" that appears on the output tape.) > > 2. Define the meaning mathematically in some way. Note that by > "mathematically" I don't necessarily mean formally proven math, but > mathematically precise definitions with the usual amount of hand-waving > and shortcuts. 1 and 2 are same to me. In all cases it is an input for some machine. An inference system based on Peano axioms or whatever, is no less a machine than the Turing one. The only difference is in the computational power of such machines. > As far as I know, all methods -- formal or informal -- for defining the > meaning of sentences in formal languages are based on how the sentence > is derived, or equivalently how it is parsed. No. 1234 has the same meaning for Ada parser, human reader and an OCR system. > How else could you explain why "2*3+4*5" means 26, and not 70, nor > something else? Explain? It is a subject of proof. There are more than one way to do it. BTW the "meaning" you are advocating for looks rather something like 23*45*+. > In practice, the role of the parse/derivation tree appears in the fact > that parser/compiler generators let you attach "semantic actions" to > grammar productions, and often help you to consider larger parts of the > derivation by passing information around the parse/derivation tree in > well-defined ways. Attribute grammars are one example. Parsing tree is just a less ambiguous representation of the program. You don't even need to generate it in order to evaluate 2*3+4*5 using a finite state machine. And you certainly do not need it when using your head or calculator. >> The grammar defines only syntax. > > No, it also defines the structure, through the parse/derivation tree. I was talking about formal languages. Do you want to say that the trace of productions applied during parsing is the program semantics? It is amusing to read how LRM Annex P tries to describe operator precedence by syntactical means. This is a remnant of a delusion shared by some that formal languages could describe, if not natural languages, but at least machine languages. I thought nobody seriously believes in this now. Care to carve a grammar where declare X : Integer := 1.0 + 2; would be a syntax error? I took my journey in the opposite direction. If you look at the grammar: http://www.dmitry-kazakov.de/ada/components.htm#12.2 It has just one production rule. >> Furthermore the merits of formal grammar definitions are debatable, because >> the complexity of grammar descriptions usually make it impossible to judge >> if a given string do belong to the formal language or not. > > Entirely wrong. Can you give an example of this impossibility? Are you claiming that if I gave you a random set of production rules and a random string you could tell me [in 30 seconds time] if that string belongs to the language? >> Maybe, but they are totally irrelevant for the grammars of real programming >> languages. > > No, they are relevant. For example, you want limited look-ahead to make > parsing fast, and for most "little languages" for which you implement a > parser manually, you want the language to be suitable for recursive > descent parsing. Really? Comp.compilers is busy with silly questions how do I modify the grammar [a list of productions follows] to parse [a description] by parser X. How is this activity different from programming? People willingly choose a bad programming language [of productions] for what reason? Why not to do parser straight in Ada? I do. >> On one hand these classes do not capture real differences >> between syntaxes of programming languages. > > What sort of "real differences" do you mean? Like between C++ and Ada. >> On the other hand, the classes >> they do describe have no other practical interest than torturing students >> of CS classes. > > I don't want to become too personal, but have you had a Horrible > Experience with some poor formal-languages class that has left a > permanent scar in your mind? My wife had something like that; she was > among the first students of computer science in our university, and the > teachers at that time were also beginners. I think most of the students > were mystified by the course. No. Actually I enjoyed the subject for its simplicity compared to "hard" mathematics, and for its elegance too. I have a weakness to all sorts of formalisms, which is typical to Russian mathematical school. My major was mathematics. I didn't want to become a programmer, it just happened to me during 5th or 6th semester. That time I was exposed to the classical Aho and Ullmann. "Ahoullmann" we called it, which in Russian has a frivolous connotation, though with a note of awe and respect. This fully applied to me, I could not believe my eyes. Is it really so bad? How anybody could do anything out of this horror? I was saved by David Gries. He is a really great man, who deserves all possible fame and respect. His book on compiler construction opened my eyes. Another great book that shaped my attitude to formal languages was Griswold, Poage, Polonsky introduction to SNOBOL4. It helped me to understand what patterns actually are, and ultimately, why all of them, even the ones of SNOBOL are bad. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 20:40 ` Dmitry A. Kazakov @ 2012-03-09 0:44 ` Georg Bauhaus 2012-03-09 22:13 ` Niklas Holsti 1 sibling, 0 replies; 63+ messages in thread From: Georg Bauhaus @ 2012-03-09 0:44 UTC (permalink / raw) On 08.03.12 21:40, Dmitry A. Kazakov wrote: > On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote: >> How else could you explain why "2*3+4*5" means 26, and not 70, nor >> something else? > > Explain? It is a subject of proof. There are more than one way to do it. > BTW the "meaning" you are advocating for looks rather something like > 23*45*+. No! It must mean "at least zero twos followed by a three or more of these, with maybe fours coming next, and finally a five". + is clearly a macro. And, for proof, in which school did they ever teach that a star denotes multiplication? A Kleene star is a Kleene star. No! "2*3+4*5" must have a syntax error. What it really means is this: Python>>> "2"*3+"4"*5 '22244444' What? No! Dumb animal, this Python. The mathematical interpretation is obviously right: Perl-DB<1> print "2"*3+"4"*5 26 Excuse me, no! A Programming Language was there first, and it says, * 2*3+4*5 Inf Because, obviously, * 2*1027 Inf Uh, no, not any more. Are you guys still using that old APL dialect? Look at J for the real meaning: 2*3+4*5 46 Because 3+4*5 23 Nah! Ever since 1969, SNOBOL-4 clearly shows that 2*3+4*5 2*3+4*5 ^ stdin:1: *** Illegal character in element Because a string needs quotes around it, for one thing! Right, that's been known by Lispers for a long time: * 2*3+4*5 Error in KERNEL::UNBOUND-SYMBOL-ERROR-HANDLER: the variable 2*3+4*5 is unbound. [Condition of type UNBOUND-VARIABLE] Really, you can't remove the quotes from a string and expect it to have meaning? Look, all it means when output is just itself! SNOBOL4 (Version 3.11, May 19, 1975) Bell Telephone Laboratories, Incorporated OUTPUT = "2*3+4*5" END No errors detected in source program 2*3+4*5 Normal termination at level 0 -- > LISP * 2 + 3 * 4 + 5 2 * 2 * 3 * 3 * 4 * 4 * 5 * (QUIT) ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 20:40 ` Dmitry A. Kazakov 2012-03-09 0:44 ` Georg Bauhaus @ 2012-03-09 22:13 ` Niklas Holsti 2012-03-10 10:36 ` Dmitry A. Kazakov 2012-03-10 13:46 ` Brian Drummond 1 sibling, 2 replies; 63+ messages in thread From: Niklas Holsti @ 2012-03-09 22:13 UTC (permalink / raw) On 12-03-08 22:40 , Dmitry A. Kazakov wrote: > On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote: > >> On 12-03-07 22:17 , Dmitry A. Kazakov wrote: >>> On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: >>> >>>> On 12-03-07 11:12 , Dmitry A. Kazakov wrote: >>>> >>>>> I was always wondering why people bother to study grammars, LR, LL parsers >>>>> and other useless stuff. >>>> >>>> Grammars and the notion of parsing (derivations) rigorously define what >>>> a given string of characters means, in some formally defined language. >>> >>> No. A grammar defines some formal language: a set of strings. >> >> Ok, I was being too brief. You are right that a grammar defines a >> language. But notice that I was also speaking of parsing and >> derivations, not just grammars. >> >> If you want to define the meanings of the sentences of a language in a >> rigorous way, you have basically two options: >> >> 1. Define the meaning operationally, by giving a standard interpreter, a >> black box, for example a Turing machine, that accepts the language, and >> define that the actions of that machine, while reading a sentence, are >> the meaning of that sentence. (From your point of view, the meaning >> could be the "translation" that appears on the output tape.) >> >> 2. Define the meaning mathematically in some way. Note that by >> "mathematically" I don't necessarily mean formally proven math, but >> mathematically precise definitions with the usual amount of hand-waving >> and shortcuts. > > 1 and 2 are same to me. In all cases it is an input for some machine. An > inference system based on Peano axioms or whatever, is no less a machine > than the Turing one. The only difference is in the computational power of > such machines. I see a big difference in the structure of the machine. In 1, it can be arbitrarily complex and opaque. And remember the halting problem... In 2 (using a parse tree as said later), it has a very specific structure, clearly related to the structure of the language, and much more declarative than in 1. >> As far as I know, all methods -- formal or informal -- for defining the >> meaning of sentences in formal languages are based on how the sentence >> is derived, or equivalently how it is parsed. > > No. 1234 has the same meaning for Ada parser, human reader and an OCR > system. 1234 has no meaning, except as a literal string, until you give it some other meaning, for example as the integer one thousand two hundred and thirty four. >> How else could you explain why "2*3+4*5" means 26, and not 70, nor >> something else? > > Explain? It is a subject of proof. There are more than one way to do it. Operator precedence, perhaps? See earlier post... > BTW the "meaning" you are advocating for looks rather something like > 23*45*+. More like: the expression is a SUM of (a TERM that is the PRODUCT of (the number 2) and (the number 3)) and (a TERM that is the PRODUCT of (the number 4) and (the number 5)), together with a rule for evaluating such expression trees. >> In practice, the role of the parse/derivation tree appears in the fact >> that parser/compiler generators let you attach "semantic actions" to >> grammar productions, and often help you to consider larger parts of the >> derivation by passing information around the parse/derivation tree in >> well-defined ways. Attribute grammars are one example. > > Parsing tree is just a less ambiguous representation of the program. "Just"? It is the first step in the assignment of meaning to a sentence. The second step is a computation defined inductively on the parse tree. > You don't even need to generate it in order to evaluate 2*3+4*5 > using a finite state machine. You cannot do it with a finite state machine (assuming you want to allow literals of any size and expressions of any length). You need a stack of numbers, too. The generation of parsing machines from grammars is one of the main achievements of formal language and parsing theory. Why sweat to make the machine manually, when you can do it automatically? > And you certainly do not need it when using your head How do you know what your head is using? :-) > or calculator. Depends on the calculator. If it does not have '(' and ')', or a stack, you may have to follow a parse tree, maybe even write down some intermediate results. >>> The grammar defines only syntax. >> >> No, it also defines the structure, through the parse/derivation tree. > > I was talking about formal languages. Me too. Maybe we understand the term differently? > Do you want to say that the trace of > productions applied during parsing is the program semantics? That trace (or rather the parse tree) is the structure on which the semantics is defined, by structurally inductive definitions. > It is amusing to read how LRM Annex P tries to describe operator precedence > by syntactical means. What do you mean, "tries"? It works, and as a formalism I like it better than the operator precedence definition. I might agree that an intuitive idea of operator precedence helps us human programmers, with our limited and unreliable working memory. > This is a remnant of a delusion shared by some that > formal languages could describe, if not natural languages, but at least > machine languages. I thought nobody seriously believes in this now. Natural languages, ok, since they have no precise definition to begin with, so can only be approximated by formal languages. But I am amazed by your position regarding machine languages. Perhaps we have a different notion of what "describing" means. > Care to carve a grammar where > > declare > X : Integer := 1.0 + 2; > > would be a syntax error? Easy, the grammar that defines only the empty string :-) But I guess you are thinking about types, and how to make the mixing of integers and floating point illegal. It is trivial to write a grammar that makes it a syntax error to mix integer literals and floating point literals in the same expression, but you have to duplicate the definition of expressions and have separate metasymbols Integer_Expression and FP_Expression, and all the way down to literals. If you are thinking about identifiers with types, use an attribute grammar that checks types by means of a symbol table, and there you are. Perhaps you don't want to call that a "syntax error", but at least it is a "parse-time error". >>> Furthermore the merits of formal grammar definitions are debatable, because >>> the complexity of grammar descriptions usually make it impossible to judge >>> if a given string do belong to the formal language or not. >> >> Entirely wrong. Can you give an example of this impossibility? > > Are you claiming that if I gave you a random set of production rules and a > random string you could tell me [in 30 seconds time] if that string belongs > to the language? What have my personal abilities to do with anything? The research in formal languages and parsing has led to tools. Take your grammar, feed it into a parser ganerator, and in a few seconds it will tell you if the grammar is acceptable (as it usually is, or can be made to be). A few seconds more, and you will have a parser, which will use a few more seconds to tell you if the string is in the language. Or did you have some other meaning for "usually"? Perhaps you want to use informal methods, and ask Jill, Joe, and John if the sentence is legal? Perhaps with majority voting? >>> Maybe, but they are totally irrelevant for the grammars of real programming >>> languages. >> >> No, they are relevant. For example, you want limited look-ahead to make >> parsing fast, and for most "little languages" for which you implement a >> parser manually, you want the language to be suitable for recursive >> descent parsing. > > Really? Comp.compilers is busy with silly questions how do I modify the > grammar [a list of productions follows] to parse [a description] by parser > X. How is this activity different from programming? How is logic programming different from functional programming different from event-driven programming different from ... It is all about describing things with sufficient precision for automated processing. From this point of view, grammars are an application-specific programming language, for programming parsers. Nothing wrong with that. And it was made possible by the research into formal languages and parsing. > People willingly choose a bad programming language [of productions] > for what reason? Why not to do parser straight in Ada? I do. There is much disagreement about which languages are "good", which "bad". By all means write your parsers in Ada; I do too, although I let OpenToken do the lexical analysis. But I have simple languages, with simple semantics. For more complex cases, and especially for any attribute evaluation that requires several passes over the parse tree, attribute grammars are hard to beat for convenience. If you want to describe what language your hand-written parser accepts, and what its processing means, I think you have to present the grammar and describe the meaning of the sentences. You can show the grammar in BNF, and describe the meaning informally, but I would be very surprised if you could define the meaning without referring to the BNF, that is, without using some structural induction method (informally, of course). >>> On one hand these classes do not capture real differences >>> between syntaxes of programming languages. >> >> What sort of "real differences" do you mean? > > Like between C++ and Ada. Sorry, not clear yet. I'm sure you don't mean just that one uses {} and the other begin-end, but what do you mean? >> I don't want to become too personal, but have you had a Horrible >> Experience with some poor formal-languages > No. Actually I enjoyed the subject for its simplicity compared to "hard" > mathematics, and for its elegance too. I have a weakness to all sorts of > formalisms, which is typical to Russian mathematical school. My major was > mathematics. That sounds like the Dmitry I've met before on c.l.a. (My major was math, too.) > I didn't want to become a programmer, it just happened to me during 5th or > 6th semester. That time I was exposed to the classical Aho and Ullmann. That is the book we used. I don't remember the edition. > I was saved by David Gries. He is a really great man, who deserves all > possible fame and respect. His book on compiler construction opened my > eyes. I don't remember reading that book. I'll look out for it. A stimulating discussion, thanks Dmitry. But I guess, no conversions of beliefs :-) -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 22:13 ` Niklas Holsti @ 2012-03-10 10:36 ` Dmitry A. Kazakov 2012-03-10 20:35 ` Niklas Holsti 2012-03-10 13:46 ` Brian Drummond 1 sibling, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-10 10:36 UTC (permalink / raw) On Sat, 10 Mar 2012 00:13:19 +0200, Niklas Holsti wrote: > On 12-03-08 22:40 , Dmitry A. Kazakov wrote: >> On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote: >> >>> On 12-03-07 22:17 , Dmitry A. Kazakov wrote: >>>> On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: >>>> >>>>> On 12-03-07 11:12 , Dmitry A. Kazakov wrote: >>>>> >>>>>> I was always wondering why people bother to study grammars, LR, LL parsers >>>>>> and other useless stuff. >>>>> >>>>> Grammars and the notion of parsing (derivations) rigorously define what >>>>> a given string of characters means, in some formally defined language. >>>> >>>> No. A grammar defines some formal language: a set of strings. >>> >>> Ok, I was being too brief. You are right that a grammar defines a >>> language. But notice that I was also speaking of parsing and >>> derivations, not just grammars. >>> >>> If you want to define the meanings of the sentences of a language in a >>> rigorous way, you have basically two options: >>> >>> 1. Define the meaning operationally, by giving a standard interpreter, a >>> black box, for example a Turing machine, that accepts the language, and >>> define that the actions of that machine, while reading a sentence, are >>> the meaning of that sentence. (From your point of view, the meaning >>> could be the "translation" that appears on the output tape.) >>> >>> 2. Define the meaning mathematically in some way. Note that by >>> "mathematically" I don't necessarily mean formally proven math, but >>> mathematically precise definitions with the usual amount of hand-waving >>> and shortcuts. >> >> 1 and 2 are same to me. In all cases it is an input for some machine. An >> inference system based on Peano axioms or whatever, is no less a machine >> than the Turing one. The only difference is in the computational power of >> such machines. > > I see a big difference in the structure of the machine. Which is more or less equivalent to its power. >>> As far as I know, all methods -- formal or informal -- for defining the >>> meaning of sentences in formal languages are based on how the sentence >>> is derived, or equivalently how it is parsed. >> >> No. 1234 has the same meaning for Ada parser, human reader and an OCR >> system. > > 1234 has no meaning, except as a literal string, until you give it some > other meaning, for example as the integer one thousand two hundred and > thirty four. 1234 is an example of a language (of decimal literal). The point is that the meaning of the sentence in that language does depend on how you would parse it. An OCR system could simply compare pictures. What human brain does, nobody knows. An Ada program would yield a machine number. Where is a tree? Which tree? + / \ 4 + / \ * / \ 3 10 This one? There is a countable infinity of such trees. A German would say that this is twelve hundred, four and thirty (12,4,3). A way different structure then? The meaning has nothing to do with parsing. In fact parsing has to be validated against the meaning. >>> How else could you explain why "2*3+4*5" means 26, and not 70, nor >>> something else? >> >> Explain? It is a subject of proof. There are more than one way to do it. > > Operator precedence, perhaps? See earlier post... I didn't mean variations of operator precedence generating different results. I meant that 2*3+4*5 = 26 is a statement to be proven. It is either true or false (or undecidable). 26 is the value of the formula. >> BTW the "meaning" you are advocating for looks rather something like >> 23*45*+. > > More like: the expression is a SUM of (a TERM that is the PRODUCT of > (the number 2) and (the number 3)) and (a TERM that is the PRODUCT of > (the number 4) and (the number 5)), together with a rule for evaluating > such expression trees. The above is alike, the difference is that it uses the Polish Reverse instead of your tree. Why is your "meaning" better? >>> In practice, the role of the parse/derivation tree appears in the fact >>> that parser/compiler generators let you attach "semantic actions" to >>> grammar productions, and often help you to consider larger parts of the >>> derivation by passing information around the parse/derivation tree in >>> well-defined ways. Attribute grammars are one example. >> >> Parsing tree is just a less ambiguous representation of the program. > > "Just"? It is the first step in the assignment of meaning to a sentence. It could be a step of evaluating the sentence, validating it etc. But the meaning of the sentence is already assigned by the sentence itself (provided the sentence is legal). > The second step is a computation defined inductively on the parse tree. The second step is semantic analysis, if we consider compilers. AST gets a lot of kicking at this stage: transformed, manipulated, pruned etc. >> You don't even need to generate it in order to evaluate 2*3+4*5 >> using a finite state machine. > > You cannot do it with a finite state machine (assuming you want to allow > literals of any size and expressions of any length). You need a stack of > numbers, too. First, stack is not tree. Second, regarding finiteness, that is of course true, just because an infinite language cannot be mapped (injectively) onto a finite number of states. But this has nothing to do with the "structure." Once you limit the language you can parse it by brute force, no trees, no stacks, just strcmp. > The generation of parsing machines from grammars is one of the main > achievements of formal language and parsing theory. Why sweat to make > the machine manually, when you can do it automatically? Because: 1. You need not to generate anything. All infix expressions can be parsed by the same engine controlled by a few tables. 2. The efforts are immense. I already gave an example of Ada grammar capturing operator precedence, association constraints (e.g. 2**+6), left to right, right to left etc. It is a huge work. And you also need to verify if the mess of productions you have written indeed means that "unary + is illegal as power." Because productions is NOT the meaning, they are just a program. 3. You need tools, you need integration of the generated code into your project. You need to keep track of changes of the grammar program, of the generated program, of the rest. >> And you certainly do not need it when using your head > > How do you know what your head is using? :-) Last time I did it, there were no trees inside! (:-)) >> Do you want to say that the trace of >> productions applied during parsing is the program semantics? > > That trace (or rather the parse tree) is the structure on which the > semantics is defined, by structurally inductive definitions. We have to disagree here. To me AST is just a representation, e.g. another language. Together with the operations defined on the ASTs, it is a machine into which language the parser might translate the input. There exist other machines, and there is nothing special in this particular one. >> This is a remnant of a delusion shared by some that >> formal languages could describe, if not natural languages, but at least >> machine languages. I thought nobody seriously believes in this now. > > Natural languages, ok, since they have no precise definition to begin > with, so can only be approximated by formal languages. But I am amazed > by your position regarding machine languages. Perhaps we have a > different notion of what "describing" means. > >> Care to carve a grammar where >> >> declare >> X : Integer := 1.0 + 2; >> >> would be a syntax error? > > Easy, the grammar that defines only the empty string :-) > > But I guess you are thinking about types, and how to make the mixing of > integers and floating point illegal. It is trivial to write a grammar > that makes it a syntax error to mix integer literals and floating point > literals in the same expression, but you have to duplicate the > definition of expressions and have separate metasymbols > Integer_Expression and FP_Expression, and all the way down to literals. ... and for all types, user-defined included. And also - "a return required in the function". And also - all alternatives of a case do not intersect and covering all choices ... Just for the start. (Grammar is not the semantics) > If you are thinking about identifiers with types, use an attribute > grammar that checks types by means of a symbol table, and there you are. > Perhaps you don't want to call that a "syntax error", but at least it is > a "parse-time error". That is yet another nail in the coffin. Grammars are not only incapable to describe language semantics beyond classroom exercises. Even if they could, that would not yet make a compiler. A compiler should also explain why it rejects illegal program. Formally speaking, the language being parsed consists of all possible strings. Nothing is a syntax error. If the input is illegal in the language 1 (e.g. Ada), then in the language 2 it is legal and yields "closing bracket is missing for the opening bracket found at xx:yy." >>>> Furthermore the merits of formal grammar definitions are debatable, because >>>> the complexity of grammar descriptions usually make it impossible to judge >>>> if a given string do belong to the formal language or not. >>> >>> Entirely wrong. Can you give an example of this impossibility? >> >> Are you claiming that if I gave you a random set of production rules and a >> random string you could tell me [in 30 seconds time] if that string belongs >> to the language? > > What have my personal abilities to do with anything? If nobody can understand grammar descriptions, they cannot serve as a meaning-carrier. > Perhaps you want to use informal methods, and ask Jill, Joe, and John if > the sentence is legal? Perhaps with majority voting? Absolutely. To me returning a limited object from a function is rubbish, but the majority voted it meaningful. Anyway, there is a rock bottom for any formalism, beyond it everything becomes informal. > From this point of view, grammars are an application-specific > programming language, for programming parsers. Yes. > Nothing wrong with that. Wrong is that they serve that purpose rather poorly. > And it was made possible by the research into formal languages and parsing. These are not worth the paper they were printed on. But that is my personal opinion. CS is in general turned to be quite fruitless. Maybe because the subject is too hard to our contemporary mathematics. Maybe because humans are simply stupid, and we will never advance beyond our limits. Who knows. > If you want to describe what language your hand-written parser accepts, > and what its processing means, I think you have to present the grammar > and describe the meaning of the sentences. Or, I can present the tables with the same effect. > You can show the grammar in > BNF, and describe the meaning informally, but I would be very surprised > if you could define the meaning without referring to the BNF, that is, > without using some structural induction method (informally, of course). BTW, mathematical notation existed long before BNF. >>>> On one hand these classes do not capture real differences >>>> between syntaxes of programming languages. >>> >>> What sort of "real differences" do you mean? >> >> Like between C++ and Ada. > > Sorry, not clear yet. I'm sure you don't mean just that one uses {} and > the other begin-end, but what do you mean? I mean that a human reader instantly recognizes that the language is "like C++" or "like Ada". This means that somewhere in the brain there is a picture, a class of syntaxes, we, as humans, are capable to operate. On the contrary the classes of formal languages are difficult to us to recognize. If anybody were presented by a sample from this or that class, he would not be able to figure out what it were. It is like the FFT of an image. Can you tell if the picture shows a cat or a girl by looking at its spectrum? Ergo, spectral analysis is poor for scene recognition. (The difference is though spectral analysis is useful for other things, while grammars are useful for nothing... (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-10 10:36 ` Dmitry A. Kazakov @ 2012-03-10 20:35 ` Niklas Holsti 2012-03-11 9:47 ` Dmitry A. Kazakov 2012-03-11 10:55 ` Georg Bauhaus 0 siblings, 2 replies; 63+ messages in thread From: Niklas Holsti @ 2012-03-10 20:35 UTC (permalink / raw) On 12-03-10 12:36 , Dmitry A. Kazakov wrote: > On Sat, 10 Mar 2012 00:13:19 +0200, Niklas Holsti wrote: > >> On 12-03-08 22:40 , Dmitry A. Kazakov wrote: >>> On Thu, 08 Mar 2012 20:02:25 +0200, Niklas Holsti wrote: >>> >>>> On 12-03-07 22:17 , Dmitry A. Kazakov wrote: >>>>> On Wed, 07 Mar 2012 19:49:04 +0200, Niklas Holsti wrote: >>>>> >>>>>> On 12-03-07 11:12 , Dmitry A. Kazakov wrote: >>>>>> >>>>>>> I was always wondering why people bother to study grammars, LR, LL parsers >>>>>>> and other useless stuff. >>>>>> >>>>>> Grammars and the notion of parsing (derivations) rigorously define what >>>>>> a given string of characters means, in some formally defined language. >>>>> >>>>> No. A grammar defines some formal language: a set of strings. >>>> >>>> Ok, I was being too brief. You are right that a grammar defines a >>>> language. But notice that I was also speaking of parsing and >>>> derivations, not just grammars. >>>> >>>> If you want to define the meanings of the sentences of a language in a >>>> rigorous way, you have basically two options: >>>> >>>> 1. Define the meaning operationally, by giving a standard interpreter, a >>>> black box, for example a Turing machine, that accepts the language, and >>>> define that the actions of that machine, while reading a sentence, are >>>> the meaning of that sentence. (From your point of view, the meaning >>>> could be the "translation" that appears on the output tape.) >>>> >>>> 2. Define the meaning mathematically in some way. Note that by >>>> "mathematically" I don't necessarily mean formally proven math, but >>>> mathematically precise definitions with the usual amount of hand-waving >>>> and shortcuts. >>> >>> 1 and 2 are same to me. In all cases it is an input for some machine. An >>> inference system based on Peano axioms or whatever, is no less a machine >>> than the Turing one. The only difference is in the computational power of >>> such machines. >> >> I see a big difference in the structure of the machine. > > Which is more or less equivalent to its power. Van Winjgaarden grammars are Turing complete (which also causes problems, of course). One should use the simplest formalism that is powerful enough for one's needs. You probably don't consider grammars and parsing tools simple, but that is a matter of opinion and personal taste. >>>> As far as I know, all methods -- formal or informal -- for defining the >>>> meaning of sentences in formal languages are based on how the sentence >>>> is derived, or equivalently how it is parsed. >>> >>> No. 1234 has the same meaning for Ada parser, human reader and an OCR >>> system. >> >> 1234 has no meaning, except as a literal string, until you give it some >> other meaning, for example as the integer one thousand two hundred and >> thirty four. > > 1234 is an example of a language (of decimal literal). Or an example of octal literals, or hexadecimal literals, or telephone numbers. The meaning is not intrinsic in the string; it must be defined by other means. > The point is that > the meaning of the sentence in that language does depend on how you would > parse it. Is there a "not" missing in that sentence? Otherwise it seems to agree with me that the meaning depends on the parsing, roughly speaking. > An OCR system could simply compare pictures. What human brain > does, nobody knows. An Ada program would yield a machine number. Where is a > tree? Which tree? It is not intrinsic in the string. The grammar, and the corresponding parse trees, are tools for defining the strings in the language, and the meaning of each string. > > + > / \ > 4 + > / \ > * > / \ > 3 10 > > This one? There is a countable infinity of such trees. A German would say > that this is twelve hundred, four and thirty (12,4,3). A way different > structure then? Yes, the grammar of (decimal) literals can be written in several ways, and all of these grammars can be attributed to define the same meaning (numerical value) of the string (of course they can also be attributed in different ways, to define different meanings). So what? I never claimed that the grammar is unique, just that grammars and parse trees are IMO the unique known method to define the meanings of complex sentences. > The meaning has nothing to do with parsing. Ok, if you can define the meaning in some other way, and not by parsing. How do you define it? > In fact parsing has to be validated against the meaning. Why? If you define the meaning without using parsing, why is that validation necessary? >>>> How else could you explain why "2*3+4*5" means 26, and not 70, nor >>>> something else? >>> >>> Explain? It is a subject of proof. There are more than one way to do it. >> >> Operator precedence, perhaps? See earlier post... > > I didn't mean variations of operator precedence generating different > results. I meant that 2*3+4*5 = 26 is a statement to be proven. It is > either true or false (or undecidable). 26 is the value of the formula. The truth of the formula depends on the meaning assigned to the expressions in the formula. Before one can prove anything, one must define the proof rules, including the rules for modifying sentences (expressions) while preserving their meaning. The latter rules are derived from the rules that define the meaning of an expression based on the structure (parse tree) of the expression, and the properties of the operators in the expression. >>> BTW the "meaning" you are advocating for looks rather something like >>> 23*45*+. >> >> More like: the expression is a SUM of (a TERM that is the PRODUCT of >> (the number 2) and (the number 3)) and (a TERM that is the PRODUCT of >> (the number 4) and (the number 5)), together with a rule for evaluating >> such expression trees. > > The above is alike, the difference is that it uses the Polish Reverse > instead of your tree. Why is your "meaning" better? My intent was to show a tree. I was too lazy to use ASCII art, so I used parentheses instead, to make the groupings unambiguous. Please imagine a drawing of the tree instead. Note that the tree is not yet the meaning, but the meaning of the tree (and thus the meaning of the sentence) is defined by structural inductive rules over the tree. >>>> In practice, the role of the parse/derivation tree appears in the fact >>>> that parser/compiler generators let you attach "semantic actions" to >>>> grammar productions, and often help you to consider larger parts of the >>>> derivation by passing information around the parse/derivation tree in >>>> well-defined ways. Attribute grammars are one example. >>> >>> Parsing tree is just a less ambiguous representation of the program. >> >> "Just"? It is the first step in the assignment of meaning to a sentence. > > It could be a step of evaluating the sentence, validating it etc. But the > meaning of the sentence is already assigned by the sentence itself > (provided the sentence is legal). No, meanings are not inherent in sentences. There must be a definition of the meaning of an arbitrary sentence: a function "meaning" from the sent of sentences to the set of meanings. For use in computation, this function must be defined in a practically computable way. It is not enough that we have an intuitive or automatic understanding of the sentence. >> The second step is a computation defined inductively on the parse tree. > > The second step is semantic analysis, if we consider compilers. AST gets a > lot of kicking at this stage: transformed, manipulated, pruned etc. Right. The AST represents the meaning of the program, and the compiler applies meaning-preserving transformations to the AST. This requires a definition of the meaning of an AST (for compilers: the computation represented by the AST). The only practical way to define the meaning of an AST is by inductive rules using the structure of the AST. >>> You don't even need to generate it in order to evaluate 2*3+4*5 >>> using a finite state machine. >> >> You cannot do it with a finite state machine (assuming you want to allow >> literals of any size and expressions of any length). You need a stack of >> numbers, too. > > First, stack is not tree. True, but it is the simplest structure that lets finite state machines handle this kind of computation. I was humoring you by avoiding trees. > Second, regarding finiteness, that is of course > true, just because an infinite language cannot be mapped (injectively) onto > a finite number of states. But this has nothing to do with the "structure." > Once you limit the language you can parse it by brute force, no trees, no > stacks, just strcmp. Eh? I don't understand what you are talking about. Finite languages? Not worth discussing, IMO. >> The generation of parsing machines from grammars is one of the main >> achievements of formal language and parsing theory. Why sweat to make >> the machine manually, when you can do it automatically? > > Because: > > 1. You need not to generate anything. All infix expressions can be parsed > by the same engine controlled by a few tables. Infix expressions are a simple case. > 2. The efforts are immense. I already gave an example of Ada grammar > capturing operator precedence, association constraints (e.g. 2**+6), left > to right, right to left etc. It is a huge work. Hardly huge. There are only 6 levels: expression, relation, simple_expression, term, factor, primary. > And you also need to verify > if the mess of productions you have written indeed means that "unary + is > illegal as power." Because productions is NOT the meaning, they are just a > program. And have you created tables for parsing Ada? How long did it take you, and are you sure they are correct? > 3. You need tools, you need integration of the generated code into your > project. You need to keep track of changes of the grammar program, of the > generated program, of the rest. True. But many SW projects already use one or more such tools: preprocessors, GUI code generators, DB interface generators, ... >>> Care to carve a grammar where >>> >>> declare >>> X : Integer := 1.0 + 2; >>> >>> would be a syntax error? >> >> Easy, the grammar that defines only the empty string :-) >> >> But I guess you are thinking about types, and how to make the mixing of >> integers and floating point illegal. It is trivial to write a grammar >> that makes it a syntax error to mix integer literals and floating point >> literals in the same expression, but you have to duplicate the >> definition of expressions and have separate metasymbols >> Integer_Expression and FP_Expression, and all the way down to literals. > > ... and for all types, user-defined included. And also - "a return required > in the function". And also - all alternatives of a case do not intersect > and covering all choices ... Just for the start. The practical way is to use attribute grammars: >> If you are thinking about identifiers with types, use an attribute >> grammar that checks types by means of a symbol table, and there you are. >> Perhaps you don't want to call that a "syntax error", but at least it is >> a "parse-time error". > > That is yet another nail in the coffin. Grammars are not only incapable to > describe language semantics beyond classroom exercises. I don't agree. But so be it. > Even if they could, > that would not yet make a compiler. A compiler should also explain why it > rejects illegal program. Here we are entering usability and friendliness, which are quality-of-implementation issues. There has certainly been work in the formal-languages and parsing theory communities on error reporting and automatic recovery from parsing errors. And I know some manually written parsers that are not very friendly with error reports... > Formally speaking, the language being parsed consists of all possible > strings. Nothing is a syntax error. If the input is illegal in the language > 1 (e.g. Ada), then in the language 2 it is legal and yields "closing > bracket is missing for the opening bracket found at xx:yy." I don't understand your point. Are you being ironical? >>>>> Furthermore the merits of formal grammar definitions are debatable, because >>>>> the complexity of grammar descriptions usually make it impossible to judge >>>>> if a given string do belong to the formal language or not. ... > If nobody can understand grammar descriptions, they cannot serve as a > meaning-carrier. To be sure, grammars can be complex and hard to grasp. But your alternatives seem worse, IMO. >> If you want to describe what language your hand-written parser accepts, >> and what its processing means, I think you have to present the grammar >> and describe the meaning of the sentences. > > Or, I can present the tables with the same effect. I'll bet any users would prefer the grammar to the tables. >>>>> On one hand these classes do not capture real differences >>>>> between syntaxes of programming languages. >>>> >>>> What sort of "real differences" do you mean? >>> >>> Like between C++ and Ada. >> >> Sorry, not clear yet. I'm sure you don't mean just that one uses {} and >> the other begin-end, but what do you mean? > > I mean that a human reader instantly recognizes that the language is "like > C++" or "like Ada". Sorry, not clear yet. Is the human reader looking at some C++ code and some Ada code, or at a C++ grammar and an Ada grammar? Or at a manually written C++ parser and a manually written Ada parser? Or at parsing tables for C++, and parsing tables for Ada? > This means that somewhere in the brain there is a > picture, a class of syntaxes, we, as humans, are capable to operate. I suspect that if you show some non-programmers examples of C++ and Ada code, they will find both equally incomprehensible. > On the > contrary the classes of formal languages are difficult to us to recognize. I agree, they are technical distinctions. > If anybody were presented by a sample from this or that class, he would not > be able to figure out what it were. It is like the FFT of an image. Can you > tell if the picture shows a cat or a girl by looking at its spectrum? Ergo, > spectral analysis is poor for scene recognition. It is a poor way for humans to recognize scenes, sure. But it can be useful in machine vision and recognition applications. > (The difference is though > spectral analysis is useful for other things, while grammars are useful for > nothing... (:-)) Going back to the differences between C++ and Ada, one systematic difference that comes to mind is the "overloading" of the same terminal symbols in C++, especially the closing braces, '}'. Writing too few or too many closing braces can lead to confusing error messages. In contrast, Ada uses different terminal symbol combinations (end if, end loop, end case, end <identifier>) which usually lets the compiler give more informative error messages. I don't think this corresponds to a difference in the formal language classes of C++ and Ada, but I think the difference could be observed by som kind of "grammar metrics" (or "language metrics", if you prefer) that would give Ada a better score than C++, on this metric. But such grammar metrics have not been developed AFAIK. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-10 20:35 ` Niklas Holsti @ 2012-03-11 9:47 ` Dmitry A. Kazakov 2012-03-11 22:22 ` Niklas Holsti 2012-03-11 10:55 ` Georg Bauhaus 1 sibling, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-11 9:47 UTC (permalink / raw) On Sat, 10 Mar 2012 22:35:09 +0200, Niklas Holsti wrote: > On 12-03-10 12:36 , Dmitry A. Kazakov wrote: >> >> 1234 is an example of a language (of decimal literal). > > Or an example of octal literals, or hexadecimal literals, or telephone > numbers. The meaning is not intrinsic in the string; it must be defined > by other means. Which only supports my point. >> The point is that >> the meaning of the sentence in that language does depend on how you would >> parse it. > > Is there a "not" missing in that sentence? Yes, sorry. >> An OCR system could simply compare pictures. What human brain >> does, nobody knows. An Ada program would yield a machine number. Where is a >> tree? Which tree? > > It is not intrinsic in the string. Yes, meaning is a mapping of which string is the argument. The way this mapping is computed (if computable) is irrelevant. > The grammar, and the corresponding > parse trees, are tools for defining the strings in the language, and the > meaning of each string. No, only the former. Example: <numeral> ::= <decimal-digit>[<numeral>] This does not define the meaning of 1234. It may hint the human reader that <numeral> has something to do with numbers. But numbers and their properties are defined elsewhere (in mathematics). The mapping: <numeral> corresponds to the number from the set X identified by the means Y must be stated additionally to the grammar. >> + >> / \ >> 4 + >> / \ >> * >> / \ >> 3 10 >> >> This one? There is a countable infinity of such trees. A German would say >> that this is twelve hundred, four and thirty (12,4,3). A way different >> structure then? > > Yes, the grammar of (decimal) literals can be written in several ways, > and all of these grammars can be attributed to define the same meaning > (numerical value) of the string (of course they can also be attributed > in different ways, to define different meanings). So what? I never > claimed that the grammar is unique, just that grammars and parse trees > are IMO the unique known method to define the meanings of complex sentences. 1. If not unique, then you cannot claim that the structure of a particular tree is inherent to the language. It is not. 2. If same grammar may be used with different meanings, then the grammar does not define any of them. >> The meaning has nothing to do with parsing. > > Ok, if you can define the meaning in some other way, and not by parsing. > How do you define it? http://en.wikipedia.org/wiki/Positional_notation does not even use the word "grammar" when describes positional notation. (Of course parsing is involved, e.g. your browser will parse the page in order to render it. But that defines neither the meaning of this page nor the meaning of numerals.) Maybe our disagreement lies in that you think that all definitions have to be constructive? They should not. Anyway I can always skip parsing and define meaning by brute force: "1" means 1 "2" means 2 "3" means 3 ... >> In fact parsing has to be validated against the meaning. > > Why? If you define the meaning without using parsing, why is that > validation necessary? Because the meaning exists independently on the parsing, thus I must show that the result of translation corresponds to the meaning. "1234" must be translated into 1234, because that is the meaning of "1234". Not otherwise. >>>>> How else could you explain why "2*3+4*5" means 26, and not 70, nor >>>>> something else? >>>> >>>> Explain? It is a subject of proof. There are more than one way to do it. >>> >>> Operator precedence, perhaps? See earlier post... >> >> I didn't mean variations of operator precedence generating different >> results. I meant that 2*3+4*5 = 26 is a statement to be proven. It is >> either true or false (or undecidable). 26 is the value of the formula. > > The truth of the formula depends on the meaning assigned to the > expressions in the formula. Before one can prove anything, one must > define the proof rules, including the rules for modifying sentences > (expressions) while preserving their meaning. Yes. > The latter rules are > derived from the rules that define the meaning of an expression based on > the structure (parse tree) of the expression, and the properties of the > operators in the expression. No, it is based on algebraic properties of numbers. > Note that the tree is not yet the meaning, but the meaning of the tree > (and thus the meaning of the sentence) is defined by structural > inductive rules over the tree. Sure, tree is just another representation. The meanings of the string and of its tree is supposed to be same. Neither is the definition of the meaning, they both just have it in the given context. Moreover, this meaning will be preserved during following stages of compilation. The object code will have it, the executable code will have it too. Keeping this meaning intact is what makes the compiler correct. >>>>> In practice, the role of the parse/derivation tree appears in the fact >>>>> that parser/compiler generators let you attach "semantic actions" to >>>>> grammar productions, and often help you to consider larger parts of the >>>>> derivation by passing information around the parse/derivation tree in >>>>> well-defined ways. Attribute grammars are one example. >>>> >>>> Parsing tree is just a less ambiguous representation of the program. >>> >>> "Just"? It is the first step in the assignment of meaning to a sentence. >> >> It could be a step of evaluating the sentence, validating it etc. But the >> meaning of the sentence is already assigned by the sentence itself >> (provided the sentence is legal). > > No, meanings are not inherent in sentences. There must be a definition > of the meaning of an arbitrary sentence: a function "meaning" from the > sent of sentences to the set of meanings. This is precisely same. "A sentence in the language X" reads as follows: a set of strings + a domain set + a mapping from the former to the latter. Without that it is not a sentence. > For use in computation, this > function must be defined in a practically computable way. No, because the domain set in most cases is incomputable or ill-defined. >>>> You don't even need to generate it in order to evaluate 2*3+4*5 >>>> using a finite state machine. >>> >>> You cannot do it with a finite state machine (assuming you want to allow >>> literals of any size and expressions of any length). You need a stack of >>> numbers, too. >> >> First, stack is not tree. > > True, but it is the simplest structure that lets finite state machines > handle this kind of computation. That presumes a certain class of machines. There could be others, computing the result without stack, without any memory, for that matter, just per transitions. OT. I wondered what would happen if the hardware development took other path. I mean 20 years ago, computing was cheaper than volatile storage in relation to what we have now. Consider it shifting back. It is possible when molecular or optical, or whatever architectures evolve. This might require approaches quite different from ones presently in use. This is true for a shift in the opposite direction as well. >> Second, regarding finiteness, that is of course >> true, just because an infinite language cannot be mapped (injectively) onto >> a finite number of states. But this has nothing to do with the "structure." >> Once you limit the language you can parse it by brute force, no trees, no >> stacks, just strcmp. > > Eh? I don't understand what you are talking about. Finite languages? Not > worth discussing, IMO. All computers are finite and all programming languages are effectively finite both in the terms of the source code length and the number of states compiler programs may have. >>> The generation of parsing machines from grammars is one of the main >>> achievements of formal language and parsing theory. Why sweat to make >>> the machine manually, when you can do it automatically? >> >> Because: >> >> 1. You need not to generate anything. All infix expressions can be parsed >> by the same engine controlled by a few tables. > > Infix expressions are a simple case. And the rest is just trivial recursive descent. >> 2. The efforts are immense. I already gave an example of Ada grammar >> capturing operator precedence, association constraints (e.g. 2**+6), left >> to right, right to left etc. It is a huge work. > > Hardly huge. There are only 6 levels: expression, relation, > simple_expression, term, factor, primary. For a grammar, it is, because it is hard to reason about productions. >> And you also need to verify >> if the mess of productions you have written indeed means that "unary + is >> illegal as power." Because productions is NOT the meaning, they are just a >> program. > > And have you created tables for parsing Ada? How long did it take you, > and are you sure they are correct? I knew it, because a table has only symbol and its left and right precedence levels. These are directly described in LRM. Association constraints are verified by a single function, which takes left and right operators. When left is ** and right is unary +, that is an error. And note how simple it is to generate a meaningful error message in this case. >> 3. You need tools, you need integration of the generated code into your >> project. You need to keep track of changes of the grammar program, of the >> generated program, of the rest. > > True. But many SW projects already use one or more such tools: > preprocessors, GUI code generators, DB interface generators, ... Ah that sort of argument: people already keep on dying in car accidents. Why should we bother tightening the belts? (:-)) >> Formally speaking, the language being parsed consists of all possible >> strings. Nothing is a syntax error. If the input is illegal in the language >> 1 (e.g. Ada), then in the language 2 it is legal and yields "closing >> bracket is missing for the opening bracket found at xx:yy." > > I don't understand your point. Are you being ironical? Not at all. The point is that 7#FF# need to be parsed although illegal. You have to make your parser tolerant to the level typical to a human reader. This necessarily means that the grammar being parsed is not the original grammar, but one relaxed to accept the malformed literal above as well as illegal formula 2**+6, for which in the original grammar there is no production. >>> If you want to describe what language your hand-written parser accepts, >>> and what its processing means, I think you have to present the grammar >>> and describe the meaning of the sentences. >> >> Or, I can present the tables with the same effect. > > I'll bet any users would prefer the grammar to the tables. Maybe. They are indoctrinated in YACC etc as they are in C and Java. Users do prefer Java, but we both agree that Java is inferior to Ada. I have to write compilers from domain-specific languages frequently under very tight time constraints. Nobody pays for a compiler in these days. Customers just expect the job done as a part of a larger project. They don't even aware that this requires writing a compiler. Simplicity is one advantage of the table-driven approach. Another is reuse. You can reuse not only the parser itself but 80% of the code around it from project to project. That is because it is Ada and because recursive descent parsers ideally map onto object/procedural decomposition. >>>>>> On one hand these classes do not capture real differences >>>>>> between syntaxes of programming languages. >>>>> >>>>> What sort of "real differences" do you mean? >>>> >>>> Like between C++ and Ada. >>> >>> Sorry, not clear yet. I'm sure you don't mean just that one uses {} and >>> the other begin-end, but what do you mean? >> >> I mean that a human reader instantly recognizes that the language is "like >> C++" or "like Ada". > > Sorry, not clear yet. Is the human reader looking at some C++ code and > some Ada code, or at a C++ grammar and an Ada grammar? At the code sample, the only recognizable thing about a written grammar is - dear God, why should it happen to me! >> This means that somewhere in the brain there is a >> picture, a class of syntaxes, we, as humans, are capable to operate. > > I suspect that if you show some non-programmers examples of C++ and Ada > code, they will find both equally incomprehensible. Nothing in Plato's sense, the brain above is one of a trained programmer who was exposed to procedural languages. >> If anybody were presented by a sample from this or that class, he would not >> be able to figure out what it were. It is like the FFT of an image. Can you >> tell if the picture shows a cat or a girl by looking at its spectrum? Ergo, >> spectral analysis is poor for scene recognition. > > It is a poor way for humans to recognize scenes, sure. But it can be > useful in machine vision and recognition applications. Yes, but scenes are defined in terms of human perception, so you have to play by the rules. Compilers are similar in that respect because a human programmer is always involved. >> (The difference is though >> spectral analysis is useful for other things, while grammars are useful for >> nothing... (:-)) > > Going back to the differences between C++ and Ada, one systematic > difference that comes to mind is the "overloading" of the same terminal > symbols in C++, especially the closing braces, '}'. Writing too few or > too many closing braces can lead to confusing error messages. In > contrast, Ada uses different terminal symbol combinations (end if, end > loop, end case, end <identifier>) which usually lets the compiler give > more informative error messages. > > I don't think this corresponds to a difference in the formal language > classes of C++ and Ada, but I think the difference could be observed by > som kind of "grammar metrics" (or "language metrics", if you prefer) > that would give Ada a better score than C++, on this metric. But such > grammar metrics have not been developed AFAIK. The problem is that we cannot rationalize this. We just don't know how human brain works. Formal languages failed to reflect human perception of a language necessary for building a compiler. Other failures, alleged or not, we have already discussed. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-11 9:47 ` Dmitry A. Kazakov @ 2012-03-11 22:22 ` Niklas Holsti 2012-03-12 5:12 ` Niklas Holsti 2012-03-12 9:43 ` Dmitry A. Kazakov 0 siblings, 2 replies; 63+ messages in thread From: Niklas Holsti @ 2012-03-11 22:22 UTC (permalink / raw) On 12-03-11 11:47 , Dmitry A. Kazakov wrote: > On Sat, 10 Mar 2012 22:35:09 +0200, Niklas Holsti wrote: > >> On 12-03-10 12:36 , Dmitry A. Kazakov wrote: >>> >>> 1234 is an example of a language (of decimal literal). >> >> Or an example of octal literals, or hexadecimal literals, or telephone >> numbers. The meaning is not intrinsic in the string; it must be defined >> by other means. > > Which only supports my point. What point is that? You have lost me. Or I you. > Yes, meaning is a mapping of which string is the argument. The way this > mapping is computed (if computable) is irrelevant. All language-processing programs, compilers and so on, compute with the meanings of strings. First, the meaning must be strictly defined; second, it must be practically computable. (Of course this means that there are important restrictions on the kinds of meanings that language-processing programs can handle -- they must often be "weakened" meanings. For example, during compilation, the meaning of a program is a "computation", not the end result of that computation.) >> The grammar, and the corresponding >> parse trees, are tools for defining the strings in the language, and the >> meaning of each string. > > No, only the former. Example: > > <numeral> ::=<decimal-digit>[<numeral>] > > This does not define the meaning of 1234. It may hint the human reader that > <numeral> has something to do with numbers. But numbers and their > properties are defined elsewhere (in mathematics). The mapping:<numeral> > corresponds to the number from the set X identified by the means Y must be > stated additionally to the grammar. A plain grammar does not define meaning, agreed. The meaning is defined by rules attached to the grammar productions. In this example, the rule use numbers and their mathematical proprerties; for your example production, the rule would say that the meaning -- the value -- of the grammar symbol on the left (<numeral>) is defined as the value of the <decimal-digit>, times 10, plus the value of the <numeral> on the right, or just as the value of the <decimal-digit> if there is no <numeral> on the right. Attribute grammars (which I include in the general term "grammars", but perhaps you don't) are one formalization of such rules. > >>> + >>> / \ >>> 4 + >>> / \ >>> * >>> / \ >>> 3 10 >>> >>> This one? There is a countable infinity of such trees. A German would say >>> that this is twelve hundred, four and thirty (12,4,3). A way different >>> structure then? >> >> Yes, the grammar of (decimal) literals can be written in several ways, >> and all of these grammars can be attributed to define the same meaning >> (numerical value) of the string (of course they can also be attributed >> in different ways, to define different meanings). So what? I never >> claimed that the grammar is unique, just that grammars and parse trees >> are IMO the unique known method to define the meanings of complex sentences. > > 1. If not unique, then you cannot claim that the structure of a particular > tree is inherent to the language. It is not. I've never claimed that -- see the quote to which you are replying! The parse tree depends on the grammar and the sentence, as I'm sure you know. > 2. If same grammar may be used with different meanings, then the grammar > does not define any of them. See above. The grammar is the scaffold on which the meaning is defined, by attribute computations or in other ways. >>> The meaning has nothing to do with parsing. >> >> Ok, if you can define the meaning in some other way, and not by parsing. >> How do you define it? > > http://en.wikipedia.org/wiki/Positional_notation > > does not even use the word "grammar" when describes positional notation. As I said in an earlier post, literal numbers are one of the few languages for which meaning can be defined by induction on the length of the string, which is the same as the positional notation definition. But defining meaning by positional notation does not work so well for the numeric literals in Ada. You would have to exclude underscores from the position numbering, and also handle the optional base specification. And it is just ridiculous to consider something like positional notation for defining the meaning of an Ada program. What role does the number 137 play in the meaning of the 137th token in the program? So I still wonder how you can define meaning without parsing, for languages more complex than numeric literals. > (Of course parsing is involved, e.g. your browser will parse the page in > order to render it. But that defines neither the meaning of this page nor > the meaning of numerals.) Maybe our disagreement lies in that you think > that all definitions have to be constructive? They should not. If you want to compute with meanings, they must be computable, which usually means they must be constructive. Counterexample? > Anyway I can > always skip parsing and define meaning by brute force: > > "1" means 1 > "2" means 2 > "3" means 3 If you include "0" and continue up to "9", you have the definition of the meaning of the <decimal-digit> symbol in your example production above. Just add the inductive rule for the production that defines <numeral>, and you get a complete, finite, definition of the meaning of all possible <numeral>s. > ... > >>> In fact parsing has to be validated against the meaning. >> >> Why? If you define the meaning without using parsing, why is that >> validation necessary? > > Because the meaning exists independently on the parsing, thus I must show > that the result of translation corresponds to the meaning. "1234" must be > translated into 1234, because that is the meaning of "1234". Not otherwise. Mere parsing does not give a translation (unless you consider the parse tree as a translation, which it is, of course). If you are producing the translation during parsing, or by a traversal of the parse tree, validation is easier if you define the meaning, too, as a function of the parse tree. Look, I agree, of course, that the natural number 1234 exists, without referring to the string "1234" or any grammar. That is mathematics. But in computing, we are dealing with myriads of different languages, with different domains of meaning, and we need computational definitions of the meaning of sentences, at least for the aspects of "meaning" that are involved in the specification, implementation, and validation of our programs. >>>>>> How else could you explain why "2*3+4*5" means 26, and not 70, nor >>>>>> something else? >>>>> >>>>> Explain? It is a subject of proof. There are more than one way to do it. >>>> >>>> Operator precedence, perhaps? See earlier post... >>> >>> I didn't mean variations of operator precedence generating different >>> results. I meant that 2*3+4*5 = 26 is a statement to be proven. It is >>> either true or false (or undecidable). 26 is the value of the formula. >> >> The truth of the formula depends on the meaning assigned to the >> expressions in the formula. Before one can prove anything, one must >> define the proof rules, including the rules for modifying sentences >> (expressions) while preserving their meaning. > > Yes. > >> The latter rules are >> derived from the rules that define the meaning of an expression based on >> the structure (parse tree) of the expression, and the properties of the >> operators in the expression. > > No, it is based on algebraic properties of numbers. Which is what I said -- algebra is about expressions and operators. I feel that you are not really getting any meaning from what I write, and vice versa, a bit. > All computers are finite and all programming languages are effectively > finite both in the terms of the source code length and the number of states > compiler programs may have. This argument is one of the last recourses when you have no good answer. Finity, finity, all is finity. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-11 22:22 ` Niklas Holsti @ 2012-03-12 5:12 ` Niklas Holsti 2012-03-12 9:43 ` Dmitry A. Kazakov 1 sibling, 0 replies; 63+ messages in thread From: Niklas Holsti @ 2012-03-12 5:12 UTC (permalink / raw) A small correction: On 12-03-12 00:22 , Niklas Holsti wrote: > On 12-03-11 11:47 , Dmitry A. Kazakov wrote: >> On Sat, 10 Mar 2012 22:35:09 +0200, Niklas Holsti wrote: ... >>> The grammar, and the corresponding >>> parse trees, are tools for defining the strings in the language, and the >>> meaning of each string. >> >> No, only the former. Example: >> >> <numeral> ::=<decimal-digit>[<numeral>] >> >> This does not define the meaning of 1234. [...] > > A plain grammar does not define meaning, agreed. The meaning is defined > by rules attached to the grammar productions. In this example, the rule > use numbers and their mathematical proprerties; for your example > production, the rule would say that the meaning -- the value -- of the > grammar symbol on the left (<numeral>) is defined as the value of the > <decimal-digit>, times 10, plus the value of the <numeral> on the right, ^^ I should have said, 10 to the power (length of the <numeral> string on the right). > or just as the value of the <decimal-digit> if there is no <numeral> on > the right. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-11 22:22 ` Niklas Holsti 2012-03-12 5:12 ` Niklas Holsti @ 2012-03-12 9:43 ` Dmitry A. Kazakov 2012-03-14 8:36 ` Niklas Holsti 1 sibling, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-12 9:43 UTC (permalink / raw) On Mon, 12 Mar 2012 00:22:06 +0200, Niklas Holsti wrote: > On 12-03-11 11:47 , Dmitry A. Kazakov wrote: >> On Sat, 10 Mar 2012 22:35:09 +0200, Niklas Holsti wrote: >> >>> On 12-03-10 12:36 , Dmitry A. Kazakov wrote: >>>> >>>> 1234 is an example of a language (of decimal literal). >>> >>> Or an example of octal literals, or hexadecimal literals, or telephone >>> numbers. The meaning is not intrinsic in the string; it must be defined >>> by other means. >> >> Which only supports my point. > > What point is that? You have lost me. Or I you. The point is that "1234" is meaningless. Whatever way you parse that string, it is still nothing but a string. >> Yes, meaning is a mapping of which string is the argument. The way this >> mapping is computed (if computable) is irrelevant. > > All language-processing programs, compilers and so on, compute with the > meanings of strings. No. They are computing transformations invariant to the meaning. Consider the source code of a program. You can modify it without knowing what it does, while keeping it doing the same thing. You can translate a C program into Ada without understanding it. A compiler does just same. There is no dwarf sitting in the box who understands Ada. There is nobody in. > First, the meaning must be strictly defined; > second, it must be practically computable. No. Programs deal with incomputable meanings most of the time. Real numbers, integer numbers, clock, user input are all incomputable. >>> The grammar, and the corresponding >>> parse trees, are tools for defining the strings in the language, and the >>> meaning of each string. >> >> No, only the former. Example: >> >> <numeral> ::=<decimal-digit>[<numeral>] >> >> This does not define the meaning of 1234. It may hint the human reader that >> <numeral> has something to do with numbers. But numbers and their >> properties are defined elsewhere (in mathematics). The mapping:<numeral> >> corresponds to the number from the set X identified by the means Y must be >> stated additionally to the grammar. > > A plain grammar does not define meaning, agreed. The meaning is defined > by rules attached to the grammar productions. No. The meanings are attached to the strings. But what you said is already sufficient to ditch grammars, since you said that it is rather the rules which carry the "meaning." Therefore I can take whatever algorithm and attach "meaning" to its transitions/states. From this point of view your grammars are just an algorithm among others. Why specifically grammars? Why the way of programming using grammars should be any better than one using tables, or whatever? >> 1. If not unique, then you cannot claim that the structure of a particular >> tree is inherent to the language. It is not. > > I've never claimed that -- see the quote to which you are replying! The > parse tree depends on the grammar and the sentence, as I'm sure you know. Then I don't understand your point about trees defining the meaning. >> 2. If same grammar may be used with different meanings, then the grammar >> does not define any of them. > > See above. The grammar is the scaffold on which the meaning is defined, > by attribute computations or in other ways. The same can be said about a piece of paper and pencil, files, keyboards, displays etc. I don't understand your argument. > But defining meaning by positional notation does not work so well for > the numeric literals in Ada. You would have to exclude underscores from > the position numbering, and also handle the optional base specification. And what is the problem to exclude them? I heard of no one who had any difficulty in understanding what Ada numeric literals mean. The problem you describe has nothing to do with defining their meaning. It has to do with your way of programming the parser. You have much worse problems with Ada, e.g. <label-1> : loop ... end loop <label-2>; where <label-1> must be equivalent to <label-2>. It is no problem to any human reader to understand what does this mean. It is no problem to a recursive descent parser. All Ada compilers handle it effortlessly. It is only a problem for the *language* you chosen. Take another one, end of story. > What role does the number > 137 play in the meaning of the 137th token in the program? Depends on what that program does. The meaning of a program is specific to the domain, none of the compiler's business. > So I still > wonder how you can define meaning without parsing, for languages more > complex than numeric literals. Here it is: "acceptable to the customer." Troublesome as it is, but there is no other way. > If you want to compute with meanings, they must be computable, which > usually means they must be constructive. Counterexample? Pi [ Meanings are not computed, they are assigned to the computational states, absolutely voluntarily, of course ] > Look, I agree, of course, that the natural number 1234 exists, without > referring to the string "1234" or any grammar. That is mathematics. But > in computing, we are dealing with myriads of different languages, with > different domains of meaning, and we need computational definitions of > the meaning of sentences, at least for the aspects of "meaning" that are > involved in the specification, implementation, and validation of our > programs. I still do not see how this may support grammars, or the point that when Ada compiler sees "Pi" it must compute Pi, or that ARM's syntax annex defines Pi? > I feel that you are not really getting any meaning from what I write, > and vice versa, a bit. Seems so. >> All computers are finite and all programming languages are effectively >> finite both in the terms of the source code length and the number of states >> compiler programs may have. > > This argument is one of the last recourses when you have no good answer. Because this the killer argument. No computer can parse infinite language or compute incomputable. If you want to handle such stuff [and you must all the time!] you have to map it onto some finite/computable framework and deal within the latter. You certainly lose something while doing this. And it is again incomputable to reason about what was lost. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-12 9:43 ` Dmitry A. Kazakov @ 2012-03-14 8:36 ` Niklas Holsti 2012-03-14 9:24 ` Georg Bauhaus 2012-03-14 10:14 ` Verified compilers? Dmitry A. Kazakov 0 siblings, 2 replies; 63+ messages in thread From: Niklas Holsti @ 2012-03-14 8:36 UTC (permalink / raw) On 12-03-12 11:43 , Dmitry A. Kazakov wrote: > On Mon, 12 Mar 2012 00:22:06 +0200, Niklas Holsti wrote: > >> On 12-03-11 11:47 , Dmitry A. Kazakov wrote: >>> On Sat, 10 Mar 2012 22:35:09 +0200, Niklas Holsti wrote: >>> >>>> On 12-03-10 12:36 , Dmitry A. Kazakov wrote: >>>>> >>>>> 1234 is an example of a language (of decimal literal). >>>> >>>> Or an example of octal literals, or hexadecimal literals, or telephone >>>> numbers. The meaning is not intrinsic in the string; it must be defined >>>> by other means. >>> >>> Which only supports my point. >> >> What point is that? You have lost me. Or I you. > > The point is that "1234" is meaningless. Whatever way you parse that > string, it is still nothing but a string. A string is a string, yes. We are talking about ways to define the meaning of strings, in various languages. >> All language-processing programs, compilers and so on, compute with the >> meanings of strings. > > No. They are computing transformations invariant to the meaning. > > Consider the source code of a program. You can modify it without knowing > what it does, while keeping it doing the same thing. You must be able to prove that the two forms of the program, P and P', have the same meaning. Either you have a definition of the meaning of the program, from which you show that meaning(P) = meaning(P'), or a set of modification (rewriting) rules R that by definition or assumption do not change the meaning, and then you show that P can be transformed to P' by applying these rules. IMO grammars (including attribute grammars) are a very practical way define either the meaning or the rewriting rules. > You can translate a C > program into Ada without understanding it. A compiler does just same. There > is no dwarf sitting in the box who understands Ada. There is nobody in. > >> First, the meaning must be strictly defined; >> second, it must be practically computable. > > No. Programs deal with incomputable meanings most of the time. Real > numbers, integer numbers, clock, user input are all incomputable. Nah. Anything a program can do, is by definition computable. A legal program source text always has a computable meaning, with one exception: non-termination, which is why the meaning of a program is defined as a possibly non-terminating computation, not as the end result. But this is a debate about the "meaning of meaning". The original question was whether the research in formal languages and parsing is good and useful (my view) or useless (your view). I don't see any hope for agreement. >> A plain grammar does not define meaning, agreed. The meaning is defined >> by rules attached to the grammar productions. > > No. The meanings are attached to the strings. > > But what you said is already sufficient to ditch grammars, since you said > that it is rather the rules which carry the "meaning." Therefore I can take > whatever algorithm and attach "meaning" to its transitions/states. From > this point of view your grammars are just an algorithm among others. Why > specifically grammars? Why the way of programming using grammars should be > any better than one using tables, or whatever? Why do all language definitions currently use grammars? Is it just bad influence from adacemic screwballs, inertia, and delusion? Grammars are declarative, attribute grammars almost so, at least much more so than "parse tables". >> But defining meaning by positional notation does not work so well for >> the numeric literals in Ada. You would have to exclude underscores from >> the position numbering, and also handle the optional base specification. > > And what is the problem to exclude them? You have to give rules for it, and these rules make the "positional notation" definition of the meaning of a numeric literal string much more complex. It is easy to do it informally, by saying something like "ignore underscores when you count positions", but doing it rigorously (mathematically) is more complex. > I heard of no one who had any > difficulty in understanding what Ada numeric literals mean. I am not talking about intuitive understanding, but about the rigorous definition of the meaning (which is not really rigorous in the Ada LRM, of course, since numeric literals are considered single tokens with values described by English sentences). > You have much worse problems with Ada, e.g. > > <label-1> : loop ... end loop<label-2>; > > where<label-1> must be equivalent to<label-2>. It is no problem to any > human reader to understand what does this mean. It is no problem to a > recursive descent parser. All Ada compilers handle it effortlessly. It is > only a problem for the *language* you chosen. What "problem"? A context-free grammar cannot enforce the two labels to be equivalent, true. An attribute grammar can, and a Van Wijngaarden grammar can. A yacc-generated parser can. >> What role does the number >> 137 play in the meaning of the 137th token in the program? > > Depends on what that program does. The meaning of a program is specific to > the domain, none of the compiler's business. Now you are using "meaning" with a different meaning, as in "what is the program used for". Out of scope. >>> All computers are finite and all programming languages are effectively >>> finite both in the terms of the source code length and the number of states >>> compiler programs may have. >> >> This argument is one of the last recourses when you have no good answer. > > Because this the killer argument. For what? For denying the usefulness of grammars? Would you really define the Ada language by a "manual" that lists all possible Ada programs? I guess the 1983 version would allow programs up to a certain size, the 1995 version would allow longer and therefore more programs, and so on... > No computer can parse infinite language > or compute incomputable. If you want to handle such stuff [and you must all > the time!] you have to map it onto some finite/computable framework and > deal within the latter. Exactly. But since our present computers can deal with very large numbers and programs indeed, the mathematical and technical methods we use for handling programming languages and programs cannot depend on this finiteness, but must use methods that apply to numbers and programs of unbounded size, but still of constrained form. Grammars are good for that, IMO. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-14 8:36 ` Niklas Holsti @ 2012-03-14 9:24 ` Georg Bauhaus 2012-03-14 11:14 ` REAL (was: Verified compilers?) stefan-lucks 2012-03-14 10:14 ` Verified compilers? Dmitry A. Kazakov 1 sibling, 1 reply; 63+ messages in thread From: Georg Bauhaus @ 2012-03-14 9:24 UTC (permalink / raw) On 14.03.12 09:36, Niklas Holsti wrote: > On 12-03-12 11:43 , Dmitry A. Kazakov wrote: >> No computer can parse infinite language >> or compute incomputable. If you want to handle such stuff [and you must all >> the time!] you have to map it onto some finite/computable framework and >> deal within the latter. > > Exactly. But since our present computers can deal with very large numbers and programs indeed, the mathematical and technical methods we use for handling programming languages and programs cannot depend on this finiteness, but must use methods that apply to numbers and programs of unbounded size, but still of constrained form. Grammars are good for that, IMO. Side note: WRT finiteness, language designers have made one particular mistake again and again, following a convention, by telling programmers about some type called REAL. The set of supposed real numbers that programmers have heard about before, in math lessons, is not finite---though no one has ever seen it, giving rise to the question of a possible misunderstanding of infinity. The values of reals include numbers that are not rational, not a ratio, or fraction, that is. What the language designers usually mean, though, is some finite Fraction_with_Split_Shifting thing, having infinitesimally little to do with the reals. Consequence: an infinite supply of surprises, cause by the name REAL. ^ permalink raw reply [flat|nested] 63+ messages in thread
* REAL (was: Verified compilers?) 2012-03-14 9:24 ` Georg Bauhaus @ 2012-03-14 11:14 ` stefan-lucks 2012-03-14 12:59 ` REAL Dmitry A. Kazakov 0 siblings, 1 reply; 63+ messages in thread From: stefan-lucks @ 2012-03-14 11:14 UTC (permalink / raw) On Wed, 14 Mar 2012, Georg Bauhaus wrote: > Side note: WRT finiteness, language designers have made one > particular mistake again and again, following a convention, > by telling programmers about some type called REAL. The set > of supposed real numbers that programmers have heard about > before, in math lessons, is not finite [...] > > What the language designers usually mean, though, is some finite > Fraction_with_Split_Shifting thing, having infinitesimally > little to do with the reals. Consequence: an infinite supply > of surprises, cause by the name REAL. In general, floating point numbers are meant to approximate the real numbers (in the mathematical sense). Unfortunately, most programmers have been educated just to apply floating point arithmetic in their programs, expecting the result to be close enough to the "real result". Hardly anyone cares enough to analyze how the error grows from operation to operation. Hardly anyone even bothers to define how close the result should to be -- which depends on the application at hand, of course. And, to be honest, that approach works well most of the time. Which is precisely the reason why wrong results (= approximations not close enough to the real result) come as *surprises* ... -- ---- Stefan.Lucks (at) uni-weimar.de, University of Weimar, Germany ---- <http://www.uni-weimar.de/cms/medien/mediensicherheit/home.html> ------ I love the taste of Cryptanalysis in the morning! ------ ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 11:14 ` REAL (was: Verified compilers?) stefan-lucks @ 2012-03-14 12:59 ` Dmitry A. Kazakov 2012-03-14 13:30 ` REAL Georg Bauhaus 0 siblings, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-14 12:59 UTC (permalink / raw) On Wed, 14 Mar 2012 12:14:05 +0100, stefan-lucks@see-the.signature wrote: > Hardly anyone cares enough to analyze how the error grows from operation > to operation. Hardly anyone even bothers to define how close the result > should to be -- which depends on the application at hand, of course. It is largely the hardware to blame. When the machine rounds, it is more difficult to keep track on precision, comparing to a truncating machine (in either direction). Of course, interval arithmetic should have replaced IEEE long ago. But nobody cares. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 12:59 ` REAL Dmitry A. Kazakov @ 2012-03-14 13:30 ` Georg Bauhaus 2012-03-14 13:51 ` REAL Dmitry A. Kazakov ` (2 more replies) 0 siblings, 3 replies; 63+ messages in thread From: Georg Bauhaus @ 2012-03-14 13:30 UTC (permalink / raw) On 14.03.12 13:59, Dmitry A. Kazakov wrote: > On Wed, 14 Mar 2012 12:14:05 +0100, stefan-lucks@see-the.signature wrote: > >> Hardly anyone cares enough to analyze how the error grows from operation >> to operation. Hardly anyone even bothers to define how close the result >> should to be -- which depends on the application at hand, of course. > > It is largely the hardware to blame. When the machine rounds, it is more > difficult to keep track on precision, comparing to a truncating machine (in > either direction). > > Of course, interval arithmetic should have replaced IEEE long ago. But > nobody cares. Programmers do care about floating point effects---when they strike: but being able to explain possible causes of FPT "effects" is an asset to human capital. So it is better to keep the current types and effects and acquire the asset. Turns hazards into sources of achievement. "So, you handled that effect? Good job!" If interval arithmetic is a superior alternative, then interval arithmetic popular in some way, but also make sure that programmers can substitute assets! Seems to be one way of getting technical changes adopted. Maybe write some nifty shader. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 13:30 ` REAL Georg Bauhaus @ 2012-03-14 13:51 ` Dmitry A. Kazakov 2012-03-14 20:37 ` REAL Brian Drummond 2012-03-14 13:52 ` REAL georg bauhaus 2012-03-14 17:42 ` REAL Jeffrey Carter 2 siblings, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-14 13:51 UTC (permalink / raw) On Wed, 14 Mar 2012 14:30:17 +0100, Georg Bauhaus wrote: > So it is better to keep the current > types and effects and acquire the asset. Turns hazards into > sources of achievement. When you compute anything using conventional floating point arithmetic, the result of this computation is within the bounds of the equivalent interval computation, provided, CPU has no bugs in its ALU. This is a mathematical fact. Interval arithmetic does not compute anything differently. It just keeps track of the error bounds. The disadvantages comparing to normal arithmetic is that you need twice as much memory (if the implementation is straightforward). This was an issue when intervals were actively studied (60s-70s). No any problem now. Comparing to analytical methods of error estimation. Intervals give a pessimistic one. A manual analysis based on additional knowledge (especially if variables are dependant) may sufficiently improve the estimation. It is worth to remember that floating point arithmetic does no any error estimation, whatsoever. In fact it is guaranteed to be inaccurate, which is the asset, you wrote about. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 13:51 ` REAL Dmitry A. Kazakov @ 2012-03-14 20:37 ` Brian Drummond 2012-03-14 21:52 ` REAL Dmitry A. Kazakov 0 siblings, 1 reply; 63+ messages in thread From: Brian Drummond @ 2012-03-14 20:37 UTC (permalink / raw) On Wed, 14 Mar 2012 14:51:59 +0100, Dmitry A. Kazakov wrote: > Interval arithmetic does not compute anything differently. It just keeps > track of the error bounds. > > The disadvantages comparing to normal arithmetic is (paraphrasing : a factor of two in hardware may not be very important, but...) > Comparing to analytical methods of error estimation. Intervals give a > pessimistic one. Indeed. As has been pointed out frequently on comp.arch by (I think) Nick Maclaren, any sufficiently complex calculation in interval arithmetic will return the interval (-inf, +inf). > A manual analysis based on additional knowledge > (especially if variables are dependant) may sufficiently improve the > estimation. But surely the same analysis can be performed on float arithmetic - admittedly this time, to widen the estimated range. - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 20:37 ` REAL Brian Drummond @ 2012-03-14 21:52 ` Dmitry A. Kazakov 0 siblings, 0 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-14 21:52 UTC (permalink / raw) On Wed, 14 Mar 2012 20:37:28 +0000 (UTC), Brian Drummond wrote: > On Wed, 14 Mar 2012 14:51:59 +0100, Dmitry A. Kazakov wrote: > >> Comparing to analytical methods of error estimation. Intervals give a >> pessimistic one. > > Indeed. As has been pointed out frequently on comp.arch by (I think) Nick > Maclaren, any sufficiently complex calculation in interval arithmetic > will return the interval (-inf, +inf). That depends on the nature of computations. Complex direct numerically instable computation which evaluate garbage are no better. Normally computations are iterative, actually narrowing the interval rather than widening it. >> A manual analysis based on additional knowledge >> (especially if variables are dependant) may sufficiently improve the >> estimation. > > But surely the same analysis can be performed on float arithmetic - > admittedly this time, to widen the estimated range. Not necessarily. Example: x + x. When replaced by 2*x, it would have no effect on the accuracy of floating-point computation, while may narrow the interval. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 13:30 ` REAL Georg Bauhaus 2012-03-14 13:51 ` REAL Dmitry A. Kazakov @ 2012-03-14 13:52 ` georg bauhaus 2012-03-14 17:42 ` REAL Jeffrey Carter 2 siblings, 0 replies; 63+ messages in thread From: georg bauhaus @ 2012-03-14 13:52 UTC (permalink / raw) Georg Bauhaus <rm.dash-bauhaus@futureapps.de> wrote: > If interval arithmetic is a superior alternative, then ...make... > interval arithmetic popular in some way, ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: REAL 2012-03-14 13:30 ` REAL Georg Bauhaus 2012-03-14 13:51 ` REAL Dmitry A. Kazakov 2012-03-14 13:52 ` REAL georg bauhaus @ 2012-03-14 17:42 ` Jeffrey Carter 2 siblings, 0 replies; 63+ messages in thread From: Jeffrey Carter @ 2012-03-14 17:42 UTC (permalink / raw) On 03/14/2012 06:30 AM, Georg Bauhaus wrote: > > Programmers do care about floating point effects---when they > strike: but being able to explain possible causes of FPT "effects" > is an asset to human capital. So it is better to keep the current > types and effects and acquire the asset. Turns hazards into > sources of achievement. "So, you handled that effect? Good job!" This is also part of the reason languages like C and its offspring are popular: the feeling of satisfaction at accomplishing something despite the complications imposed by the language. With a language like Ada, you write it and it works. Where's the fun in that? -- Jeff Carter "Mr. President, we must not allow a mine-shaft gap!" Dr. Strangelove 33 --- Posted via news://freenews.netfront.net/ - Complaints to news@netfront.net --- ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-14 8:36 ` Niklas Holsti 2012-03-14 9:24 ` Georg Bauhaus @ 2012-03-14 10:14 ` Dmitry A. Kazakov 2012-03-14 20:13 ` Niklas Holsti 1 sibling, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-14 10:14 UTC (permalink / raw) On Wed, 14 Mar 2012 10:36:51 +0200, Niklas Holsti wrote: > On 12-03-12 11:43 , Dmitry A. Kazakov wrote: >> On Mon, 12 Mar 2012 00:22:06 +0200, Niklas Holsti wrote: >> >>> All language-processing programs, compilers and so on, compute with the >>> meanings of strings. >> >> No. They are computing transformations invariant to the meaning. >> >> Consider the source code of a program. You can modify it without knowing >> what it does, while keeping it doing the same thing. > > You must be able to prove that the two forms of the program, P and P', > have the same meaning. Which can be done without knowing this meaning. It is enough to show that the meaning (whatever it be) is preserved. This is possible by showing that P and P' traverse same (or equivalent in some defined sense) states. >> You can translate a C >> program into Ada without understanding it. A compiler does just same. There >> is no dwarf sitting in the box who understands Ada. There is nobody in. >> >>> First, the meaning must be strictly defined; >>> second, it must be practically computable. >> >> No. Programs deal with incomputable meanings most of the time. Real >> numbers, integer numbers, clock, user input are all incomputable. > > Nah. Anything a program can do, is by definition computable. Irrelevant. The program operates incomputable meanings, that is the point. It can be done without computing these meanings, which is why computers are so useful while being so limited. A meaning is always incomputable by definition. This is where the process of computing actually stops, when it reaches a state meaning something to the program's user. > A legal program source text always has a computable meaning, Wrong, it is undecidable to determine what a program computes. And what is the problem with incomputable meanings? Even wrong meanings? Consider this program: Put_Line ("I have just computed that the Earth is flat!"); > But this is a debate about the "meaning of meaning". The original > question was whether the research in formal languages and parsing is > good and useful (my view) or useless (your view). Well, that debate was around your argument that grammars might have some virtual usefulness of being a meaning carrier. This does not work either. >>> A plain grammar does not define meaning, agreed. The meaning is defined >>> by rules attached to the grammar productions. >> >> No. The meanings are attached to the strings. >> >> But what you said is already sufficient to ditch grammars, since you said >> that it is rather the rules which carry the "meaning." Therefore I can take >> whatever algorithm and attach "meaning" to its transitions/states. From >> this point of view your grammars are just an algorithm among others. Why >> specifically grammars? Why the way of programming using grammars should be >> any better than one using tables, or whatever? > > Why do all language definitions currently use grammars? Where you saw any? ARM is not written in BNF. The semi-formal definitions of lexical elements it contains (nobody could ever care to write full BNF for Unicode characters) do not define the semantics of Ada. To prove your point you have to rewrite all ARM in BNF. > Grammars are declarative, attribute grammars almost so, at least much > more so than "parse tables". Huh, you should have argued that they are *less* declarative. Being declarative is not an advantage and a grammar as a set of rules may look more imperative than a bunch of tables. >>> But defining meaning by positional notation does not work so well for >>> the numeric literals in Ada. You would have to exclude underscores from >>> the position numbering, and also handle the optional base specification. >> >> And what is the problem to exclude them? > > You have to give rules for it, and these rules make the "positional > notation" definition of the meaning of a numeric literal string much > more complex. It is easy to do it informally, by saying something like > "ignore underscores when you count positions", but doing it rigorously > (mathematically) is more complex. That is easy. Remove all underscores first. (Don't tell me that an act of removing underscores cannot be formalized or automated.) >> I heard of no one who had any >> difficulty in understanding what Ada numeric literals mean. > > I am not talking about intuitive understanding, but about the rigorous > definition of the meaning (which is not really rigorous in the Ada LRM, > of course, since numeric literals are considered single tokens with > values described by English sentences). That meaning simply does not exist. There are language lawyers thinking about various language aspects. I don't even know if Ada could be formalized to a more or less decent level of rigor. In any case, grammars would be only an obstacle in this mammoth task... >> You have much worse problems with Ada, e.g. >> >> <label-1> : loop ... end loop<label-2>; >> >> where<label-1> must be equivalent to<label-2>. It is no problem to any >> human reader to understand what does this mean. It is no problem to a >> recursive descent parser. All Ada compilers handle it effortlessly. It is >> only a problem for the *language* you chosen. > > What "problem"? The problem is with the grammar, it does not define this constraint. What about the constraint that two objects with equivalent names cannot be declared in the same scope. Show me the production rule. > An attribute grammar can, and a Van Wijngaarden > grammar can. Great, why does not the ARM define it that way? Either it does not work or it is impracticable. Any outcome would support my point. > A yacc-generated parser can. We were talking about definitions of meanings, not about compiler programs. If grammars are so great to define the meaning of loop labels in Ada, where are these definitions? >>> What role does the number >>> 137 play in the meaning of the 137th token in the program? >> >> Depends on what that program does. The meaning of a program is specific to >> the domain, none of the compiler's business. > > Now you are using "meaning" with a different meaning, as in "what is the > program used for". Out of scope. ? That is the only meaning of programs I know. >>>> All computers are finite and all programming languages are effectively >>>> finite both in the terms of the source code length and the number of states >>>> compiler programs may have. >>> >>> This argument is one of the last recourses when you have no good answer. >> >> Because this the killer argument. > > For what? For denying the usefulness of grammars? For denying any arguments to infinite languages. > Would you really > define the Ada language by a "manual" that lists all possible Ada > programs? And that would not be enough. For each legal Ada program the definition must provide its equivalent in some machine language and some run-time environment. >> No computer can parse infinite language >> or compute incomputable. If you want to handle such stuff [and you must all >> the time!] you have to map it onto some finite/computable framework and >> deal within the latter. > > Exactly. But since our present computers can deal with very large > numbers and programs indeed, the mathematical and technical methods we > use for handling programming languages and programs cannot depend on > this finiteness, but must use methods that apply to numbers and programs > of unbounded size, but still of constrained form. Sorry, but it is either finite or not. Whatever methods you are talking about are *sufficiently* finite. You cannot compute incomputable, no matter what. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-14 10:14 ` Verified compilers? Dmitry A. Kazakov @ 2012-03-14 20:13 ` Niklas Holsti 0 siblings, 0 replies; 63+ messages in thread From: Niklas Holsti @ 2012-03-14 20:13 UTC (permalink / raw) On 12-03-14 12:14 , Dmitry A. Kazakov wrote: > A meaning is always incomputable by definition. This is where the process > of computing actually stops, when it reaches a state meaning something to > the program's user. If you want to discuss only this kind of "meaning", I quit. It is not computing, but psychology. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-10 20:35 ` Niklas Holsti 2012-03-11 9:47 ` Dmitry A. Kazakov @ 2012-03-11 10:55 ` Georg Bauhaus 1 sibling, 0 replies; 63+ messages in thread From: Georg Bauhaus @ 2012-03-11 10:55 UTC (permalink / raw) On 10.03.12 21:35, Niklas Holsti wrote: > One should use the simplest formalism that is powerful enough for one's needs. You probably don't consider grammars and parsing tools simple, but that is a matter of opinion and personal taste. Simplicity, if I may throw in a remark for later discussion, is frequently said to be an important property of a formalism. Opinion or taste may not adequately reflect the importance. More adequacy can be achieved through observation. Simplicity can be defined to be not just a matter of opinion, or of taste. It is a tuple, then, having two components at least: (simplicity felt, simplicity measured). The components correspond, roughly, to subjective and objective. Nailing these components down proceeds as follows. 1. Subjective simplicity: Subjects can declare the simplicity of a thing T on a subjective scale when stating their opinion. We can observe them saying, "It's simple!" This is the first, the subjective component of the tuple. (Validity of the data might be spoiled by affection or loyalty, etc., but this can be checked.) 2. Objective simplicity: To learn about simplicity of a thing T, observe the degree to which subjects solve a given problem P, employing thing T. Have two groups of subjects solve P, one employing T and the other employing T'. Then, a thing T is objectively simpler than a different thing T' if the T-group is more effective at solving P. "More effective" can refer to time needed, and to steps done. (As usual, the groups would have to be comparable. E.g., subjects in either group have the same amount of experience using T and T', respectively.) An example of subjective simplicity is when someone highly skilled in game theory says, "Rubik's cube is simple to solve!". Not everyone agrees. Objectively, Rubik's cube seems not so simple. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 22:13 ` Niklas Holsti 2012-03-10 10:36 ` Dmitry A. Kazakov @ 2012-03-10 13:46 ` Brian Drummond 1 sibling, 0 replies; 63+ messages in thread From: Brian Drummond @ 2012-03-10 13:46 UTC (permalink / raw) On Sat, 10 Mar 2012 00:13:19 +0200, Niklas Holsti wrote: > On 12-03-08 22:40 , Dmitry A. Kazakov wrote: >> I didn't want to become a programmer, it just happened to me during 5th >> or 6th semester. That time I was exposed to the classical Aho and >> Ullmann. > > That is the book we used. I don't remember the edition. Heh - I remember poring over it in a bookstore - several times - and deciding I didn't like it. I can't remember exactly why, but it felt like formalism for formalism's sake, rather than as a tool to get anything done. >> I was saved by David Gries. He is a really great man, who deserves all >> possible fame and respect. His book on compiler construction opened my >> eyes. I thought Gries was better, but my own favourite was "Understanding and Writing Compilers: A Do-it-Yourself Guide" by Richard Bornat. Quite hard to find nowadays (though I think I found a PDF of it). It does cover formal grammars - perhaps less fully, but clearly and from a practical aspect (i.e. assuming you would rather implement a parser than prove its grammar). But it covers other aspects - admittedly for older languages, but since it covers ALGOL (stack and heap), and SIMULA (closures) - all very much as a tool to get something done. - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-06 17:24 ` Shark8 2012-03-06 17:43 ` Dmitry A. Kazakov @ 2012-03-07 1:00 ` Randy Brukardt 1 sibling, 0 replies; 63+ messages in thread From: Randy Brukardt @ 2012-03-07 1:00 UTC (permalink / raw) "Shark8" <onewingedshark@gmail.com> wrote in message news:9207716.776.1331054644462.JavaMail.geo-discussion-forums@ynaz38... > On Monday, March 5, 2012 7:27:25 PM UTC-6, Randy Brukardt wrote: >> >> In any case, no one is going to run out and totally rebuild their >> compiler >> because someone has found a supposedly "better" architecture for it. >> There >> were many such better architectures when I started out (supposely >> everything >> was going to be table-driven and automatically generated), and they're >> all >> long gone. No reason to assume that anything is different here. > > I seem to recall someone on this forum doing a table-driven approach, > though it might have > only been WRT tokenization... The table-driven approaches I was thinking of were involved with semantics and code generation. Table-driver parsers and tokenizers are common (Janus/Ada uses a table-driven parser, for instance.) And to be fair, we did use a tiny corner of a table-driver approach in our code generator (there is a pattern-matching table for generating improved code for some sequences). We've move away from that in recent updates (we used to handle all numeric literals using the table, nowdays they are handled as special stack contents, a more general approach allowing more use of indexed addressing modes and instructions with embedded constants). Randy. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-06 1:27 ` Randy Brukardt 2012-03-06 17:24 ` Shark8 @ 2012-03-07 12:42 ` Stuart 2012-03-08 1:06 ` Randy Brukardt 1 sibling, 1 reply; 63+ messages in thread From: Stuart @ 2012-03-07 12:42 UTC (permalink / raw) On Tuesday, 6 March 2012 01:27:25 UTC, Randy Brukardt wrote: > "Yannick Duchêne (Hibou57)" <...> wrote in message > news:op.v91cleyiule2fv@douda-yannick ... > >What do you think of this sentence, "now that he's found the right > >architecture for building verified compilers, everyone will know how to do > >it?" > > Why would anyone want to do it? This is *not* a problem area in compilers > (at least in my experience). When we introduced an intermediate code to > separate semantics from optimization from code generation, one of the > unexpected benefits was that code generation bugs dropped to almost none. I admit I have not read the report yet - so I am unaware of the scope of the 'formal verification'; and Randy's post does point out other areas where errors creep into the compilation process. That said, there is still an issue in safety-critical applications of constructing a robust and compelling safety case, particularly covering the compilation process. This would seem to be something that would aid in such cases. You note 'code generation bugs dropped to almost none' - but then say:- > Bugs come from incorrectly converting semantics (which could not be > "verified" unless a formal description of the source language exists, and > that has proven too hard to do for any real programming language - "a subset > of C" not counting as such a language!). And most of all from optimizations; > perhaps an optimizer could be formally verified but I suspect doing so would > take 3 times as much work as fixing the bugs as they're found. The remark about optimization is interesting as [certainly up to a few years back] the number of target code generation faults seemed evenly balanced between those that were caused during optimization and those that were fixed by optimization (i.e. the fault goes away when the compiler is allowed to apply optimization strategies). Whichever stage of the compilation process is at fault, [in my experience] there are usually a number of target code faults associated with fairly mature compilers and these cases have to be managed by the safety case and development processes. Getting people with the experience at both the high-level source and low-level target code to review compilation issues is becoming increasingly difficult. A technique that provides alternatives in verifying the compilation process are attractive. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-07 12:42 ` Stuart @ 2012-03-08 1:06 ` Randy Brukardt 2012-03-08 9:04 ` Jacob Sparre Andersen 0 siblings, 1 reply; 63+ messages in thread From: Randy Brukardt @ 2012-03-08 1:06 UTC (permalink / raw) "Stuart" <google_news@palin-26.freeserve.co.uk> wrote in message news:15276255.3865.1331124143041.JavaMail.geo-discussion-forums@vbze11... On Tuesday, 6 March 2012 01:27:25 UTC, Randy Brukardt wrote: ... > I admit I have not read the report yet - so I am unaware of the scope of > the 'formal verification'; > and Randy's post does point out other areas where errors creep into the > compilation process. > That said, there is still an issue in safety-critical applications of > constructing a robust and > compelling safety case, particularly covering the compilation process. > This would seem to be > something that would aid in such cases. I thought about that some after posting my reply. I probably was a bit too flippant. The real problem is that the only languages that can practically be verified are extremely simple, and I suspect it usually would be a bad idea to code large systems in such languages. (All of the advantages of a language like Ada -- or even C++ -- would be thrown away. After all, that's my complaint about SPARK, and this would be many times worse.) ... > Whichever stage of the compilation process is at fault, [in my experience] > there are usually > a number of target code faults associated with fairly mature compilers and > these cases have > to be managed by the safety case and development processes. True enough. But I don't think that it would be all that practical to verify an entire compiler for a real programming language. Moreover, I doubt that it would take any special architecture to do so, just a reasonable partitioning of concerns. It surely would be "easy" to verify the translation of our intermediate code to machine code (particularly if small simplifications were made in both; a formal definition would be needed for the intermediate code, but that would not be hard to do - it's already defined by Ada source code and carefully documented in English). It probably would be practical to verify the intermediate code transformations of an optimizer (not ours, however). Verifying the translation of raw syntax to intermediate code would be harder, and would mainly depend on a formal definition of the source language (which as noted, would be impossible to provide for any real language; part of the problem being that no human could understand it well enough to know if it is right -- so would verifying that it is translated properly really provide any benefit??). > Getting people with the experience at both the high-level source and > low-level target code > to review compilation issues is becoming increasingly difficult. A > technique that provides > alternatives in verifying the compilation process are attractive. Understood, but as noted I'm dubious that this would really help. You could verify *some* language, but whether it really was the same language that you wanted would be impossible to tell. I suppose if you could hook up a SPARK-like verifier to it, you could see if your program did what you want based on the formal description -- whether that actually matched the expected semantics or not -- but that would require describing the intended outputs of your program in the language of the formal description, and I'm dubious that you could understand *that* well enough to tell if it really did what you want. And if you succeeded in doing that, you no longer need the program itself -- it's just an artifact, you might as well directly translate the formal specification into code -- at which point you have to start the entire cycle again. (Can you tell that I think trying to verify everything is a waste of time, not matter what it is??) Anyway, it's certainly true that some programming problems need extraordinary measures. And I suppose for those people, this sort of thing will appear to be a new silver bullet. But of course there is no silver bullet for programming - it's hard, it is going to remain hard, and worrying too much about the compiler when the source program is hardly verified at all makes little sense. Randy. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 1:06 ` Randy Brukardt @ 2012-03-08 9:04 ` Jacob Sparre Andersen 2012-03-08 9:37 ` Dmitry A. Kazakov ` (3 more replies) 0 siblings, 4 replies; 63+ messages in thread From: Jacob Sparre Andersen @ 2012-03-08 9:04 UTC (permalink / raw) Randy Brukardt wrote: > [...] But I don't think that it would be all that practical to verify > an entire compiler for a real programming language. [...] It surely > would be "easy" to verify the translation of our intermediate code to > machine code [...] This makes me think that introducing an extra layer of intermediate language/code might make it easier to convince yourself that each transformation from one layer to another is correct (maybe even formally verify it). It would of course on the other hand introduce an extra - possibly incorrect - transformation in the compilation system. Can we somehow estimate the likelihood of errors in the complete compilation system (produced software), depending on the number of transformations involved and the procedures used to assure the correctness of each transformation? (Well, I know the math; multiplying the likelihoods of each transformation being correct gives you the likelihood that all the transformations are correct.) But how do we estimate the likelihoods of each transformation being correct? Can we do better than 0.999 ** Source_Line_Count? (Using the rule of thumb that there on average is an error for every thousand lines of code.) Oh. And who verifies that the silicon is correct? Greetings, Jacob -- "Be who you are and say what you feel, because those who mind don't matter and those who matter don't mind." -- Dr. Seuss ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 9:04 ` Jacob Sparre Andersen @ 2012-03-08 9:37 ` Dmitry A. Kazakov 2012-03-08 11:23 ` Simon Wright 2012-03-08 10:23 ` Brian Drummond ` (2 subsequent siblings) 3 siblings, 1 reply; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 9:37 UTC (permalink / raw) On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: > (Well, I know the math; multiplying > the likelihoods of each transformation being correct gives you the > likelihood that all the transformations are correct.) Only if these are independent (assuming a probabilistic model). > But how do we estimate the likelihoods of each transformation being > correct? By making a model of and then running an experiment. The problem is that neither a model exist, nor experiments are possible. > Oh. And who verifies that the silicon is correct? Well, that is much simpler. Since silicon is a physical device statistical models work pretty well there. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 9:37 ` Dmitry A. Kazakov @ 2012-03-08 11:23 ` Simon Wright 2012-03-08 12:27 ` Dmitry A. Kazakov 0 siblings, 1 reply; 63+ messages in thread From: Simon Wright @ 2012-03-08 11:23 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: >> Oh. And who verifies that the silicon is correct? > > Well, that is much simpler. Since silicon is a physical device > statistical models work pretty well there. You can probably tell if the silicon has become broken. It's harder to tell whether the silicon has a design error. ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 11:23 ` Simon Wright @ 2012-03-08 12:27 ` Dmitry A. Kazakov 0 siblings, 0 replies; 63+ messages in thread From: Dmitry A. Kazakov @ 2012-03-08 12:27 UTC (permalink / raw) On Thu, 08 Mar 2012 11:23:53 +0000, Simon Wright wrote: > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > >>> Oh. And who verifies that the silicon is correct? >> >> Well, that is much simpler. Since silicon is a physical device >> statistical models work pretty well there. > > You can probably tell if the silicon has become broken. It's harder to > tell whether the silicon has a design error. Yes, but that is no different from software errors. What I meant was hardware faults. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 9:04 ` Jacob Sparre Andersen 2012-03-08 9:37 ` Dmitry A. Kazakov @ 2012-03-08 10:23 ` Brian Drummond 2012-03-08 23:38 ` Bill Findlay 2012-03-08 15:23 ` Peter C. Chapin 2012-03-09 2:04 ` Randy Brukardt 3 siblings, 1 reply; 63+ messages in thread From: Brian Drummond @ 2012-03-08 10:23 UTC (permalink / raw) On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: > > Oh. And who verifies that the silicon is correct? VIPER. en.wikipedia.org/wiki/VIPER_microprocessor www.cl.cam.ac.uk/techreports/UCAM-CL-TR-104.pdf - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 10:23 ` Brian Drummond @ 2012-03-08 23:38 ` Bill Findlay 2012-03-09 13:56 ` Brian Drummond 0 siblings, 1 reply; 63+ messages in thread From: Bill Findlay @ 2012-03-08 23:38 UTC (permalink / raw) On 08/03/2012 10:23, in article jja1b4$8q1$1@dont-email.me, "Brian Drummond" <brian@shapes.demon.co.uk> wrote: > On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: > >> >> Oh. And who verifies that the silicon is correct? > > VIPER. > > en.wikipedia.org/wiki/VIPER_microprocessor > www.cl.cam.ac.uk/techreports/UCAM-CL-TR-104.pdf Was the 'proof' of VIPER not later shown to have missed serious faults? (BTW nice to see that someone else remembers the Rekursiv.) -- Bill Findlay with blueyonder.co.uk; use surname & forename; ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 23:38 ` Bill Findlay @ 2012-03-09 13:56 ` Brian Drummond 2012-03-09 14:43 ` Shark8 ` (3 more replies) 0 siblings, 4 replies; 63+ messages in thread From: Brian Drummond @ 2012-03-09 13:56 UTC (permalink / raw) On Thu, 08 Mar 2012 23:38:39 +0000, Bill Findlay wrote: > On 08/03/2012 10:23, in article jja1b4$8q1$1@dont-email.me, "Brian > Drummond" <brian@shapes.demon.co.uk> wrote: > >> On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: >> >> >>> Oh. And who verifies that the silicon is correct? >> >> VIPER. >> >> en.wikipedia.org/wiki/VIPER_microprocessor >> www.cl.cam.ac.uk/techreports/UCAM-CL-TR-104.pdf > > Was the 'proof' of VIPER not later shown to have missed serious faults? Quem vedisti pastores? That must be the "controversy" hinted at on Viper's Wikipedia page. Unfortunately, apart from the Dick Pountain article in Byte, I don't know anything about the VIPER, so any further info would be appreciated. It would be interesting to see verified hardware : maybe even VIPER revisited with newer proof tools; and to see if something more like a real CPU - say an ARM1 or 2 (let's not be too ambitious!) - would be verifiable these days. While it amuses me to see SPADE references so close to the surface in SPARK, it does also make me question how much progress has been made in formal verification in the last quarter century. Gut feel says, more than AI and flying cars, but not much. Which does not negate its value. Rather, I think it points to the fact that there is much neglected ground, that could prove fertile. One area, directly germane to the question above, is that there is a lot of commonality between the SPARK subset of Ada, and the synthesisable subset of VHDL, probably because both share the need for static elaboration of resources. I would like to see this exploited in several ways, whether as SystemAda (surely a better alternative to SystemC) for SW/HW co-verification or ESL design, or as SPARK-like proof tools applied to VHDL. VHDL and Ada seem to be following separate but convergent evolution paths (e.g. VHDL has conditional and case expressions, Ada will. Ada has inheritance, VHDL doesn't ... yet) and formal proof would appear to be equally applicable to suitable subsets of both. > (BTW nice to see that someone else remembers the Rekursiv.) (well I know a little more about THAT!) There was a lot of interesting hardware at the time. I was intrigued by the similarity between Transputer and Lilith instruction formats, which might be worth re-examining where code density (or making it fit in cache) is important. - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 13:56 ` Brian Drummond @ 2012-03-09 14:43 ` Shark8 2012-03-09 21:51 ` Brian Drummond 2012-03-09 15:49 ` Bill Findlay ` (2 subsequent siblings) 3 siblings, 1 reply; 63+ messages in thread From: Shark8 @ 2012-03-09 14:43 UTC (permalink / raw) On Friday, March 9, 2012 7:56:34 AM UTC-6, Brian Drummond wrote: > > There was a lot of interesting hardware at the time. I was intrigued by > the similarity between Transputer and Lilith instruction formats, which > might be worth re-examining where code density (or making it fit in > cache) is important. > > - Brian Which others of those interesting technologies (or the idea behind them) do you suppose was a missed opportunity? I remember reading about the Lisp-Machines that came out, and some of the development tools they had. The "step into / modify [executing] code" debugging seems worlds ahead of anything I've seen on the *nix front (admittedly only a cursory glance) and substantially ahead of the windows debuggers. (MS's .NET debugging w/ VS doesn't allow for on-the-fly modification; Borland's Delphi debugger allowed you to modify variables / registers on-the-fly, IIRC.) ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 14:43 ` Shark8 @ 2012-03-09 21:51 ` Brian Drummond 0 siblings, 0 replies; 63+ messages in thread From: Brian Drummond @ 2012-03-09 21:51 UTC (permalink / raw) On Fri, 09 Mar 2012 06:43:03 -0800, Shark8 wrote: > On Friday, March 9, 2012 7:56:34 AM UTC-6, Brian Drummond wrote: >> >> There was a lot of interesting hardware at the time. I was intrigued by >> the similarity between Transputer and Lilith instruction formats, which >> might be worth re-examining where code density (or making it fit in >> cache) is important. >> >> - Brian > > Which others of those interesting technologies (or the idea behind them) > do you suppose was a missed opportunity? Well, the Linn Rekursiv of course! Realistically - I don't know. At the time I postulated that RISC (as it was then) was a flash in the pan, taking advantage of short term opportunities (limited transistor count - wasting some on complex decoders or features really could hurt performance) and I still do. I knew that longer term, some form of CISC would win out - but I imagined a cleaner design with hindsight to inform it, not an 8086 with layers upon layers of add-ons! And like others here, I regretted that the 80286 segment debacle got in the way of exploiting segments (or Rekursiv objects) to provide so much of the protection we need. (Aside : I remember a campfire conversation 20 years ago in Wyoming where - quite unexpectedly - I met a guy from ARM, when I argued that RISC would always have a disadvantage from fixed length instructions : you want something like Huffman encoding so the commonest symbols are shortest, like push/pop and other 1-byte insts on x86. Fixed instructions play to early RISC's limited transistor count; longer term, a few thousand gates for decoding aren't even worth counting) So I think the biggest lost opportunity was that the Hennessy/Patterson orthodoxy became so dominant. They're not wrong : you can optimise what you have by measuring any feature's benefit, and cost. However... One trouble is, their measure of benefit tends to be performance (because it is easy to measure) and not safety, ease of use, security, bug count, ease of debugging (all of which are harder). Another is, you can't measure a design you haven't built yet. ("Built" can include simulation for this purpose). And you can't build a new design if you can't prove it will be better. (And even if you build it, all the benchmarks are in C, which is too low level to exploit more sophisticated hardware) And you can't prove it will be better if you can't measure its performance... So I think it has funneled design to some local peak on the landscape, but made it impossible to jump across the valleys to other mountain ranges. In language of the time, the big problem (or "problem") addressed by Lisp machines and the Rekursiv - and the Transputer in its own way - was "bridging the semantic gap" between high level languages, and the primitive machine operations. Alas, we have eliminated the semantic gap ... by programming in C. So the Transputer for example provided means for parallel programming - directly supported in hardware, by simple interconnections between multiple chips. (Oddly enough, the fact that a single T414 in 1985 ran at 20MHz when other CPUs struggled to 10MHz seems to have been lost) I believe it was also the first CPU to divorce the core clock (20MHz) from the motherboard clock (5 MHz) so that faster chips need only change the clock multiplier. But I digress. Coming back on topic, if the orthodoxy had embraced that, we would certainly be a lot further ahead now that transistor budgets are screaming for multicore - and the market would need a language that really understood multiprocessing... I'll leave the Rekursiv for another time. It demonstrated a lot of unorthodox ideas, many of which have potential value. But it was a prototype, developed by a tiny team working fast, focused on the new ideas rather than ultimate performance. (So a lot of basic ideas, like pipelining or caches, were ignored - unless you could implement them in microcode.) Two things it taught me : (1) you can't beat hardware support when you need performance. (2) If the hardware support isn't what you wanted, you lose that advantage. Putting those together in 1990, and revisiting the FPGA, was quite an eye- opener. Previously, I thought that an ASIC that forgot what it was when you switched the power off, was a REALLY DUMB idea... > I remember reading about the Lisp-Machines that came out, and some of > the development tools they had. The "step into / modify [executing] > code" debugging seems worlds ahead of anything I've seen on the *nix > front (admittedly only a cursory glance) and substantially ahead of the > windows debuggers. (MS's .NET debugging w/ VS doesn't allow for > on-the-fly modification; Borland's Delphi debugger allowed you to modify > variables / registers on-the-fly, IIRC.) Today you would have to sandbox such a debugger, in such a way that you could guarantee its execution of modified code wasn't exploited by bad people. - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 13:56 ` Brian Drummond 2012-03-09 14:43 ` Shark8 @ 2012-03-09 15:49 ` Bill Findlay 2012-03-09 20:34 ` Brian Drummond 2012-03-09 19:40 ` Jeffrey Carter 2012-03-09 23:59 ` phil.clayton 3 siblings, 1 reply; 63+ messages in thread From: Bill Findlay @ 2012-03-09 15:49 UTC (permalink / raw) On 09/03/2012 13:56, in article jjd26i$k12$1@dont-email.me, "Brian Drummond" <brian@shapes.demon.co.uk> wrote: > On Thu, 08 Mar 2012 23:38:39 +0000, Bill Findlay wrote: > >> On 08/03/2012 10:23, in article jja1b4$8q1$1@dont-email.me, "Brian >> Drummond" <brian@shapes.demon.co.uk> wrote: >> >>> On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: >>> >>> >>>> Oh. And who verifies that the silicon is correct? >>> >>> VIPER. >>> >>> en.wikipedia.org/wiki/VIPER_microprocessor >>> www.cl.cam.ac.uk/techreports/UCAM-CL-TR-104.pdf >> >> Was the 'proof' of VIPER not later shown to have missed serious faults? > > Quem vedisti pastores? > > That must be the "controversy" hinted at on Viper's Wikipedia page. > Unfortunately, apart from the Dick Pountain article in Byte, I don't know > anything about the VIPER, so any further info would be appreciated. Sorry - I'm as vague on this as Wikipedia . I dimly remember some such thing being discovered years after the 'proof' was ballyhooed. > One area, directly germane to the question above, is that there is a lot > of commonality between the SPARK subset of Ada, and the synthesisable > subset of VHDL, probably because both share the need for static > elaboration of resources. I would like to see this exploited in several > ways, whether as SystemAda (surely a better alternative to SystemC) for > SW/HW co-verification or ESL design, or as SPARK-like proof tools applied > to VHDL. Interesting point. -- Bill Findlay with blueyonder.co.uk; use surname & forename; ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 15:49 ` Bill Findlay @ 2012-03-09 20:34 ` Brian Drummond 0 siblings, 0 replies; 63+ messages in thread From: Brian Drummond @ 2012-03-09 20:34 UTC (permalink / raw) On Fri, 09 Mar 2012 15:49:53 +0000, Bill Findlay wrote: > On 09/03/2012 13:56, in article jjd26i$k12$1@dont-email.me, "Brian > Drummond" <brian@shapes.demon.co.uk> wrote: > >> On Thu, 08 Mar 2012 23:38:39 +0000, Bill Findlay wrote: >> >>> On 08/03/2012 10:23, in article jja1b4$8q1$1@dont-email.me, "Brian >>> Drummond" <brian@shapes.demon.co.uk> wrote: >>> >>>> On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: >>>> >>>> >>>>> Oh. And who verifies that the silicon is correct? ... >> One area, directly germane to the question above, is that there is a >> lot of commonality between the SPARK subset of Ada, and the >> synthesisable subset of VHDL ... > Interesting point. ... but I don't know where to take it. I believe ghdl (VHDL compiler) is largely written in Ada, so there are some sources to play with. Between that and the SPARK-GPL sources, there must be a starting point for some investigation. But finding time - maybe when I retire! Or funding... - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 13:56 ` Brian Drummond 2012-03-09 14:43 ` Shark8 2012-03-09 15:49 ` Bill Findlay @ 2012-03-09 19:40 ` Jeffrey Carter 2012-03-09 20:39 ` Brian Drummond 2012-03-09 23:59 ` phil.clayton 3 siblings, 1 reply; 63+ messages in thread From: Jeffrey Carter @ 2012-03-09 19:40 UTC (permalink / raw) On 03/09/2012 06:56 AM, Brian Drummond wrote: > > While it amuses me to see SPADE references so close to the surface in > SPARK, it does also make me question how much progress has been made in > formal verification in the last quarter century. Gut feel says, more than > AI and flying cars, but not much. Terrafugia is actively working to certify a "roadable aircraft": http://www.terrafugia.com/ -- Jeff Carter "Sons of a silly person." Monty Python & the Holy Grail 02 ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 19:40 ` Jeffrey Carter @ 2012-03-09 20:39 ` Brian Drummond 0 siblings, 0 replies; 63+ messages in thread From: Brian Drummond @ 2012-03-09 20:39 UTC (permalink / raw) On Fri, 09 Mar 2012 12:40:50 -0700, Jeffrey Carter wrote: > On 03/09/2012 06:56 AM, Brian Drummond wrote: >> >> While it amuses me to see SPADE references so close to the surface in >> SPARK, it does also make me question how much progress has been made in >> formal verification in the last quarter century. Gut feel says, more >> than AI and flying cars, but not much. > > Terrafugia is actively working to certify a "roadable aircraft": > > http://www.terrafugia.com/ Heh, I suppose I asked for that. But I'm not sure if that proves, or disproves my point! - Brian ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-09 13:56 ` Brian Drummond ` (2 preceding siblings ...) 2012-03-09 19:40 ` Jeffrey Carter @ 2012-03-09 23:59 ` phil.clayton 3 siblings, 0 replies; 63+ messages in thread From: phil.clayton @ 2012-03-09 23:59 UTC (permalink / raw) On Friday, March 9, 2012 1:56:34 PM UTC, Brian Drummond wrote: > On Thu, 08 Mar 2012 23:38:39 +0000, Bill Findlay wrote: > > > On 08/03/2012 10:23, in article jja1b4$8q1$1@dont-email.me, "Brian > > Drummond" > wrote: > > > >> On Thu, 08 Mar 2012 10:04:04 +0100, Jacob Sparre Andersen wrote: > >> > >> > >>> Oh. And who verifies that the silicon is correct? > >> > >> VIPER. > >> > >> en.wikipedia.org/wiki/VIPER_microprocessor > >> www.cl.cam.ac.uk/techreports/UCAM-CL-TR-104.pdf > > > > Was the 'proof' of VIPER not later shown to have missed serious faults? > > Quem vedisti pastores? > > That must be the "controversy" hinted at on Viper's Wikipedia page. > Unfortunately, apart from the Dick Pountain article in Byte, I don't know > anything about the VIPER, so any further info would be appreciated. See Donald MacKenzie's excellent book 'Knowing Machines' which makes extensive reference to the VIPER case. (It is a fairly unique example.) The short chapter covering the VIPER fiasco can (except for the first page) be read here: http://books.google.co.uk/books?id=c5YaHkcP6DEC&pg=PA160 (Don't let that stop you buying the book though!) Ultimately the problem seems to be that expectations were set too high: the formal proofs did not say as much about the actual hardware as people may have been led to expect and the commercial prospects were not as great as suggested. I used to work in the same team as one of the VIPER developers, though many years after VIPER, and it is always interesting to hear parts of the story. One favourite is an early meeting with the lawyers. They sat down and asked "So, what exactly is boolean algebra?" > It would be interesting to see verified hardware : maybe even VIPER > revisited with newer proof tools; and to see if something more like a > real CPU - say an ARM1 or 2 (let's not be too ambitious!) - would be > verifiable these days. I assume you mean a complete CPU verified from design all the way to the gate level. That would be interesting! The hardware you are using right now almost certainly has some aspect of formal verification already, for example, Intel have been doing this for years. There is plenty of information out there discussing the issues, e.g. http://www.cl.cam.ac.uk/~jrh13/slides/types-04sep99/slides1.pdf > While it amuses me to see SPADE references so close to the surface in > SPARK, it does also make me question how much progress has been made in > formal verification in the last quarter century. Gut feel says, more than > AI and flying cars, but not much. I generally agree. Certainly, much larger systems are now being verified but, when considering the actual advance, one should compensate for simply making use of greater computational resources. Formal verification is fundamentally difficult so it is difficult to advance. There are a lot of people out there trying. > Which does not negate its value. Rather, I think it points to the fact > that there is much neglected ground, that could prove fertile. I very much think so too! Phil ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 9:04 ` Jacob Sparre Andersen 2012-03-08 9:37 ` Dmitry A. Kazakov 2012-03-08 10:23 ` Brian Drummond @ 2012-03-08 15:23 ` Peter C. Chapin 2012-03-09 2:04 ` Randy Brukardt 3 siblings, 0 replies; 63+ messages in thread From: Peter C. Chapin @ 2012-03-08 15:23 UTC (permalink / raw) On 2012-03-08 04:04, Jacob Sparre Andersen wrote: > Oh. And who verifies that the silicon is correct? I have the impression that the hardware people routinely use model checking techniques to verify designs. I don't know much about it, however. Peter ^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Verified compilers? 2012-03-08 9:04 ` Jacob Sparre Andersen ` (2 preceding siblings ...) 2012-03-08 15:23 ` Peter C. Chapin @ 2012-03-09 2:04 ` Randy Brukardt 3 siblings, 0 replies; 63+ messages in thread From: Randy Brukardt @ 2012-03-09 2:04 UTC (permalink / raw) "Jacob Sparre Andersen" <sparre@nbi.dk> wrote in message news:87d38nv4zf.fsf@adaheads.sparre-andersen.dk... > Randy Brukardt wrote: > >> [...] But I don't think that it would be all that practical to verify >> an entire compiler for a real programming language. [...] It surely >> would be "easy" to verify the translation of our intermediate code to >> machine code [...] > > This makes me think that introducing an extra layer of intermediate > language/code might make it easier to convince yourself that each > transformation from one layer to another is correct (maybe even formally > verify it). Right. Which I suspect is all that the OP's reference has really "discovered"; another discovery of the obvious. (And I suppose a patent on it, too.) Randy. ^ permalink raw reply [flat|nested] 63+ messages in thread
end of thread, other threads:[~2012-03-14 21:52 UTC | newest] Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-02-21 15:42 Verified compilers? Yannick Duchêne (Hibou57) 2012-02-24 1:41 ` Shark8 2012-02-24 8:52 ` Georg Bauhaus 2012-02-24 17:36 ` Peter C. Chapin 2012-03-06 1:27 ` Randy Brukardt 2012-03-06 17:24 ` Shark8 2012-03-06 17:43 ` Dmitry A. Kazakov 2012-03-06 19:03 ` Shark8 2012-03-07 5:33 ` Yannick Duchêne (Hibou57) 2012-03-07 9:12 ` Dmitry A. Kazakov 2012-03-07 17:49 ` Niklas Holsti 2012-03-07 20:17 ` Dmitry A. Kazakov 2012-03-07 23:28 ` Usefulness of Formal Notions in Programming (was: Verified compilers?) Georg Bauhaus 2012-03-08 9:24 ` Usefulness of Formal Notions in Programming Dmitry A. Kazakov 2012-03-08 10:30 ` Nasser M. Abbasi 2012-03-08 12:37 ` Dmitry A. Kazakov 2012-03-08 0:42 ` Verified compilers? Randy Brukardt 2012-03-08 9:25 ` Dmitry A. Kazakov 2012-03-08 18:10 ` Niklas Holsti 2012-03-08 20:41 ` Dmitry A. Kazakov 2012-03-08 18:02 ` Niklas Holsti 2012-03-08 20:40 ` Dmitry A. Kazakov 2012-03-09 0:44 ` Georg Bauhaus 2012-03-09 22:13 ` Niklas Holsti 2012-03-10 10:36 ` Dmitry A. Kazakov 2012-03-10 20:35 ` Niklas Holsti 2012-03-11 9:47 ` Dmitry A. Kazakov 2012-03-11 22:22 ` Niklas Holsti 2012-03-12 5:12 ` Niklas Holsti 2012-03-12 9:43 ` Dmitry A. Kazakov 2012-03-14 8:36 ` Niklas Holsti 2012-03-14 9:24 ` Georg Bauhaus 2012-03-14 11:14 ` REAL (was: Verified compilers?) stefan-lucks 2012-03-14 12:59 ` REAL Dmitry A. Kazakov 2012-03-14 13:30 ` REAL Georg Bauhaus 2012-03-14 13:51 ` REAL Dmitry A. Kazakov 2012-03-14 20:37 ` REAL Brian Drummond 2012-03-14 21:52 ` REAL Dmitry A. Kazakov 2012-03-14 13:52 ` REAL georg bauhaus 2012-03-14 17:42 ` REAL Jeffrey Carter 2012-03-14 10:14 ` Verified compilers? Dmitry A. Kazakov 2012-03-14 20:13 ` Niklas Holsti 2012-03-11 10:55 ` Georg Bauhaus 2012-03-10 13:46 ` Brian Drummond 2012-03-07 1:00 ` Randy Brukardt 2012-03-07 12:42 ` Stuart 2012-03-08 1:06 ` Randy Brukardt 2012-03-08 9:04 ` Jacob Sparre Andersen 2012-03-08 9:37 ` Dmitry A. Kazakov 2012-03-08 11:23 ` Simon Wright 2012-03-08 12:27 ` Dmitry A. Kazakov 2012-03-08 10:23 ` Brian Drummond 2012-03-08 23:38 ` Bill Findlay 2012-03-09 13:56 ` Brian Drummond 2012-03-09 14:43 ` Shark8 2012-03-09 21:51 ` Brian Drummond 2012-03-09 15:49 ` Bill Findlay 2012-03-09 20:34 ` Brian Drummond 2012-03-09 19:40 ` Jeffrey Carter 2012-03-09 20:39 ` Brian Drummond 2012-03-09 23:59 ` phil.clayton 2012-03-08 15:23 ` Peter C. Chapin 2012-03-09 2:04 ` Randy Brukardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox