From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,1be1b347b5b5ad43 X-Google-Attributes: gid103376,public From: "Robert I. Eachus" Subject: Re: Ayacc/Aflex "entropy" (P2Ada) Date: 1999/11/01 Message-ID: <381E1912.865D682@mitre.org>#1/1 X-Deja-AN: 543270141 Content-Transfer-Encoding: 7bit References: <3813716C.52655126@Maths.UniNe.CH> <7v2400$e02$1@nnrp1.deja.com> <7v30jd$3i6$1@nnrp1.deja.com> <7v3u0f$nn6$1@nnrp1.deja.com> <3816331A.99C596D2@mitre.org> <7v5ns8$2h1$1@nnrp1.deja.com> <3819CBDA.E801A064@mitre.org> <7vhib3$8o1$1@nnrp1.deja.com> X-Accept-Language: en Content-Type: text/plain; charset=us-ascii X-Complaints-To: usenet@news.mitre.org X-Trace: top.mitre.org 941496255 20126 129.83.41.77 (1 Nov 1999 22:44:15 GMT) Organization: The MITRE Corporation Mime-Version: 1.0 NNTP-Posting-Date: 1 Nov 1999 22:44:15 GMT Newsgroups: comp.lang.ada Date: 1999-11-01T22:44:15+00:00 List-Id: Robert Dewar wrote: > Remember to check that the set of declarations (where I is > in the above) does not include a pragma Import for Foo, which > renders this example semantically legal (have a look in the > GNAT code for the rather complex details in handling this > case correctly :-) Actually, that's why I put in a following declaration but no ellipsis. > > Putting in a special case rule to recognize that the semicolon > > should be an "is" helps a lot, but you also want to look > > further to deliver the right error message. > If Robert Eachus is saying that it is easy to add rules to > a typical table driver parser to handle this case, all I can > say is (a) I never saw it done and (b) I think it would be > tricky, and (c) the only thing that would convince me is an > actual working example. Somewhere I have a source listing for LALR for Multics, and a paper showing how it is done. But it really isn't so hard. What you want is to have productions which are only used in error situations. Once errors start spewing out, sometimes the "right" correction is arbitrarily far back. But since I am doing LR not LL parsing, it is possible to add a production such as: ::= ; [] ; (There is a standard production: ::= begin end This simplifies other error correction...) Now what happens is the compiler notices that it is in "panic mode." Lots of errors and no obvious way to continue the parse. Instead of just trying to match whatever is on the stack to tokens in the forward direction, you basically flip a switch and then try to restart from the (lexical) beginning of each production in the parse stack with these additional rules switched in. The Ada/SIL compiler grammer had six such productions, including the one above, and between them they reduced the number of error messages produced by the ACVC B-tests by seventy something percent. Things like = or : or even : = for assignment were detected and fixed by a different process. These rules were only used for these "arbitrary lookahead" errors, where the error could not be detected until many tokens had been read and processed. > The trouble in this kind of error detection and recovery is > very much that the devil is in the engineering details. Amen! > For example, people have suggested for years the idea of using > indentation to help error recovery, but I have not seen this > systematically implemented till GNAT, and it is really quite > tricky (have a look at par-endh.adb in the GNAT sources for > example!) Yes, that is pretty hairy because you also have to infer what the user's indentation style is, and you can't reject anything for bad indentation (absent -gnatg ;-) but you need to go arbitrarily far back once you do hit the error to find the right fix. -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is...