From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,1be1b347b5b5ad43
X-Google-Attributes: gid103376,public
From: "Robert I. Eachus" <eachus@mitre.org>
Subject: Re: Ayacc/Aflex "entropy" (P2Ada)
Date: 1999/11/01
Message-ID: <381E1912.865D682@mitre.org>#1/1
X-Deja-AN: 543270141
Content-Transfer-Encoding: 7bit
References: <3813716C.52655126@Maths.UniNe.CH> <7v2400$e02$1@nnrp1.deja.com>
 <7v30jd$3i6$1@nnrp1.deja.com> <7v3u0f$nn6$1@nnrp1.deja.com>
 <3816331A.99C596D2@mitre.org> <7v5ns8$2h1$1@nnrp1.deja.com>
 <3819CBDA.E801A064@mitre.org> <7vhib3$8o1$1@nnrp1.deja.com>
X-Accept-Language: en
Content-Type: text/plain; charset=us-ascii
X-Complaints-To: usenet@news.mitre.org
X-Trace: top.mitre.org 941496255 20126 129.83.41.77 (1 Nov 1999 22:44:15 GMT)
Organization: The MITRE Corporation
Mime-Version: 1.0
NNTP-Posting-Date: 1 Nov 1999 22:44:15 GMT
Newsgroups: comp.lang.ada
Date: 1999-11-01T22:44:15+00:00
List-Id: <comp.lang.ada>

Robert Dewar wrote:
  
> Remember to check that the set of declarations (where I is
> in the above) does not include a pragma Import for Foo, which
> renders this example semantically legal (have a look in the
> GNAT code for the rather complex details in handling this
> case correctly :-)

  Actually, that's why I put in a following declaration but no ellipsis.
 
> > Putting in a special case rule to recognize that the semicolon
> > should be an "is" helps a lot, but you also want to look
> > further to deliver the right error message.
 
> If Robert Eachus is saying that it is easy to add rules to
> a typical table driver parser to handle this case, all I can
> say is (a) I never saw it done and (b) I think it would be
> tricky, and (c) the only thing that would convince me is an
> actual working example.

   Somewhere I have a source listing for LALR for Multics, and a paper
showing how it is done.  But it really isn't so hard.  What you want is
to have productions which are only used in error situations.  Once
errors start spewing out, sometimes the "right" correction is
arbitrarily far back.  But since I am doing LR not LL parsing, it is
possible to add a production such as:

   <subprogram body> ::= <subprogram declaration> ; <begin block>
[<name>] ;

    (There is a standard production:

   <begin block> ::= begin <sequence of statements> end

    This simplifies other error correction...) 

  Now what happens is the compiler notices that it is in "panic mode." 
Lots of errors and no obvious way to continue the parse.  Instead of
just trying to match whatever is on the stack to tokens in the forward
direction, you basically flip a switch and then try to restart from the
(lexical) beginning of
each production in the parse stack with these additional rules switched
in.

  The Ada/SIL compiler grammer had six such productions, including the
one above, and between them they reduced the number of error messages
produced by the ACVC B-tests by seventy something percent.  Things like
= or : or even : = for assignment were detected and fixed by a different
process.  These rules were only used for these "arbitrary lookahead"
errors, where the error could not be detected until many tokens had been
read and processed. 

> The trouble in this kind of error detection and recovery is
> very much that the devil is in the engineering details.

    Amen!
 
> For example, people have suggested for years the idea of using
> indentation to help error recovery, but I have not seen this
> systematically implemented till GNAT, and it is really quite
> tricky (have a look at par-endh.adb in the GNAT sources for
> example!)
 
   Yes, that is pretty hairy because you also have to infer what the
user's indentation style is, and you can't reject anything for bad
indentation (absent -gnatg ;-) but you need to go arbitrarily far back
once you do hit the error to find the right fix.
 
-- 

                                        Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...