From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,3b05f12bd7a2a871 X-Google-Attributes: gid103376,public From: Mark A Biggar Subject: Re: Lexical Conundrum Date: 1998/02/23 Message-ID: <34F1BC2B.529629AF@lmco.com>#1/1 X-Deja-AN: 327904402 Content-Transfer-Encoding: 7bit References: <01bd3d80$101287c0$LocalHost@xhv46.dial.pipex.com> Content-Type: text/plain; charset=us-ascii Organization: Lockheed Martin Western Development Labs Mime-Version: 1.0 Newsgroups: comp.lang.ada Date: 1998-02-23T00:00:00+00:00 List-Id: Robert A Duff wrote: > > In article <01bd3d80$101287c0$LocalHost@xhv46.dial.pipex.com>, > Nick Roberts wrote: > >But, if you look closely at line 6, you will see the sequence > > > > or'a'in > > > >in the middle of an expression. > > > >Now, from chapter 2 of the RM, one might get the impression that this could > >be parsed as five lexical elements (three identifiers and two apostrophes). > > I think 2.2(7) makes it clear that the above is three lexical elements, > not five. Not really, as section 2.2(7) only talks about cases that require white space between tokens to be legal. The real point in this example is the "or" is a keyword not an identifier, so the following "'" must be the start of a character literal, not the start of an attribute. If you look at the LM grammar, you will find that there are NO cases where an attribute follows a key word and also NO cases where a character literal follows an identifier, so you can always determine the meaning of "'" just by remembering the classification of the previous token. This whole problem is one of the reasons why the Ada95 LM makes such a big deal of the fact that keywords tokens are NOT a subset of identifier tokens, which was a problem with the Ada83 LM. Note that a lexical analyser can fully determine this on its own with out help from the associated parser. So any compiler that doesn't tokenize the above as the 3 tokens "or", "'a'" and "in" is in violation of the LM. -- Mark Biggar mark.a.biggar@lmco.com