From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,3b05f12bd7a2a871
X-Google-Attributes: gid103376,public
From: Mark A Biggar <mark.a.biggar@lmco.com>
Subject: Re: Lexical Conundrum
Date: 1998/02/23
Message-ID: <34F1BC2B.529629AF@lmco.com>#1/1
X-Deja-AN: 327904402
Content-Transfer-Encoding: 7bit
References: <01bd3d80$101287c0$LocalHost@xhv46.dial.pipex.com>
 <EotBMK.MnK@world.std.com>
Content-Type: text/plain; charset=us-ascii
Organization: Lockheed Martin Western Development Labs
Mime-Version: 1.0
Newsgroups: comp.lang.ada
Date: 1998-02-23T00:00:00+00:00
List-Id: <comp.lang.ada>


Robert A Duff wrote:
> 
> In article <01bd3d80$101287c0$LocalHost@xhv46.dial.pipex.com>,
> Nick Roberts <Nick.Roberts@dial.pipex.com> wrote:
> >But, if you look closely at line 6, you will see the sequence
> >
> >   or'a'in
> >
> >in the middle of an expression.
> >
> >Now, from chapter 2 of the RM, one might get the impression that this could
> >be parsed as five lexical elements (three identifiers and two apostrophes).
> 
> I think 2.2(7) makes it clear that the above is three lexical elements,
> not five.

Not really, as section 2.2(7) only talks about cases that require white space
between tokens to be legal.  The real point in this example is the "or" is
a keyword not an identifier, so the following "'" must be the start of
a character literal, not the start of an attribute.  If you look at the 
LM grammar, you will find that there are NO cases where an attribute follows
a key word and also NO cases where a character literal follows an identifier,
so you can always determine the meaning of "'" just by remembering the
classification of the previous token.  This whole problem is one of the
reasons why the Ada95 LM makes such a big deal of the fact
that keywords tokens are NOT a subset of identifier tokens, which was a
problem with the Ada83 LM.  Note that a lexical analyser can fully determine
this on its own with out help from the associated parser.  So any compiler
that doesn't tokenize the above as the 3 tokens "or", "'a'" and "in" is
in violation of the LM.

--
Mark Biggar
mark.a.biggar@lmco.com