From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,3b05f12bd7a2a871
X-Google-Attributes: gid103376,public
From: dewar@merv.cs.nyu.edu (Robert Dewar)
Subject: Re: Lexical Conundrum
Date: 1998/02/22
Message-ID: <dewar.888188165@merv>#1/1
X-Deja-AN: 327636063
References: <6cpkda$otl$1@plug.news.pipex.net>
X-Complaints-To: usenet@news.nyu.edu
X-Trace: news.nyu.edu 888188317 16735 (None) 128.122.140.58
Organization: New York University
Newsgroups: comp.lang.ada
Date: 1998-02-22T00:00:00+00:00
List-Id: <comp.lang.ada>


Nick Roberts said

  I have run the following program
  
     1  with Ada.Text_IO; use Ada.Text_IO;
     2  procedure Test_1 is
     3     subtype UC is Character range'A'..'Z';
     4  begin
     5     Put_Line("Start of Test_1");
     6     if 'a'='a' or'a'in UC then
     7        Put_Line("True branch executed");
     8     end if;
     9     Put_Line("End of Test_1");
    10  end;
  
  through GNAT 3.10, and it compiles and runs fine.  I haven't tried, but I
  imagine it would probably work on any Ada compiler, in all likelihood.
  But, if you look closely at line 6, you will see the sequence
  
     or'a'in
  
  in the middle of an expression.
  Now, from chapter 2 of the RM, one might get the impression that this could
  be parsed as five lexical elements (three identifiers and two apostrophes).
  Of course, if a compiler were to parse it that way, the result would be a
  syntax error.
  
  For comparison, I have put
  
     range'A'..'Z';
  
  into line 3.  I don't think there is an ambiguity in parsing this, although
  it is very close to the previous example.  Similarly, if the space were to
  be removed from just before the example in line 6, the ambiguity would be
  resolved there also.
  
  One immediate conclusion, I think, is that the introduction of a one-letter
  attribute (in an implementation of the language) could cause difficulties!
  Of course, it's hard to imagine a motive for such an attribute, in practice.
  
  Question: would a compiler be in contravention of the RM by rejecting the
  above program (with a syntax error in line 6)?  I admit that such a
  compiler may be considered to be poorly designed!

You entirely misunderstand the RM. It is not a recipe for constructing
a compiler, it is a set of rules about what programs are valid and what
they mean. The rules make it clear that

  or'a'in ..

is valid, where three tokens are involved

  or  'a'   in

so this is valid. End of story. If the compiler parses this as five tokens
  or  ' a ' in

it will get confused, and will be wrong! End of story.

Furthermore, introducing a one character attribute does not in anyway change
things. An ambiguity at the language level would be a problem in the language
design.

An apparent ambiguity at the lexical level is merely a problem for the
compiler writer. Not a very difficult one. It is in practice easy to tell
wheher a quote is the start of a character literal or an attribute character.

I recommend reading the GNAT sources to get more knowledge in this area. Here
is a quote from the comments of the Scn package:

  --  Here is where we make the test to distinguish the cases. Treat
  --  as apostrophe if previous token is an identifier, right paren
  --  or the reserved word "all" (latter case as in A.all'Address)
  --  Also treat it as apostrophe after a literal (wrong anyway, but
  --  that's probably the better choice).

The RM is completely silent on this issue, since there is no issue from a
language point of view!