From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,47bd5b7b3b898723
X-Google-Attributes: gid103376,public
From: dewar@cs.nyu.edu (Robert Dewar)
Subject: Re: Text_IO and Ada source (was: Form feed comment for pragma Page)
Date: 1996/06/20
Message-ID: <dewar.835286961@schonberg>#1/1
X-Deja-AN: 161767957
references: <4p04vi$3ui$1@mhafn.production.compuserve.com>
 <dewar.833858501@schonberg> <evans-0406961641500001@ppp5.pgh.net>
 <dewar.833946907@schonberg> <evans-0606960931350001@ppp7.pgh.net>
 <dewar.834075532@schonberg> <4pkp5k$13gs@watnews1.watson.ibm.com
  <dewar.834585066@schonberg> <4q6c5s$12nr@watnews1.watson.ibm.com>
organization: Courant Institute of Mathematical Sciences
newsgroups: comp.lang.ada
Date: 1996-06-20T00:00:00+00:00
List-Id: <comp.lang.ada>


Norm Cohen says

"What RM95 2.2(2) requires is that, whatever mechanisms the implementation
uses to represent a format effector other than a tab, the implementation
must treat the logical occurrence of such a character as the end of a
line, i.e., such a character cannot be construed as occurring in the
MIDDLE of a comment."

There is nothing to say that "the mechanism the implementation uses to
represent a format effector" has anything to do with any characters in
the source text (in Norm's message here, he is using character in the
logical sense). For example, the following is peculiar, but quite
legitimate:

A form feed format effector character could be the character 16#0C#
provided that it occurs outside a comment, but inside a comment
16#0C# could be regarded as not being a format effector. The occurrence
of a form feed terminating a comment (which must be permissible) could
be represented by some entirely different mechanism.

There is nothing in the RM to say that the method used for source
representation must be a simple one-to-one translation of some kind,
though of course in many systems this will indeed be the case.

Note that in particular, it is perfectly fine to represent all format
effectors that terminate a line in an identical manner, since they
all have identical lexical/syntactic/semantic effects.

For example, it is perfectly fine on the IBM mainframe to represent
CR, LF, VT, or FF or any sequence of such characters as the physical
end of record, without distinguishing between these cases.

In a Unix-based system, it would be peculiar, but perfectly permissible
to represent CR,LF,VT and FF (here and in the previous paragraph I am
using these abbreviations to abbreviate the logical Ada characters, not
the corresponding ascii representations), the same way.

For example, you can take the Ada source, and translate CR,LF,VT,FF all
to LF, and that is a legitimate source representation, since every possible
Ada program is representable, and has correct semantics.

You can take this much further, for example, all comments can be 
represented by a single space in the text that is input to your
compiler.

Of course, good taste and usability dictate against choosing source
representations that are undesirable or unworkable, but the RM itself
dictates *very* little in this area!