From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: a07f3367d7,36b2e8ae79f7dadd
X-Google-Attributes: gida07f3367d7,public,usenet
X-Google-NewGroupId: yes
X-Google-Language: ENGLISH,ASCII
Path: 
 g2news2.google.com!news1.google.com!npeer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail
Newsgroups: comp.lang.ada
Subject: Re: OpenToken version 3.1 preview
References: <ueiscnz8i.fsf@stephe-leake.org>
 <aa18196d-e374-436d-b6c3-28426d9ea58e@a26g2000yqn.googlegroups.com>
 <uab2wb2h6.fsf@stephe-leake.org>
 <17b17a5b-6b54-4486-8494-650827a58dad@c1g2000yqi.googlegroups.com>
From: Stephen Leake <stephen_leake@stephe-leake.org>
Date: Fri, 24 Jul 2009 06:54:42 -0400
Message-ID: <uab2u5p1p.fsf@stephe-leake.org>
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (windows-nt)
Cancel-Lock: sha1:o2bs9Nx9j2au1gvqg+IczSHXqSM=
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@flashnewsgroups.com
Organization: FlashNewsgroups.com
X-Trace: e6ea34a6992dee197caa711710
Xref: g2news2.google.com comp.lang.ada:7314
Date: 2009-07-24T06:54:42-04:00
List-Id: <comp.lang.ada>

AdaMagica <christoph.grein@eurocopter.com> writes:

> On Jul 23, 3:41�am, Stephen Leake <stephen_le...@stephe-leake.org>
> wrote:
>> AdaMagica <christoph.gr...@eurocopter.com> writes:
>> > There is a problem with Bracketed_Comment. If it extends over more
>> > than one line, the token is correctly recognized, but the lexeme
>> > fails.
>>
>> The line feed characters are dropped from the lexeme, on Windows.
>
> Also on Linux.
>
>> I don't suppose you have an idea of how to fix it?
>
> You guessed right - I haven't. I shortly browsed the code, but found
> no simple solution.
>
>> It will be interesting to figure out how to make that test portable
>> between Windows and Gnu/Linux. The easiest way to identify which line
>> ending to use that I know of is to look at
>> GNAT.Directory_Operations.Dir_Separator; it's '\' for CR LF, '/' for
>> LF. Don't know how to deal with Mac!
>
> There are other OSs where an end of line is not a character in the
> stream. Can OpenToken handle these?

The current file Text_Feeder uses Ada.Text_IO, so it should do "the
right thing" for any OS.

> We could do a Get_Line and insert a LF irrespective of what the OS
> uses. 

That's what the text feeder does now. Actually, it inserts
EOL_Character (see below).

So the LF must be dropped after that; I'll have to look harder.

> If then a lexeme was output that comprises several lines (currently
> only Bracketed_Comment I think), the output routine would have to
> translate this back to the OS's New_Line (this has of course to be
> documented in the recognizer).

Right.

> There is a declaration EOL_Character in package OpenToken.

Which has a comment to change it for your OS; not very friendly, as
it's a constant!

It's used in OpenToken.Recognizer.Character_Set.Standard_Whitespace,
OpenToken.Recognizer.Line_Comment.Analyze,
OpenToken.Recognizer.String.Analyze.

I'll change the comment to "we use this regardless of OS, since we
need a standard way of representing an end of line in a string
buffer".

-- 
-- Stephe