From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,36b2e8ae79f7dadd X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news1.google.com!npeer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail Newsgroups: comp.lang.ada Subject: Re: OpenToken version 3.1 preview References: <17b17a5b-6b54-4486-8494-650827a58dad@c1g2000yqi.googlegroups.com> From: Stephen Leake Date: Fri, 24 Jul 2009 06:54:42 -0400 Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (windows-nt) Cancel-Lock: sha1:o2bs9Nx9j2au1gvqg+IczSHXqSM= MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@flashnewsgroups.com Organization: FlashNewsgroups.com X-Trace: e6ea34a6992dee197caa711710 Xref: g2news2.google.com comp.lang.ada:7314 Date: 2009-07-24T06:54:42-04:00 List-Id: AdaMagica writes: > On Jul 23, 3:41�am, Stephen Leake > wrote: >> AdaMagica writes: >> > There is a problem with Bracketed_Comment. If it extends over more >> > than one line, the token is correctly recognized, but the lexeme >> > fails. >> >> The line feed characters are dropped from the lexeme, on Windows. > > Also on Linux. > >> I don't suppose you have an idea of how to fix it? > > You guessed right - I haven't. I shortly browsed the code, but found > no simple solution. > >> It will be interesting to figure out how to make that test portable >> between Windows and Gnu/Linux. The easiest way to identify which line >> ending to use that I know of is to look at >> GNAT.Directory_Operations.Dir_Separator; it's '\' for CR LF, '/' for >> LF. Don't know how to deal with Mac! > > There are other OSs where an end of line is not a character in the > stream. Can OpenToken handle these? The current file Text_Feeder uses Ada.Text_IO, so it should do "the right thing" for any OS. > We could do a Get_Line and insert a LF irrespective of what the OS > uses. That's what the text feeder does now. Actually, it inserts EOL_Character (see below). So the LF must be dropped after that; I'll have to look harder. > If then a lexeme was output that comprises several lines (currently > only Bracketed_Comment I think), the output routine would have to > translate this back to the OS's New_Line (this has of course to be > documented in the recognizer). Right. > There is a declaration EOL_Character in package OpenToken. Which has a comment to change it for your OS; not very friendly, as it's a constant! It's used in OpenToken.Recognizer.Character_Set.Standard_Whitespace, OpenToken.Recognizer.Line_Comment.Analyze, OpenToken.Recognizer.String.Analyze. I'll change the comment to "we use this regardless of OS, since we need a standard way of representing an end of line in a string buffer". -- -- Stephe