From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,36b2e8ae79f7dadd X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news4.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool4.arcor-online.net!news.arcor.de.POSTED!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: OpenToken version 3.1 preview Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.15.1 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <17b17a5b-6b54-4486-8494-650827a58dad@c1g2000yqi.googlegroups.com> Date: Thu, 23 Jul 2009 10:00:10 +0200 Message-ID: NNTP-Posting-Date: 23 Jul 2009 10:00:09 CEST NNTP-Posting-Host: c4565a1f.newsspool4.arcor-online.net X-Trace: DXC=0`[0I_D261N=8m7nZkdN^@4IUK On Wed, 22 Jul 2009 22:09:27 -0700 (PDT), AdaMagica wrote: > On Jul 23, 3:41�am, Stephen Leake > wrote: >> AdaMagica writes: >>> There is a problem with Bracketed_Comment. If it extends over more >>> than one line, the token is correctly recognized, but the lexeme >>> fails. >> >> The line feed characters are dropped from the lexeme, on Windows. > > Also on Linux. > >> I don't suppose you have an idea of how to fix it? > > You guessed right - I haven't. I shortly browsed the code, but found > no simple solution. > >> It will be interesting to figure out how to make that test portable >> between Windows and Gnu/Linux. The easiest way to identify which line >> ending to use that I know of is to look at >> GNAT.Directory_Operations.Dir_Separator; it's '\' for CR LF, '/' for >> LF. Don't know how to deal with Mac! > > There are other OSs where an end of line is not a character in the > stream. Can OpenToken handle these? > We could do a Get_Line and insert a LF irrespective of what the OS > uses. If then a lexeme was output that comprises several lines > (currently only Bracketed_Comment I think), the output routine would > have to translate this back to the OS's New_Line (this has of course > to be documented in the recognizer). You could do what I did in the Simple Components for Ada parser. I decoupled sources from the parser itself. The source is an abstract object that provides basic operations like "get next line" and "forward to the next line". The obvious advantage is that you need not to care about LF, CR in the parser, and can use files, streams, strings, GUI text buffers, etc, as a source to the same parser. My 2 cents. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de