From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,36b2e8ae79f7dadd X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news1.google.com!npeer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail Newsgroups: comp.lang.ada Subject: Re: OpenToken version 3.1 preview References: <17b17a5b-6b54-4486-8494-650827a58dad@c1g2000yqi.googlegroups.com> From: Stephen Leake Date: Fri, 24 Jul 2009 21:18:50 -0400 Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (windows-nt) Cancel-Lock: sha1:HCk1DoaLha/alt32LudSEFIFjS0= MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@flashnewsgroups.com Organization: FlashNewsgroups.com X-Trace: ee9bd4a6a5d67e197caa704220 Xref: g2news2.google.com comp.lang.ada:7335 Date: 2009-07-24T21:18:50-04:00 List-Id: Stephen Leake writes: > AdaMagica writes: > >> On Jul 23, 3:41�am, Stephen Leake >> wrote: >>> AdaMagica writes: >>> > There is a problem with Bracketed_Comment. If it extends over more >>> > than one line, the token is correctly recognized, but the lexeme >>> > fails. >>> >>> The line feed characters are dropped from the lexeme, on Windows. >> Here is the explanation of this symptom. Text_Feeder uses Ada.Text_IO.Get_Line, so it never sees the "CR LF" on DOS, nor the "LF" on Linux. It does insert a EOL_Character = CR for each line break. That's why it appears to be dropping the LF. So for a file created like this: Text1 : constant String := "/* A comment that starts here"; Text2 : constant String := " and keeps going"; Text3 : constant String := " and finally ends here *.*.."; Create (File, Out_File, File_Name); Put_Line (File, Text1); Put_Line (File, Text2); Put_Line (File, Text3); Close (File); the expected lexeme is: Expected_Lexeme : constant String := Text1 & OpenToken.EOL_Character & Text2 & OpenToken.EOL_Character & Text3; I've added a test that demonstrates this, and a comment to opentoken-recognizer-bracketed_comment.ads to document it. If the purpose of the lexer is to just recognize comments and skip them, this is fine. If the purpose of the lexer is to be able to later reconstruct the code, the reconstruction routine will need a way to turn EOL_Character back into OS-specific newlines; using Ada.Text_IO.Put_Line will do that. -- -- Stephe