From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!news.glorb.com!peer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail From: Stephen Leake Newsgroups: comp.lang.ada Subject: Re: OpenToken: Parsing Ada (subset)? References: <878uc3r2y6.fsf@adaheads.sparre-andersen.dk> <85twupvjxo.fsf@stephe-leake.org> <81ceb070-16fe-4578-a09a-eb11a2bbb664@googlegroups.com> <162zj7c2l0ykp$.1rxias18vby83.dlg@40tude.net> <856172bk80.fsf@stephe-leake.org> <1ljiyuuchbxvp.wrtbilkw3rdb.dlg@40tude.net> <85pp4vakmy.fsf@stephe-leake.org> <1a08qrccls0bi$.16y7q3hosklae.dlg@40tude.net> Date: Wed, 17 Jun 2015 12:38:38 -0500 Message-ID: <85pp4u8cbl.fsf@stephe-leake.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (windows-nt) Cancel-Lock: sha1:ysPYorD9M3I7cm9i8NuZy3RMZzI= MIME-Version: 1.0 Content-Type: text/plain X-Complaints-To: abuse@flashnewsgroups.com Organization: FlashNewsgroups.com X-Trace: c372d5581b0a0e97f808402144 X-Received-Bytes: 2672 X-Received-Body-CRC: 766344139 Xref: news.eternal-september.org comp.lang.ada:26358 Date: 2015-06-17T12:38:38-05:00 List-Id: "G.B." writes: > On 16.06.15 15:24, Dmitry A. Kazakov wrote: >>> It does not enforce all the lexical rules for numbers; it allows >>> repeated, leading, and trailing underscores; it doesn't enforce pairs of >>> '#'. >> That is exactly the point. It does not parse literal right and you have to >> reparse the matched chunk of text once again. What was the gain? Why >> wouldn't do it right in single step? > > (I believe the use case here permits simplifications, > meaning that REs are not being used for meticulous, final > parsing of Ada.) Yes, my examples are drawn from a parser for the indentation/navigation engine in Emacs Ada mode, which is actually required to be looser about Ada syntax rules than an Ada compiler. Adding an "enforce strict rules" requirement could change the lexer/parser design, especially if you place an emphasis on good/helpful/useful error messages. In addition, OpenToken (and FastToken) are intended for writing grammar-based parsers quickly, not for getting the best possible performance, or meeting other project-specific requirements. So the design priorizes minimizing the amount of new code that must be written for a new language. > But '_' seems missing from "[-+0-9a-fA-F.]+". Oops; good catch. I would have found that in an ada-mode test, but I'm not actually planning on using the Aflex lexer in Emacs. > (And obsolete Ada syntax, i.e. substitutes for '#'. Which makes a CFG > parser more desirable if '#' or ':' should have matching occurrences. > ;-) Emacs ada-mode explicitly ignores such things (at least until someone asks for it). -- -- Stephe