From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!news.glorb.com!peer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail
From: Stephen Leake <stephen_leake@stephe-leake.org>
Newsgroups: comp.lang.ada
Subject: Re: OpenToken: Parsing Ada (subset)?
References: <878uc3r2y6.fsf@adaheads.sparre-andersen.dk>
 	<85twupvjxo.fsf@stephe-leake.org>
 	<81ceb070-16fe-4578-a09a-eb11a2bbb664@googlegroups.com>
 	<162zj7c2l0ykp$.1rxias18vby83.dlg@40tude.net>
 	<856172bk80.fsf@stephe-leake.org>
 	<1ljiyuuchbxvp.wrtbilkw3rdb.dlg@40tude.net>
 	<85pp4vakmy.fsf@stephe-leake.org>
 	<1a08qrccls0bi$.16y7q3hosklae.dlg@40tude.net>
 	<mlpasc$r4m$1@dont-email.me>
Date: Wed, 17 Jun 2015 12:38:38 -0500
Message-ID: <85pp4u8cbl.fsf@stephe-leake.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (windows-nt)
Cancel-Lock: sha1:ysPYorD9M3I7cm9i8NuZy3RMZzI=
MIME-Version: 1.0
Content-Type: text/plain
X-Complaints-To: abuse@flashnewsgroups.com
Organization: FlashNewsgroups.com
X-Trace: c372d5581b0a0e97f808402144
X-Received-Bytes: 2672
X-Received-Body-CRC: 766344139
Xref: news.eternal-september.org comp.lang.ada:26358
Date: 2015-06-17T12:38:38-05:00
List-Id: <comp.lang.ada>

"G.B." <bauhaus@futureapps.invalid> writes:

> On 16.06.15 15:24, Dmitry A. Kazakov wrote:
>>> It does not enforce all the lexical rules for numbers; it allows
>>> repeated, leading, and trailing underscores; it doesn't enforce pairs of
>>> '#'.
>> That is exactly the point. It does not parse literal right and you have to
>> reparse the matched chunk of text once again. What was the gain? Why
>> wouldn't do it right in single step?
>
> (I believe the use case here permits simplifications,
> meaning that REs are not being used for meticulous, final
> parsing of Ada.)

Yes, my examples are drawn from a parser for the indentation/navigation
engine in Emacs Ada mode, which is actually required to be looser about
Ada syntax rules than an Ada compiler.

Adding an "enforce strict rules" requirement could change the
lexer/parser design, especially if you place an emphasis on
good/helpful/useful error messages.

In addition, OpenToken (and FastToken) are intended for writing
grammar-based parsers quickly, not for getting the best possible
performance, or meeting other project-specific requirements. So the
design priorizes minimizing the amount of new code that must be written
for a new language.

> But '_' seems missing from "[-+0-9a-fA-F.]+". 

Oops; good catch. I would have found that in an ada-mode test,
but I'm not actually planning on using the Aflex lexer in Emacs.

> (And obsolete Ada syntax, i.e. substitutes for '#'. Which makes a CFG
> parser more desirable if '#' or ':' should have matching occurrences.
> ;-)

Emacs ada-mode explicitly ignores such things (at least until someone
asks for it).

-- 
-- Stephe