From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,13b4e394fcd91d4
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.68.73.229 with SMTP id o5mr12419559pbv.7.1327971403534;
        Mon, 30 Jan 2012 16:56:43 -0800 (PST)
MIME-Version: 1.0
From: "Randy Brukardt" <randy@rrsoftware.com>
Newsgroups: comp.lang.ada
Subject: Re: OpenToken: Handling the empty word token
Date: Mon, 30 Jan 2012 18:56:36 -0600
Organization: Jacob Sparre Andersen Research & Innovation
Message-ID: <jg7e89$27p$1@munin.nbi.dk>
References: 
 <62121d9d-f208-4e78-a109-749742da14a6@h12g2000yqg.googlegroups.com>
 	<1jvlv7i0tn14u.b5d2cwsqhl2h$.dlg@40tude.net> 	<jfvqqu$83e$1@munin.nbi.dk>
 <82ehuibdwt.fsf@stephe-leake.org>
NNTP-Posting-Host: static-69-95-181-76.mad.choiceone.net
X-Trace: munin.nbi.dk 1327971401 2297 69.95.181.76 (31 Jan 2012 00:56:41 GMT)
X-Complaints-To: news@jacob-sparre.dk
NNTP-Posting-Date: Tue, 31 Jan 2012 00:56:41 +0000 (UTC)
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.5931
X-RFC2646: Format=Flowed; Original
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Path: 
 lh20ni242994pbb.0!nntp.google.com!news1.google.com!news4.google.com!proxad.net!feeder1-2.proxad.net!feed.ac-versailles.fr!news.ecp.fr!news.jacob-sparre.dk!pnx.dk!jacob-sparre.dk!ada-dk.org!.POSTED!not-for-mail
Date: 2012-01-30T18:56:36-06:00
List-Id: <comp.lang.ada>

"Stephen Leake" <stephen_leake@stephe-leake.org> wrote in message 
news:82ehuibdwt.fsf@stephe-leake.org...
> "Randy Brukardt" <randy@rrsoftware.com> writes:
...
>> I'd be surprised if OpenToken didn't have something similar;
>
> Not quite. Because OpenToken uses Ada types to build the grammar, we
> need an explicit Epsilon token (full code below):

I see. It appears that OpenToken uses an LL-derived parsing scheme. (It 
appears to be a version of recursive descent, which is an LL scheme.] And 
it's important to note that LL parsing cannot include epsilon tokens and 
have problems with left-recursion. LR-derived parsing does not have either 
of these limitations. And note that there is no real workaround to these 
limitations (short of rewriting your grammar); they're inherent in the 
parsing technique.

[Note: I learned all about parsing back in the early 1980's, so (1) I may be 
remembering something wrong, or (2) the state of the art may have changed 
some.]

I would suggest that the OP use a LR-based grammar generator. That won't be 
quite so "pretty" as using "Ada types", but it will work well with almost 
any original grammar that you can throw at it. (I believe that modern parser 
generators use LR(1) tables rather than the LALR(1) tables used in the past; 
LR(1) tables are a lot bigger to generate and we simply didn't have enough 
memory to use that back in the day.)

Hacking an LL-based scheme (such as OpenToken's) to work is pretty much 
doomed to failure. You might get it to work for limited circumstances, but I 
don't think it would work in any general sort of way.

                                         Randy.