From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,13b4e394fcd91d4
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.68.213.68 with SMTP id nq4mr13759096pbc.2.1328012212047;
        Tue, 31 Jan 2012 04:16:52 -0800 (PST)
Path: 
 lh20ni244729pbb.0!nntp.google.com!news1.google.com!npeer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail
From: Stephen Leake <stephen_leake@stephe-leake.org>
Newsgroups: comp.lang.ada
Subject: Re: OpenToken: Handling the empty word token
References: 
 <62121d9d-f208-4e78-a109-749742da14a6@h12g2000yqg.googlegroups.com>
 	<1jvlv7i0tn14u.b5d2cwsqhl2h$.dlg@40tude.net>
 	<jfvqqu$83e$1@munin.nbi.dk> <82ehuibdwt.fsf@stephe-leake.org>
 	<jg7e89$27p$1@munin.nbi.dk>
Date: Tue, 31 Jan 2012 07:16:46 -0500
Message-ID: <82obtkgj81.fsf@stephe-leake.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (windows-nt)
Cancel-Lock: sha1:yv4qQ6iZh0u9i0Cw8/66Oh+s3H4=
MIME-Version: 1.0
X-Complaints-To: abuse@flashnewsgroups.com
Organization: FlashNewsgroups.com
X-Trace: d52b74f27dbb3e029e66109126
Content-Type: text/plain; charset=us-ascii
Date: 2012-01-31T07:16:46-05:00
List-Id: <comp.lang.ada>

"Randy Brukardt" <randy@rrsoftware.com> writes:

> "Stephen Leake" <stephen_leake@stephe-leake.org> wrote in message 
> news:82ehuibdwt.fsf@stephe-leake.org...
>> "Randy Brukardt" <randy@rrsoftware.com> writes:
> ...
>>> I'd be surprised if OpenToken didn't have something similar;
>>
>> Not quite. Because OpenToken uses Ada types to build the grammar, we
>> need an explicit Epsilon token (full code below):
>
> I see. It appears that OpenToken uses an LL-derived parsing scheme. (It 
> appears to be a version of recursive descent, which is an LL scheme.] And 
> it's important to note that LL parsing cannot include epsilon tokens and 
> have problems with left-recursion. LR-derived parsing does not have either 
> of these limitations. And note that there is no real workaround to these 
> limitations (short of rewriting your grammar); they're inherent in the 
> parsing technique.

I'm also on somewhat shakey ground on parser theory, but I did read (and
understand :) "the red dragon book"
(http://en.wikipedia.org/wiki/Principles_of_Compiler_Design), when I
took over OpenToken and fixed some major bugs.

The parser is an LALR parser.

But I think the implementation is not complete. There are no examples of
grammars containing epsilon in the OpenToken tests. I don't remember
what the dragon book has to say about epsilon handling.

So the best approach here is to go back and read the theory about how to
handle epsilon, add some OpenToken tests that use epsilon, and see what
needs to be done.

More than I have the energy for at the moment, but I'm happy to provide
insight into the current code.

-- 
-- Stephe