From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,13b4e394fcd91d4 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.213.68 with SMTP id nq4mr13759096pbc.2.1328012212047; Tue, 31 Jan 2012 04:16:52 -0800 (PST) Path: lh20ni244729pbb.0!nntp.google.com!news1.google.com!npeer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail From: Stephen Leake Newsgroups: comp.lang.ada Subject: Re: OpenToken: Handling the empty word token References: <62121d9d-f208-4e78-a109-749742da14a6@h12g2000yqg.googlegroups.com> <1jvlv7i0tn14u.b5d2cwsqhl2h$.dlg@40tude.net> <82ehuibdwt.fsf@stephe-leake.org> Date: Tue, 31 Jan 2012 07:16:46 -0500 Message-ID: <82obtkgj81.fsf@stephe-leake.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (windows-nt) Cancel-Lock: sha1:yv4qQ6iZh0u9i0Cw8/66Oh+s3H4= MIME-Version: 1.0 X-Complaints-To: abuse@flashnewsgroups.com Organization: FlashNewsgroups.com X-Trace: d52b74f27dbb3e029e66109126 Content-Type: text/plain; charset=us-ascii Date: 2012-01-31T07:16:46-05:00 List-Id: "Randy Brukardt" writes: > "Stephen Leake" wrote in message > news:82ehuibdwt.fsf@stephe-leake.org... >> "Randy Brukardt" writes: > ... >>> I'd be surprised if OpenToken didn't have something similar; >> >> Not quite. Because OpenToken uses Ada types to build the grammar, we >> need an explicit Epsilon token (full code below): > > I see. It appears that OpenToken uses an LL-derived parsing scheme. (It > appears to be a version of recursive descent, which is an LL scheme.] And > it's important to note that LL parsing cannot include epsilon tokens and > have problems with left-recursion. LR-derived parsing does not have either > of these limitations. And note that there is no real workaround to these > limitations (short of rewriting your grammar); they're inherent in the > parsing technique. I'm also on somewhat shakey ground on parser theory, but I did read (and understand :) "the red dragon book" (http://en.wikipedia.org/wiki/Principles_of_Compiler_Design), when I took over OpenToken and fixed some major bugs. The parser is an LALR parser. But I think the implementation is not complete. There are no examples of grammars containing epsilon in the OpenToken tests. I don't remember what the dragon book has to say about epsilon handling. So the best approach here is to go back and read the theory about how to handle epsilon, add some OpenToken tests that use epsilon, and see what needs to be done. More than I have the energy for at the moment, but I'm happy to provide insight into the current code. -- -- Stephe