From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,4cb1f8d1c17d39a8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.43.48.202 with SMTP id ux10mr20413019icb.6.1320419052847;
        Fri, 04 Nov 2011 08:04:12 -0700 (PDT)
Path: 
 p6ni70572pbn.0!nntp.google.com!news1.google.com!goblin1!goblin.stu.neva.ru!newsfeed.straub-nv.de!uucp.gnuu.de!newsfeed.arcor.de!newsspool2.arcor-online.net!news.arcor.de.POSTED!not-for-mail
Date: Fri, 04 Nov 2011 16:03:28 +0100
From: Georg Bauhaus <rm.dash-bauhaus@futureapps.de>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1
MIME-Version: 1.0
Newsgroups: comp.lang.ada
Subject: Re: Ada 'hello world' for Android; success!
References: <8239efcjuw.fsf@stephe-leake.org>
 	<98ca5430-aa52-4e39-b789-70d0dd6adb46@d33g2000prb.googlegroups.com>
 	<824nyrq5p6.fsf@stephe-leake.org>
 	<4eac1ca1$0$7625$9b4e6d93@newsspool1.arcor-online.net>
 	<82mxciogt0.fsf@stephe-leake.org>
 	<4eafbc25$0$6575$9b4e6d93@newsspool3.arcor-online.net>
 	<82mxcemscg.fsf@stephe-leake.org>
 	<4eb1e241$0$7627$9b4e6d93@newsspool1.arcor-online.net>
 	<8239e5mo9q.fsf@stephe-leake.org>
 	<4eb31240$0$6549$9b4e6d93@newsspool4.arcor-online.net>
 <82ty6kkrnc.fsf@stephe-leake.org>
In-Reply-To: <82ty6kkrnc.fsf@stephe-leake.org>
Message-ID: <4eb3fec1$0$6558$9b4e6d93@newsspool4.arcor-online.net>
Organization: Arcor
NNTP-Posting-Date: 04 Nov 2011 16:03:29 CET
NNTP-Posting-Host: 4bdd5c1f.newsspool4.arcor-online.net
X-Trace: 
 DXC=mGF4X32Z?VC=8m7nZkdN^@4IUK<Cl32<A4Fo<]lROoRA8kF<OcfhCOKPH:a3g3^SkAnc\616M64>JLh>_cHTX3jMI6SD2Q4dN:N
X-Complaints-To: usenet-abuse@arcor.de
Xref: news1.google.com comp.lang.ada:18829
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Date: 2011-11-04T16:03:29+01:00
List-Id: <comp.lang.ada>

On 04.11.11 13:18, Stephen Leake wrote:

> In current OpenToken, we have 'procedure Parse' instead of 'procedure
> Produce_Subprograms'; the rest is the same. I gather
> 'Produce_Subprograms' generates Ada code, that must then be compiled.

Yes.

> I'm not clear what the benefit of this is over the current OpenToken
> approach.

Meant only to be a possible alternative, its implementation is trivial,
and it automatically reuses mechanisms such as a stack that
a traditional computer features as a built-in. :-)

The given method is exhaustive, if that's needed, it's there.
If there are no loops in the grammar itself (such as left recursion),
then the length of input determines the number of recursive
calls for this method, too, so use of stack space has a known
maximum.

>From a programmer's perspective, the method shown will pass tokens
to handlers. Writers of SAX style content handlers will recognize
the scheme and find it familiar, I think.

Following are some ideas that I would find fairly easy to add
to the resulting parsers, to make them more useful, more efficient,
or even more powerful.
Are they equally easy to add to, for example, bottom up parsers?

1. A simple extension, using functions as subprograms, allows more
control: Start another rule (call another subprogram) only if the
previous one (a function) has returned False. IOW, parsing does
not need to be exhaustive.  (A variation can use non-local
transfer of control, i.e. exception handlers added to subprograms
where desired.)
<hack>Let r be the return value, then f(r) is the next rule.
Assumption: this choice is not to be determined by the grammer
alone if modeling it accordingly would be awkward.</hack>

2. Or keep some state information along the way that can be
queried for (pseudo-)context, making the grammar less static,
and react to input:  Only if rule R (m) has hit token T (j)
before, assume that the parser should now choose rule R (k).
The conditionals would be added to the generated code,
thus specializing the parsing process. OK, a.k.a hacking.
But see "Fence", next.

3. Or introduce pseudo-symbols such as "Fence" that have no
meaning in the language but will have an effect on backtracking:
backwards motion stops at a "Fence".
  (Another possible pseudo-symbol is "User-Input".)

4. One can use (or misuse?) the mechanism to perform lexical
scanning. A first run produces tokens from a sub-grammar describing
the language in terms of lexemes only.

The basic mechanism never changes, and the number of topics
from which the parsers' logic can be understood is, I think,
pleasantly small.