From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,4cb1f8d1c17d39a8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.43.48.202 with SMTP id ux10mr20413019icb.6.1320419052847; Fri, 04 Nov 2011 08:04:12 -0700 (PDT) Path: p6ni70572pbn.0!nntp.google.com!news1.google.com!goblin1!goblin.stu.neva.ru!newsfeed.straub-nv.de!uucp.gnuu.de!newsfeed.arcor.de!newsspool2.arcor-online.net!news.arcor.de.POSTED!not-for-mail Date: Fri, 04 Nov 2011 16:03:28 +0100 From: Georg Bauhaus User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Ada 'hello world' for Android; success! References: <8239efcjuw.fsf@stephe-leake.org> <98ca5430-aa52-4e39-b789-70d0dd6adb46@d33g2000prb.googlegroups.com> <824nyrq5p6.fsf@stephe-leake.org> <4eac1ca1$0$7625$9b4e6d93@newsspool1.arcor-online.net> <82mxciogt0.fsf@stephe-leake.org> <4eafbc25$0$6575$9b4e6d93@newsspool3.arcor-online.net> <82mxcemscg.fsf@stephe-leake.org> <4eb1e241$0$7627$9b4e6d93@newsspool1.arcor-online.net> <8239e5mo9q.fsf@stephe-leake.org> <4eb31240$0$6549$9b4e6d93@newsspool4.arcor-online.net> <82ty6kkrnc.fsf@stephe-leake.org> In-Reply-To: <82ty6kkrnc.fsf@stephe-leake.org> Message-ID: <4eb3fec1$0$6558$9b4e6d93@newsspool4.arcor-online.net> Organization: Arcor NNTP-Posting-Date: 04 Nov 2011 16:03:29 CET NNTP-Posting-Host: 4bdd5c1f.newsspool4.arcor-online.net X-Trace: DXC=mGF4X32Z?VC=8m7nZkdN^@4IUKJLh>_cHTX3jMI6SD2Q4dN:N X-Complaints-To: usenet-abuse@arcor.de Xref: news1.google.com comp.lang.ada:18829 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Date: 2011-11-04T16:03:29+01:00 List-Id: On 04.11.11 13:18, Stephen Leake wrote: > In current OpenToken, we have 'procedure Parse' instead of 'procedure > Produce_Subprograms'; the rest is the same. I gather > 'Produce_Subprograms' generates Ada code, that must then be compiled. Yes. > I'm not clear what the benefit of this is over the current OpenToken > approach. Meant only to be a possible alternative, its implementation is trivial, and it automatically reuses mechanisms such as a stack that a traditional computer features as a built-in. :-) The given method is exhaustive, if that's needed, it's there. If there are no loops in the grammar itself (such as left recursion), then the length of input determines the number of recursive calls for this method, too, so use of stack space has a known maximum. >From a programmer's perspective, the method shown will pass tokens to handlers. Writers of SAX style content handlers will recognize the scheme and find it familiar, I think. Following are some ideas that I would find fairly easy to add to the resulting parsers, to make them more useful, more efficient, or even more powerful. Are they equally easy to add to, for example, bottom up parsers? 1. A simple extension, using functions as subprograms, allows more control: Start another rule (call another subprogram) only if the previous one (a function) has returned False. IOW, parsing does not need to be exhaustive. (A variation can use non-local transfer of control, i.e. exception handlers added to subprograms where desired.) Let r be the return value, then f(r) is the next rule. Assumption: this choice is not to be determined by the grammer alone if modeling it accordingly would be awkward. 2. Or keep some state information along the way that can be queried for (pseudo-)context, making the grammar less static, and react to input: Only if rule R (m) has hit token T (j) before, assume that the parser should now choose rule R (k). The conditionals would be added to the generated code, thus specializing the parsing process. OK, a.k.a hacking. But see "Fence", next. 3. Or introduce pseudo-symbols such as "Fence" that have no meaning in the language but will have an effect on backtracking: backwards motion stops at a "Fence". (Another possible pseudo-symbol is "User-Input".) 4. One can use (or misuse?) the mechanism to perform lexical scanning. A first run produces tokens from a sub-grammar describing the language in terms of lexemes only. The basic mechanism never changes, and the number of topics from which the parsers' logic can be understood is, I think, pleasantly small.