comp.lang.ada
 help / color / mirror / Atom feed
From: Stephen Leake <stephen_leake@stephe-leake.org>
Subject: Re: disambiguating 'begin'
Date: Thu, 04 Oct 2012 04:23:35 -0400
Date: 2012-10-04T04:23:35-04:00	[thread overview]
Message-ID: <858vbmsl14.fsf@stephe-leake.org> (raw)
In-Reply-To: 5b0a709d-1abc-4b86-a9fe-320c228c1d18@googlegroups.com

gautier_niouzes@hotmail.com writes:

> Le mardi 2 octobre 2012 12:48:41 UTC+2, Stephen Leake a écrit :
>
>> The problem is that "begin" is used in two ways: as the _start_ of a
>> block, and as the _divider_ between declarations and statements in a
>> block
>
> "begin" is always the start of a block's statements, and sometimes the
> start of the block itself.

Yes, and that "sometimes" is the problem.

> Is there kind of a grammar
> with SMIE ?

Yes, it's BNF. However, it's a kind of "very dumb" BNF. The core
parser only knows about operator precedence; it forgets all the other
information that is in the BNF.

So the grammar fragment for block statements looks like this:

(identifier ":" "declare-label" declarations "begin-divide" statements "end-other")
("declare-open" declarations "begin-divide" statements "end-other")
("begin-open" statements "end-other")

Note that there are two variants of "declare", and two of "begin" (and
the rest of the grammar has other variants of "end"). That's because
each variant must have different precedence for this to work properly.

That means the lexer must distinguish between the variants. For
"declare", that's not hard; look for a preceding ":" token.

However, for "begin", there is no simple way to distinguish between
them; you have to scan all the way back to the start of the file.

However, I figured out a way to deal with this. I can deliberately start
a parse forward at the beginning of the file. Then when the parser gets
to a "begin", I can examine the parser stack; it will either have a
keyword that must precede "begin-divide", or something else. That lets
me decide which variant it is.

Of course, doing that full file scan every time you hit a "begin" is
painfully slow (I implemented it that way at first, just to see). So I
added a caching mechanism; once I've classified a "begin", it is
remembered, until text in front of it is edited.

-- 
-- Stephe



      parent reply	other threads:[~2012-10-04  8:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-02 10:48 disambiguating 'begin' Stephen Leake
2012-10-02 11:33 ` gautier_niouzes
     [not found] ` <5b0a709d-1abc-4b86-a9fe-320c228c1d18@googlegroups.com>
2012-10-04  8:23   ` Stephen Leake [this message]
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox