From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,1b6a1fe7038b5b8e,start X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.66.81.74 with SMTP id y10mr4817999pax.17.1349174921118; Tue, 02 Oct 2012 03:48:41 -0700 (PDT) Path: t10ni23601382pbh.0!nntp.google.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!news.flashnewsgroups.com-b7.4zTQh5tI3A!not-for-mail From: Stephen Leake Newsgroups: comp.lang.ada Subject: disambiguating 'begin' Date: Tue, 02 Oct 2012 06:48:38 -0400 Message-ID: <85obkl2lq1.fsf@stephe-leake.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (windows-nt) Cancel-Lock: sha1:5otnO+7c9+dUKloQeN4TnURVKJU= MIME-Version: 1.0 X-Complaints-To: abuse@flashnewsgroups.com Organization: FlashNewsgroups.com X-Trace: f2fd9506ac688e029e66106359 X-Received-Bytes: 3852 Content-Type: text/plain Date: 2012-10-02T06:48:38-04:00 List-Id: I've hit a major snag in the new Emacs Ada mode indentation engine. I'm posting here in hopes of sympathy and good ideas :). I'm using Emacs SMIE (Simple Minded Indentation Engine), which provides facilities for implementing an operator precedence grammar (http://en.wikipedia.org/wiki/Operator-precedence_parser - gota love Wikipedia! more thorough description at http://dickgrune.com/Books/PTAPG_2nd_Edition/, or the dragon book section 4.6) The main rationale for this kind of parser is that it works equally well backwards as forwards, as long as tokens are unique, or can be made so by only looking at local text. That's useful for an indentation engine; you can figure out the indention by looking back in the text a short way. However, it turns out "begin" in Ada cannot be made unique in this sense! The problem is that "begin" is used in two ways: as the _start_ of a block, and as the _divider_ between declarations and statements in a block: function F1 is begin -- divider begin -- block start end; end; In the operator precedence grammar, these two uses of begin must be unique; they must be given separate keyword names. However, as far as I can see, the only way to figure out which role "begin" is playing is to parse from the start of the compilation unit. Consider a package body: package body Pack_1 is function F1 is begin -- divider begin -- block start end; begin -- block start end; end; begin -- divider end; Here I've deliberately got the indentation wrong at the end, to emphasize the ambiguity (that's how my latest indentation code indents this :( ). If we just look back a few keywords from each "begin", we can't tell which role it is playing. In particular, the package "begin" just looks like it follows a bunch of statements/declarations (SMIE can't tell the difference between a statement and a declaration). We must go all the way back to "package". When all the tokens are properly disambiguated, SMIE can traverse correctly from package "begin" to "package". But we can't do that while disambiguating "begin"; that's circular (been there, done that :). The current Emacs Ada mode does this in a totally ad-hoc way, and I'm pretty sure that introducing if-expressions will break it (they do break something in the current indentation engine). I believe the indentation engine in GPS always starts at the beginning of the edit buffer, and scans forward to the current editing point, keeping track of things. (I found the scanner code in the GPS source, but not the code that calls it, so I'm not certain what string is passed in). Emacs has a facility for doing that (semantic), so I can give that a try. But it's a lot more work, it's apparently not intended to be used this way (it typically runs in the background), and I was making such good progress with SMIE! Any ideas? (I did briefly consider requesting the ARG to make these two uses separate keywords; I'm desperate :). -- -- Stephe