comp.lang.ada
 help / color / mirror / Atom feed
* disambiguating 'begin'
@ 2012-10-02 10:48 Stephen Leake
  2012-10-02 11:33 ` gautier_niouzes
       [not found] ` <5b0a709d-1abc-4b86-a9fe-320c228c1d18@googlegroups.com>
  0 siblings, 2 replies; 3+ messages in thread
From: Stephen Leake @ 2012-10-02 10:48 UTC (permalink / raw)


I've hit a major snag in the new Emacs Ada mode indentation engine. I'm
posting here in hopes of sympathy and good ideas :).

I'm using Emacs SMIE (Simple Minded Indentation Engine), which provides
facilities for implementing an operator precedence grammar
(http://en.wikipedia.org/wiki/Operator-precedence_parser - gota love
Wikipedia! more thorough description at
http://dickgrune.com/Books/PTAPG_2nd_Edition/, or the dragon book
section 4.6)

The main rationale for this kind of parser is that it works equally well
backwards as forwards, as long as tokens are unique, or can be made so
by only looking at local text. That's useful for an indentation engine;
you can figure out the indention by looking back in the text a short
way.

However, it turns out "begin" in Ada cannot be made unique in this
sense!

The problem is that "begin" is used in two ways: as the _start_ of a
block, and as the _divider_ between declarations and statements in a
block:

function F1 is
  <declarations>
begin -- divider
  <statements>
  begin -- block start
    <statements>
  end;
end;

In the operator precedence grammar, these two uses of begin must be
unique; they must be given separate keyword names.

However, as far as I can see, the only way to figure out which role
"begin" is playing is to parse from the start of the compilation unit.
Consider a package body:

package body Pack_1 is

  <declarations>

  function F1 is
    <declarations>
  begin -- divider
    <statements>
    begin -- block start
      <statements>
    end;
    <statements>
    begin -- block start
      <statements>
    end;
  end;

  begin -- divider
    <statements>
  end;

Here I've deliberately got the indentation wrong at the end, to
emphasize the ambiguity (that's how my latest indentation code indents
this :( ).

If we just look back a few keywords from each "begin", we can't tell
which role it is playing. In particular, the package "begin" just looks
like it follows a bunch of statements/declarations (SMIE can't tell the
difference between a statement and a declaration). We must go all the
way back to "package".

When all the tokens are properly disambiguated, SMIE can traverse
correctly from package "begin" to "package". But we can't do that while
disambiguating "begin"; that's circular (been there, done that :).

The current Emacs Ada mode does this in a totally ad-hoc way, and I'm
pretty sure that introducing if-expressions will break it (they do break
something in the current indentation engine).

I believe the indentation engine in GPS always starts at the beginning
of the edit buffer, and scans forward to the current editing point,
keeping track of things. (I found the scanner code in the GPS source,
but not the code that calls it, so I'm not certain what string is passed
in).

Emacs has a facility for doing that (semantic), so I can give that a
try. But it's a lot more work, it's apparently not intended to be used
this way (it typically runs in the background), and I was making such
good progress with SMIE!

Any ideas?

(I did briefly consider requesting the ARG to make these two uses
separate keywords; I'm desperate :).

--
-- Stephe



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-04  8:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-02 10:48 disambiguating 'begin' Stephen Leake
2012-10-02 11:33 ` gautier_niouzes
     [not found] ` <5b0a709d-1abc-4b86-a9fe-320c228c1d18@googlegroups.com>
2012-10-04  8:23   ` Stephen Leake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox