comp.lang.ada
 help / color / mirror / Atom feed
From: Stephen Leake <stephen_leake@stephe-leake.org>
Subject: LALR parser question
Date: Sun, 28 Apr 2013 08:37:33 -0500
Date: 2013-04-28T08:37:33-05:00	[thread overview]
Message-ID: <85sj2aydwi.fsf@stephe-leake.org> (raw)

As part of Emacs Ada mode 5.0, I'm building a generalized LALR grammar
for Ada (see
http://stephe-leake.org/emacs/ada-mode/emacs-ada-mode.html#ada-mode-5.0 ) 

I'm having trouble with empty declarations. For example (using Bison
syntax for the grammar), a simplified subset of the Ada package_body sytax is:

package_body
  : IS declaration_list BEGIN SEMICOLON
  ;

declaration_list
  : declaration
  | declaration_list declaration
  ;

declaration
  : object_declaration
  | subprogram_declaration
  ;; ...
  ;

This does not allow an empty declaration_list. But Ada does, so the
question is how can we add that to the grammar.

There are three choices:

1) Add an empty declaration choice to declaration_list:

declaration_list
  : ;; empty list
  | declaration
  | declaration_list declaration
  ;

This is now redundant; since declaration_list can be empty, the second
choice is not needed:

declaration_list
  : ;; empty list
  | declaration_list declaration
  ;

2) Add an empty declaration choice to declaration:

declaration
  : ;; empty declaration
  | object_declaration
  | subprogram_declaration
  ;; ...
  ;

3) Add another choice in package_body that leaves out declaration_list:

package_body
  : PACKAGE name IS declaration_list BEGIN statement_list END SEMICOLON
  | PACKAGE name IS BEGIN statement_list END SEMICOLON
  ;

OpenToken cannot handle choice 1; every occurance of declaration_list
appears to be empty, giving parse errors at parse time. For example, on
this input:

is begin;

gives a syntax error:
shift_conflict_bug.input:2:4: Syntax error; expecting 'EOF_ID' or
'IS_ID'; found BEGIN_ID 'begin'

I'm not clear if this is expected because of the way LALR works, or if
this is a bug somewhere in OpenToken (either the grammar generator or
the parser); any clues?

Choice 2 leads to a shift/reduce conflict in the production for
package_body; I believe extending the OpenToken parser to a generalized
parser would allow it to handle this option. However, the OpenToken
grammar generator currently reports a bug, instead of reporting the
conflict.

Choice 3 works with the current OpenToken, but of course it is very
tedious; every occurance of declaration_list must be handled in the
same way.

Any other ways to handle this problem?

-- 
-- Stephe



             reply	other threads:[~2013-04-28 13:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-28 13:37 Stephen Leake [this message]
2013-04-28 14:43 ` LALR parser question Dmitry A. Kazakov
2013-04-30  1:19   ` Yannick Duchêne (Hibou57)
2013-04-30  2:03     ` John B. Matthews
2013-04-30  4:11       ` Yannick Duchêne (Hibou57)
2013-04-30 11:55         ` Peter C. Chapin
2013-04-30 13:14           ` john
2013-04-30 14:14             ` Dmitry A. Kazakov
2013-05-01 11:33             ` Peter C. Chapin
2013-04-30 16:06     ` Shark8
2013-04-30 17:15       ` Yannick Duchêne (Hibou57)
2013-04-30 17:51         ` Shark8
2013-04-30 18:52           ` Yannick Duchêne (Hibou57)
2013-05-01 12:31         ` Stephen Leake
2013-05-01 13:57           ` Shark8
2013-04-30 21:18       ` Dmitry A. Kazakov
2013-04-30 22:09         ` Shark8
2013-05-02  1:49 ` Randy Brukardt
2013-05-02  2:39   ` Yannick Duchêne (Hibou57)
2013-05-02 21:57     ` Randy Brukardt
2013-05-06 18:25     ` Oliver Kellogg
2013-05-03  9:45   ` Stephen Leake
2013-05-03 22:57     ` Randy Brukardt
2013-05-06  9:45     ` Stephen Leake
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox