From: Stephen Leake <stephen_leake@stephe-leake.org>
Subject: LALR parser question
Date: Sun, 28 Apr 2013 08:37:33 -0500
Date: 2013-04-28T08:37:33-05:00 [thread overview]
Message-ID: <85sj2aydwi.fsf@stephe-leake.org> (raw)
As part of Emacs Ada mode 5.0, I'm building a generalized LALR grammar
for Ada (see
http://stephe-leake.org/emacs/ada-mode/emacs-ada-mode.html#ada-mode-5.0 )
I'm having trouble with empty declarations. For example (using Bison
syntax for the grammar), a simplified subset of the Ada package_body sytax is:
package_body
: IS declaration_list BEGIN SEMICOLON
;
declaration_list
: declaration
| declaration_list declaration
;
declaration
: object_declaration
| subprogram_declaration
;; ...
;
This does not allow an empty declaration_list. But Ada does, so the
question is how can we add that to the grammar.
There are three choices:
1) Add an empty declaration choice to declaration_list:
declaration_list
: ;; empty list
| declaration
| declaration_list declaration
;
This is now redundant; since declaration_list can be empty, the second
choice is not needed:
declaration_list
: ;; empty list
| declaration_list declaration
;
2) Add an empty declaration choice to declaration:
declaration
: ;; empty declaration
| object_declaration
| subprogram_declaration
;; ...
;
3) Add another choice in package_body that leaves out declaration_list:
package_body
: PACKAGE name IS declaration_list BEGIN statement_list END SEMICOLON
| PACKAGE name IS BEGIN statement_list END SEMICOLON
;
OpenToken cannot handle choice 1; every occurance of declaration_list
appears to be empty, giving parse errors at parse time. For example, on
this input:
is begin;
gives a syntax error:
shift_conflict_bug.input:2:4: Syntax error; expecting 'EOF_ID' or
'IS_ID'; found BEGIN_ID 'begin'
I'm not clear if this is expected because of the way LALR works, or if
this is a bug somewhere in OpenToken (either the grammar generator or
the parser); any clues?
Choice 2 leads to a shift/reduce conflict in the production for
package_body; I believe extending the OpenToken parser to a generalized
parser would allow it to handle this option. However, the OpenToken
grammar generator currently reports a bug, instead of reporting the
conflict.
Choice 3 works with the current OpenToken, but of course it is very
tedious; every occurance of declaration_list must be handled in the
same way.
Any other ways to handle this problem?
--
-- Stephe
next reply other threads:[~2013-04-28 13:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-28 13:37 Stephen Leake [this message]
2013-04-28 14:43 ` LALR parser question Dmitry A. Kazakov
2013-04-30 1:19 ` Yannick Duchêne (Hibou57)
2013-04-30 2:03 ` John B. Matthews
2013-04-30 4:11 ` Yannick Duchêne (Hibou57)
2013-04-30 11:55 ` Peter C. Chapin
2013-04-30 13:14 ` john
2013-04-30 14:14 ` Dmitry A. Kazakov
2013-05-01 11:33 ` Peter C. Chapin
2013-04-30 16:06 ` Shark8
2013-04-30 17:15 ` Yannick Duchêne (Hibou57)
2013-04-30 17:51 ` Shark8
2013-04-30 18:52 ` Yannick Duchêne (Hibou57)
2013-05-01 12:31 ` Stephen Leake
2013-05-01 13:57 ` Shark8
2013-04-30 21:18 ` Dmitry A. Kazakov
2013-04-30 22:09 ` Shark8
2013-05-02 1:49 ` Randy Brukardt
2013-05-02 2:39 ` Yannick Duchêne (Hibou57)
2013-05-02 21:57 ` Randy Brukardt
2013-05-06 18:25 ` Oliver Kellogg
2013-05-03 9:45 ` Stephen Leake
2013-05-03 22:57 ` Randy Brukardt
2013-05-06 9:45 ` Stephen Leake
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox