comp.lang.ada
 help / color / mirror / Atom feed
From: Stephen Leake <stephen_leake@stephe-leake.org>
Subject: working towards OpenToken version 4
Date: Mon, 27 Jul 2009 19:56:15 -0400
Date: 2009-07-27T19:56:15-04:00	[thread overview]
Message-ID: <uhbwxfzog.fsf@stephe-leake.org> (raw)

I've been thinking about ways to improve the recursive descent support
in OpenToken.

One thing that could use work is the syntax for grammar specification.

Consider example 5.10 from the dragon book; see
OpenToken/Examples/ASU_Example_5_10/asu_example_5_10.ads

The grammar, with production actions, is given as:

   --  L -> E         print (L.val)
   --  E -> E + T     E.val := E1.val + T.val
   --  E -> T
   --  T -> T * F     T.val := T1.val * F.val
   --  T -> F
   --  F -> ( E )     F.val := E.val
   --  F -> digit

For the OpenToken LR parser generator, the code that expresses this is:

   Grammar : constant Production_List.Instance :=
     L <= E & EOF                      + Simple_Integer.Print_Value'Access and
     E <= E & Plus & T                 + Simple_Integer.Add_Integers       and
     E <= T                                                                and
     T <= T & Times & F                + Simple_Integer.Multiply_Integers  and
     T <= F                                                                and
     F <= Left_Paren & E & Right_Paren + Simple_Integer.Synthesize_Second  and
     F <= Int_Literal;

A very nice match.

However, for the recursive descent case, things are not so nice.
First, the grammar must be rewritten. That's expected; it's left
recursive, which leads to infinite recursion. Ted Dennison rewrote it
this way, using {} to indicate possible repetition (I had to figure
this out, it's not in the code):

   --  L -> E           print (L.val)
   --  E -> T {+ T}     E.val := T1.val + T2.val ...
   --  T -> F {* F}     T.val := F1.val * F2.val ...
   --  F -> ( E )       F.val := E.val
   --  F -> digit

The code for this is (see
OpenToken/Examples/ASU_Example_5_10/asu_example_5_10_rd-run.adb):

   L.all := E & EOF;
   E.all := Operation_List.Class (Add_Operation_Instance'(Get (T, Plus)));
   T.all := Operation_List.Class (Multiply_Operation_Instance'(Get (F, Times)));
   F.all := Int_Literal or Expression_Sequence_Handle'
       (new Expression_Sequence_Instance'(Left_Paren & E & Right_Paren));

not so nice. 

First, the actions are nowhere to be seen; they are given by
overriding declarations elsewhere. We can fix that by making them
runtime procedure pointers, and using the "+" operator, as for the LR
parser generator. That will significantly reduce the number of generic
instantiations required by this example, while introducing negligible
runtime overhead.

Add_Operation_Instance is a type that implements {+ T}. We don't have
{} as an operator in Ada (never thought I'd be wishing we did !). It
takes two tokens to express that; Plus, and T. So I'm thinking we
could use the "**" operator. That leaves us with:

   L.all := E & EOF;
   E.all := T ** Plus + Add_Integers'Access;
   T.all := F ** Times + Multiply_Integers'Access;
   F.all := Int_Literal or 
       New_Instance (Left_Paren & E & Right_Paren) + Copy_integer'Access;

Can't get rid of that New_Instance call; I've tried :).

Opinions? Is this worth it? (besides being fun :)

I guess "*" would work equally well, and might be a closer match to
the intended semantics. It's not used elsewere in OpenToken as a
grammar operator. But it is used in Extended Backus-Naur Form to
indicate a precise number of repetitions, and we might add that to
OpenToken, as well.

I've gotten backtracking in recursive descent actually working; it was
pretty broken before.

None of this will be in 3.1; I want to get that release out quickly.
But I got inspired :).

Ludovic: the ada-france mtn server is down. Can you fix?

-- 
-- Stephe



             reply	other threads:[~2009-07-27 23:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-27 23:56 Stephen Leake [this message]
2009-07-28  9:29 ` working towards OpenToken version 4 Ludovic Brenta
2009-07-28 13:12   ` Dmitry A. Kazakov
2009-07-30  1:39   ` Stephen Leake
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox