From: Ludovic Brenta <ludovic@ludovic-brenta.org>
Subject: Re: working towards OpenToken version 4
Date: Tue, 28 Jul 2009 02:29:35 -0700 (PDT)
Date: 2009-07-28T02:29:35-07:00 [thread overview]
Message-ID: <2df50994-4bed-4396-aafb-4efc31dd02b6@c1g2000yqi.googlegroups.com> (raw)
In-Reply-To: uhbwxfzog.fsf@stephe-leake.org
Stephen Leake wrote on comp.lang.ada:
> I've been thinking about ways to improve the recursive descent support
> in OpenToken.
Great. I like recursive descent parsers :)
> One thing that could use work is the syntax for grammar specification.
>
> Consider example 5.10 from the dragon book; see
> OpenToken/Examples/ASU_Example_5_10/asu_example_5_10.ads
>
> The grammar, with production actions, is given as:
>
> -- L -> E print (L.val)
> -- E -> E + T E.val := E1.val + T.val
> -- E -> T
> -- T -> T * F T.val := T1.val * F.val
> -- T -> F
> -- F -> ( E ) F.val := E.val
> -- F -> digit
>
> For the OpenToken LR parser generator, the code that expresses this is:
>
> Grammar : constant Production_List.Instance :=
> L <= E & EOF + Simple_Integer.Print_Value'Access and
> E <= E & Plus & T + Simple_Integer.Add_Integers and
> E <= T and
> T <= T & Times & F + Simple_Integer.Multiply_Integers and
> T <= F and
> F <= Left_Paren & E & Right_Paren + Simple_Integer.Synthesize_Second and
> F <= Int_Literal;
>
> A very nice match.
>
> However, for the recursive descent case, things are not so nice.
> First, the grammar must be rewritten. That's expected; it's left
> recursive, which leads to infinite recursion. Ted Dennison rewrote it
> this way, using {} to indicate possible repetition (I had to figure
> this out, it's not in the code):
>
> -- L -> E print (L.val)
> -- E -> T {+ T} E.val := T1.val + T2.val ...
> -- T -> F {* F} T.val := F1.val * F2.val ...
> -- F -> ( E ) F.val := E.val
> -- F -> digit
>
> The code for this is (see
> OpenToken/Examples/ASU_Example_5_10/asu_example_5_10_rd-run.adb):
>
> L.all := E & EOF;
> E.all := Operation_List.Class (Add_Operation_Instance'(Get (T, Plus)));
> T.all := Operation_List.Class (Multiply_Operation_Instance'(Get (F, Times)));
> F.all := Int_Literal or Expression_Sequence_Handle'
> (new Expression_Sequence_Instance'(Left_Paren & E & Right_Paren));
>
> not so nice.
>
> First, the actions are nowhere to be seen; they are given by
> overriding declarations elsewhere. We can fix that by making them
> runtime procedure pointers, and using the "+" operator, as for the LR
> parser generator. That will significantly reduce the number of generic
> instantiations required by this example, while introducing negligible
> runtime overhead.
>
> Add_Operation_Instance is a type that implements {+ T}. We don't have
> {} as an operator in Ada (never thought I'd be wishing we did !). It
> takes two tokens to express that; Plus, and T. So I'm thinking we
> could use the "**" operator. That leaves us with:
>
> L.all := E & EOF;
> E.all := T ** Plus + Add_Integers'Access;
> T.all := F ** Times + Multiply_Integers'Access;
> F.all := Int_Literal or
> New_Instance (Left_Paren & E & Right_Paren) + Copy_integer'Access;
>
> Can't get rid of that New_Instance call; I've tried :).
>
> Opinions? Is this worth it? (besides being fun :)
Yes, it is worth it IMHO, but I'm not sure about "**". Consider:
E -> T {+ U}
(where the '+' might actually be some other, asymmetric, operator).
In this case your use of "**" cannot work. I think such a production
should be represented with "or", e.g.
E.all := T or (T & Plus & U + Add_Integers'Access);
> I guess "*" would work equally well, and might be a closer match to
> the intended semantics. It's not used elsewere in OpenToken as a
> grammar operator. But it is used in Extended Backus-Naur Form to
> indicate a precise number of repetitions, and we might add that to
> OpenToken, as well.
I like "**" better because it is asymmetric whereas "*" is symmetric.
> I've gotten backtracking in recursive descent actually working; it was
> pretty broken before.
>
> None of this will be in 3.1; I want to get that release out quickly.
> But I got inspired :).
How about including the fix for backtracking in 3.1? I would think
that such an important fix should get high priority.
> Ludovic: the ada-france mtn server is down. Can you fix?
Fixed.
--
Ludovic Brenta.
next prev parent reply other threads:[~2009-07-28 9:29 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-27 23:56 working towards OpenToken version 4 Stephen Leake
2009-07-28 9:29 ` Ludovic Brenta [this message]
2009-07-28 13:12 ` Dmitry A. Kazakov
2009-07-30 1:39 ` Stephen Leake
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox