From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,f46e95f7ed68fc6b X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!postnews.google.com!c1g2000yqi.googlegroups.com!not-for-mail From: Ludovic Brenta Newsgroups: comp.lang.ada Subject: Re: working towards OpenToken version 4 Date: Tue, 28 Jul 2009 02:29:35 -0700 (PDT) Organization: http://groups.google.com Message-ID: <2df50994-4bed-4396-aafb-4efc31dd02b6@c1g2000yqi.googlegroups.com> References: NNTP-Posting-Host: 153.98.68.197 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1248773375 1950 127.0.0.1 (28 Jul 2009 09:29:35 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Tue, 28 Jul 2009 09:29:35 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: c1g2000yqi.googlegroups.com; posting-host=153.98.68.197; posting-account=pcLQNgkAAAD9TrXkhkIgiY6-MDtJjIlC User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.6) Gecko/2009012111 Red Hat/3.0.6-1.el5 Firefox/3.0.6,gzip(gfe),gzip(gfe) Xref: g2news2.google.com comp.lang.ada:7372 Date: 2009-07-28T02:29:35-07:00 List-Id: Stephen Leake wrote on comp.lang.ada: > I've been thinking about ways to improve the recursive descent support > in OpenToken. Great. I like recursive descent parsers :) > One thing that could use work is the syntax for grammar specification. > > Consider example 5.10 from the dragon book; see > OpenToken/Examples/ASU_Example_5_10/asu_example_5_10.ads > > The grammar, with production actions, is given as: > > =A0 =A0-- =A0L -> E =A0 =A0 =A0 =A0 print (L.val) > =A0 =A0-- =A0E -> E + T =A0 =A0 E.val :=3D E1.val + T.val > =A0 =A0-- =A0E -> T > =A0 =A0-- =A0T -> T * F =A0 =A0 T.val :=3D T1.val * F.val > =A0 =A0-- =A0T -> F > =A0 =A0-- =A0F -> ( E ) =A0 =A0 F.val :=3D E.val > =A0 =A0-- =A0F -> digit > > For the OpenToken LR parser generator, the code that expresses this is: > > =A0 =A0Grammar : constant Production_List.Instance :=3D > =A0 =A0 =A0L <=3D E & EOF =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ Si= mple_Integer.Print_Value'Access and > =A0 =A0 =A0E <=3D E & Plus & T =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 + Simple_I= nteger.Add_Integers =A0 =A0 =A0 and > =A0 =A0 =A0E <=3D T =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= and > =A0 =A0 =A0T <=3D T & Times & F =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0+ Simple_I= nteger.Multiply_Integers =A0and > =A0 =A0 =A0T <=3D F =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= and > =A0 =A0 =A0F <=3D Left_Paren & E & Right_Paren + Simple_Integer.Synthesiz= e_Second =A0and > =A0 =A0 =A0F <=3D Int_Literal; > > A very nice match. > > However, for the recursive descent case, things are not so nice. > First, the grammar must be rewritten. That's expected; it's left > recursive, which leads to infinite recursion. Ted Dennison rewrote it > this way, using {} to indicate possible repetition (I had to figure > this out, it's not in the code): > > =A0 =A0-- =A0L -> E =A0 =A0 =A0 =A0 =A0 print (L.val) > =A0 =A0-- =A0E -> T {+ T} =A0 =A0 E.val :=3D T1.val + T2.val ... > =A0 =A0-- =A0T -> F {* F} =A0 =A0 T.val :=3D F1.val * F2.val ... > =A0 =A0-- =A0F -> ( E ) =A0 =A0 =A0 F.val :=3D E.val > =A0 =A0-- =A0F -> digit > > The code for this is (see > OpenToken/Examples/ASU_Example_5_10/asu_example_5_10_rd-run.adb): > > =A0 =A0L.all :=3D E & EOF; > =A0 =A0E.all :=3D Operation_List.Class (Add_Operation_Instance'(Get (T, P= lus))); > =A0 =A0T.all :=3D Operation_List.Class (Multiply_Operation_Instance'(Get = (F, Times))); > =A0 =A0F.all :=3D Int_Literal or Expression_Sequence_Handle' > =A0 =A0 =A0 =A0(new Expression_Sequence_Instance'(Left_Paren & E & Right_= Paren)); > > not so nice. > > First, the actions are nowhere to be seen; they are given by > overriding declarations elsewhere. We can fix that by making them > runtime procedure pointers, and using the "+" operator, as for the LR > parser generator. That will significantly reduce the number of generic > instantiations required by this example, while introducing negligible > runtime overhead. > > Add_Operation_Instance is a type that implements {+ T}. We don't have > {} as an operator in Ada (never thought I'd be wishing we did !). It > takes two tokens to express that; Plus, and T. So I'm thinking we > could use the "**" operator. That leaves us with: > > =A0 =A0L.all :=3D E & EOF; > =A0 =A0E.all :=3D T ** Plus + Add_Integers'Access; > =A0 =A0T.all :=3D F ** Times + Multiply_Integers'Access; > =A0 =A0F.all :=3D Int_Literal or > =A0 =A0 =A0 =A0New_Instance (Left_Paren & E & Right_Paren) + Copy_integer= 'Access; > > Can't get rid of that New_Instance call; I've tried :). > > Opinions? Is this worth it? (besides being fun :) Yes, it is worth it IMHO, but I'm not sure about "**". Consider: E -> T {+ U} (where the '+' might actually be some other, asymmetric, operator). In this case your use of "**" cannot work. I think such a production should be represented with "or", e.g. E.all :=3D T or (T & Plus & U + Add_Integers'Access); > I guess "*" would work equally well, and might be a closer match to > the intended semantics. It's not used elsewere in OpenToken as a > grammar operator. But it is used in Extended Backus-Naur Form to > indicate a precise number of repetitions, and we might add that to > OpenToken, as well. I like "**" better because it is asymmetric whereas "*" is symmetric. > I've gotten backtracking in recursive descent actually working; it was > pretty broken before. > > None of this will be in 3.1; I want to get that release out quickly. > But I got inspired :). How about including the fix for backtracking in 3.1? I would think that such an important fix should get high priority. > Ludovic: the ada-france mtn server is down. Can you fix? Fixed. -- Ludovic Brenta.