From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!news.albasani.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Niklas Holsti Newsgroups: comp.lang.ada Subject: Re: Interesting article on ARG work Date: Tue, 10 Apr 2018 22:23:21 +0300 Organization: Tidorum Ltd Message-ID: References: <1b44444f-c1b3-414e-84fb-8798961487c3@googlegroups.com> <62ee0aac-49da-4925-b9aa-a16695b3fc45@googlegroups.com> <9879872e-c18a-4667-afe5-41ce0f54559f@googlegroups.com> <80db2d05-744f-4201-ba1b-4436f8040491@googlegroups.com> <59f9ab6d-d6ba-45ff-a6f0-c5699983d9e8@googlegroups.com> <1a390e22-f49f-4028-8e58-ca4d0f51e4b6@googlegroups.com> <195739aa-2f32-4cba-80ac-e60591e94712@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: individual.net GbdR7j1t9HvHyN9Lc1PztQO8YvQKuku8cA3Uc65Pf4Jod0/7uR Cancel-Lock: sha1:+VQCylG4zEOygBVDXBBnHcfKzJw= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 In-Reply-To: <195739aa-2f32-4cba-80ac-e60591e94712@googlegroups.com> Xref: reader02.eternal-september.org comp.lang.ada:51431 Date: 2018-04-10T22:23:21+03:00 List-Id: On 18-04-10 01:09 , Dan'l Miller wrote: > On Monday, April 9, 2018 at 4:17:42 PM UTC-5, Niklas Holsti wrote: >> If there are 2 alternative AST subtrees at each choice point, the >> total number of possible complete ASTs (which is what the later >> stages of an ordinary compiler would have to process) is >> exponential in the number of choice points (that is, "@" >> instances) > > Trivially exponential: at most 2¹ = 2 alternative AST subtrees as > some points, because there is only one choice point: an > @-code-is-present on the compiler's command-line versus an > @-code-is-elided on the compiler's command line. I call a "choice point" a point in the input sentence where the parser has to fork into alternatives. For a GLR parser, this means every _occurrence_ of an "@" in the input sentence. You are ignoring my explicitly stated assumption (which follows if one is using _just_ a GLR parser) that the choices of alternative at the various choice points are uncorrelated. This makes the number of combinations of choices, which is the number of alternative _entire_ ASTs (in the commonly accepted meaning of AST as _one_ parse of the input sentence) exponential in the number of choice points. The paper to which you referred avoids that problem by associating choice points with conditions depending on preprocessor variables, which is a natural thing to do, but which also means passing beyond the abilities of _just_ GLR parsing. > Each @-prefixed line is not isomorphic to yet another > differently-named macro in the C preprocessor (or M4, p4, etc > preprocessors). In plain GLR parsing it is isomorphic, because a GLR parser does not create any correlations between the non-deterministic choices at different points in the sentence. That would be a context-dependent grammar. > For p choicepoints on > the compiler command-line, yes you are trivially correct: > “exponential” 2ᵖ but where p is not only constant, but merely p=1, > hence GLR's O(pn) asymptotic rate of growth is O(1n) = O(n) and > likewise for the at-most twice-as-big AST. That is true for the "Fork-Merge LR" (FMLR) parsing in the paper you referenced, but that is not GLR. > Not @ instances (as counted by lines of code) but @ degrees of > freedom on the compiler command-line. You are not counting the > correct things. We are certainly counting different things. You said that GLR parsing could be used to parse both "@ enabled" and "@ disabled" sentences at the same time. I maintain that _just_ a GLR parser is not suitable, because the uncorrelated choice of alternative parses at each choice point (each "@") does not match the meaning of the "@" symbol (enabled everywhere, or disabled everywhere). The FMLR method, with alternatives correlated by logical conditions, works, at least if one fully believes the claims in the paper to which you referred (I am perhaps not quite convinced that the paper's "ifdef-hoisting" will work and remain relatively local in all cases). -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .