From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: [Slightly OT] How to process lightweight text markup languages? Date: Mon, 19 Jan 2015 14:21:15 +0100 Organization: cbb software GmbH Message-ID: References: Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: YGNMlxhiQ90vAyH0QA4qPw.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:24604 Date: 2015-01-19T14:21:15+01:00 List-Id: On Mon, 19 Jan 2015 12:09:40 +0100, G.B. wrote: > On 18.01.15 21:21, Dmitry A. Kazakov wrote: >> This is a pretty straightforward and simple technique. > > The trouble is with expectations: > > Input: > > ((){)([()[[]])] > > Typical parsers will respond with such useless results > as "error at EOF". Not something that a (close to) > natural language processor can afford, I think. Not with the technique I described. In your example, the operator stack will contain: ( at pos. 2 <--- stack top ( at pos. 1 when } will try to wind it up by popping the last unmatched (. Since } does not match ( you will easily generate "the closing curly bracket at pos. 3 does not match the opening round bracket at pos. 2" Your experience probably come from grammar-generated parsers. The straightforward technique is so much better for all practical purposes, and for error messages generation especially. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de