From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail From: "G.B." Newsgroups: comp.lang.ada Subject: Re: [Slightly OT] How to process lightweight text markup languages? Date: Mon, 19 Jan 2015 12:09:40 +0100 Organization: A noiseless patient Spider Message-ID: References: Reply-To: nonlegitur@futureapps.de Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Mon, 19 Jan 2015 11:09:07 +0000 (UTC) Injection-Info: mx02.eternal-september.org; posting-host="b96887e80893c84a90c3007226ca0d1c"; logging-data="31541"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+ndRh4YESk5Z9o+Z/nzGUGsDfSQ4f25vY=" User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 In-Reply-To: Cancel-Lock: sha1:QEUG4OfVsO6tGprHd0XoMkgRf3E= Xref: news.eternal-september.org comp.lang.ada:24603 Date: 2015-01-19T12:09:40+01:00 List-Id: On 18.01.15 21:21, Dmitry A. Kazakov wrote: > This is a pretty > straightforward and simple technique. The trouble is with expectations: Input: ((){)([()[[]])] Typical parsers will respond with such useless results as "error at EOF". Not something that a (close to) natural language processor can afford, I think. What is needed, maybe, is a way to judiciously transcend the simplicity of a dumb, straight forward stack with REJECT at the end. The processor should, after all, output text that is maximally useful because this justifies the effort in the first place. What syntactical criteria are there, if any, that could be the input to finding these maxima? Is context dependence required? A very simple example is EOL, if there is one: one corrupted line of output is better than all remaining lines corrupted. Can "judicious" entail the use of special casing and be done? Would it be possible to describe a fixed point so that the translation functions would close in on this best result?