From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
Path: 
 border2.nntp.dca1.giganews.com!nntp.giganews.com!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!feeder.erje.net!eu.feeder.erje.net!news2.arglkargh.de!news.mixmin.net!aioe.org!.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: [Slightly OT] How to process lightweight text markup languages?
Date: Sun, 18 Jan 2015 21:21:18 +0100
Organization: cbb software GmbH
Message-ID: <ynm6coktfevl.1esu61g1n9477.dlg@40tude.net>
References: <slrnmbntdm.19vl.lithiumcat@nat.rebma.instinctive.eu>
Reply-To: mailbox@dmitry-kazakov.de
NNTP-Posting-Host: 0MSBVPcE8EdvhPFyEbPM4g.user.speranza.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: 40tude_Dialog/2.0.15.1
X-Notice: Filtered by postfilter v. 0.8.2
Xref: number.nntp.giganews.com comp.lang.ada:191907
Date: 2015-01-18T21:21:18+01:00
List-Id: <comp.lang.ada>

On Sun, 18 Jan 2015 18:04:08 +0000 (UTC), Natasha Kerensikova wrote:

[...] 

> My latest attempts involve keeping the online architecture with separate
> input and output types and streams, and keeping a stack of currently
> opened constructs, with a dynamically dispatching ending test on each
> character for each construct on the stack. It feels horribly
> inefficient and complicated.

Nothing complicated and most efficient in the sense that depending on the
formal language classification you could not be able eliminate the stack
(memory) in some or other form. You could think of it in terms of possible
states of the parser. If the number of states is infinite you must have
stack or else the parser itself must be infinite. Simple example is parsing
the language of balanced brackets: ()(()).
 
> Or maybe giving up the online capacity, making the assumption that any
> reasonable input can fit in memory, and build a serious
> semantically-loaded token sequence, and build a tree from it.

This is equivalent to having stacks.

It seems that the language of yours has brackets and prefix operators.
Nothing complicated, really. You push operands (terms) onto one stack.
Operators and brackets are pushed onto another stack. Precedence rules wind
up the second stack while popping operands from the first. This is a pretty
straightforward and simple technique.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de