From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,cfb2002511b830ab
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,UTF8
Received: by 10.68.36.6 with SMTP id m6mr24596748pbj.4.1322468512110;
        Mon, 28 Nov 2011 00:21:52 -0800 (PST)
Path: 
 lh20ni28788pbb.0!nntp.google.com!news1.google.com!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: Natasha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: Starter project: getopt_long in Ada
Date: Mon, 28 Nov 2011 08:21:51 +0000 (UTC)
Organization: A noiseless patient Spider
Message-ID: <slrnjd6h4l.1lme.lithiumcat@sigil.instinctive.eu>
References: <slrnjcushg.vl6.lithiumcat@sigil.instinctive.eu>
 <4ecfc4c4$0$6579$9b4e6d93@newsspool3.arcor-online.net>
 <op.v5lityfgule2fv@douda-yannick>
 <slrnjd4bar.vl6.lithiumcat@sigil.instinctive.eu>
 <op.v5l1t4w1ule2fv@douda-yannick>
Mime-Version: 1.0
Injection-Date: Mon, 28 Nov 2011 08:21:51 +0000 (UTC)
Injection-Info: mx04.eternal-september.org;
 posting-host="Mda950WjNwNLAFOE7yJXQw";
	logging-data="23558"; mail-complaints-to="abuse@eternal-september.org";
	posting-account="U2FsdGVkX18WLH36JxUxPLunGlYV8dKc"
User-Agent: slrn/0.9.9p1 (FreeBSD)
Cancel-Lock: sha1:KOmeOVwm/NDsossNHXtTq/zjWZM=
Xref: news1.google.com comp.lang.ada:19206
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Date: 2011-11-28T08:21:51+00:00
List-Id: <comp.lang.ada>

On 2011-11-27, Yannick Duchêne <yannick_duchene@yahoo.fr> wrote:
> Le Sun, 27 Nov 2011 13:30:28 +0100, Natasha Kerensikova  
><lithiumcat@gmail.com> a écrit:
>> It's just that I don't like at all having a tagged type for only one
>> dispatching operation (and no obvious need for internal state).
> If you think Specification, you should not say “I” here.

I'm sorry, but I don't understand what you mean there.

>> Of course that's only when considering related operations. In another
>> project that I will publish soon (still needs a bit of polishing), I use
>> two access-to-subprograms, but they are really meant to be completely
>> different sources (one creates tokens from input while the other outputs
>> the token, and the whole point of the separation is to have different
>> input-analysis and output-generation that can plugged together). So in
>> that case, I count them as two independant single-operation cases.
> Your Markdown processor ?

More or less yes. It's actually a generic (lightweight) markup
processor, and Markdown is only one of the input-to-token callback sets.
I'm not sure exactly what kind of expressive power is has, but it's at
least suitable for Creole and Textile too.

The rough design is based around an array created by the client for the
library engine, whose elements are a record containing a lexer callback
(the input-to-token part), a renderer callback (token-to-output), an
indication for engine to not keep calling every entry all the time, and
a priority that breaks ties.

The engine calls the input-to-token part when adequate, which updates
the current position and returns a Token'Class object, which is then fed
to the corresponding token-to-output callback.

One thing I don't like much in this design is that the token type must
match: for example when parsing a link, the input-to-token will create a
Link_Token that contains the linked URI, a title and the link text. The
token-to-output will need all that information, so I'm casting the
Token'Class back into a Link_Token. But what if the client mismatched
the callbacks and it got a String_Token instead? That raises a run-time
exception, while the information is already there at compile-time.

Well in theory an input-to-token callback could create different token
types (e.g. when the link title is optional, instead of putting an
empty string in a Link_Token, have a Titled_Link_Token and a
titleless Link_Token). But I would gladly force each input-to-token
callback to create a single token type if that can be statically checked
against the expected token-to-output type.

Anyway, back to the original point, the library will also provide as
examples standard Markdown, various Markdown extensions, Creole and
Textile input-to-token callback sets, and some output-to-token callback
sets as well (I can't think of anything other than HTML and XHTML right
now). The whole token layer is meant to make them independent, so it
would be counterproductive to make one class for each combination only
to remove two accesses-to-subprogram.

>> My wild guess is that tagged types would need two dereferences while
>> access to subprogram only one,
> With an access to a subprogram, there is a reference to an address and an  
> address dereference; with a tagged type, there is a reference to an  
> instance and a selector (is that the good word? I'm not sure). I see two  
> for both.

Maybe I'm too tainted with C++ implementation details, but what I
imagined was the access-to-subprogram containing the address of the
address of the subprogram, so that's one dereference, while a tagged
type instance would contain a reference to a dispatch table (shared by
all instances) which itself contains a reference to the actual code.

It feels like a waste of resources to have a one-entry dispatch table,
but I'm not sure dealing with it optimally is really worth the extra
complexity in compilers.


Natasha