From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,24d7acf9b853aac8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII
Path: 
 g2news1.google.com!postnews.google.com!s9g2000yqd.googlegroups.com!not-for-mail
From: Natacha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: S-expression I/O in Ada
Date: Mon, 9 Aug 2010 02:55:03 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: <be8d2cc5-c4b9-4e8b-908f-1c565198f11e@s9g2000yqd.googlegroups.com>
References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com>
	<1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net>
 <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com>
	<258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net>
 <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com>
	<osztlhozsld6.cnzz5m4w13ts.dlg@40tude.net>
 <bd4cba52-3e2c-4160-89bd-1f460271bcf9@5g2000yqz.googlegroups.com>
	<46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net>
 <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com>
	<13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net>
 <ee8ec904-60c1-405d-925f-89eba34a44d0@l20g2000yqm.googlegroups.com>
	<1lhdkikeh2sif.bd3pon3knbv8.dlg@40tude.net>
 <7027f0c6-d909-428c-ab8d-6ba1bd7ff4b2@x21g2000yqa.googlegroups.com>
	<1424bzz54867w.soj1iq72wkwl$.dlg@40tude.net>
NNTP-Posting-Host: 95.152.65.220
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1281347703 32567 127.0.0.1 (9 Aug 2010 09:55:03
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Mon, 9 Aug 2010 09:55:03 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: s9g2000yqd.googlegroups.com; posting-host=95.152.65.220;
	posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3)
	Gecko/20100524 Firefox/3.6.3,gzip(gfe)
Xref: g2news1.google.com comp.lang.ada:12977
Date: 2010-08-09T02:55:03-07:00
List-Id: <comp.lang.ada>

On Aug 8, 5:15=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Sun, 8 Aug 2010 06:49:09 -0700 (PDT), Natacha Kerensikova wrote:
> > On Aug 8, 3:01=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
> >> No I do. But you have defined it as a text file. A streamed text file =
is a
> >> sequence of Character items.
>
> > Actually, I didn't. I only defined it as a bunch of byte sequences
> > organized in a certain way.
>
> I see. This confuses things even more. Why should I represent anything as=
 a
> byte sequence? It already is, and in 90% cases I just don't care how the
> compiler does that. Why to convert byte sequences into other sequences an=
d
> then into a text file. It just does not make sense to me. Any conversion
> must be accompanied by moving the thing from one medium to another.
> Otherwise it is wasting.

Representing something as a byte sequence is serialization (at least,
according to my (perhaps wrong) definition of serialization). Actually
there is no byte sequences converted into other sequences converted
into a text file. The only conversion is from in-memory representation
(which happens to be also a byte sequence, but maybe not contiguous or
context-dependent or whatever, that's besides the point) into a
serialized byte sequence.

S-expressions are not a format on top or below that, it's a format
*besides* that, at the same level. Objects are serialized into byte
sequences forming S-expression atoms, and relations between objects/
atoms are serialized by the S-expression format. This is how one get
the canonical representation of a S-expression.

Now depending on the situation one might want additional constrains on
the representation, for example human-readability or being text-based,
and the S-expression standard defines non-canonical representations
for such situations.

> > I know very well these differences, except octet vs character,
> > especially considering Ada's definition of a Character. Or is it only
> > that Character is an enumeration while octet is a modular integer?
>
> The difference is that Character represents code points and octet does
> atomic arrays of 8 bits.

Considering Ada's Character also spans over 8 bits (256-element
enumeration), both are equivalent, right? The only difference is the
intent and the meaning of values, right? (unlike byte vs octet, where
quantitative differences might exist on some platforms).

> > This leads to a question I had in mind since quite early in the
> > thread, should I really use an array of Storage_Element, while S-
> > expression standard considers only sequences of octets?
>
> That depends on what are you going to do. Storage_Element is a
> machine-dependent addressable memory unit. Octet is a machine independent
> presentation layer unit, a thing of 256 independent states. Yes
> incidentally Character has 256 code points.

Actually I've started to wonder whether Stream_Element might even more
appropriated: considering a S-expression atom is the serialization of
an object, and I guess objects which know how to serialize themselves
do so using the Stream subsystem, so maybe I could more easily
leverage existing serialization codes if I use Stream_Element_Array
for atoms. But then I don't know whether it's possible to have object
hand over a Stream_Element_Array representing themselves, and I don't
know either how to deal with cases where Stream_Element is not an
octet.

> >> Once you matched "tcp-connect", you know all the types of the followin=
g
> >> components.
>
> > Unfortunately, you know "80" is a 16-bit integer only after having
> > matched "port".
>
> Nope, we certainly know that each TCP connection needs a port. There is
> nothing to resolve since the notation is not reverse. Parse it top down, =
it
> is simple, it is safe, it allows excellent diagnostics, it works.

Consider:
(tcp-connect (host foo.example) (port 80))
and:
(tcp-connect (port 80) (host foo.example))

Both of these are semantically equivalent, but know which of the tail
atom is a 16-bit integer and which is the string, you have to first
match "port" and "host" head atoms.

Or am I misunderstanding your point?

> >>> This is not always the
> >>> case, for example it might be necessary to build an associative array
> >>> from a list of list before being able to know the type of non-head
> >>> atoms,
>
> >> What for? Even if such cases might be invented, I see no reason to do =
that.
> >> It is difficult to parse, it is difficult to read. So why to mess with=
?
>
> > For example, you might have a sub-S-expression describing a seldom
> > used object that is expensive to build, wouldn't you want to be sure
> > you actually need it before building it?
>
> See above, if you parse top down, you know if you need that object before
> begin. Then having a bracketed structure, it is trivial to skip the
> object's description without construction. Just count brackets.

Well in that example I was considering something outside from the S-
expression selects which object to use. For example a database
containing thousands of templates or whatever, and user selection
picking only one of them.


Thanks for your patience with me,
Natacha