From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,24d7acf9b853aac8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII
Path: 
 g2news1.google.com!postnews.google.com!v15g2000yqe.googlegroups.com!not-for-mail
From: Natacha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: S-expression I/O in Ada
Date: Tue, 10 Aug 2010 01:56:22 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: 
 <9db37b80-acbb-4c9f-a646-34f108f52985@v15g2000yqe.googlegroups.com>
References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com>
	<8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com>
	<258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net>
 <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com>
	<osztlhozsld6.cnzz5m4w13ts.dlg@40tude.net>
 <bd4cba52-3e2c-4160-89bd-1f460271bcf9@5g2000yqz.googlegroups.com>
	<46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net>
 <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com>
	<13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net>
 <ee8ec904-60c1-405d-925f-89eba34a44d0@l20g2000yqm.googlegroups.com>
	<1lhdkikeh2sif.bd3pon3knbv8.dlg@40tude.net>
 <7027f0c6-d909-428c-ab8d-6ba1bd7ff4b2@x21g2000yqa.googlegroups.com>
	<1424bzz54867w.soj1iq72wkwl$.dlg@40tude.net>
 <be8d2cc5-c4b9-4e8b-908f-1c565198f11e@s9g2000yqd.googlegroups.com>
	<drsq220hoiki.ps12jg4zocj5.dlg@40tude.net>
NNTP-Posting-Host: 178.83.214.115
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1281430582 5712 127.0.0.1 (10 Aug 2010 08:56:22
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 10 Aug 2010 08:56:22 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: v15g2000yqe.googlegroups.com; posting-host=178.83.214.115;
	posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3)
	Gecko/20100524 Firefox/3.6.3,gzip(gfe)
Xref: g2news1.google.com comp.lang.ada:13041
Date: 2010-08-10T01:56:22-07:00
List-Id: <comp.lang.ada>

On Aug 9, 12:56=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Mon, 9 Aug 2010 02:55:03 -0700 (PDT), Natacha Kerensikova wrote:
> > On Aug 8, 5:15=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
> > wrote:
> >> I see. This confuses things even more. Why should I represent anything=
 as a
> >> byte sequence? It already is, and in 90% cases I just don't care how t=
he
> >> compiler does that. Why to convert byte sequences into other sequences=
 and
> >> then into a text file. It just does not make sense to me. Any conversi=
on
> >> must be accompanied by moving the thing from one medium to another.
> >> Otherwise it is wasting.
>
> > Representing something as a byte sequence is serialization (at least,
> > according to my (perhaps wrong) definition of serialization).
>
> "Byte" here is a RAM unit or a disk file item? I meant the former. All
> objects are already sequences of bytes in the RAM.

I have to admit I tend to confuse these two kind of bytes. Considering
my C-bitching background, I call "byte" C's char, i.e. the smallest
memory unit addressable independently (from the language point of
view, I'm well aware modern i386 have memory bus much wider than 8
bits, and transfer whole chunks at one time, but the CPU is following
the "as-if" rule so it looks like 8-bits are addressed independently,
hence 8-bit chars on these platforms). My confusion is that I often
take for granted that disk files can be addressed the same way, while
I perfectly know it's not always the case (e.g. DS has a 16-bit GBA
bus and 8 bit operations have to load the whole 16 bit chunk, just
like operating on 4-bit nibbles on i386).

It must be because I've never thought of disk I/O as unbuffered, and
the buffer does follow RAM rules of addressing.

> > S-expressions are not a format on top or below that, it's a format
> > *besides* that, at the same level. Objects are serialized into byte
> > sequences forming S-expression atoms, and relations between objects/
> > atoms are serialized by the S-expression format. This is how one get
> > the canonical representation of a S-expression.
>
> I thought you wanted to represent *objects* ... as S-sequences?

It depends what you call object. Here again, my vocabulary might has
been tainted by C Standard. Take for example a record, I would call
each component an object, as well as the whole record itself. I guess
I would call "object" an area of memory meaning something for the
compiler (e.g. a byte in some multibyte internal representation of
anything is not an object, while the byte sequence in RAM used by the
compiler to know what an access variable accesses is an object).

A S-expression atom must be build from something, so from this
definition it comes that whatever is turned into a S-expression atom
is an object (Well one could make an atom from multiple object but I
don't think it's a good idea). So there are objects meant to end up in
a single S-expression atom.

There are indeed objects that I want to represent as more elaborate S-
expressions. For example I wouldn't store a record into a single atom,
I'd rather define a S-expression specification to represent it and
store only simple components into atoms. But that's purely a personal
choice, motivated by the preference for human-readable disk files.

> Where these representations are supposed to live? In the RAM? Why are you
> then talking about text files, configurations and humans reading somethin=
g?
> I cannot remember last time I read memory dump...

Ultimately these representations are supposed to be stored on disk or
transferred over a network or things like that, hence the need to be
built from serialized representations of objects.

However there is processing I might want to perform that require them
to be buffered in memory. So they temporary live in RAM until they are
stored or sent.

Text file, configurations and human reading somethings are details of
some applications of S-expressions, that might or might not be
relevant in a discussion about the deep nature of what a S-expression
is. But they have to be taken into consideration when a S-expression
library is written, lest the library turn out unsuitable for these
applications.

> >> The difference is that Character represents code points and octet does
> >> atomic arrays of 8 bits.
>
> > Considering Ada's Character also spans over 8 bits (256-element
> > enumeration), both are equivalent, right?
>
> Equivalent defiled as? In terms of types they are not, because the types
> are different. In terms of the relation "=3D" they are not either, becaus=
e
> "=3D" is not defined on the tuple Character x Unsigned_8 (or whatever).

Sorry, "equivalent" in the mathematical that there is a bijection
between the set of Characters and the set of Octets, which allows to
use any of them to represent the other. Agreed, this a very week
equivalence, it just means there are exactly as many octet values as
Character values.

On the other hand, Storage_Element and Character are not bijection-
equivalent because there is no guarantee they will always have the
same number of values, even though they often do.

> > Actually I've started to wonder whether Stream_Element might even more
> > appropriated: considering a S-expression atom is the serialization of
> > an object, and I guess objects which know how to serialize themselves
> > do so using the Stream subsystem, so maybe I could more easily
> > leverage existing serialization codes if I use Stream_Element_Array
> > for atoms.
>
> Note that Stream_Element is machine-depended as well.

I'm sadly aware of that. I need an octet-sequence to follow the S-
expression standard, and there is here an implementation trade-off:
assuming objects already know how to serialize themselves into a
Stream_Element_Array, I can either code a converter from
Stream_Element_Array to octet-sequence, or reinvent the wheel and code
a converter for each type directly into an octet-sequence. For some
strange reason I prefer by far the first possibility.

> > But then I don't know whether it's possible to have object
> > hand over a Stream_Element_Array representing themselves,
>
> This does not make sense to me, it is mixing abstractions:
> Stream_Element_Array is a representation of an object in a stream.
> Storage_Array might be a representation of in the memory. These are two
> different objects. You cannot consider them same, even if they shared
> physically same memory (but they do not). The whole purpose of
> serialization to a raw stream is conversion of a Storage_Array to
> Stream_Element_Array. Deserialization is a backward conversion.

I don't consider them the same, otherwise I wouldn't be pondering
about which one to use.

If it helps, you can think of S-expressions as a standardized way of
serializing some relations between objects. However the objects still
have to be serialized, and that's outside of the scope of S-
expressions. From what I understood, the existing serializations of
objects use Stream_Element_Array as a low-level type. So the natural
choice for serializing the relations seems to be taking the
Stream_Element_Array from each object, and hand over to the lower-
level I/O a unified Stream_Element_Array.

Does it make sense or am I completely missing something?

> The point is that you never meet 80 before knowing that this is a "port",
> you never meet "port" before knowing it is of "tcp-connect". You always
> know all types in advance. It is strictly top-down.

Right, in that simple example it the case. It is even quite often the
case, hence my thinking about a Sexp_Stream in another post, which
would allow S-expression I/O without having more than a single node at
the same time in memory.

But there are still situations where S-expression have to be stored in
memory. For examples the templates, where S-expressions represent a
kind of limited programming language that is re-interpreted for each
template extension. Either store the S-expression in memory or re-read
the file form disk (which doesn't work efficiently when there is more
than one template in a file).

> > Thanks for your patience with me,
>
> You are welcome. I think from the responses of the people here you see th=
at
> the difference between Ada and C is much deeper than begin/end instead of
> curly brackets.

Of course, if there was no other difference I would have stayed in C
(I still prefer brackets rather than words).

> Ada does not let you through without a clear concept of
> what are you going to do. Surely with some proficiency one could write
> classical C programs in Ada, messing everything up. You could even create
> buffer overflow in Ada. But it is difficult for a beginner...

I'm not here to Cifiy Ada. C is undoubtedly the best language to do C-
styled stuff.

I felt a mismatch between the nature of C and my way of programming. I
got interested in Ada (rather than in any language in the plethora of
existing languages out there) because it seemed to have what I lacked
in C while still having what I like in C. The former being roughly the
emphasis on robustness, readability and quality; the latter being
roughly the low-level (as in "close to the hardware", at least close
enough not to rule out programming for embedded platforms and system
programming) and the performance. The previous phrase lacking a lot of
irrational elements that I don't manage to word and that can be
summarized as "personal taste".

Now I can't explain why your posts often make me feel Ada is
completely out of my tastes in programming languages, while other
people's posts often make me feel Ada is probably an improvement over
C in term of personal preferences. However I have the feeling I got
much more personal improvement from your posts (improvement still
meaningful even if I eventually drop Ada to come back to C), and I'm
grateful for that.


Thanks for your help,
Natacha