From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,24d7acf9b853aac8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII
Path: 
 g2news1.google.com!news2.google.com!postnews.google.com!f42g2000yqn.googlegroups.com!not-for-mail
From: Natacha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: S-expression I/O in Ada
Date: Wed, 11 Aug 2010 02:43:58 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: 
 <d2b60b2d-4170-4769-a7ff-529335a2c963@f42g2000yqn.googlegroups.com>
References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com>
	<bd4cba52-3e2c-4160-89bd-1f460271bcf9@5g2000yqz.googlegroups.com>
	<46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net>
 <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com>
	<13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net>
 <ee8ec904-60c1-405d-925f-89eba34a44d0@l20g2000yqm.googlegroups.com>
	<1lhdkikeh2sif.bd3pon3knbv8.dlg@40tude.net>
 <7027f0c6-d909-428c-ab8d-6ba1bd7ff4b2@x21g2000yqa.googlegroups.com>
	<1424bzz54867w.soj1iq72wkwl$.dlg@40tude.net>
 <be8d2cc5-c4b9-4e8b-908f-1c565198f11e@s9g2000yqd.googlegroups.com>
	<drsq220hoiki.ps12jg4zocj5.dlg@40tude.net>
 <9db37b80-acbb-4c9f-a646-34f108f52985@v15g2000yqe.googlegroups.com>
	<16xmnn0qe5yog.ii1p0ap9yuth$.dlg@40tude.net>
 <5d1d705a-008a-43f1-aa19-9b4878ec926b@m1g2000yqo.googlegroups.com>
	<mhclav6yb3gy.1dk04ags83k42$.dlg@40tude.net>
NNTP-Posting-Host: 178.83.214.115
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1281519838 30249 127.0.0.1 (11 Aug 2010 09:43:58
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Wed, 11 Aug 2010 09:43:58 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: f42g2000yqn.googlegroups.com; posting-host=178.83.214.115;
	posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3)
	Gecko/20100524 Firefox/3.6.3,gzip(gfe)
Xref: g2news1.google.com comp.lang.ada:13111
Date: 2010-08-11T02:43:58-07:00
List-Id: <comp.lang.ada>

On Aug 10, 5:46=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Tue, 10 Aug 2010 05:06:29 -0700 (PDT), Natacha Kerensikova wrote:
> > The first object is the internal memory representation designed for
> > actual efficient use. For example, an integer will probably be
> > represented by its binary value with machine-defined endianness and
> > machine-defined size.
>
> > The other object is a "serialized" representation, in the sense that
> > it's designed for communication and storage, for example the same
> > integer, in a context where it will be sent over a network, can be for
> > example represented as an ASCII-encoded decimal number, or in binary
> > but with a predefined size and endianness.
>
> Why don't you send it at once?

As I said, I can't just insert the raw object in the stream, I need at
least to know its size. I might need further inspection of the
serialized representation in case I want a "smart" chose of atom
encoding, but I'm afraid it wasn't a good idea to mention that point
because it doesn't fit into the simplified concept of S-expression I
have been discussing with you for quite some posts.

However my proposed Sexp_Stream does send it as soon as it gets the
whole representation, and this idea comes from our discussion.

> > This is really the same
> > considerations as when storing or sending an object directly, except
> > that is has to reside in memory for a short time. There is no more
> > conversions or representations than when S-expression-lessly storing
> > or sending objects; the only difference is the memory buffering to
> > allow S-expression-specific information to be inserted around the
> > stream.
>
> This is impossible in general case, so the question why. As an example
> consider a stateful communication protocol (existing in real life) which =
is
> reacts only on changes. When you send your integer nothing happens becaus=
e
> the device reacts only when the bit pattern changes. So if you wanted to
> really send it to another side you have to change something in the
> representation of integer, e.g. to toggle some extra bit.

Well obviously S-expressions aren't designed to be transmitted over
such a protocol. The basic assumption behind S-expression that we can
transmit/store/whatever octet sequences and receive/retrieve/whatever
them intact. When the assumption doesn't hold, either something must
be done to make it true (e.g. add another layer) or S-expressions must
be abandoned.

For example, S-expressions small enough to fit in one packet can be
easily transferred over UDP. S-expression parsers (or at least mine)
handle well fragmented data (even when unevenly fragmented) but fail
when data is missing or mis-ordered, which prevent large S-expressions
to be simply spread over as many packets as needed. However one might
solve this issue by adding a sequence number inside the UDP payload,
along with a mechanism to re-send lost packet; however that would be
(at least partially) re-inventing TCP.

> >> What was the problem then?
>
> > The problem is to organize different objects inside a single file. S-
> > expression standardize the organization and relations between objects,
> > while something else has to be done beyond S-expression to turn
> > objects into representations suitable to live in a file.
>
> > [...]
>
> Yes, I don't see how S-expression might help there. They do not add value=
,
> because the work of serialization or equivalent to serialization is alrea=
dy
> done while construction of the expression object.

There are two things that are to be serialized: objects into atoms,
and relations between objects into S-expression-specific stuff. The S-
expression object is an unserialized in-memory representation of
relations between serialized representations of objects. The writing
of an S-expression into a stream is the final serialization-of-
relations stage.

> There are two questions to discuss:
>
> 1. The external storage format: S-expressions vs. other
> 2. Construction of an intermediate S-expression object in the memory
>
> You justified #1 by an argument to legacy. You cannot [re-]use that
> argument for #2. (OK, since Ludovic had already done it, you could (:-))

I don't re-use that argument. And actually if you followed the
description of my Sexp_Stream, I don't need a S-expression object in
memory, I only need serialized representation of atoms. The rest can
be directly send into a stream.

And while I occasionally feel the need of an in-memory S-expression
object, so far it has never been for writing or sending, it was always
for specific sub-S-expression that are read or received. I believe
this need happens only when I have a variable of type S-expression,
which I consider to be as good a type as String or Natural. It is then
a data-structure choice, which happens at a higher level than
encoding, serialization, I/O or most of what we have discussed so far.

> >> Why do you need S-sequence in the memory, while dumping
> >> objects directly into files as S-sequences (if you insist on having th=
em)
> >> is simpler, cleaner, thinner, faster.
>
> > Because I need to examine the S-sequence before writing it to disk, in
> > order to have enough information to write S-expression metadata. At
> > the very lest, I need to know the total size of the atom before
> > allowing its first byte to be send into the file.
>
> That does not look like a stream! But this is again about abstraction
> layers. Why do you care?

The "verbatim encoding" of an atom, which is the only one allowed in
canonical representation of a S-expression, is defined as follow: a
size, represented as the ASCII encoding of the decimal representation
of the number of octets in the atom, without leading zero (therefore
of variable length); followed by the ASCII character ':'; followed by
the octet sequence of the atom.

You can't write an atom using such a format when you don't know in
advance the number of octets in the atom.

The idea behind S-expressions could be seen as the serialization of a
list of serialized objects. When serializing such a list one much be
able to distinguish between the different objects; to the best of my
knowledge this can only be done either by keeping track of object
sizes, or by using separators. To prevent the restriction of possible
atom contents, the first solution has been chosen.

> > That sounds like a very nice way of doing it. So in the most common
> > case, there will still be a stream, provided by the platform-specific
> > socket facilities, which will accept an array-of-octets, and said
> > array would have to be created from objects by custom code, right?
>
> Yes, if TCP sockets is what you use. There is a hell of other protocols
> even on the Ethernet, some of which are not stream-oriented.

But you were talking about Octet'Read and Octet'Write. Aren't these
Ada Stream based stuff?

> >> In other post Jeffrey Carter described this as low-level. Why not to t=
ell
> >> the object: store yourself and all relations you need, I just don't ca=
re
> >> which and how?
>
> > That's indeed a higher-level question. That's how it will happen at
> > some point in my code; however at some other point I will still have
> > to actually implement said object storage, and that's when I will
> > really care about which and how. I'm aware from the very beginning
> > that a S-expression library is low-level and is only used by mid-level
> > objects before reaching the application.
>
> This is what caused the questions. Because if the problem is serializatio=
n,
> then S-expression does not look good.

Why? Because it's a partial serialization? Because it serializes stuff
you deem as useless? Because it's a way of serializing stuff you would
have serialized in another way? I still don't understand what is so
bad with S-expressions. While I understand gut-rejection of anything-
with-parentheses (including lisp and S-expressions), you seem way
above that.

> >> You do not need S-expressions here either. You can
> >> store/restore templates as S-sequences. A template in the memory would=
 be
> >> an object with some operations like Instantiate_With_Parameters etc. T=
he
> >> result of instantiation will be again an object and no S-sequence.
>
> > Well how would solve the problem described above without S-
> > expressions? (That's a real question, if something simpler and/or more
> > efficient than my way of doing it exists, I'm genuinely interested.)
>
> The PPN, a simple stack machine. Push arguments onto the stack, pop to
> execute an operation. Push the results back. Repeat.

Does that allows to push an operation and its arguments, to have it
executed by another operation? S-expressions do it naturally, and I
find it very useful in conditional or loop constructs.


Thanks for the discussion,
Natacha