From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,24d7acf9b853aac8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!news4.google.com!feeder.news-service.com!newsfeed.straub-nv.de!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail
From: Natasha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: S-expression I/O in Ada
Date: Wed, 18 Aug 2010 13:55:51 +0000 (UTC)
Organization: A noiseless patient Spider
Message-ID: <slrni6npj7.1efq.lithiumcat@sigil.instinctive.eu>
References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com>
 <slrni6lg3e.1efq.lithiumcat@sigil.instinctive.eu>
 <i4em7q$1pcu$1@adenine.netfront.net>
 <slrni6nem3.1efq.lithiumcat@sigil.instinctive.eu>
 <ebc7b61e-f12d-4d44-9463-0d6a4947fd19@l6g2000yqb.googlegroups.com>
 <slrni6nioh.1efq.lithiumcat@sigil.instinctive.eu>
 <5f5303d4-075f-48ec-bd9b-17c9052cadd6@k10g2000yqa.googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 18 Aug 2010 13:55:51 +0000 (UTC)
Injection-Info: mx01.eternal-september.org;
 posting-host="Mda950WjNwNLAFOE7yJXQw";
	logging-data="10715"; mail-complaints-to="abuse@eternal-september.org";
	posting-account="U2FsdGVkX1856a6iaHB2ogbXIsbHgLfo"
User-Agent: slrn/0.9.9p1 (FreeBSD)
Cancel-Lock: sha1:Cbg0dx2p6bJN2C2QOinsAozcKCI=
Xref: g2news1.google.com comp.lang.ada:13485
Date: 2010-08-18T13:55:51+00:00
List-Id: <comp.lang.ada>

On 2010-08-18, Ludovic Brenta <ludovic@ludovic-brenta.org> wrote:
> Natasha Kerensikova wrote on comp.lang.ada:
>> Then I don't really understand the point of having both Vectors and
>> Doubly_Linked_Lists. The interface of Vectors is a strict superset of
>> the interface of Doubly_Linked_Lists, and the only difference in
>> complexity (which is advice anyway, so non-guaranteed) is on Prepend.
>
> No, there are similar differences in complexity in Delete, Insert and
> Append (the Vector occasionally has to reallocate its internal storage
> and copy all existing elements over; the Doubly_Linked_List not).

Usually, and probably also in reality, yes. It's just that the ARM
doesn't mention them, so we can't rely on them unless we first define
the platform.

ARM only mentions worst-case time complexity, as advice on top of that,
and they are all O(log N) for Vectors.Element, Vectors.Append,
D_L_Lists.Element, D_L_Lists.Insert and D_L_Lists.Delete; O(N log N) for
Vectors.Prepend and Vectors.Delete_First; and O(N**2) for sort.
These are quite conservative requirements anyways.

> Vector provides random access, Doubly_Linked_List does not. So, while
> the two interfaces are similar, Vectors and Lists are for different
> kinds of problems with different time complexities.

So as I said in another post, considering I don't anything more than the
common interface and that I won't use expensive operations on any of
them, which one do I choose?

>> Obviously, discrete type serialization alone is not enough, as I will
>> have non-discrete atomic object to somehow turn into atoms. And your
>> _Blob functions suffer from the same issue as Jeffrey Carter's
>> implementation in that it's just a memory dump.
>
> Why is that a problem? The only reason I can think of is that your
> type contains access values, in which case it's simply not an atom but
> rather a cons pair (or a list).

Serialization isn't only about access values. Or maybe I'm misusing the
word.

Anyway, my (and Rivest's) S-expressions being meant for storage and
transport, there are other issues with memory dumps than access values.
For example, endianness, space-efficiency, or human-readability.

>>> In my implementation, it is also possible to turn any kind of object
>>> into an atom by providing a pair of To_String, From_String operations,
>>> but these operations actually perform the serialization you were
>>> trying to hide ;/
>>
>> Actually I'm not trying to hide the serialization, I only want to have
>> it happen in the client object package, because I assume the object is
>> better qualified to know how to serialize itself rather than the
>> S-expression package.
>
> Then there is still no problem:
>
> with S_Expression;
> procedure Client is
>    type Complex_Atom is record ... end record;
>    function To_String (A : Complex_Atom) return String is separate;
>    -- does the difficult part of the serialization to string
>    function To_Atom (A : Complex_Atom) return S_Expression.T is
>    begin
>       return S_Expression.To_Atom (To_String (A));
>    end To_Atom;
> begin
>    ...
> end Client;
>
> Here, the serialization is indeed in the client.

Yes, and that's exactly what I would do if I was forced to hide Atom
type definition; except I wouldn't use String as an intermediate value,
rather Storage_Element_Array or Stream_Element_Array, because I allow both
string and binary contents in atoms.

Which then raises the question, why not use array-of-octet directly, to
avoid issues when Storage_Element or Stream_Element are not octets.

But then I could make a public array-of-octets type, and provide
functions to "convert" back and forth between the private Atom type and
public array-of-octets type. Would that be acceptable?

> Therefore, as String is not a blob. The blob needs to be encoded into
> ASCII characters, the String does not because it already consists of
> characters. Therefore From_Blob hex-encodes the blob into a String.

My (and Rivest's) atoms can contain any octet sequence, which makes the
hex encoding irrelevant. So I do treat strings as blobs, they are both
data only a type away from being an atom.

You could somehow say that strings and blobs need only to be "cast" into
an atom, while other objects need to be serialized first (usually into
strings or machine-independant blobs).

>> Let's take for example a Wide_Wide_String (e.g. because that's how my
>> application handles strings internally), which is not so good as an
>> example because a memory dump wouldn't be so much of an issue (except
>> maybe for endianness). Let's further assume I want it serialized in
>> UTF-8. How do I do that?
>
> By modifying my implementation a little:

I think you're missing the point. I don't want to modify any
s-expression package implementation whenever I use a new type in an
application.

Let's say for example I have a thousand of applications using Strings,
and one using Wide_Wide_Strings, I can't see how it can be justified to
change (or fork) the S-expression package to take it into account.

It seems much more natural, at least to my C-used eyes, to ask that one
application to provide whatever is needed to serialize Wide_Wide_Strings
into atoms.


Thanks for your tought-prvoking comments,
Natacha