From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,24d7acf9b853aac8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII
Path: 
 g2news1.google.com!postnews.google.com!5g2000yqz.googlegroups.com!not-for-mail
From: Natacha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: S-expression I/O in Ada
Date: Sat, 7 Aug 2010 00:23:01 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: <bd4cba52-3e2c-4160-89bd-1f460271bcf9@5g2000yqz.googlegroups.com>
References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com>
	<1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net>
 <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com>
	<258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net>
 <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com>
	<osztlhozsld6.cnzz5m4w13ts.dlg@40tude.net>
NNTP-Posting-Host: 178.83.214.115
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1281165781 7374 127.0.0.1 (7 Aug 2010 07:23:01
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sat, 7 Aug 2010 07:23:01 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: 5g2000yqz.googlegroups.com; posting-host=178.83.214.115;
	posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3)
	Gecko/20100524 Firefox/3.6.3,gzip(gfe)
Xref: g2news1.google.com comp.lang.ada:12922
Date: 2010-08-07T00:23:01-07:00
List-Id: <comp.lang.ada>

On Aug 1, 11:13=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Sun, 1 Aug 2010 13:06:10 -0700 (PDT), Natacha Kerensikova wrote:
> > On Aug 1, 8:49=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
> > wrote:
> >> Hmm, why do you consider brackets as separate elements?
> > Because that's the definition of S-expressions :-)
> OK, why S-expressions, then? (:-))

First, because this is the existing format in most of my programs, so
for interoperability I can choose only between using this format or
converting from and to it (or rewrite everything). In both cases there
is a strong code-reuse argument in favor of writing a S-expression
library instead of writing a bunch of ad-hoc similar code chunks.

Second, because I like this format and I find it good (see below).

> > I'm referring to this almost-RFC:http://people.csail.mit.edu/rivest/Sex=
p.txt
> I see, yet another poor data format.

Could you please explain why it is so poor?

I consider it good because it is flexible, expressive and simple. I
already mentioned quite a few times why it looks simple: my parser
written from scratch in ~1000 lines. I looks expressive, because most
data structures I can think of, and all data structures I have
actually used, can be easily represented: an array can be written down
as a list, a hash table as a list of two-element lists (key then
value), and so on. And I see flexibility coming from the fact that any
sequence of bytes can be encoded in an atom.

What am I missing?

> > I use S-expression as a sort of universal text-based container, just
> > like most people use XML these days,
>
> These wonder me too. I see no need in text-based containers for binary
> data. Binary data aren't supposed to be read by people, they read texts.
> And conversely the machine shall not read texts, it has no idea of good
> literature...

Actually this an interesting remark, it made me realize I'm mixing two
very different use of a data format (though I still think S-
expressions are adequate for both of them):

I think text-based format is very useful when the file has to be dealt
with by both humans and programs. The typical example would be
configuration files: read and written by humans, and used by the
program. And that's where I believe XML is really poor, because it's
too heavy for human use. I occasionally feel the need of embedding
binary data in some configuration files, e.g. cryptographic keys or
binary initialization sequences to send as-is over whatever
communication channel. In these occasion I do use the text-based
binary encoding allowed by S-expressions, base-64 or hexadecimal, so
that the configuration file is still a text file. The huge advantage
of text files here is that there is already a lot of tools to deal
with it, while using a binary format would require writing specific
tools for humans to deal with it, with is IMO a waste of time compared
to the text-based approach.

The other application is actual serialization, i.e. converting
internal types into a byte sequence in order to be stored on disk or
transmitted over a network or whatever. In this situation, humans
don't need to interact with the data (except for debugging purposes,
but it's an argument so light it's only justified when everything else
is otherwise equal).


In my previous posts I have talked a lot about serialization, while my
actual use of S-expression is more often the first one. And
historically I first used S-expressions for configuration files,
because of their expressiveness over all other text format I know,
while still being extremely simple. I then used this format for
serialization mostly for code reuse sake: I did have a code for S-
expressions, so it was very cheap to use it for serialization
purposes. Compared to using another serialization format, it leads to
less code being more used, hence less opportunities to write bugs and
more opportunities to find and fix bugs. So it seems like a very
rational choice.

> > The library is not supposed to care about what those some_stuff_ are.
> > Actually, the library is suppose to get binary data from the
> > application along with the tree structure described above, store it
> > into a sequence of bytes (on a disk or over a network or whatever),
> > and to retrieve from the byte sequence the original tree structure
> > along with the binary data provided.
>
> Serialize can yield a binary chunk, but if you have to write it anyway, w=
hy
> not to write text? You insisted on having text, why do mess with binary
> stuff?

I hope the above already answered this: I mostly use S-expressions for
text purposes, yet occasionally I feel the need of embedding binary
data. Of course there are a lot of way to testify binary data, like
hexadecimal or base-64, but considering S-expressions already handle
textification, I don't see the point of having the application deal
with it too.

> > But now that I think about it, I'm wondering whether I'm stuck in my C
> > way of thinking and trying to apply it to Ada. Am I missing an Ada way
> > of storing structured data in a text-based way?
>
> I think yes. Though it is not Ada-specific, rather commonly used OOP desi=
gn
> patterns.

I heard people claiming that the first language shapes the mind of
coders (and they continue saying a whole generation of programmers has
been mind-crippled by BASIC). My first language happened to be 386
assembly, that might explain things. Anyway, I genuinely tried OOP
with C++ (which I dropped because it's way too complex for me (and I'm
tempted to say way too complex for the average coder, it should be
reserved to the few geniuses actually able to fully master it)), but I
never felt the need of anything beyond what can be done with a C
struct containing function pointers.

Now back to the topic, thanks to your post and some others in this
thread (for which I'm also thankful), I came to realize my mistake is
maybe wanting to parse S-expressions and atom contents separately. The
problem is, I just can't manage to imagine how to go in a single step
from the byte sequence containing a S-expression describing multiple
objects to the internal memory representation and vice-versa.


Thanks for your help and your patience,
Natacha