From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD,
	FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,24d7acf9b853aac8
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII
Path: 
 g2news1.google.com!postnews.google.com!l20g2000yqm.googlegroups.com!not-for-mail
From: Natacha Kerensikova <lithiumcat@gmail.com>
Newsgroups: comp.lang.ada
Subject: Re: S-expression I/O in Ada
Date: Sun, 8 Aug 2010 05:23:37 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: 
 <ee8ec904-60c1-405d-925f-89eba34a44d0@l20g2000yqm.googlegroups.com>
References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com>
	<1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net>
 <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com>
	<258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net>
 <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com>
	<osztlhozsld6.cnzz5m4w13ts.dlg@40tude.net>
 <bd4cba52-3e2c-4160-89bd-1f460271bcf9@5g2000yqz.googlegroups.com>
	<46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net>
 <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com>
	<13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net>
NNTP-Posting-Host: 95.152.65.220
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1281270217 9666 127.0.0.1 (8 Aug 2010 12:23:37
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 8 Aug 2010 12:23:37 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: l20g2000yqm.googlegroups.com; posting-host=95.152.65.220;
	posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3)
	Gecko/20100524 Firefox/3.6.3,gzip(gfe)
Xref: g2news1.google.com comp.lang.ada:12949
Date: 2010-08-08T05:23:37-07:00
List-Id: <comp.lang.ada>

On Aug 7, 4:23=A0pm, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Sat, 7 Aug 2010 05:56:50 -0700 (PDT), Natacha Kerensikova wrote:
> > Can we at least agree on the fact that a sequence of bytes is a
> > general-purpose format, widely used for storing and transmitting data?
> > (this question is just a matter of vocabulary)
>
> I don't think so. Namely it don't think that "general" is a synonym to
> "completeness." It is rather about the abstraction level under the
> condition of completeness.

Well, then I'm afraid I can discuss anymore, because I fail to
understand your definition here.

I was using "general-purpose" as the opposite of "specific-purpose".
If we make the parallel with compression schemes, FLAC sure is as
complete as bzip2, yet the first one has a specific purpose
(compressing sounds) while the other is general-purpose. So back to
data format, I made the distinction in the amount preliminary
assumptions about data to be contained. In that sense the raw byte-
sequence is the most general format in that there is no assumption
about the contained data (except that its number of bits is a multiple
of the number of bits per byte).

> > So let's add as few semantics as possible, to keep as much generality
> > as possible. We end up with a bunch of byte sequences, whose semantics
> > are still left to the application, linked together by some kind of
> > semantic link. When the chosen links are "brother of" and "sublist of"
> > you get exactly S-expressions.
>
> Yes, the language of S-expressions is about hierarchical structures of
> elements lacking any semantics.
>
> I see no purpose such descriptions.

Indeed, I don't see any either, and that's the point: there is room to
add your application-specific purpose on top of this format.

> > However from a purely practical point of view, and using the fact that
> > in my background languages (C and 386 asm) bytes sequences and strings
> > are so similar, these crude semantics are all I need (or at least, all
> > I've ever needed so far).
>
> Lower you descend down the abstraction levels less differences you see.
> Everything is a bunch of transistors...

In the Get procedure from your last post, you don't seem to make that
much difference between a binary byte and a Character. I would seem
Ada Strings are also very similar to byte sequences/arrays.

> > Now if we agree that simplicity is a
> > desirable quality (because it leads to less bugs, more productivity,
> > etc), I still fail to see the issues of such a format.
>
> Programs in 386 Assembler are sufficiently more complex than programs in
> Ada. Simplicity of nature by no means implies simplicity of use.

Guess why I haven't written a single line in assembly during the last
8 years ;-)

> > Now regarding personal preferences about braces, I have to admit I'm
> > always shocked at the way so many people dismiss S-expressions on
> > first sight because of the LISP-looking parentheses.
>
> Do you mean LISP does not deserve its fame? (:-))

I honestly don't know enough about both LISP and its fame to mean
anything like that. I just meant that judging format only from its
relatively heavy use of parenthesis is about as silly as judging
skills of a person only from the amount of melanin in their skin.

> > My point is, most of my (currently non-OOP) code can be expressed as
> > well in an OOP style. When I defined a C structure along with a bunch
> > of functions that perform operations on it, I'm conceptually defining
> > a class and its methods, only expressed in a non-OOP language. I
> > sometimes put function pointers in the structure, to have a form of
> > dynamic dispatch or virtual methods. I occasionally even leave the
> > structure declared but undefined publicly, to hide internals (do call
> > that encapsulation?), along with functions that could well be called
> > accessors and mutators. In my opinion that doesn't count as OOP
> > because it doesn't use OOP-specific features like inheritance.
>
> I disagree because in my view this is all what OO is about. OO is not abo=
ut
> the tools (OOPL), it is about the way of programming.

Then I guess you could say I'm twisting C into OO programming, though
I readily violate OOP principles when it significantly improves code
readability or simplicity (which I guess happens much more often in C
than in Ada).

> > And the reason why I started this thread is only to
> > know how to buffer into memory the arrays of octets, because I need
> > (in my usual use of S-expressions) to resolve the links between atoms
> > before I can know the type of atoms. So I need a way to delay the
> > typing, and in the meantime handle data as a generic byte sequence
> > whose only known information is its size and its place in the S-
> > expression tree. What exactly is so bad with that approach?
>
> Nothing wrong when at the implementation level. However I don't see why
> links need to be resolved first. In comparable cases - I do much messy
> protocol/communication stuff - I usually first restore objects and then
> resolve links.

That's because some atom types are only known after having examined
other atoms. I you remember my example (tcp-connect (host foo.example)
(port 80)), here is how would it be interpreted: from the context or
initial state, we expect a list beginning with a atom which is a
string describing what to with whatever is after. "tcp-connect" is
therefore interpreted as a string, from the string value we know the
following is a list of settings, each of them being a list whose first
element is a atom which is a string describing the particular setting.
"host" is therefore a string, as its value tells us the following
atoms are also strings, describing host names to connect to, in
decreasing priority order. There "foo.example" is a string to be
resolve into a network address. "port" is also a string, and from its
value we know it's followed by atom being the decimal representation
of a port number, which in Ada would probably be a type on its own
(probably Integer mod 2**16 or something like that).

Of course, all those "we know" is actually knowledge derived from the
configuration file specification.

In this particular example, atoms are treated in the order in which
they appear in the byte stream, so there is already enough context to
know the type of an atom before reading it. This is not always the
case, for example it might be necessary to build an associative array
from a list of list before being able to know the type of non-head
atoms, or the S-expression might have to be kept uninterpreted (and
thus untyped) before some other run-time actions are performed (this
is quite common in the template system, where the template and the
data can change independently, and both changes induce a S-expression
re-interpretation).

Is it clearer now?


Natacha