From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!postnews.google.com!v15g2000yqe.googlegroups.com!not-for-mail From: Natacha Kerensikova Newsgroups: comp.lang.ada Subject: Re: S-expression I/O in Ada Date: Tue, 10 Aug 2010 01:56:22 -0700 (PDT) Organization: http://groups.google.com Message-ID: <9db37b80-acbb-4c9f-a646-34f108f52985@v15g2000yqe.googlegroups.com> References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com> <258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net> <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com> <46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net> <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com> <13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net> <1lhdkikeh2sif.bd3pon3knbv8.dlg@40tude.net> <7027f0c6-d909-428c-ab8d-6ba1bd7ff4b2@x21g2000yqa.googlegroups.com> <1424bzz54867w.soj1iq72wkwl$.dlg@40tude.net> NNTP-Posting-Host: 178.83.214.115 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1281430582 5712 127.0.0.1 (10 Aug 2010 08:56:22 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Tue, 10 Aug 2010 08:56:22 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: v15g2000yqe.googlegroups.com; posting-host=178.83.214.115; posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6 User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3) Gecko/20100524 Firefox/3.6.3,gzip(gfe) Xref: g2news1.google.com comp.lang.ada:13041 Date: 2010-08-10T01:56:22-07:00 List-Id: On Aug 9, 12:56=A0pm, "Dmitry A. Kazakov" wrote: > On Mon, 9 Aug 2010 02:55:03 -0700 (PDT), Natacha Kerensikova wrote: > > On Aug 8, 5:15=A0pm, "Dmitry A. Kazakov" > > wrote: > >> I see. This confuses things even more. Why should I represent anything= as a > >> byte sequence? It already is, and in 90% cases I just don't care how t= he > >> compiler does that. Why to convert byte sequences into other sequences= and > >> then into a text file. It just does not make sense to me. Any conversi= on > >> must be accompanied by moving the thing from one medium to another. > >> Otherwise it is wasting. > > > Representing something as a byte sequence is serialization (at least, > > according to my (perhaps wrong) definition of serialization). > > "Byte" here is a RAM unit or a disk file item? I meant the former. All > objects are already sequences of bytes in the RAM. I have to admit I tend to confuse these two kind of bytes. Considering my C-bitching background, I call "byte" C's char, i.e. the smallest memory unit addressable independently (from the language point of view, I'm well aware modern i386 have memory bus much wider than 8 bits, and transfer whole chunks at one time, but the CPU is following the "as-if" rule so it looks like 8-bits are addressed independently, hence 8-bit chars on these platforms). My confusion is that I often take for granted that disk files can be addressed the same way, while I perfectly know it's not always the case (e.g. DS has a 16-bit GBA bus and 8 bit operations have to load the whole 16 bit chunk, just like operating on 4-bit nibbles on i386). It must be because I've never thought of disk I/O as unbuffered, and the buffer does follow RAM rules of addressing. > > S-expressions are not a format on top or below that, it's a format > > *besides* that, at the same level. Objects are serialized into byte > > sequences forming S-expression atoms, and relations between objects/ > > atoms are serialized by the S-expression format. This is how one get > > the canonical representation of a S-expression. > > I thought you wanted to represent *objects* ... as S-sequences? It depends what you call object. Here again, my vocabulary might has been tainted by C Standard. Take for example a record, I would call each component an object, as well as the whole record itself. I guess I would call "object" an area of memory meaning something for the compiler (e.g. a byte in some multibyte internal representation of anything is not an object, while the byte sequence in RAM used by the compiler to know what an access variable accesses is an object). A S-expression atom must be build from something, so from this definition it comes that whatever is turned into a S-expression atom is an object (Well one could make an atom from multiple object but I don't think it's a good idea). So there are objects meant to end up in a single S-expression atom. There are indeed objects that I want to represent as more elaborate S- expressions. For example I wouldn't store a record into a single atom, I'd rather define a S-expression specification to represent it and store only simple components into atoms. But that's purely a personal choice, motivated by the preference for human-readable disk files. > Where these representations are supposed to live? In the RAM? Why are you > then talking about text files, configurations and humans reading somethin= g? > I cannot remember last time I read memory dump... Ultimately these representations are supposed to be stored on disk or transferred over a network or things like that, hence the need to be built from serialized representations of objects. However there is processing I might want to perform that require them to be buffered in memory. So they temporary live in RAM until they are stored or sent. Text file, configurations and human reading somethings are details of some applications of S-expressions, that might or might not be relevant in a discussion about the deep nature of what a S-expression is. But they have to be taken into consideration when a S-expression library is written, lest the library turn out unsuitable for these applications. > >> The difference is that Character represents code points and octet does > >> atomic arrays of 8 bits. > > > Considering Ada's Character also spans over 8 bits (256-element > > enumeration), both are equivalent, right? > > Equivalent defiled as? In terms of types they are not, because the types > are different. In terms of the relation "=3D" they are not either, becaus= e > "=3D" is not defined on the tuple Character x Unsigned_8 (or whatever). Sorry, "equivalent" in the mathematical that there is a bijection between the set of Characters and the set of Octets, which allows to use any of them to represent the other. Agreed, this a very week equivalence, it just means there are exactly as many octet values as Character values. On the other hand, Storage_Element and Character are not bijection- equivalent because there is no guarantee they will always have the same number of values, even though they often do. > > Actually I've started to wonder whether Stream_Element might even more > > appropriated: considering a S-expression atom is the serialization of > > an object, and I guess objects which know how to serialize themselves > > do so using the Stream subsystem, so maybe I could more easily > > leverage existing serialization codes if I use Stream_Element_Array > > for atoms. > > Note that Stream_Element is machine-depended as well. I'm sadly aware of that. I need an octet-sequence to follow the S- expression standard, and there is here an implementation trade-off: assuming objects already know how to serialize themselves into a Stream_Element_Array, I can either code a converter from Stream_Element_Array to octet-sequence, or reinvent the wheel and code a converter for each type directly into an octet-sequence. For some strange reason I prefer by far the first possibility. > > But then I don't know whether it's possible to have object > > hand over a Stream_Element_Array representing themselves, > > This does not make sense to me, it is mixing abstractions: > Stream_Element_Array is a representation of an object in a stream. > Storage_Array might be a representation of in the memory. These are two > different objects. You cannot consider them same, even if they shared > physically same memory (but they do not). The whole purpose of > serialization to a raw stream is conversion of a Storage_Array to > Stream_Element_Array. Deserialization is a backward conversion. I don't consider them the same, otherwise I wouldn't be pondering about which one to use. If it helps, you can think of S-expressions as a standardized way of serializing some relations between objects. However the objects still have to be serialized, and that's outside of the scope of S- expressions. From what I understood, the existing serializations of objects use Stream_Element_Array as a low-level type. So the natural choice for serializing the relations seems to be taking the Stream_Element_Array from each object, and hand over to the lower- level I/O a unified Stream_Element_Array. Does it make sense or am I completely missing something? > The point is that you never meet 80 before knowing that this is a "port", > you never meet "port" before knowing it is of "tcp-connect". You always > know all types in advance. It is strictly top-down. Right, in that simple example it the case. It is even quite often the case, hence my thinking about a Sexp_Stream in another post, which would allow S-expression I/O without having more than a single node at the same time in memory. But there are still situations where S-expression have to be stored in memory. For examples the templates, where S-expressions represent a kind of limited programming language that is re-interpreted for each template extension. Either store the S-expression in memory or re-read the file form disk (which doesn't work efficiently when there is more than one template in a file). > > Thanks for your patience with me, > > You are welcome. I think from the responses of the people here you see th= at > the difference between Ada and C is much deeper than begin/end instead of > curly brackets. Of course, if there was no other difference I would have stayed in C (I still prefer brackets rather than words). > Ada does not let you through without a clear concept of > what are you going to do. Surely with some proficiency one could write > classical C programs in Ada, messing everything up. You could even create > buffer overflow in Ada. But it is difficult for a beginner... I'm not here to Cifiy Ada. C is undoubtedly the best language to do C- styled stuff. I felt a mismatch between the nature of C and my way of programming. I got interested in Ada (rather than in any language in the plethora of existing languages out there) because it seemed to have what I lacked in C while still having what I like in C. The former being roughly the emphasis on robustness, readability and quality; the latter being roughly the low-level (as in "close to the hardware", at least close enough not to rule out programming for embedded platforms and system programming) and the performance. The previous phrase lacking a lot of irrational elements that I don't manage to word and that can be summarized as "personal taste". Now I can't explain why your posts often make me feel Ada is completely out of my tastes in programming languages, while other people's posts often make me feel Ada is probably an improvement over C in term of personal preferences. However I have the feeling I got much more personal improvement from your posts (improvement still meaningful even if I eventually drop Ada to come back to C), and I'm grateful for that. Thanks for your help, Natacha