From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!news4.google.com!feeder.news-service.com!weretis.net!feeder4.news.weretis.net!news.teledata-fn.de!newsfeed.arcor.de!newsspool3.arcor-online.net!news.arcor.de.POSTED!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: S-expression I/O in Ada Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.15.1 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net> <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com> <258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net> <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com> <46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net> <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com> <13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net> <1lhdkikeh2sif.bd3pon3knbv8.dlg@40tude.net> <7027f0c6-d909-428c-ab8d-6ba1bd7ff4b2@x21g2000yqa.googlegroups.com> <1424bzz54867w.soj1iq72wkwl$.dlg@40tude.net> Date: Mon, 9 Aug 2010 12:56:47 +0200 Message-ID: NNTP-Posting-Date: 09 Aug 2010 12:56:47 CEST NNTP-Posting-Host: 12a9e519.newsspool3.arcor-online.net X-Trace: DXC=c:3f_S@k0Be]l@YUW5NBknMcF=Q^Z^V3h4Fo<]lROoRa8kF On Mon, 9 Aug 2010 02:55:03 -0700 (PDT), Natacha Kerensikova wrote: > On Aug 8, 5:15�pm, "Dmitry A. Kazakov" > wrote: >> On Sun, 8 Aug 2010 06:49:09 -0700 (PDT), Natacha Kerensikova wrote: >>> On Aug 8, 3:01�pm, "Dmitry A. Kazakov" >>>> No I do. But you have defined it as a text file. A streamed text file is a >>>> sequence of Character items. >> >>> Actually, I didn't. I only defined it as a bunch of byte sequences >>> organized in a certain way. >> >> I see. This confuses things even more. Why should I represent anything as a >> byte sequence? It already is, and in 90% cases I just don't care how the >> compiler does that. Why to convert byte sequences into other sequences and >> then into a text file. It just does not make sense to me. Any conversion >> must be accompanied by moving the thing from one medium to another. >> Otherwise it is wasting. > > Representing something as a byte sequence is serialization (at least, > according to my (perhaps wrong) definition of serialization). "Byte" here is a RAM unit or a disk file item? I meant the former. All objects are already sequences of bytes in the RAM. > S-expressions are not a format on top or below that, it's a format > *besides* that, at the same level. Objects are serialized into byte > sequences forming S-expression atoms, and relations between objects/ > atoms are serialized by the S-expression format. This is how one get > the canonical representation of a S-expression. I thought you wanted to represent *objects* ... as S-sequences? Where these representations are supposed to live? In the RAM? Why are you then talking about text files, configurations and humans reading something? I cannot remember last time I read memory dump... > Now depending on the situation one might want additional constrains on > the representation, for example human-readability or being text-based, > and the S-expression standard defines non-canonical representations > for such situations. > >>> I know very well these differences, except octet vs character, >>> especially considering Ada's definition of a Character. Or is it only >>> that Character is an enumeration while octet is a modular integer? >> >> The difference is that Character represents code points and octet does >> atomic arrays of 8 bits. > > Considering Ada's Character also spans over 8 bits (256-element > enumeration), both are equivalent, right? Equivalent defiled as? In terms of types they are not, because the types are different. In terms of the relation "=" they are not either, because "=" is not defined on the tuple Character x Unsigned_8 (or whatever). > The only difference is the > intent and the meaning of values, right? Huh, there is *nothing* beyond the meaning (semantics). >>> This leads to a question I had in mind since quite early in the >>> thread, should I really use an array of Storage_Element, while S- >>> expression standard considers only sequences of octets? >> >> That depends on what are you going to do. Storage_Element is a >> machine-dependent addressable memory unit. Octet is a machine independent >> presentation layer unit, a thing of 256 independent states. Yes >> incidentally Character has 256 code points. > > Actually I've started to wonder whether Stream_Element might even more > appropriated: considering a S-expression atom is the serialization of > an object, and I guess objects which know how to serialize themselves > do so using the Stream subsystem, so maybe I could more easily > leverage existing serialization codes if I use Stream_Element_Array > for atoms. Note that Stream_Element is machine-depended as well. > But then I don't know whether it's possible to have object > hand over a Stream_Element_Array representing themselves, This does not make sense to me, it is mixing abstractions: Stream_Element_Array is a representation of an object in a stream. Storage_Array might be a representation of in the memory. These are two different objects. You cannot consider them same, even if they shared physically same memory (but they do not). The whole purpose of serialization to a raw stream is conversion of a Storage_Array to Stream_Element_Array. Deserialization is a backward conversion. > and I don't > know either how to deal with cases where Stream_Element is not an > octet. By not using Stream_Element_Array, obviously. You should use the encoding you want to. That is all. If the encoding is for a text file you have to read Characters, you don't care about how they land into a Stream_Element_Array, it is not your business, it is an implementation detail of the text stream. If the encoding is about octets, you have to read them. You have to chose. >>>> Once you matched "tcp-connect", you know all the types of the following >>>> components. >> >>> Unfortunately, you know "80" is a 16-bit integer only after having >>> matched "port". >> >> Nope, we certainly know that each TCP connection needs a port. There is >> nothing to resolve since the notation is not reverse. Parse it top down, it >> is simple, it is safe, it allows excellent diagnostics, it works. > > Consider: > (tcp-connect (host foo.example) (port 80)) > and: > (tcp-connect (port 80) (host foo.example)) > > Both of these are semantically equivalent, but know which of the tail > atom is a 16-bit integer and which is the string, you have to first > match "port" and "host" head atoms. Sure > Or am I misunderstanding your point? The point is that you never meet 80 before knowing that this is a "port", you never meet "port" before knowing it is of "tcp-connect". You always know all types in advance. It is strictly top-down. > Thanks for your patience with me, You are welcome. I think from the responses of the people here you see that the difference between Ada and C is much deeper than begin/end instead of curly brackets. Ada does not let you through without a clear concept of what are you going to do. Surely with some proficiency one could write classical C programs in Ada, messing everything up. You could even create buffer overflow in Ada. But it is difficult for a beginner... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de