From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!news4.google.com!feeder.news-service.com!newsfeed.kamp.net!newsfeed.kamp.net!feed.xsnews.nl!border-1.ams.xsnews.nl!151.189.20.20.MISMATCH!newsfeed.arcor.de!newsspool2.arcor-online.net!news.arcor.de.POSTED!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: S-expression I/O in Ada Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.15.1 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net> <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com> <258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net> <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com> <46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net> <13b07f2c-2f35-43e0-83c5-1b572c65d323@y11g2000yqm.googlegroups.com> <13tpf7ya3evig$.h05p3x08059s$.dlg@40tude.net> Date: Sun, 8 Aug 2010 15:01:25 +0200 Message-ID: <1lhdkikeh2sif.bd3pon3knbv8.dlg@40tude.net> NNTP-Posting-Date: 08 Aug 2010 15:01:25 CEST NNTP-Posting-Host: 394f9554.newsspool4.arcor-online.net X-Trace: DXC=DRd;9E>CUlL@Y=h<_c3PkH4IUK On Sun, 8 Aug 2010 05:23:37 -0700 (PDT), Natacha Kerensikova wrote: > On Aug 7, 4:23�pm, "Dmitry A. Kazakov" > wrote: >> On Sat, 7 Aug 2010 05:56:50 -0700 (PDT), Natacha Kerensikova wrote: >>> Can we at least agree on the fact that a sequence of bytes is a >>> general-purpose format, widely used for storing and transmitting data? >>> (this question is just a matter of vocabulary) >> >> I don't think so. Namely it don't think that "general" is a synonym to >> "completeness." It is rather about the abstraction level under the >> condition of completeness. > > Well, then I'm afraid I can discuss anymore, because I fail to > understand your definition here. > > I was using "general-purpose" as the opposite of "specific-purpose". > If we make the parallel with compression schemes, FLAC sure is as > complete as bzip2, yet the first one has a specific purpose > (compressing sounds) while the other is general-purpose. So back to > data format, I made the distinction in the amount preliminary > assumptions about data to be contained. In that sense the raw byte- > sequence is the most general format in that there is no assumption > about the contained data (except that its number of bits is a multiple > of the number of bits per byte). And how are you going to make any assumptions at the level of raw bytes? For a sequence of bytes to become sound you need to move many abstraction layers - and OSI layers - up. >>> However from a purely practical point of view, and using the fact that >>> in my background languages (C and 386 asm) bytes sequences and strings >>> are so similar, these crude semantics are all I need (or at least, all >>> I've ever needed so far). >> >> Lower you descend down the abstraction levels less differences you see. >> Everything is a bunch of transistors... > > In the Get procedure from your last post, you don't seem to make that > much difference between a binary byte and a Character. No I do. But you have defined it as a text file. A streamed text file is a sequence of Character items. > I would seem > Ada Strings are also very similar to byte sequences/arrays. I remember a machine where char was 32-bit long. Byte, octet, character are three different things (and code point is a fourth). > I just meant that judging format only from its > relatively heavy use of parenthesis is about as silly as judging > skills of a person only from the amount of melanin in their skin. The amount of melanin is unrelated to the virtues we count in human beings. An excessive need in indistinguishable brackets would definitely reduce readability. >>> And the reason why I started this thread is only to >>> know how to buffer into memory the arrays of octets, because I need >>> (in my usual use of S-expressions) to resolve the links between atoms >>> before I can know the type of atoms. So I need a way to delay the >>> typing, and in the meantime handle data as a generic byte sequence >>> whose only known information is its size and its place in the S- >>> expression tree. What exactly is so bad with that approach? >> >> Nothing wrong when at the implementation level. However I don't see why >> links need to be resolved first. In comparable cases - I do much messy >> protocol/communication stuff - I usually first restore objects and then >> resolve links. > > That's because some atom types are only known after having examined > other atoms. I you remember my example (tcp-connect (host foo.example) > (port 80)), here is how would it be interpreted: from the context or > initial state, we expect a list beginning with a atom which is a > string describing what to with whatever is after. "tcp-connect" is > therefore interpreted as a string, from the string value we know the > following is a list of settings, Once you matched "tcp-connect", you know all the types of the following components. > This is not always the > case, for example it might be necessary to build an associative array > from a list of list before being able to know the type of non-head > atoms, What for? Even if such cases might be invented, I see no reason to do that. It is difficult to parse, it is difficult to read. So why to mess with? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de