From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!news4.google.com!feeder.news-service.com!feeder.erje.net!news2.arglkargh.de!noris.net!newsfeed.arcor.de!newsspool3.arcor-online.net!news.arcor.de.POSTED!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: S-expression I/O in Ada Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.15.1 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net> <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com> <258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net> <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com> Date: Sat, 7 Aug 2010 10:39:02 +0200 Message-ID: <46866b8yq8nn$.151lqiwa0y2k6.dlg@40tude.net> NNTP-Posting-Date: 07 Aug 2010 10:39:02 CEST NNTP-Posting-Host: aa289c29.newsspool3.arcor-online.net X-Trace: DXC=6]BEFaReUhWaAeROF2PWMQMcF=Q^Z^V3X4Fo<]lROoRQ8kF\af_8jNQMAfRPP_ X-Complaints-To: usenet-abuse@arcor.de Xref: g2news1.google.com comp.lang.ada:12926 Date: 2010-08-07T10:39:02+02:00 List-Id: On Sat, 7 Aug 2010 00:23:01 -0700 (PDT), Natacha Kerensikova wrote: > On Aug 1, 11:13�pm, "Dmitry A. Kazakov" > wrote: >> On Sun, 1 Aug 2010 13:06:10 -0700 (PDT), Natacha Kerensikova wrote: >>> On Aug 1, 8:49�pm, "Dmitry A. Kazakov" >>> wrote: >>>> Hmm, why do you consider brackets as separate elements? >>> Because that's the definition of S-expressions :-) >> OK, why S-expressions, then? (:-)) > > First, because this is the existing format in most of my programs, so > for interoperability I can choose only between using this format or > converting from and to it (or rewrite everything). In both cases there > is a strong code-reuse argument in favor of writing a S-expression > library instead of writing a bunch of ad-hoc similar code chunks. Legacy stuff also. That is a valid argument. >>> I'm referring to this almost-RFC:http://people.csail.mit.edu/rivest/Sexp.txt >> I see, yet another poor data format. > > Could you please explain why it is so poor? > > I consider it good because it is flexible, expressive and simple. I > already mentioned quite a few times why it looks simple: my parser > written from scratch in ~1000 lines. I looks expressive, because most > data structures I can think of, and all data structures I have > actually used, can be easily represented: an array can be written down > as a list, a hash table as a list of two-element lists (key then > value), and so on. And I see flexibility coming from the fact that any > sequence of bytes can be encoded in an atom. > > What am I missing? The requirements. One cannot judge a format without knowing what is the purpose of. Most of the formats like S-expressions are purposeless, in the sense that there is no *rational* purpose behind them. As you wrote above, it is either legacy (we have to overcome some limitations of some other poorly designed components of the system) or personal preferences (some people like angle brackets others do curly ones). >>> I use S-expression as a sort of universal text-based container, just >>> like most people use XML these days, >> >> These wonder me too. I see no need in text-based containers for binary >> data. Binary data aren't supposed to be read by people, they read texts. >> And conversely the machine shall not read texts, it has no idea of good >> literature... > > Actually this an interesting remark, it made me realize I'm mixing two > very different use of a data format (though I still think S- > expressions are adequate for both of them): > > I think text-based format is very useful when the file has to be dealt > with by both humans and programs. The typical example would be > configuration files: read and written by humans, and used by the > program. There should be no configuration files at all. The idea that a configuration can be edited using a text editor is corrupt. > And that's where I believe XML is really poor, because it's > too heavy for human use. I occasionally feel the need of embedding > binary data in some configuration files, e.g. cryptographic keys or > binary initialization sequences to send as-is over whatever > communication channel. In these occasion I do use the text-based > binary encoding allowed by S-expressions, base-64 or hexadecimal, so > that the configuration file is still a text file. The huge advantage > of text files here is that there is already a lot of tools to deal > with it, while using a binary format would require writing specific > tools for humans to deal with it, with is IMO a waste of time compared > to the text-based approach. All these tools are here exclusively to handle poor formats of these files. They add absolutely nothing to the actual purpose of configuration, namely to handle the *semantics* of the given configuration parameter. None answers simple questions like: How do I make the 3-rd button on the left 4cm large? Less than none verify the parameter values. The king is naked. > The other application is actual serialization, That should not be a text. >>> But now that I think about it, I'm wondering whether I'm stuck in my C >>> way of thinking and trying to apply it to Ada. Am I missing an Ada way >>> of storing structured data in a text-based way? >> >> I think yes. Though it is not Ada-specific, rather commonly used OOP design >> patterns. > > I heard people claiming that the first language shapes the mind of > coders (and they continue saying a whole generation of programmers has > been mind-crippled by BASIC). My first language happened to be 386 > assembly, that might explain things. I see where mixing abstraction layers comes from... > Anyway, I genuinely tried OOP > with C++ (which I dropped because it's way too complex for me (and I'm > tempted to say way too complex for the average coder, it should be > reserved to the few geniuses actually able to fully master it)), but I > never felt the need of anything beyond what can be done with a C > struct containing function pointers. Everything is Turing-complete you know... (:-)) > The > problem is, I just can't manage to imagine how to go in a single step > from the byte sequence containing a S-expression describing multiple > objects to the internal memory representation and vice-versa. You need not, that is the power of OOP you dislike so much. Consider each object knows how to construct itself from a stream of octets. It is trivial to simple objects like number. E.g. you read until the octets are '0'..'9' and generate the result interpreting it as a decimal representation. Or you take four octets and treat them as big-endian binary representation etc. For a container type, you call the constructors for each container member in order. If the container is unbounded, e.g. has variable length, you read its bounds first or you use some terminator in the stream to mark the container end. For containers of dynamically typed elements you must learn the component type before you construct it. In the theory this is called the recursive descent parser, the simplest thing ever. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de