From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!postnews.google.com!5g2000yqz.googlegroups.com!not-for-mail From: Natacha Kerensikova Newsgroups: comp.lang.ada Subject: Re: S-expression I/O in Ada Date: Sat, 7 Aug 2010 00:23:01 -0700 (PDT) Organization: http://groups.google.com Message-ID: References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <1qk2k63kzh7yv$.3jgc403xcqdw$.dlg@40tude.net> <8ae8e899-9eef-4c8c-982e-bfdfc10072f1@h17g2000pri.googlegroups.com> <258zlxrv4fn6.1vszho1rtmf48$.dlg@40tude.net> <984db477-973c-4a66-9bf6-e5348c9b95f2@n19g2000prf.googlegroups.com> NNTP-Posting-Host: 178.83.214.115 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1281165781 7374 127.0.0.1 (7 Aug 2010 07:23:01 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Sat, 7 Aug 2010 07:23:01 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: 5g2000yqz.googlegroups.com; posting-host=178.83.214.115; posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6 User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3) Gecko/20100524 Firefox/3.6.3,gzip(gfe) Xref: g2news1.google.com comp.lang.ada:12922 Date: 2010-08-07T00:23:01-07:00 List-Id: On Aug 1, 11:13=A0pm, "Dmitry A. Kazakov" wrote: > On Sun, 1 Aug 2010 13:06:10 -0700 (PDT), Natacha Kerensikova wrote: > > On Aug 1, 8:49=A0pm, "Dmitry A. Kazakov" > > wrote: > >> Hmm, why do you consider brackets as separate elements? > > Because that's the definition of S-expressions :-) > OK, why S-expressions, then? (:-)) First, because this is the existing format in most of my programs, so for interoperability I can choose only between using this format or converting from and to it (or rewrite everything). In both cases there is a strong code-reuse argument in favor of writing a S-expression library instead of writing a bunch of ad-hoc similar code chunks. Second, because I like this format and I find it good (see below). > > I'm referring to this almost-RFC:http://people.csail.mit.edu/rivest/Sex= p.txt > I see, yet another poor data format. Could you please explain why it is so poor? I consider it good because it is flexible, expressive and simple. I already mentioned quite a few times why it looks simple: my parser written from scratch in ~1000 lines. I looks expressive, because most data structures I can think of, and all data structures I have actually used, can be easily represented: an array can be written down as a list, a hash table as a list of two-element lists (key then value), and so on. And I see flexibility coming from the fact that any sequence of bytes can be encoded in an atom. What am I missing? > > I use S-expression as a sort of universal text-based container, just > > like most people use XML these days, > > These wonder me too. I see no need in text-based containers for binary > data. Binary data aren't supposed to be read by people, they read texts. > And conversely the machine shall not read texts, it has no idea of good > literature... Actually this an interesting remark, it made me realize I'm mixing two very different use of a data format (though I still think S- expressions are adequate for both of them): I think text-based format is very useful when the file has to be dealt with by both humans and programs. The typical example would be configuration files: read and written by humans, and used by the program. And that's where I believe XML is really poor, because it's too heavy for human use. I occasionally feel the need of embedding binary data in some configuration files, e.g. cryptographic keys or binary initialization sequences to send as-is over whatever communication channel. In these occasion I do use the text-based binary encoding allowed by S-expressions, base-64 or hexadecimal, so that the configuration file is still a text file. The huge advantage of text files here is that there is already a lot of tools to deal with it, while using a binary format would require writing specific tools for humans to deal with it, with is IMO a waste of time compared to the text-based approach. The other application is actual serialization, i.e. converting internal types into a byte sequence in order to be stored on disk or transmitted over a network or whatever. In this situation, humans don't need to interact with the data (except for debugging purposes, but it's an argument so light it's only justified when everything else is otherwise equal). In my previous posts I have talked a lot about serialization, while my actual use of S-expression is more often the first one. And historically I first used S-expressions for configuration files, because of their expressiveness over all other text format I know, while still being extremely simple. I then used this format for serialization mostly for code reuse sake: I did have a code for S- expressions, so it was very cheap to use it for serialization purposes. Compared to using another serialization format, it leads to less code being more used, hence less opportunities to write bugs and more opportunities to find and fix bugs. So it seems like a very rational choice. > > The library is not supposed to care about what those some_stuff_ are. > > Actually, the library is suppose to get binary data from the > > application along with the tree structure described above, store it > > into a sequence of bytes (on a disk or over a network or whatever), > > and to retrieve from the byte sequence the original tree structure > > along with the binary data provided. > > Serialize can yield a binary chunk, but if you have to write it anyway, w= hy > not to write text? You insisted on having text, why do mess with binary > stuff? I hope the above already answered this: I mostly use S-expressions for text purposes, yet occasionally I feel the need of embedding binary data. Of course there are a lot of way to testify binary data, like hexadecimal or base-64, but considering S-expressions already handle textification, I don't see the point of having the application deal with it too. > > But now that I think about it, I'm wondering whether I'm stuck in my C > > way of thinking and trying to apply it to Ada. Am I missing an Ada way > > of storing structured data in a text-based way? > > I think yes. Though it is not Ada-specific, rather commonly used OOP desi= gn > patterns. I heard people claiming that the first language shapes the mind of coders (and they continue saying a whole generation of programmers has been mind-crippled by BASIC). My first language happened to be 386 assembly, that might explain things. Anyway, I genuinely tried OOP with C++ (which I dropped because it's way too complex for me (and I'm tempted to say way too complex for the average coder, it should be reserved to the few geniuses actually able to fully master it)), but I never felt the need of anything beyond what can be done with a C struct containing function pointers. Now back to the topic, thanks to your post and some others in this thread (for which I'm also thankful), I came to realize my mistake is maybe wanting to parse S-expressions and atom contents separately. The problem is, I just can't manage to imagine how to go in a single step from the byte sequence containing a S-expression describing multiple objects to the internal memory representation and vice-versa. Thanks for your help and your patience, Natacha