From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,CP1252 Path: g2news1.google.com!postnews.google.com!p7g2000yqa.googlegroups.com!not-for-mail From: Natacha Kerensikova Newsgroups: comp.lang.ada Subject: Re: S-expression I/O in Ada Date: Thu, 12 Aug 2010 05:16:41 -0700 (PDT) Organization: http://groups.google.com Message-ID: <6a5068d7-b774-4c52-8b00-ddcc76865847@p7g2000yqa.googlegroups.com> References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <87aap6wcdx.fsf@ludovic-brenta.org> <87vd7jliyi.fsf@ludovic-brenta.org> <699464f5-7f04-4ced-bc09-6ffc42c5322a@w30g2000yqw.googlegroups.com> <87ocdbl41u.fsf@ludovic-brenta.org> <318d4041-eb01-4419-ae68-e6f3436c5b66@i31g2000yqm.googlegroups.com> <8f4b65ed-e003-4ed9-8118-da8d240dd8aa@z28g2000yqh.googlegroups.com> NNTP-Posting-Host: 95.152.65.220 Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1281615402 8439 127.0.0.1 (12 Aug 2010 12:16:42 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Thu, 12 Aug 2010 12:16:42 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: p7g2000yqa.googlegroups.com; posting-host=95.152.65.220; posting-account=aMKgaAoAAAAoW4eaAiNFNP4PjiOifrN6 User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.2.3) Gecko/20100524 Firefox/3.6.3,gzip(gfe) Xref: g2news1.google.com comp.lang.ada:13171 Date: 2010-08-12T05:16:41-07:00 List-Id: On Aug 12, 12:55=A0pm, Ludovic Brenta wrote: > Natacha Kerensikova wrote on comp.lang.ada: > [...] > > > Sexp_Stream is supposed to perform S-expression I/O over streams, > > without ever constructing a S-expression into memory. It is supposed > > to do only the encoding and decoding of S-expression format, to expose > > its clients only atoms and relations. > > But how can it expose atoms and relations without an in-memory tree > representation? Honestly, I do not think that such a concept is > viable. I have already showing how to write S-expressions to a stream without in-memory representations of the S-expression (thought I still need an in-memory representation of atoms, and I believe this can't be worked around), using an interface like this: procedure OpenList(sstream: in out Sexp_Stream); -- more or less equivalent to output a '(' into the underlying stream procedure CloseList(sstream: in out Sexp_Stream); -- more or less equivalent to output a ')' into the underlying stream procedure PutAtom(sstream: in out Sexp_Stream, atom: in octet_array); -- encode the atom and send it into the underlying stream I guess it will also need some functions to convert usual types to and from Octet_Array. The reading part is a bit more tricky, and I admitted when I proposed Sexp_Stream I didn't know how to make it. Having thought (maybe too much) since then, here is what the interface might look like: type Node_Type is (S_None, S_List, S_Atom); function Current_Node_Type(sstream: in Sexp_Stream) return Atom_Type; procedure Get_Atom(sstream: in Sexp_Stream, contents: out octet_array); -- raises an exception when Current_Node_Type is not S_Atom -- not sure "out octet_array" works, but that's the idea -- maybe turn it into a function for easier atom-to-object conversion procedure Move_Next(sstream: in out Sexp_Stream); -- when the client is finished with the current node, it calls this -- procedure to update stream internal state to reflect the next node in -- list procedure Move_Lower(sstream: in out Sexp_Stream); -- raises an exception when Current_Node_Type is not S_List -- update the internal state to reflect the first child of the current list procedure Move_Upper(sstream: in out Sexp_Stream); -- update the internal state to reflect the node following the list -- containing the current node. sortof "uncle" node -- the implementation is roughly skipping whatever nodes exist in the -- current list until reading its ')' and reading the following node Such an interface support data streams, i.e. reading new data without seek or pushback. The data would probably be read octet-by-octet (unless reading a verbatim encoded atom), hoping for an efficient buffering in the underlying stream. If that's a problem I guess some buffering can be transparently implemented in the private part of Sexp_Stream. Of course atoms are to be kept in memory, which means Sexp_Stream will have to contain a dynamic array (Vector in Ada terminology, right?) populated from the underlying stream until the atom is completely read. Get_Atom would hand over a (static) array built from the private dynamic array. The implementation would for example rely on a procedure to advance the underlying to the next node (i.e. skipping white space), and from the first character know whether it's the beginning of a new list (when '(' is encountered) and update the state to S_List, and it's finished; whether it's the end of the list (when ')' is encountered) and update the state to S_None; or whether it's the beginning of an atom, so read it completely, updating the internal dynamic array and setting the state to S_Atom. Well actually, it's probably not the best idea, I'm not yet clear enough on the specifications and on stream I/O to clearly think about implementation, but that should be enough to make you understand the idea of S-expression reading without S-expression objects. Now regarding the actual use of this interface, I think my previous writing example is enough, so here is the reading part. My usual way of reading S-expression configuration file is to read sequentially, one at a time, a list whose first element is an atom. Atoms, empty lists and lists beginning with a list are considered as comments and silently dropped. "(tcp-connect =85)" is meant to be one of these, processed by TCP_Info's client, which will hand over only the "=85" part to TCP_Info.Read (or other initialization subprogram). So just like TCP_Info's client processes something like "(what-ever- config =85) (tcp-connect =85) (other-config =85)", TCP_Info.Read will process only "(host foo.example) (port 80)". So TCP_Info's client, after having read "tcp-connect" atom, will call Move_Next on the Sexp_Stream and pass it to TCP_Info.Read. Then TCP_Info.Read proceeds with: loop case Current_Atom_Type(sstream) is when S_None =3D> return; -- TCP_Info configuration is over when S_Atom =3D> null; -- silent atom drop when S_List =3D> Move_Lower(sstream); Get_Atom(sstream, atom); -- make command of type String from atom -- if Get_Atom was successful Move_Next(sstream); if command =3D "host" then -- Get_Atom and turn it into host string elif command =3D "port" then -- Get_Atom and turn it into port number else -- complain about unrecognized command end if; Move_Upper(sstream); end case; end loop; TCP_Info's client S-expression parsing would be very similar, except if command =3D =85 would be followed by a call to TCP_Info.Read rather than a Get_Atom. So where are the problems with my Sexp_Stream without in memory object? What am I missing? Or is it so ugly and insane that it should be kept in C's realm? > > A further unnamed yet package would handle the memory representation > > of S-expressions, which involve object creation, memory management and > > things like that. It would be a client of Sexp_Stream for I/O, so the > > parsing would only be done at one place (namely Sexp_Stream). As I > > said Ludovic Brenta's code might take this place. > > No, it replaces both of your hypothetical packages and I don't think > you can have the "stream" package without the "S-Expression" package. Yes, I indeed realized this mistake short after having send the message. My bad. > You could, however, have an S-Expression package without any I/O. Indeed, though I have to admit I have trouble imagining what use it can have. > TCP_Info : constant String :=3D "(tcp-connect (host foo.bar) (port > 80))"; > TCP_Info_Structured :=3D constant To_TCP_Info (To_Sexp (TCP_Info)); That's an interesting idea, which conceptually boils down to serialize by hand the S-expression into a String, in order to unserialize it as an in-memory object, in order to serialize back into a Stream. Proofreading my post, the above might sound sarcastic though actually it is not. It's a twist I haven't thought of, but it might indeed turn out to be the simplest practical way of doing it. Actually for a very long time I used to write S-expressions to file using only string facilities and a special sx_print_atom() function to handle escaping of unsafe data. By then I would have handled TCP_Info.Write with the following C fragment (sorry I don't know yet how to turn it into Ada, but I'm sure equivalent as simple exists): fprintf(outfile, "(tcp-connect\n\t(host "); sx_print_atom(outfile, host); fprintf(outfile, ")\n\t(port %d)\n)\n", port); > >> Your TCP_Info-handling pkg would convert the record into an S-expressi= on, and > >> call a single operation from your S-expression pkg to output the S-exp= ression. > > > That's the tricky part. At least so tricky that I can't imagine how to > > do it properly. > > Yes, the S_Expression.Read operation is quite involved. But it can be > done, and has been done :) Actually what I called "tricky" was the "single operation" part of TCP_Info's job. As I said, 8 nodes to build and send, it's not that easy to fit in a single operation, though your tick performs it quite well. By comparison the S-expression reading from a stream feels less tricky to me, though a bit more tiring, and has indeed been done, even by me (if you can read C, that the second half (lines 417 to the end) of http://git.instinctive.eu/cgit/libnathandbag/tree/csexp.c ). Thanks for your comments and your implementation, Natacha