From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,24d7acf9b853aac8 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news3.google.com!feeder.news-service.com!85.214.198.2.MISMATCH!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Ludovic Brenta Newsgroups: comp.lang.ada Subject: Re: S-expression I/O in Ada Date: Thu, 12 Aug 2010 22:22:04 +0200 Organization: A noiseless patient Spider Message-ID: <87zkwrk2dv.fsf@ludovic-brenta.org> References: <547afa6b-731e-475f-a7f2-eaefefb25861@k8g2000prh.googlegroups.com> <87aap6wcdx.fsf@ludovic-brenta.org> <87vd7jliyi.fsf@ludovic-brenta.org> <699464f5-7f04-4ced-bc09-6ffc42c5322a@w30g2000yqw.googlegroups.com> <87k4nylb8c.fsf@ludovic-brenta.org> <4c617774$0$6765$9b4e6d93@newsspool3.arcor-online.net> <2610d347-27cf-4c88-ac18-84f73c7da858@h32g2000yqm.googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Date: Thu, 12 Aug 2010 20:22:05 +0000 (UTC) Injection-Info: mx01.eternal-september.org; posting-host="79l6+2iEBh3kHi3l3eyPTw"; logging-data="29861"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18OBZF5A6M0h6uX6zvFRG2y" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) Cancel-Lock: sha1:rXhDbD3+itlbLa/JIj4wfq1NKXA= sha1:Fl8ZDSj9kdqfWfzwBOAYfC6tfSI= Xref: g2news1.google.com comp.lang.ada:13192 Date: 2010-08-12T22:22:04+02:00 List-Id: I wrote on comp.lang.ada: > I made some fixes and I now consider my S-Expression parser[1] feature- > complete as of revision b13ccabbaf227bad264bde323138910751aa2c2b. > There may still be some bugs though, and the error reporting (to > diagnose syntax errors in the input) is very primitive. > > Highlights: > * the procedure S_Expression.Read is a quasi-recursive descent > parser. "Quasi" because it only recurses when encountering an opening > parenthesis, but processes atoms without recursion, in the same finite > state machine. > * the parser reads each character exactly once; there is no push_back > or backtracking involved. This makes the parser suitable to process > standard input on the fly. > * to achive this, I had to resort to using exceptions instead of > backtracking; this happens when the parser encounters a ')' > immediately after an S-Expression (atom or list). > * the parser also supports lists of the form (a b c) (more than 2 > elements) and properly translates them to (a (b c)). The Append() > procedure that does this is also public and available to clients. > * the parser does not handle incomplete input well. If it gets > Ada.IO_Exceptions.End_Error in the middle of an S-Expression, it will > return an incomplete, possibly empty, S-Expression rather than report > the error. I'll try to improve that. > * the test.adb program demonstrates how to construct an S-Expression > tree in memory (using cons()) and then sending it to a stream (using > 'Write). > * the test.adb program also demonstrates how to read an S-Expression > from a stream (using 'Read) and then traverse the in-memory tree > (using car(), cdr()). > > [1] http://green.ada-france.org:8081/branch/changes/org.ludovic-brenta.s_expressions > > I have not yet tested the parser on your proposed input (IIUC, > consisting of two S-Expressions with a missing closing parenthesis). > I think this will trigger the bug where End_Error in the middle of an > S-Expression is not diagnosed. > > I also still need to add the proper GPLv3 license text on each file. > > I'll probably add support for Lisp-style comments (starting with ';' > and ending at end of line) in the future. As of revision b60f80fba074431aeeffd95aa273a1d4fc81bf41, I now handle end-of-stream in all situations and (I believe) react appropriately. I have now tested the parser against this sample input file: $ cat test_input (tcp-connect (host foo.bar) (port 80)) (tcp-connect ((host foo.bar) (port 80)) (tcp-connect (host foo.bar) (port 80))) $ ./test < test_input (tcp-connect ((host foo.example) (port 80))) Parsing the S-Expression: (tcp-connect ((host foo.bar) (port 80))) Writing the S-Expression: (tcp-connect ((host foo.bar) (port 80))) Parsing the S-Expression: Exception name: TEST.SYNTAX_ERROR Message: Expected atom with value 'host' Writing the S-Expression: (tcp-connect ((((host foo.bar) (port 80)) (tcp-connect (host foo.bar))) (port 80))) raised S_EXPRESSION.SYNTAX_ERROR : Found ')' at the start of an expression The very first line of output is the result of 'Write of an S-Expression constructed from a hardcoded TCP_Connect record. "Parsing" refers to the high-level part of the parsing that traverses the in-memory S-Expression tree and converts it to a TCP_Connect_T record. "Writing" refers to both halves of the low-level parsing: reading the character stream, producing the in-memory S-Expression tree, and converting it back to a character stream. The TEST.SYNTAX_ERROR is because the high-level parser found a list instead of the expected atom "host"; this is because of the extra '(' before "host" at line 3 in the input. The S_EXPRESSION.SYNTAX_ERROR is because the low-level parser found an extra ')' at the very end of line 4 in the input; it coalesced lines 3 and 4 into a single, valid, S-Expression, and was expecting a new S-Expression starting with '('. -- Ludovic Brenta.