From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,b5cd7bf26d091c6f X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news3.google.com!feeder.news-service.com!feeder.erje.net!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: Natasha Kerensikova Newsgroups: comp.lang.ada Subject: Re: Reading the while standard input into a String Date: Mon, 6 Jun 2011 10:46:20 +0000 (UTC) Organization: A noiseless patient Spider Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Injection-Date: Mon, 6 Jun 2011 10:46:20 +0000 (UTC) Injection-Info: mx04.eternal-september.org; posting-host="Mda950WjNwNLAFOE7yJXQw"; logging-data="744"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19pvHgZnoUUsg01Xo2F3QQE" User-Agent: slrn/0.9.9p1 (FreeBSD) Cancel-Lock: sha1:Wu76aCu3x79EKnpNWIYONDbj6ik= Xref: g2news1.google.com comp.lang.ada:19633 Date: 2011-06-06T10:46:20+00:00 List-Id: Hello, On 2011-06-06, Dmitry A. Kazakov wrote: > On Sun, 5 Jun 2011 16:20:39 +0000 (UTC), Natasha Kerensikova wrote: > >> However I still read >> character by character > > You have to, because the definition of line end is language/OS/encoding > dependent, so in order to detect line ends properly you need to scan > characters one by one, maybe recoding them into the encoding used by the > parser (e.g. UTF-8). It does not make much sense to read input by arbitrary > size chunks. Read it line by line. If parser needs returns over the line > margin (unlikely), then keep read lines cached. The line end detection problem is exactly why I wanted unprocessed input bytes. Each instance of the parser code can get at least LF-ended lines (from unix files) or CR&LF-ended lines (from web form), so unless Ada.Text_IO can deals with this (but I guess I cannot really count on it), I have to do it in my own code. >> into a temporary buffer, > > Read it into the destination buffer. Well the destination buffer for the processed text is a very inefficient place to store input text, because the processing involves a lot of insertions. Moreover, because of the forward reference issue I detailed in another post, I cannot see how I can escape the schema: input stream --> temporary buffer --> output stream/buffer/storage > Don't use Unbounded_String; that is a > bad idea in almost all cases, this one included. Would you explain why? Unless there is a way to predict the left of the input, I need some text container able to grow as much as needed while reading. I will also need a similar container during the processing, and even GNAT's improved Unbounded_String still does a lot of a reallocations in the process. So I was considering implementing my own container, something like Chunked_Unbounded_String, which would allocate memory by chunks of fixed size (probably provided by a generic package parameter, using a few kilobytes) and thereby improve a lot performance of lots of small Appends. But I guess you weren't calling using Unbounded_String a bad idea only because of performance, were you? Thanks for your comments, Natasha