From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,b5cd7bf26d091c6f
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news2.google.com!news3.google.com!feeder.news-service.com!94.75.214.39.MISMATCH!aioe.org!.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: Reading the while standard input into a String
Date: Mon, 6 Jun 2011 14:05:27 +0200
Organization: cbb software GmbH
Message-ID: <1ckesozpipi2y$.p1ei3zmgxjfc$.dlg@40tude.net>
References: <slrniunb6n.i18.lithiumcat@sigil.instinctive.eu>
 <pl7du6ibfnw.p03vhf1w4viu.dlg@40tude.net>
 <slrniupbvs.i18.lithiumcat@sigil.instinctive.eu>
Reply-To: mailbox@dmitry-kazakov.de
NNTP-Posting-Host: FbOMkhMtVLVmu7IwBnt1tw.user.speranza.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: 40tude_Dialog/2.0.15.1
X-Notice: Filtered by postfilter v. 0.8.2
Xref: g2news2.google.com comp.lang.ada:20606
Date: 2011-06-06T14:05:27+02:00
List-Id: <comp.lang.ada>

On Mon, 6 Jun 2011 10:46:20 +0000 (UTC), Natasha Kerensikova wrote:

> Hello,
> 
> On 2011-06-06, Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:
>> On Sun, 5 Jun 2011 16:20:39 +0000 (UTC), Natasha Kerensikova wrote:
>>
>>> However I still read
>>> character by character
>>
>> You have to, because the definition of line end is language/OS/encoding
>> dependent, so in order to detect line ends properly you need to scan
>> characters one by one, maybe recoding them into the encoding used by the
>> parser (e.g. UTF-8). It does not make much sense to read input by arbitrary
>> size chunks. Read it line by line. If parser needs returns over the line
>> margin (unlikely), then keep read lines cached.
> 
> The line end detection problem is exactly why I wanted unprocessed input
> bytes.

There is no such thing as unprocessed input.

>>>  into a temporary buffer,
>>
>> Read it into the destination buffer.
> 
> Well the destination buffer for the processed text is a very inefficient
> place to store input text, because the processing involves a lot of
> insertions.

A temporary buffer makes it only slower.

> Moreover, because of the forward reference issue I detailed in another
> post, I cannot see how I can escape the schema:
> input stream --> temporary buffer --> output stream/buffer/storage

Input stream ---recoding---> Line buffer

(The line buffer could be bounded from above as Ludovic suggested.)

>> Don't use Unbounded_String; that is a
>> bad idea in almost all cases, this one included.
> 
> Would you explain why?

Because they are inefficient and do not have array view (lack indexing,
slicing, constraining).

> Unless there is a way to predict the left of the input, I need some
> text container able to grow as much as needed while reading.

String is such a container.

> But I guess you weren't calling using Unbounded_String a
> bad idea only because of performance, were you?

Yes, missing array view is the problem. In all cases you know the size in
advance or when the size incrementally grows use String allocated in the
heap. Unbounded_Strings can be used, for example, as members of
non-controlled records.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de