From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,8de7eedad50552f1
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!news3.google.com!news.glorb.com!blackbush.cw.net!cw.net!feed.news.tiscali.de!newsfeed.freenet.de!151.189.20.20.MISMATCH!newsfeed.arcor.de!news.arcor.de!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: Ada bench : count words
Newsgroups: comp.lang.ada
User-Agent: 40tude_Dialog/2.0.14.1
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Reply-To: mailbox@dmitry-kazakov.de
Organization: cbb software GmbH
References: <ur7ib38sg.fsf@obry.net>
 <pan.2005.03.19.16.57.03.525542@linuxchip.demon.co.uk.uk.uk>
 <87vf7n5njs.fsf@code-hal.de> <umzsy3c8j.fsf@obry.net>
 <423f5813$0$9224$9b4e6d93@newsread4.arcor-online.net>
 <mailman.47.1111454204.23655.comp.lang.ada@ada-france.org>
 <18arnvu705ly4$.1wz6ybz1jt70y$.dlg@40tude.net>
 <mailman.48.1111492666.23655.comp.lang.ada@ada-france.org>
 <1q9cx4jt7802s.k45m6mcntl87$.dlg@40tude.net>
 <mailman.49.1111495687.23655.comp.lang.ada@ada-france.org>
 <460oxs2p0hbc.yjqxjeasx37r.dlg@40tude.net>
 <mailman.54.1111510113.23655.comp.lang.ada@ada-france.org>
Date: Tue, 22 Mar 2005 18:34:45 +0100
Message-ID: <1spfhtlo4ya1w.1a8leo0bqclk8.dlg@40tude.net>
NNTP-Posting-Date: 22 Mar 2005 18:34:43 MET
NNTP-Posting-Host: 94925ca2.newsread2.arcor-online.net
X-Trace: 
 DXC=WW@^1A^9GUe;XK28C2J>>hQ5U85hF6f;djW\KbG]kaMh]kI_X=5Keaf30<;KdZlH6m[6LHn;2LCVnVVa[ZlQni_a:lL]<V@GYZo
X-Complaints-To: abuse@arcor.de
Xref: g2news1.google.com comp.lang.ada:9742
Date: 2005-03-22T18:34:43+01:00
List-Id: <comp.lang.ada>

On Tue, 22 Mar 2005 16:48:13 +0000, Marius Amado Alves wrote:

>> Get_Line does one extra line scan. So it will be inherently slower. 
>> Then it would not take any advantage of having Buffer if lines are 
>> shorter than 4K.
> 
> I'm not sure I understand. An extra scan (I assume for EOL) in addition 
> to what?

To parse the line. Get_Line scans some n-th buffer attached to the file and
copies bytes from there into your n+1th buffer until it meets EOL. I
suppose under VMS it could be otherwise, but we have what we deserve: UNIX
and Windows. After that you rescan these bytes again. This means that each
byte will be touched n+1 times.

> I don't understand why Get_Line has to be slower. I understand how a 
> naive, non-buffered, implementation can be slower. But at least when 
> reading from standard input the implementation should buffer, or cache, 
> the data, on a buffer of Item'Size or greater, and remember the 
> positions, and never read again.

Only if Get_Line were a function returning a slice referring to a system
buffer that had multiple segments with reference counting ... Forget it
(:-))

BTW, I think that this challenge is wrong because it does not filter I/O
issue out. In real-life applications it is not Text_IO (or its equivalent
in any other language) which is the bottleneck. Text_IO could be 10 times
slower and still impose no or little problems, because string processing in
the true sense, is usually 1000 times slower.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de