From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,8de7eedad50552f1 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news3.google.com!news.glorb.com!blackbush.cw.net!cw.net!feed.news.tiscali.de!newsfeed.freenet.de!151.189.20.20.MISMATCH!newsfeed.arcor.de!news.arcor.de!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: Ada bench : count words Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.14.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <87vf7n5njs.fsf@code-hal.de> <423f5813$0$9224$9b4e6d93@newsread4.arcor-online.net> <18arnvu705ly4$.1wz6ybz1jt70y$.dlg@40tude.net> <1q9cx4jt7802s.k45m6mcntl87$.dlg@40tude.net> <460oxs2p0hbc.yjqxjeasx37r.dlg@40tude.net> Date: Tue, 22 Mar 2005 18:34:45 +0100 Message-ID: <1spfhtlo4ya1w.1a8leo0bqclk8.dlg@40tude.net> NNTP-Posting-Date: 22 Mar 2005 18:34:43 MET NNTP-Posting-Host: 94925ca2.newsread2.arcor-online.net X-Trace: DXC=WW@^1A^9GUe;XK28C2J>>hQ5U85hF6f;djW\KbG]kaMh]kI_X=5Keaf30<;KdZlH6m[6LHn;2LCVnVVa[ZlQni_a:lL] On Tue, 22 Mar 2005 16:48:13 +0000, Marius Amado Alves wrote: >> Get_Line does one extra line scan. So it will be inherently slower. >> Then it would not take any advantage of having Buffer if lines are >> shorter than 4K. > > I'm not sure I understand. An extra scan (I assume for EOL) in addition > to what? To parse the line. Get_Line scans some n-th buffer attached to the file and copies bytes from there into your n+1th buffer until it meets EOL. I suppose under VMS it could be otherwise, but we have what we deserve: UNIX and Windows. After that you rescan these bytes again. This means that each byte will be touched n+1 times. > I don't understand why Get_Line has to be slower. I understand how a > naive, non-buffered, implementation can be slower. But at least when > reading from standard input the implementation should buffer, or cache, > the data, on a buffer of Item'Size or greater, and remember the > positions, and never read again. Only if Get_Line were a function returning a slice referring to a system buffer that had multiple segments with reference counting ... Forget it (:-)) BTW, I think that this challenge is wrong because it does not filter I/O issue out. In real-life applications it is not Text_IO (or its equivalent in any other language) which is the bottleneck. Text_IO could be 10 times slower and still impose no or little problems, because string processing in the true sense, is usually 1000 times slower. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de