From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,8de7eedad50552f1 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news1.google.com!news.maxwell.syr.edu!newsfeed.icl.net!newsfeed.fjserv.net!feed.news.tiscali.de!newsfeed.freenet.de!151.189.20.20.MISMATCH!newsfeed.arcor.de!news.arcor.de!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: Ada bench : count words Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.14.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <87vf7n5njs.fsf@code-hal.de> <423f5813$0$9224$9b4e6d93@newsread4.arcor-online.net> <18arnvu705ly4$.1wz6ybz1jt70y$.dlg@40tude.net> <1q9cx4jt7802s.k45m6mcntl87$.dlg@40tude.net> Date: Tue, 22 Mar 2005 14:08:04 +0100 Message-ID: <460oxs2p0hbc.yjqxjeasx37r.dlg@40tude.net> NNTP-Posting-Date: 22 Mar 2005 14:08:02 MET NNTP-Posting-Host: 8b60378a.newsread4.arcor-online.net X-Trace: DXC=^gF66jXGMgYD__2dTlB=E[:ejgIfPPldTjW\KbG]kaMXdbobRQ:W=WR6DQ8mU0GR8]WRXZ37ga[7ZncfD5BXcIXPhE?HIfUYM1S X-Complaints-To: abuse@arcor.de Xref: g2news1.google.com comp.lang.ada:9728 Date: 2005-03-22T14:08:02+01:00 List-Id: On Tue, 22 Mar 2005 12:47:51 +0000, Marius Amado Alves wrote: >>>> Is Text_IO that bad? >>> >>> No, if you can solve The Get_Line puzzle :-) >> >> What about Get (Item : out Character)? > > I tried and was too slow. Anyway I think I cracked the Get_Line puzzle > (review welcome). So now it reads from standard input as required. But > it's still 3 to 4 times slower than the C version. > > -- Count words in Ada for the language shootout > -- by Marius Amado Alves > > with Ada.Characters.Handling; > with Ada.Characters.Latin_1; > with Ada.Text_IO; > > procedure Count_Words is > > use Ada.Characters.Handling; > use Ada.Characters.Latin_1; > use Ada.Text_IO; > > Buffer : String (1 .. 4096); > Lines : Natural := 0; > Words : Natural := 0; > Total : Natural := 0; > In_Word : Boolean := False; > N : Natural; > > function Is_Separator (C : Character) return Boolean is > begin > return Is_Control (C) or C = ' '; > end; > > procedure Begin_Word is > begin > In_Word := True; > end; > > procedure End_Word is > begin > if In_Word then > Words := Words + 1; > In_Word := False; > end if; > end; > > procedure End_Line is > begin > Lines := Lines + 1; > Total := Total + 1; > End_Word; > end; > > procedure Count_Words (S : in String) is > begin > Total := Total + S'Length; > for I in S'Range loop > if Is_Separator (S (I)) then > if In_Word then End_Word; end if; > else > if not In_Word then Begin_Word; end if; > end if; > end loop; > end; > > begin > while not End_Of_File loop Replace End_Of_File with End_Error handling. > Get_Line (Buffer, N); Get_Line does one extra line scan. So it will be inherently slower. Then it would not take any advantage of having Buffer if lines are shorter than 4K. Once Count_Words is inlined the buffer size does not matter. BTW, you can safely declare Buffer either 1 or 1G bytes, because hidden buffering happens anyway in Text_IO. (You only save calls to Get_Line.) Who knows how large are buffers there? This probably disqualifies Text_IO, as well as C's getc! It should be raw "read". > Count_Words (Buffer (1 .. N)); Wouldn't it count buffer ends as word separators for lines longer than 4K? > if N < Buffer'Length then > End_Line; > end if; > end loop; > > Ada.Text_IO.Put_Line > (Natural'Image (Lines) & > Natural'Image (Words) & > Natural'Image (Total)); > end; -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de