* Ada bench @ 2005-03-19 16:22 Pascal Obry 2005-03-19 16:55 ` Dr. Adrian Wrigley 0 siblings, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-03-19 16:22 UTC (permalink / raw) I don't want to restart this thread. I just want to point out that the famous matrix bench from Shootout that has been talked out here is now running a bit faster with GNU/Ada than GNU/C and GNU/C++. I have contributed a new version. The 4 time slower was because the Ada program was poorly written and the Ada options passed to the compiler were not the same used for GNU/C and GNU/C++. This is a proof for nothing except maybe that GNU/Ada is not inherently slow. For what it's worth... Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-19 16:22 Ada bench Pascal Obry @ 2005-03-19 16:55 ` Dr. Adrian Wrigley 2005-03-19 21:32 ` Michael Bode 2005-04-07 20:59 ` David Sauvage 0 siblings, 2 replies; 122+ messages in thread From: Dr. Adrian Wrigley @ 2005-03-19 16:55 UTC (permalink / raw) On Sat, 19 Mar 2005 17:22:23 +0100, Pascal Obry wrote: > I don't want to restart this thread. here we go again... > I just want to point out > that the famous matrix bench from Shootout that has been talked > out here is now running a bit faster with GNU/Ada than GNU/C > and GNU/C++. I have contributed a new version. The 4 time slower > was because the Ada program was poorly written and the Ada options > passed to the compiler were not the same used for GNU/C and GNU/C++. > > This is a proof for nothing except maybe that GNU/Ada is not > inherently slow. it does show that apparently small differences in style can have a big effect on speed in Ada. (as always) you have to keep your wits about you! Looking at: http://shootout.alioth.debian.org/great/benchmark.php?test=all&lang=gnat&sort=fullcpu I see five entries with "error" and 14 with "no program". Who can we volunteer to complete the table? -- Adrian ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-19 16:55 ` Dr. Adrian Wrigley @ 2005-03-19 21:32 ` Michael Bode 2005-03-20 9:20 ` Pascal Obry 2005-04-07 20:59 ` David Sauvage 1 sibling, 1 reply; 122+ messages in thread From: Michael Bode @ 2005-03-19 21:32 UTC (permalink / raw) "Dr. Adrian Wrigley" <amtw@linuxchip.demon.co.uk.uk.uk> writes: > I see five entries with "error" and 14 with "no program". > Who can we volunteer to complete the table? The statistics bechmark has two broken line breaks in line 31 and 123, otherwise it works. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-19 21:32 ` Michael Bode @ 2005-03-20 9:20 ` Pascal Obry 2005-03-20 9:39 ` Michael Bode ` (2 more replies) 0 siblings, 3 replies; 122+ messages in thread From: Pascal Obry @ 2005-03-20 9:20 UTC (permalink / raw) Michael Bode <m.g.bode@web.de> writes: > The statistics bechmark has two broken line breaks in line 31 and 123, > otherwise it works. Yes it was a cut&paste problem. There is some other bench that will work very soon. For example the Fibonacci is fine, only the expected output is wrong at the moment... Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 9:20 ` Pascal Obry @ 2005-03-20 9:39 ` Michael Bode 2005-03-20 11:16 ` Pascal Obry 2005-03-20 10:11 ` Adrian Knoth 2005-03-21 23:27 ` Georg Bauhaus 2 siblings, 1 reply; 122+ messages in thread From: Michael Bode @ 2005-03-20 9:39 UTC (permalink / raw) Pascal Obry <pascal@obry.net> writes: > Yes it was a cut&paste problem. There is some other bench that will work > very soon. For example the Fibonacci is fine, only the expected output > is wrong at the moment... Isn't that a problem of the testing environment? Almost any language has an error in fibonacci. And the few which have no error, have different input values than the others. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 9:39 ` Michael Bode @ 2005-03-20 11:16 ` Pascal Obry 2005-03-20 12:20 ` Dmitry A. Kazakov 0 siblings, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-03-20 11:16 UTC (permalink / raw) Michael Bode <m.g.bode@web.de> writes: > Isn't that a problem of the testing environment? Almost any language > has an error in fibonacci. That's what I said. The expected output is wrong (should be fixed soon). Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 11:16 ` Pascal Obry @ 2005-03-20 12:20 ` Dmitry A. Kazakov 2005-03-20 12:32 ` Pascal Obry 2005-03-21 1:42 ` (see below) 0 siblings, 2 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-20 12:20 UTC (permalink / raw) On 20 Mar 2005 12:16:20 +0100, Pascal Obry wrote: > Michael Bode <m.g.bode@web.de> writes: > >> Isn't that a problem of the testing environment? Almost any language >> has an error in fibonacci. > > That's what I said. The expected output is wrong (should be fixed soon). Pascal, could you also take care of the sum of column integers example? The code creates a new string for each integer to be taken from! -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 12:20 ` Dmitry A. Kazakov @ 2005-03-20 12:32 ` Pascal Obry 2005-03-21 1:42 ` (see below) 1 sibling, 0 replies; 122+ messages in thread From: Pascal Obry @ 2005-03-20 12:32 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > Pascal, could you also take care of the sum of column integers example? The > code creates a new string for each integer to be taken from! I'll probably have a look... but it will take time! Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 12:20 ` Dmitry A. Kazakov 2005-03-20 12:32 ` Pascal Obry @ 2005-03-21 1:42 ` (see below) 2005-03-21 2:24 ` (see below) ` (2 more replies) 1 sibling, 3 replies; 122+ messages in thread From: (see below) @ 2005-03-21 1:42 UTC (permalink / raw) On 20/3/05 12:20 pm, in article 2huslr9uto15$.nbw7z2uaceua.dlg@40tude.net, "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote: > Pascal, could you also take care of the sum of column integers example? The > code creates a new string for each integer to be taken from! I did some experiments. - I used GNAT 3.3, MacOS X build 1650, my 500MHz iBook, MacOS X 10.3.8 - I had four variants, of which the last is best - All compiled with "gnatmake -O3 *.adb" - All run with "time ./sumcol? < iofile.txt" - All runs used iofile.txt, which contains 1000 repetitions of the content of Shootout's file, and therefore sums to 500_000 (verified for each run) - I calculated the cpu time (user+sys) for each of 3 runs and took the mean - For each program I present the source, the 3 cpu times, and the mean SumCol4 is not only the fastest, with a MEAN of 0m2.320s, I think it is the most accurate transcription of the Icon program, since Icon is an event-driven (~= exception-driven) string language. Unlike versions 1..3, SumCol4 outputs an initial space. If necessary, this could be removed using Trim. For comparison, the given C program, compiled with Shootout's flags, but run as above, gave: cpu 0m0.880s cpu 0m0.920s cpu 0m0.860s MEAN 0m0.830s If Ada and C performance scale similarly we should look for a figure, for SumCol4 on Shootout's hardware, of around 1.04s++ for N = 1000. Ada Results: -- SumCol1 with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; procedure SumCol1 is Line : String (1 .. 128); Sum : Integer := 0; Len : Natural; begin while not End_Of_File loop Get_Line (Item => Line, Last => Len); Sum := Sum + Integer'Value (Line (1 .. Len)); end loop; Ada.Integer_Text_IO.Put (Item => Sum, Width => 0); New_Line; end SumCol1; time ./sumcol1 cpu 0m2.840s cpu 0m2.790s cpu 0m2.830s MEAN 0m2.820s ================ -- SumCol2 with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; procedure SumCol2 is Num : Integer; Sum : Integer := 0; begin while not End_Of_File loop Get (Num); Sum := Sum + Num; end loop; Ada.Integer_Text_IO.Put (Item => Sum, Width => 0); New_Line; end SumCol2; time ./sumcol2 cpu 0m5.280s cpu 0m5.330s cpu 0m5.350s MEAN 0m5.320s ================ -- SumCol3 with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; procedure SumCol3 is Num : Integer; Sum : Integer := 0; begin loop Get (Num); Sum := Sum + Num; end loop; exception when End_Error => Ada.Integer_Text_IO.Put (Item => Sum, Width => 0); New_Line; end SumCol3; time ./sumcol3 cpu 0m4.740s cpu 0m4.680s cpu 0m4.700s MEAN 0m4.710s ================ -- SumCol4 with Ada.Text_IO; use Ada.Text_IO; procedure SumCol4 is Line : String (1 .. 128); Sum : Integer := 0; Len : Natural; begin loop Get_Line (Item => Line, Last => Len); Sum := Sum + Integer'Value (Line (1 .. Len)); end loop; exception when End_Error => Put_Line (Integer'Image(Sum)); end SumCol4; time ./sumcol4 cpu 0m2.250s cpu 0m2.310s cpu 0m2.410s MEAN 0m2.320s ================ -- Bill ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 1:42 ` (see below) @ 2005-03-21 2:24 ` (see below) 2005-03-21 15:00 ` (see below) 2005-03-21 3:54 ` Ed Falis 2005-03-21 18:07 ` Pascal Obry 2 siblings, 1 reply; 122+ messages in thread From: (see below) @ 2005-03-21 2:24 UTC (permalink / raw) Following up on these "benchmarks" is probably worthwhile, simply from the point of view of greater visibility. Since Brian Wichmann, one of the designers of Ada, popularized the use of Ackermann's function as a recursion benchmark (not to mention the use of the Whetstone numerics benchmark), it is unseemly that it should be in such a sad state at the Shootout. In that spirit I therefore looked at ackermann. At a first cut, on my machine, the C program takes about 8.9s. (This seems very slow by comparison with the Shootout's hardware!) The Ada program at Shootout takes about 14.0 s. Ditto, but with Ack moved out as a library unit, takes about 8.9s, essentially the same as C. -- Bill ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 2:24 ` (see below) @ 2005-03-21 15:00 ` (see below) 0 siblings, 0 replies; 122+ messages in thread From: (see below) @ 2005-03-21 15:00 UTC (permalink / raw) I said: > In that spirit I therefore looked at ackermann. > > At a first cut, on my machine, the C program takes about 8.9s. > (This seems very slow by comparison with the Shootout's hardware!) I've had a closer look at this and it seems to be a cache effect. For this test, the stack size increases as 2**N. For N>7 on my machine it swamped the 256K L2 cache. There will be an N that does the same on the Shootout machine, of course, so that higher values of N test only RAM bandwidth. -- Bill ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 1:42 ` (see below) 2005-03-21 2:24 ` (see below) @ 2005-03-21 3:54 ` Ed Falis 2005-03-21 12:20 ` Jeff C ` (2 more replies) 2005-03-21 18:07 ` Pascal Obry 2 siblings, 3 replies; 122+ messages in thread From: Ed Falis @ 2005-03-21 3:54 UTC (permalink / raw) You neglected two things: 1. suppress checks: -gnatp 2. Use the end of file function instead of the exception. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 3:54 ` Ed Falis @ 2005-03-21 12:20 ` Jeff C 2005-03-21 15:18 ` (see below) 2005-03-21 21:31 ` Randy Brukardt 2 siblings, 0 replies; 122+ messages in thread From: Jeff C @ 2005-03-21 12:20 UTC (permalink / raw) Ed Falis wrote: > You neglected two things: > > 1. suppress checks: -gnatp > 2. Use the end of file function instead of the exception. The question is do the requirements of this benchmark require that you avoid exceptions. Generally you need to have an exception case anyway (for any non-trivial file) to handle files with bad termination conditions. It looks to me like the problem definition here would encourage that you do this "normal" processing without relying exceptions for the termination case. So I tend to agree with Ed. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 3:54 ` Ed Falis 2005-03-21 12:20 ` Jeff C @ 2005-03-21 15:18 ` (see below) 2005-03-21 15:24 ` (see below) 2005-03-21 18:56 ` Isaac Gouy 2005-03-21 21:31 ` Randy Brukardt 2 siblings, 2 replies; 122+ messages in thread From: (see below) @ 2005-03-21 15:18 UTC (permalink / raw) On 21/3/05 3:54 am, in article opsnyydgl25afhvo@localhost, "Ed Falis" <falis@verizon.net> wrote: > You Me? > neglected two things: > 1. suppress checks: -gnatp I have tried it with "pragma Suppress(All_Checks);" and it makes no difference to the cpu time, as I expected. This test is dominated by the (poor) performance of Ada.Text_IO. > 2. Use the end of file function instead of the exception. As I said: > SumCol4 is not only the fastest, with a MEAN of 0m2.320s, > I think it is the most accurate transcription of the Icon program, > since Icon is an event-driven (~= exception-driven) string language. We have an Icon program as a model from Shootout, so I think that is fair. In any event, the version using End_Of_File takes 2.82s, 0.5s faster for 1_000_000 lines of input. -- Bill ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 15:18 ` (see below) @ 2005-03-21 15:24 ` (see below) 2005-03-21 18:56 ` Isaac Gouy 1 sibling, 0 replies; 122+ messages in thread From: (see below) @ 2005-03-21 15:24 UTC (permalink / raw) > > In any event, the version using End_Of_File takes 2.82s, 0.5s faster Doh! "slower", of course! > for 1_000_000 lines of input. > > -- > Bill ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 15:18 ` (see below) 2005-03-21 15:24 ` (see below) @ 2005-03-21 18:56 ` Isaac Gouy 1 sibling, 0 replies; 122+ messages in thread From: Isaac Gouy @ 2005-03-21 18:56 UTC (permalink / raw) > > SumCol4 is not only the fastest, with a MEAN of 0m2.320s, > > I think it is the most accurate transcription of the Icon program, > > since Icon is an event-driven (~= exception-driven) string language. > > We have an Icon program as a model from Shootout, so I think that is fair. The Icon program is simply an example - there's no intention that Icon's programming model be replicated by every other program! ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 3:54 ` Ed Falis 2005-03-21 12:20 ` Jeff C 2005-03-21 15:18 ` (see below) @ 2005-03-21 21:31 ` Randy Brukardt 2005-03-21 22:14 ` Ed Falis 2 siblings, 1 reply; 122+ messages in thread From: Randy Brukardt @ 2005-03-21 21:31 UTC (permalink / raw) "Ed Falis" <falis@verizon.net> wrote in message news:opsnyydgl25afhvo@localhost... > You neglected two things: > > 1. suppress checks: -gnatp > 2. Use the end of file function instead of the exception. I'm curious why you would say that. End_of_File has dismal performance in Ada (because of the requirements on line and page terminators), and it should be avoided anytime performance is important. Indeed, our coding standards specifically say to avoid this function in favor of handling End_Error. Of course, handling the exception inside the main loop would also be pretty expensive. Unless there is some requirement *not* to use exceptions, avoid End_of_File. Randy. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 21:31 ` Randy Brukardt @ 2005-03-21 22:14 ` Ed Falis 0 siblings, 0 replies; 122+ messages in thread From: Ed Falis @ 2005-03-21 22:14 UTC (permalink / raw) On Mon, 21 Mar 2005 15:31:45 -0600, Randy Brukardt <randy@rrsoftware.com> wrote: > "Ed Falis" <falis@verizon.net> wrote in message > news:opsnyydgl25afhvo@localhost... >> You neglected two things: >> >> 1. suppress checks: -gnatp >> 2. Use the end of file function instead of the exception. > > I'm curious why you would say that. End_of_File has dismal performance in > Ada (because of the requirements on line and page terminators), and it > should be avoided anytime performance is important. Indeed, our coding > standards specifically say to avoid this function in favor of handling > End_Error. Of course, handling the exception inside the main loop would > also > be pretty expensive. > > Unless there is some requirement *not* to use exceptions, avoid > End_of_File. Fair enough - I stand corrected. ;-) - Ed ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 1:42 ` (see below) 2005-03-21 2:24 ` (see below) 2005-03-21 3:54 ` Ed Falis @ 2005-03-21 18:07 ` Pascal Obry 2 siblings, 0 replies; 122+ messages in thread From: Pascal Obry @ 2005-03-21 18:07 UTC (permalink / raw) "(see below)" <yaldnif.b@blueyonder.co.uk> writes: > Unlike versions 1..3, SumCol4 outputs an initial space. Version 2 and 3 are wrong. It is required to read the data line by line. The 4th version is indeed a bit faster, but I don't like using exception for this :) Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 9:20 ` Pascal Obry 2005-03-20 9:39 ` Michael Bode @ 2005-03-20 10:11 ` Adrian Knoth 2005-03-20 10:30 ` Michael Bode 2005-03-21 23:27 ` Georg Bauhaus 2 siblings, 1 reply; 122+ messages in thread From: Adrian Knoth @ 2005-03-20 10:11 UTC (permalink / raw) Pascal Obry <pascal@obry.net> wrote: > Yes it was a cut&paste problem. There is some other bench that will work > very soon. For example the Fibonacci is fine, only the expected output > is wrong at the moment... Is it necessary to use this recursive fibonacci-code? What about an iterative approach? function it_fibo (wert : in Positive) return Positive is n1 : Positive := 1; n2 : Positive := 1; result : Positive; begin if (wert < 3) then return 1; else for I in 1 .. wert - 2 loop result := n1 + n2; n1 := n2; n2 := result; end loop; return result; end if; end it_fibo; -- mail: adi@thur.de http://adi.thur.de PGP: v2-key via keyserver Treffen sich zwei Hellseher: "Dir gehts gut und wie gehts mir?" ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 10:11 ` Adrian Knoth @ 2005-03-20 10:30 ` Michael Bode 0 siblings, 0 replies; 122+ messages in thread From: Michael Bode @ 2005-03-20 10:30 UTC (permalink / raw) Adrian Knoth <adi@thur.de> writes: > Is it necessary to use this recursive fibonacci-code? What about an > iterative approach? This is the task: Each program should calculate the Fibonacci function using the same naï¿œve recursive-algorithm F(n) n = 0 = 0 n = 1 = 1 otherwise = F(n-2) + F(n-1) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-20 9:20 ` Pascal Obry 2005-03-20 9:39 ` Michael Bode 2005-03-20 10:11 ` Adrian Knoth @ 2005-03-21 23:27 ` Georg Bauhaus 2005-03-22 1:16 ` Ada bench : count words Marius Amado Alves 2005-03-22 7:05 ` Ada bench Pascal Obry 2 siblings, 2 replies; 122+ messages in thread From: Georg Bauhaus @ 2005-03-21 23:27 UTC (permalink / raw) Pascal Obry wrote: > For example the Fibonacci is fine, only the expected output > is wrong at the moment... I get consistently better results (7% and much more) when placing the Fib function (as online) in a package. Georg ^ permalink raw reply [flat|nested] 122+ messages in thread
* Ada bench : count words 2005-03-21 23:27 ` Georg Bauhaus @ 2005-03-22 1:16 ` Marius Amado Alves 2005-03-22 10:59 ` Dmitry A. Kazakov ` (2 more replies) 2005-03-22 7:05 ` Ada bench Pascal Obry 1 sibling, 3 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 1:16 UTC (permalink / raw) To: comp.lang.ada I took a shot at the count-words benchmark, a program to count lines, words and characters. The Ada program currently published there is broken. My program is correct and portable but: - the speed is circa 1/3 of the GCC C version - it fails to comply with the requirement that the input be taken from standard input. To implement buffering, I have resorted to Ada.Direct_IO, which I think cannot apply to standard input. Can you help with any of these points? Thanks. The complete program follows. -- Count words in Ada for the language shootout -- by Marius Amado Alves with Ada.Characters.Handling; with Ada.Characters.Latin_1; with Ada.Command_Line; with Ada.Direct_IO; with Ada.Strings.Fixed; with Ada.Text_IO; procedure Count_Words is use Ada.Characters.Handling; use Ada.Characters.Latin_1; use Ada.Command_Line; Filename : String := Argument (1); Buffer_Size : constant := 4096; EOF : Character := FS; EOL : String := (1 => LF); Lines : Natural := 0; Words : Natural := 0; Total : Natural := 0; In_Word : Boolean := False; function Is_Separator (C : Character) return Boolean is begin return Is_Control (C) or C = ' '; end; procedure Start_Word is begin In_Word := True; end; procedure Finish_Word is begin Words := Words + 1; In_Word := False; end; procedure Process (S : in String) is begin Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); for I in S'Range loop if Is_Separator (S (I)) then if In_Word then Finish_Word; end if; else if not In_Word then Start_Word; end if; end if; end loop; end; begin declare package Character_IO is new Ada.Direct_IO (Character); use Character_IO; File : File_Type; begin Open (File, In_File, Filename); Total := Natural (Size (File)); Close (File); end; declare subtype Buffer_Type is String (1 .. Buffer_Size); package Buffer_IO is new Ada.Direct_IO (Buffer_Type); use Buffer_IO; File : File_Type; S : Buffer_Type; begin Open (File, In_File, Filename); for I in 1 .. Total / Buffer_Size loop Read (File, S); Process (S); end loop; Close (File); end; declare subtype Rest_Type is String (1 .. Total rem Buffer_Size); package Character_IO is new Ada.Direct_IO (Character); use Character_IO; File : File_Type; S : Rest_Type; begin Open (File, In_File, Filename); Set_Index (File, Count (Total - S'Length)); for I in 1 .. S'Length loop Read (File, S (I)); end loop; Close (File); Process (S); end; if In_Word then Finish_Word; end if; Ada.Text_IO.Put_Line (Natural'Image (Lines) & Natural'Image (Words) & Natural'Image (Total)); end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 1:16 ` Ada bench : count words Marius Amado Alves @ 2005-03-22 10:59 ` Dmitry A. Kazakov 2005-03-22 11:57 ` Marius Amado Alves ` (2 more replies) 2005-03-22 19:49 ` tmoran 2005-03-22 22:27 ` Dmitry A. Kazakov 2 siblings, 3 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 10:59 UTC (permalink / raw) On Tue, 22 Mar 2005 01:16:09 +0000, Marius Amado Alves wrote: > I took a shot at the count-words benchmark, a program to count lines, > words and characters. The Ada program currently published there is > broken. My program is correct and portable but: > > - the speed is circa 1/3 of the GCC C version > > - it fails to comply with the requirement that the input be taken from > standard input. To implement buffering, I have resorted to > Ada.Direct_IO, which I think cannot apply to standard input. Is Text_IO that bad? > Can you help with any of these points? Thanks. The complete program > follows. > > -- Count words in Ada for the language shootout > -- by Marius Amado Alves > > with Ada.Characters.Handling; > with Ada.Characters.Latin_1; > with Ada.Command_Line; > with Ada.Direct_IO; > with Ada.Strings.Fixed; > with Ada.Text_IO; > > procedure Count_Words is > > use Ada.Characters.Handling; > use Ada.Characters.Latin_1; > use Ada.Command_Line; > > Filename : String := Argument (1); > Buffer_Size : constant := 4096; > EOF : Character := FS; > EOL : String := (1 => LF); > Lines : Natural := 0; > Words : Natural := 0; > Total : Natural := 0; > In_Word : Boolean := False; > > function Is_Separator (C : Character) return Boolean is > begin > return Is_Control (C) or C = ' '; > end; Why don't you use character map here? > procedure Start_Word is > begin > In_Word := True; > end; > > procedure Finish_Word is > begin > Words := Words + 1; > In_Word := False; > end; > > procedure Process (S : in String) is > begin > Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); Isn't it an extra pass? I think you should do parsing using FSM. Character classes are: EOL, delimiter, letter. It is either two character map tests or one case statement. I don't know what is faster. Probably you should test both. > for I in S'Range loop > if Is_Separator (S (I)) then > if In_Word then Finish_Word; end if; > else > if not In_Word then Start_Word; end if; > end if; > end loop; > end; > > begin > declare > package Character_IO is new Ada.Direct_IO (Character); > use Character_IO; > File : File_Type; > begin > Open (File, In_File, Filename); > Total := Natural (Size (File)); > Close (File); > end; > > declare > subtype Buffer_Type is String (1 .. Buffer_Size); > package Buffer_IO is new Ada.Direct_IO (Buffer_Type); > use Buffer_IO; > File : File_Type; > S : Buffer_Type; > begin > Open (File, In_File, Filename); > for I in 1 .. Total / Buffer_Size loop > Read (File, S); > Process (S); > end loop; > Close (File); > end; > > declare > subtype Rest_Type is String (1 .. Total rem Buffer_Size); > package Character_IO is new Ada.Direct_IO (Character); > use Character_IO; > File : File_Type; > S : Rest_Type; > begin > Open (File, In_File, Filename); > Set_Index (File, Count (Total - S'Length)); > for I in 1 .. S'Length loop > Read (File, S (I)); > end loop; > Close (File); > Process (S); > end; > > if In_Word then Finish_Word; end if; > > Ada.Text_IO.Put_Line > (Natural'Image (Lines) & > Natural'Image (Words) & > Natural'Image (Total)); > end; P.S. For word frequencies: gcc version uses hash + sort. I wonder if binary trees could be better here. Or even sorted arrays for simplicity, there is no item deletion, search is more often than insert... Of course, for tree node allocation one could use a stack pool instead of heap. (That's one of the weaknesses of this contest. Actually the method should have been specified) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 10:59 ` Dmitry A. Kazakov @ 2005-03-22 11:57 ` Marius Amado Alves 2005-03-22 12:17 ` Dmitry A. Kazakov 2005-03-22 12:22 ` Jeff C 2005-03-23 17:06 ` Isaac Gouy 2 siblings, 1 reply; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 11:57 UTC (permalink / raw) To: comp.lang.ada >> ... To implement buffering, I have resorted to >> Ada.Direct_IO, which I think cannot apply to standard input. > > Is Text_IO that bad? No, if you can solve The Get_Line puzzle :-) >> procedure Process (S : in String) is >> begin >> Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); > > Isn't it an extra pass? I think you should do parsing using FSM. > Character > classes are: EOL, delimiter, letter. It is either two character map > tests > or one case statement. I don't know what is faster. Probably you should > test both. > >> for I in S'Range loop >> if Is_Separator (S (I)) then >> if In_Word then Finish_Word; end if; >> else >> if not In_Word then Start_Word; end if; >> end if; >> end loop; >> end; Note EOL is not a character, but a string, because in some environments the thing is a combination of two characters. This one-character version improves speed (but still only to 1/2 of C): for I in S'Range loop if S (I) = EOL then Lines := Lines + 1; end if; if Is_Separator (S (I)) then if In_Word then Finish_Word; end if; else if not In_Word then Start_Word; end if; end if; end loop; I have not tried with string matching (if that's what you mean with "FSM") because the iteration was already there, and I doubt the standard library implements it more efficiently than that. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 11:57 ` Marius Amado Alves @ 2005-03-22 12:17 ` Dmitry A. Kazakov 2005-03-22 12:47 ` Marius Amado Alves ` (3 more replies) 0 siblings, 4 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 12:17 UTC (permalink / raw) On Tue, 22 Mar 2005 11:57:22 +0000, Marius Amado Alves wrote: >>> ... To implement buffering, I have resorted to >>> Ada.Direct_IO, which I think cannot apply to standard input. >> >> Is Text_IO that bad? > > No, if you can solve The Get_Line puzzle :-) What about Get (Item : out Character)? I wonder if calling C-lib's read would qualify! (:-)) >>> procedure Process (S : in String) is >>> begin >>> Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); >> >> Isn't it an extra pass? I think you should do parsing using FSM. >> Character >> classes are: EOL, delimiter, letter. It is either two character map >> tests >> or one case statement. I don't know what is faster. Probably you should >> test both. >> >>> for I in S'Range loop >>> if Is_Separator (S (I)) then >>> if In_Word then Finish_Word; end if; >>> else >>> if not In_Word then Start_Word; end if; >>> end if; >>> end loop; >>> end; > > Note EOL is not a character, but a string, because in some environments > the thing is a combination of two characters. I think Text_IO should translate it into LF. But anyway you can always assume LF = new line, CR = space in the FSM. > This one-character > version improves speed (but still only to 1/2 of C): > > for I in S'Range loop > if S (I) = EOL then Lines := Lines + 1; end if; > if Is_Separator (S (I)) then > if In_Word then Finish_Word; end if; > else > if not In_Word then Start_Word; end if; > end if; > end loop; > > I have not tried with string matching (if that's what you mean with > "FSM") because the iteration was already there, and I doubt the > standard library implements it more efficiently than that. No, I meant a finite state machine: <<Word>> read character; case Char is when EOL => Inc line count; goto Blank; when Space => goto Blank; when Letter => goto Word; end case; <<Blank>> read character; case Char is when EOL => Inc line count; goto Blank; when Space => goto Blank; when Letter => Inc word count; goto Word; end case; <<Word>> read character; case Char is when EOL => Inc line count; goto Blank; when Space => goto Blank; when Letter => goto Word; end case; FSM is one of that rare cases where gotos are natural. What a pity that Ada does not have arrays of labels! (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 12:17 ` Dmitry A. Kazakov @ 2005-03-22 12:47 ` Marius Amado Alves 2005-03-22 13:08 ` Dmitry A. Kazakov 2005-03-22 12:53 ` Marius Amado Alves ` (2 subsequent siblings) 3 siblings, 1 reply; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 12:47 UTC (permalink / raw) To: comp.lang.ada >>> Is Text_IO that bad? >> >> No, if you can solve The Get_Line puzzle :-) > > What about Get (Item : out Character)? I tried and was too slow. Anyway I think I cracked the Get_Line puzzle (review welcome). So now it reads from standard input as required. But it's still 3 to 4 times slower than the C version. -- Count words in Ada for the language shootout -- by Marius Amado Alves with Ada.Characters.Handling; with Ada.Characters.Latin_1; with Ada.Text_IO; procedure Count_Words is use Ada.Characters.Handling; use Ada.Characters.Latin_1; use Ada.Text_IO; Buffer : String (1 .. 4096); Lines : Natural := 0; Words : Natural := 0; Total : Natural := 0; In_Word : Boolean := False; N : Natural; function Is_Separator (C : Character) return Boolean is begin return Is_Control (C) or C = ' '; end; procedure Begin_Word is begin In_Word := True; end; procedure End_Word is begin if In_Word then Words := Words + 1; In_Word := False; end if; end; procedure End_Line is begin Lines := Lines + 1; Total := Total + 1; End_Word; end; procedure Count_Words (S : in String) is begin Total := Total + S'Length; for I in S'Range loop if Is_Separator (S (I)) then if In_Word then End_Word; end if; else if not In_Word then Begin_Word; end if; end if; end loop; end; begin while not End_Of_File loop Get_Line (Buffer, N); Count_Words (Buffer (1 .. N)); if N < Buffer'Length then End_Line; end if; end loop; Ada.Text_IO.Put_Line (Natural'Image (Lines) & Natural'Image (Words) & Natural'Image (Total)); end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 12:47 ` Marius Amado Alves @ 2005-03-22 13:08 ` Dmitry A. Kazakov 2005-03-22 13:28 ` Marius Amado Alves 2005-03-22 16:48 ` Marius Amado Alves 0 siblings, 2 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 13:08 UTC (permalink / raw) On Tue, 22 Mar 2005 12:47:51 +0000, Marius Amado Alves wrote: >>>> Is Text_IO that bad? >>> >>> No, if you can solve The Get_Line puzzle :-) >> >> What about Get (Item : out Character)? > > I tried and was too slow. Anyway I think I cracked the Get_Line puzzle > (review welcome). So now it reads from standard input as required. But > it's still 3 to 4 times slower than the C version. > > -- Count words in Ada for the language shootout > -- by Marius Amado Alves > > with Ada.Characters.Handling; > with Ada.Characters.Latin_1; > with Ada.Text_IO; > > procedure Count_Words is > > use Ada.Characters.Handling; > use Ada.Characters.Latin_1; > use Ada.Text_IO; > > Buffer : String (1 .. 4096); > Lines : Natural := 0; > Words : Natural := 0; > Total : Natural := 0; > In_Word : Boolean := False; > N : Natural; > > function Is_Separator (C : Character) return Boolean is > begin > return Is_Control (C) or C = ' '; > end; > > procedure Begin_Word is > begin > In_Word := True; > end; > > procedure End_Word is > begin > if In_Word then > Words := Words + 1; > In_Word := False; > end if; > end; > > procedure End_Line is > begin > Lines := Lines + 1; > Total := Total + 1; > End_Word; > end; > > procedure Count_Words (S : in String) is > begin > Total := Total + S'Length; > for I in S'Range loop > if Is_Separator (S (I)) then > if In_Word then End_Word; end if; > else > if not In_Word then Begin_Word; end if; > end if; > end loop; > end; > > begin > while not End_Of_File loop Replace End_Of_File with End_Error handling. > Get_Line (Buffer, N); Get_Line does one extra line scan. So it will be inherently slower. Then it would not take any advantage of having Buffer if lines are shorter than 4K. Once Count_Words is inlined the buffer size does not matter. BTW, you can safely declare Buffer either 1 or 1G bytes, because hidden buffering happens anyway in Text_IO. (You only save calls to Get_Line.) Who knows how large are buffers there? This probably disqualifies Text_IO, as well as C's getc! It should be raw "read". > Count_Words (Buffer (1 .. N)); Wouldn't it count buffer ends as word separators for lines longer than 4K? > if N < Buffer'Length then > End_Line; > end if; > end loop; > > Ada.Text_IO.Put_Line > (Natural'Image (Lines) & > Natural'Image (Words) & > Natural'Image (Total)); > end; -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 13:08 ` Dmitry A. Kazakov @ 2005-03-22 13:28 ` Marius Amado Alves 2005-03-22 16:48 ` Marius Amado Alves 1 sibling, 0 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 13:28 UTC (permalink / raw) To: comp.lang.ada > Replace End_Of_File with End_Error handling. I tried. No speed gain. >> Get_Line (Buffer, N); > > Get_Line does one extra line scan. So it will be inherently slower. > Then it > would not take any advantage of having Buffer if lines are shorter > than 4K. I know, but how can you do it otherwise with Text_IO? > BTW, you can safely declare Buffer either 1 or 1G bytes... I know, but the benchmark specifies 4096. > ... It should be raw "read". It should, but Text_IO doesn't have it. >> Count_Words (Buffer (1 .. N)); > > Wouldn't it count buffer ends as word separators for lines longer than > 4K? No. Count_Words (S) does not call End_Word at the end of S. The state of In_Word passes unchanged to the next call of Count_Words (if not changed by End_Line). Only a separator or End_Line ends a word. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 13:08 ` Dmitry A. Kazakov 2005-03-22 13:28 ` Marius Amado Alves @ 2005-03-22 16:48 ` Marius Amado Alves 2005-03-22 17:34 ` Dmitry A. Kazakov 1 sibling, 1 reply; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 16:48 UTC (permalink / raw) To: comp.lang.ada > Get_Line does one extra line scan. So it will be inherently slower. > Then it would not take any advantage of having Buffer if lines are > shorter than 4K. I'm not sure I understand. An extra scan (I assume for EOL) in addition to what? I don't understand why Get_Line has to be slower. I understand how a naive, non-buffered, implementation can be slower. But at least when reading from standard input the implementation should buffer, or cache, the data, on a buffer of Item'Size or greater, and remember the positions, and never read again. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 16:48 ` Marius Amado Alves @ 2005-03-22 17:34 ` Dmitry A. Kazakov 2005-03-27 20:14 ` jtg 0 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 17:34 UTC (permalink / raw) On Tue, 22 Mar 2005 16:48:13 +0000, Marius Amado Alves wrote: >> Get_Line does one extra line scan. So it will be inherently slower. >> Then it would not take any advantage of having Buffer if lines are >> shorter than 4K. > > I'm not sure I understand. An extra scan (I assume for EOL) in addition > to what? To parse the line. Get_Line scans some n-th buffer attached to the file and copies bytes from there into your n+1th buffer until it meets EOL. I suppose under VMS it could be otherwise, but we have what we deserve: UNIX and Windows. After that you rescan these bytes again. This means that each byte will be touched n+1 times. > I don't understand why Get_Line has to be slower. I understand how a > naive, non-buffered, implementation can be slower. But at least when > reading from standard input the implementation should buffer, or cache, > the data, on a buffer of Item'Size or greater, and remember the > positions, and never read again. Only if Get_Line were a function returning a slice referring to a system buffer that had multiple segments with reference counting ... Forget it (:-)) BTW, I think that this challenge is wrong because it does not filter I/O issue out. In real-life applications it is not Text_IO (or its equivalent in any other language) which is the bottleneck. Text_IO could be 10 times slower and still impose no or little problems, because string processing in the true sense, is usually 1000 times slower. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 17:34 ` Dmitry A. Kazakov @ 2005-03-27 20:14 ` jtg 2005-03-27 21:22 ` Dmitry A. Kazakov 0 siblings, 1 reply; 122+ messages in thread From: jtg @ 2005-03-27 20:14 UTC (permalink / raw) Dmitry A. Kazakov wrote: > > BTW, I think that this challenge is wrong because it does not filter I/O > issue out. In real-life applications it is not Text_IO (or its equivalent > in any other language) which is the bottleneck. Text_IO could be 10 times > slower and still impose no or little problems, because string processing in > the true sense, is usually 1000 times slower. > IMHO the challenge is OK. Look at the system commands (in UNIX). In many cases text I/O performance is crucial. But since existing commands are very fast, nobody even thinks about it. Recently I had to write some simple programs in C (after years of pure Ada) and I/O performance was my biggest problem. I compared my I/O results with wc command and wc was substantially faster than my pure I/O! So I downloaded the code and looked into it. That was mess&magic. They must have spent much time to get really fast I/O, and they had a good reason for it. I think that instead of blaming the challenge we must face reality: this test indicates that existing implementation of Ada is worse than C in pure I/O applications (like system commands). Maybe instead of improving the program it would be better to look into the GNAT source and improve things there. Or to notify GNAT team about the problem. It would not only give Ada better position in this challenge, but also speed up text I/O operations of all the programs compiled with future versions of GNAT. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-27 20:14 ` jtg @ 2005-03-27 21:22 ` Dmitry A. Kazakov 2005-03-28 19:54 ` jtg 0 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-27 21:22 UTC (permalink / raw) On Sun, 27 Mar 2005 22:14:04 +0200, jtg wrote: > Dmitry A. Kazakov wrote: >> >> BTW, I think that this challenge is wrong because it does not filter I/O >> issue out. In real-life applications it is not Text_IO (or its equivalent >> in any other language) which is the bottleneck. Text_IO could be 10 times >> slower and still impose no or little problems, because string processing in >> the true sense, is usually 1000 times slower. >> > IMHO the challenge is OK. Look at the system commands (in UNIX). In > many cases text I/O performance is crucial. But since > existing commands are very fast, nobody even thinks about it. Ah, you hit my sore spot! In my view this is the problem of bad design. It is characteristic of UNIX which said has three mouse drives, one per button, but only for UNIX. [ The famous idea of splitting an application into a chain of tiny commands like wc, sort etc is attractive but altogether wrong. There is no universal way to do it. Then what about pipes overhead and need of multiple re-scans. I don't buy it, and scripting languages too. ] > Recently I had to write some simple programs in C (after years of pure > Ada) and I/O performance was my biggest problem. I compared my I/O > results with wc command and wc was substantially faster than my > pure I/O! So I downloaded > the code and looked into it. That was mess&magic. They must have > spent much time to get really fast I/O, and they had a good > reason for it. > I think that instead of blaming the challenge we must face reality: > this test indicates that existing implementation of Ada is worse than C > in pure I/O applications (like system commands). But people don't use wc, they do OpenOffice (I suppose it wasn't written in awk! (:-)). The style of programming acceptable for wc would be a disaster for a product like OpenOffice. > Maybe instead of > improving the program it would be better to look into the GNAT > source and improve things there. In my experience, GNAT always had bad performance, and not only in Ada.Text_IO, which is a minor problem for the mentioned above reasons, but in far more important parts. As for patching GNAT, well, maybe next time... (:-)) Anyway Ada.Text_IO performance is not on the top of my list. I'd better improve ADT, add full MI and MD, make T'Class for all types, tasks and protected types tagged, allow multiple parties rendezvous etc. (:-)) > Or to notify GNAT team about the > problem. It would not only give Ada better > position in this challenge, but also speed up text I/O operations of > all the programs compiled with future versions of GNAT. If you are a paying customer of ACT, you can try... (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-27 21:22 ` Dmitry A. Kazakov @ 2005-03-28 19:54 ` jtg 2005-03-28 20:56 ` Dmitry A. Kazakov 0 siblings, 1 reply; 122+ messages in thread From: jtg @ 2005-03-28 19:54 UTC (permalink / raw) Dmitry A. Kazakov wrote: >> >>IMHO the challenge is OK. Look at the system commands (in UNIX). In >>many cases text I/O performance is crucial. But since >>existing commands are very fast, nobody even thinks about it. > > > Ah, you hit my sore spot! In my view this is the problem of bad design. It > is characteristic of UNIX which said has three mouse drives, one per > button, but only for UNIX. [ The famous idea of splitting an application > into a chain of tiny commands like wc, sort etc is attractive but > altogether wrong. There is no universal way to do it. Then what about pipes > overhead and need of multiple re-scans. I don't buy it, and scripting > languages too. ] > It depends on what you are doing. Maybe you just didn't encounter problems that needed scripting languages or pipe chain. From my own experience I remember a tiny awk script I assisted to create, which needed a minute to automatically perform a task that usually took one day of human-assisted work with office tools. It took several minutes to create the script itself. "Office people" then looked at it and spent some time to create "office interface" in VB, that was just GUI for the script. They didn't even think of writing its contents in VB. And if they did, it would take several days to create, test and remove bugs from the program (integrated awk, diff and several other functions, and file I/O too). The way wc and other UNIX commands cooperate is briliant, it is the right way to do certain things and it does not exclude other ways of solving other problems. If you need Office, you can run it, even on UNIX. What's the problem? > > > In my experience, GNAT always had bad performance, and not only in > Ada.Text_IO, which is a minor problem for the mentioned above reasons, but > in far more important parts. As for patching GNAT, well, maybe next time... > (:-)) Anyway Ada.Text_IO performance is not on the top of my list. I'd > better improve ADT, add full MI and MD, make T'Class for all types, tasks > and protected types tagged, allow multiple parties rendezvous etc. (:-)) > Well... I don't need it either. Maybe Ada programmers don't need Text_IO performance. > >>Or to notify GNAT team about the >>problem. It would not only give Ada better >>position in this challenge, but also speed up text I/O operations of >>all the programs compiled with future versions of GNAT. > > > If you are a paying customer of ACT, you can try... (:-)) > No. I was just fascinated with all the efforts shown in this thread. :-) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-28 19:54 ` jtg @ 2005-03-28 20:56 ` Dmitry A. Kazakov 2005-03-29 12:40 ` jtg 0 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-28 20:56 UTC (permalink / raw) On Mon, 28 Mar 2005 21:54:32 +0200, jtg wrote: > Dmitry A. Kazakov wrote: > >>>IMHO the challenge is OK. Look at the system commands (in UNIX). In >>>many cases text I/O performance is crucial. But since >>>existing commands are very fast, nobody even thinks about it. >> >> Ah, you hit my sore spot! In my view this is the problem of bad design. It >> is characteristic of UNIX which said has three mouse drives, one per >> button, but only for UNIX. [ The famous idea of splitting an application >> into a chain of tiny commands like wc, sort etc is attractive but >> altogether wrong. There is no universal way to do it. Then what about pipes >> overhead and need of multiple re-scans. I don't buy it, and scripting >> languages too. ] > It depends on what you are doing. Maybe you just didn't encounter > problems that needed scripting languages or pipe chain. > From my own experience I remember a tiny awk script I assisted to > create, which needed a minute to automatically perform a task > that usually took one day of human-assisted work with office > tools. It took several minutes to create the script itself. I write such things in Ada. I am too stupid to understand bash man pages. Cryptography isn't my strength. Then when something does not work as expected, it takes hours before I can find the problem, usually a pretty idiotic one, which would never happen if it was written in something more advanced than mumbling. All that is just in vain. > "Office people" then looked at it and spent some time to create > "office interface" in VB, that was just GUI for the script. > They didn't even think of writing its contents in VB. And if they > did, it would take several days to create, test and remove > bugs from the program (integrated awk, diff and several > other functions, and file I/O too). > The way wc and other UNIX commands cooperate is briliant, Maybe, but I just don't need wc! There is no occupation "word counting". The most close thing is linguistics and text analysis. Can wc that? > it is the > right way to do certain things and it does not exclude other ways > of solving other problems. If you need Office, you can run it, even > on UNIX. What's the problem? The problem is that there are 1000 different commands with 50 options each, which gives 1000*2**50 variants to remember. Every UNIX admin starts with making an alias to ls and 100 further commands. It is just unusable. It took decades before people finally understood it and wrote KDE, Gnome control centers. Admins despise them, but they are lost. >> If you are a paying customer of ACT, you can try... (:-)) >> > No. I was just fascinated with all the efforts shown in this thread. :-) I think we have tolerated that C guys telling us that Ada is slow for too long time. Game is over! (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-28 20:56 ` Dmitry A. Kazakov @ 2005-03-29 12:40 ` jtg 2005-03-29 13:00 ` [OT] " Tapio Kelloniemi ` (2 more replies) 0 siblings, 3 replies; 122+ messages in thread From: jtg @ 2005-03-29 12:40 UTC (permalink / raw) Dmitry A. Kazakov wrote: > > Maybe, but I just don't need wc! There is no occupation "word counting". > The most close thing is linguistics and text analysis. Can wc that? > There are other tools for text analysis. With wc you can get the number of lines in a file or the number of entries filtered by previous commands. Like cat *.ad? |wc Which gives you immediately the total number of lines in your ada files > > The problem is that there are 1000 different commands with 50 options each, > which gives 1000*2**50 variants to remember. Every UNIX admin starts with > making an alias to ls and 100 further commands. It is just unusable. It > took decades before people finally understood it and wrote KDE, Gnome > control centers. Admins despise them, but they are lost. Really? But there are some 1000000 different Windows programs and each of them has 1000 options, so you have to remember 1000000*2**1000 variants :-) If you don't need command line tools, you don't have to use them, but don't call them bad design just because you don't need them. > > I think we have tolerated that C guys telling us that Ada is slow for too > long time. Game is over! (:-)) > I'm afraid I don't understand. Who is the winner? Maybe you meant: The game just begins :-) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: [OT] Ada bench : count words 2005-03-29 12:40 ` jtg @ 2005-03-29 13:00 ` Tapio Kelloniemi 2005-03-29 13:47 ` Dmitry A. Kazakov 2005-03-29 13:47 ` Dmitry A. Kazakov 2005-04-01 20:58 ` Georg Bauhaus 2 siblings, 1 reply; 122+ messages in thread From: Tapio Kelloniemi @ 2005-03-29 13:00 UTC (permalink / raw) jtg <jtg77@poczta.onet.pl> wrote: >Dmitry A. Kazakov wrote: >> The problem is that there are 1000 different commands with 50 options each, >> which gives 1000*2**50 variants to remember. Every UNIX admin starts with >> making an alias to ls and 100 further commands. It is just unusable. It >> took decades before people finally understood it and wrote KDE, Gnome >> control centers. Admins despise them, but they are lost. > >Really? But there are some 1000000 different Windows programs and each >of them has 1000 options, so you have to remember 1000000*2**1000 >variants :-) Millions of giant size programs with their own bugs and almost all of them fail to do something correctly. If some command-line tool sucks, just replace it with some other and combine the better one with other known-good tools. But if some GUI program has a feature you need, but it is unable to share data (or otherwise co-operate) with other programs, it is useless. Also I haven't heard of wizard applications which automate using GUI programs. I think it is much easier to write a bash script which invokes those hard to remember commands rather than writing an application which automates moving the mouse pointer to specified coordinates and then simulates a click. And that clicker wizard is even not portable to other screen resolutions... -- Tapio ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: [OT] Ada bench : count words 2005-03-29 13:00 ` [OT] " Tapio Kelloniemi @ 2005-03-29 13:47 ` Dmitry A. Kazakov 2005-03-29 15:53 ` Tapio Kelloniemi 0 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-29 13:47 UTC (permalink / raw) On Tue, 29 Mar 2005 13:00:22 GMT, Tapio Kelloniemi wrote: > jtg <jtg77@poczta.onet.pl> wrote: >>Dmitry A. Kazakov wrote: >>> The problem is that there are 1000 different commands with 50 options each, >>> which gives 1000*2**50 variants to remember. Every UNIX admin starts with >>> making an alias to ls and 100 further commands. It is just unusable. It >>> took decades before people finally understood it and wrote KDE, Gnome >>> control centers. Admins despise them, but they are lost. >> >>Really? But there are some 1000000 different Windows programs and each >>of them has 1000 options, so you have to remember 1000000*2**1000 >>variants :-) > > Millions of giant size programs with their own bugs and almost all of them > fail to do something correctly. If some command-line tool sucks, just > replace it with some other and combine the better one with other known-good > tools. You get the idea right: let others do my job if they don't like how I do it. (:-)) > But if some GUI program has a feature you need, but it is unable to > share data (or otherwise co-operate) with other programs, it is useless. Even if "what I need" /= "share data with other programs"? > Also I haven't heard of wizard applications which automate using > GUI programs. I think it is much easier to write a bash script which > invokes those hard to remember commands rather than writing an application > which automates moving the mouse pointer to specified coordinates and then > simulates a click. And that clicker wizard is even not portable to other > screen resolutions... You seem to mix program and user interfaces of an OS. I don't argue against the former. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: [OT] Ada bench : count words 2005-03-29 13:47 ` Dmitry A. Kazakov @ 2005-03-29 15:53 ` Tapio Kelloniemi 2005-03-29 16:17 ` Dmitry A. Kazakov [not found] ` <k33j419lgei1ui89s26o1dlr9ccf1qe1hd@4ax.com> 0 siblings, 2 replies; 122+ messages in thread From: Tapio Kelloniemi @ 2005-03-29 15:53 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote: >On Tue, 29 Mar 2005 13:00:22 GMT, Tapio Kelloniemi wrote: >> But if some GUI program has a feature you need, but it is unable to >> share data (or otherwise co-operate) with other programs, it is useless. > >Even if "what I need" /= "share data with other programs"? This is a bit poor example, but itshows what I mean. I return to the office example. I have written a book using that cool WYSIWYG application which has a very nice look and feel. But the WYSIWYG application has bulit-in support for some print spoolers. My host doesn't happen to run any of them. It also refuses to output raw postscript. My friend is running the required print spooler on his host, so I copy my book (10 MB) to his computer. Unfortunately I have just wasted bandwidth since that cool WYSIWYG program refuses to start up without an X display when I invoke it through SSH. >> Also I haven't heard of wizard applications which automate using >> GUI programs. I think it is much easier to write a bash script which >> invokes those hard to remember commands rather than writing an application >> which automates moving the mouse pointer to specified coordinates and then >> simulates a click. And that clicker wizard is even not portable to other >> screen resolutions... > >You seem to mix program and user interfaces of an OS. I don't argue against >the former. But GUIs cannot be automated (as command-line tools can be). GUI (or TUI) is a must for some things (like text editing), but command-line is much more powerful for many other things. -- Tapio ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: [OT] Ada bench : count words 2005-03-29 15:53 ` Tapio Kelloniemi @ 2005-03-29 16:17 ` Dmitry A. Kazakov [not found] ` <k33j419lgei1ui89s26o1dlr9ccf1qe1hd@4ax.com> 1 sibling, 0 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-29 16:17 UTC (permalink / raw) On Tue, 29 Mar 2005 15:53:43 GMT, Tapio Kelloniemi wrote: > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote: >>On Tue, 29 Mar 2005 13:00:22 GMT, Tapio Kelloniemi wrote: >>> But if some GUI program has a feature you need, but it is unable to >>> share data (or otherwise co-operate) with other programs, it is useless. >> >>Even if "what I need" /= "share data with other programs"? > > This is a bit poor example, but itshows what I mean. > I return to the office example. I have written a book using that > cool WYSIWYG application which has a very nice look and feel. But the > WYSIWYG application has bulit-in support for some print spoolers. My host > doesn't happen to run any of them. It also refuses to output raw postscript. > My friend is running the required print spooler on his host, so I copy > my book (10 MB) to his computer. Unfortunately I have just wasted bandwidth > since that cool WYSIWYG program refuses to start up without an X display > when I invoke it through SSH. One can always construct a situation where tool X is unsuitable for solving problem Y, like to write a book in postscript using vi editor. BTW, one move of Linux I enjoy is that there is finally no need to edit /etc/printcap, You can tell me that sometimes it would be useful to have an ability to do so, but I'd rather leave that business to those who enjoy it. As for sharing data that is another question to aged UNIX based on files, unstructured piles of bytes. A modern OS should have no files, but persistent objects with primitive operations defined on them. In such OS there would probably be no place for I/O and stuff like wc. >>> Also I haven't heard of wizard applications which automate using >>> GUI programs. I think it is much easier to write a bash script which >>> invokes those hard to remember commands rather than writing an application >>> which automates moving the mouse pointer to specified coordinates and then >>> simulates a click. And that clicker wizard is even not portable to other >>> screen resolutions... >> >>You seem to mix program and user interfaces of an OS. I don't argue against >>the former. > > But GUIs cannot be automated (as command-line tools can be). There is no need to automate user interfaces. They are for users which are not automata (I hope (:-)) > GUI (or TUI) > is a must for some things (like text editing), but command-line is > much more powerful for many other things. And programming interface is far more powerful than both. This is why I use it and Ada when I need to automate something. In all other cases I'd like to have GUI. I have no need in anything in between. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
[parent not found: <k33j419lgei1ui89s26o1dlr9ccf1qe1hd@4ax.com>]
* Re: [OT] Ada bench : count words [not found] ` <k33j419lgei1ui89s26o1dlr9ccf1qe1hd@4ax.com> @ 2005-04-11 23:04 ` Marius Amado Alves 0 siblings, 0 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-04-11 23:04 UTC (permalink / raw) To: comp.lang.ada On 29 Mar 2005, at 18:29, Dennis Lee Bieber wrote: > On Tue, 29 Mar 2005 15:53:43 GMT, Tapio Kelloniemi <spam17@thack.org> > declaimed the following in comp.lang.ada: > >> >> But GUIs cannot be automated (as command-line tools can be). GUI (or >> TUI) > > The ancient Amiga could be... Windows can to. Sometime ago I consider that to solve a problem. I found a commercial application that claimed did it. I ended up not using it, or testing it, but I'm pretty sure it can be done. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-29 12:40 ` jtg 2005-03-29 13:00 ` [OT] " Tapio Kelloniemi @ 2005-03-29 13:47 ` Dmitry A. Kazakov 2005-04-01 20:58 ` Georg Bauhaus 2 siblings, 0 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-29 13:47 UTC (permalink / raw) On Tue, 29 Mar 2005 14:40:53 +0200, jtg wrote: > Dmitry A. Kazakov wrote: > >> Maybe, but I just don't need wc! There is no occupation "word counting". >> The most close thing is linguistics and text analysis. Can wc that? >> > There are other tools for text analysis. With wc you can get the number > of lines in a file or the number of entries filtered by previous > commands. Like > cat *.ad? |wc > Which gives you immediately the total number of lines in your ada files I don't need that number. What I need is the list of all overrides of the primitive operation Foo in the instantiations of the generic package Bar with the formal parameter Baz set to the type Unbounded_String. >> The problem is that there are 1000 different commands with 50 options each, >> which gives 1000*2**50 variants to remember. Every UNIX admin starts with >> making an alias to ls and 100 further commands. It is just unusable. It >> took decades before people finally understood it and wrote KDE, Gnome >> control centers. Admins despise them, but they are lost. > > Really? But there are some 1000000 different Windows programs and each > of them has 1000 options, so you have to remember 1000000*2**1000 > variants :-) It is unfair to compare anything with Windows! (:-)) But yes, to achieve its goal to be the worst ever OS, Windows extensively uses UNIX ideas. It replaces a plethora of useless utilities with an equivalent set of dialog boxes and control tabs. Because it is still significantly less than UNIX's geometric explosion, MS shuffles the GUI at random to get the desired effect: you newer know "where is that damn thing!" > If you don't need command line tools, you don't have to use them, > but don't call them bad design just because you don't need them. depends on the requirements, the design is supposed to fulfill... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-29 12:40 ` jtg 2005-03-29 13:00 ` [OT] " Tapio Kelloniemi 2005-03-29 13:47 ` Dmitry A. Kazakov @ 2005-04-01 20:58 ` Georg Bauhaus 2005-04-01 20:18 ` Pascal Obry 2 siblings, 1 reply; 122+ messages in thread From: Georg Bauhaus @ 2005-04-01 20:58 UTC (permalink / raw) jtg wrote: >> I think we have tolerated that C guys telling us that Ada is slow for too >> long time. Game is over! (:-)) >> > I'm afraid I don't understand. Who is the winner? > Maybe you meant: The game just begins :-) Here's another program for the shootout, regular expression matching. This is the Spitbol patterns version. Maybe the report printed by the program needs the removal of leeding spaces in Natural'image? with GNAT.Spitbol.Patterns, Ada.Command_Line, Ada.Strings.Maps.Constants, Ada.Text_IO; procedure regexmatch is use GNAT.Spitbol.Patterns, Ada.Text_IO, Ada.Strings.Maps; package Spit renames GNAT.Spitbol; dollar_1, dollar_2, dollar_3: Spit.VString; Digit: constant Character_Set := Constants.Decimal_Digit_Set; D: constant Pattern := Any(Digit); Area_Code: constant Pattern := (D & Fence & D & D) ** dollar_1; Exchange: constant Pattern := (D & D & D) ** dollar_2; Last_Four: constant Pattern := (D & D & D & D) ** dollar_3; -- Fence: consider "(123 " after two BreakX calls in `re` below re: constant Pattern := BreakX(Digit or To_Set('(')) -- scan upto digit or left paren & ( '(' & Area_Code & ')' -- parenthesized area code or -- or Area_Code) -- just area code & ' ' -- one space & Exchange & Any(" -") & Last_Four & ( RPos(0) or NotAny(Digit) ); -- at EOL or a non-digit type String_Pointer is access String; phones: array(1 .. 50) of String_Pointer; line_count: Natural; match_count: Natural := 0; NUM: Natural; procedure set_NUM is use Ada.Command_Line; begin NUM := Positive'value(Argument(1)); exception when Constraint_Error => NUM := 1; end set_NUM; procedure read_lines is last: Natural; buf: String(1 .. 80); begin line_count := phones'first; loop Get_Line(buf, last); phones(line_count) := new String'(buf(buf'first .. last)); line_count := line_count + 1; end loop; exception when End_Error => line_count := line_count - 1; end read_lines; begin set_NUM; read_lines; Anchored_Mode := True; while NUM > 0 loop NUM := NUM - 1; matching: for p in phones'first .. line_count loop if Match(phones(p).all, re) then if NUM = 0 then match_count := match_count + 1; put_line(Natural'image(match_count) & ": " & "(" & Spit.S(dollar_1) & ") " & Spit.S(dollar_2) & "-" & Spit.S(dollar_3)); end if; end if; end loop matching; end loop; end regexmatch; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-04-01 20:58 ` Georg Bauhaus @ 2005-04-01 20:18 ` Pascal Obry 0 siblings, 0 replies; 122+ messages in thread From: Pascal Obry @ 2005-04-01 20:18 UTC (permalink / raw) Georg Bauhaus <sb463ba@user1.uni-duisburg.de> writes: > Here's another program for the shootout, regular expression > matching. This is the Spitbol patterns version. Maybe the > report printed by the program needs the removal of leeding > spaces in Natural'image? I've just checked-in a version (also base on GNAT Spitbol) for this test this afternoon. I have put a version based on GNAT Regpat as alternate solution. Both should appear soon... It would be nice to check if your pattern matcher is faster than mine as it is slightly different. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 12:17 ` Dmitry A. Kazakov 2005-03-22 12:47 ` Marius Amado Alves @ 2005-03-22 12:53 ` Marius Amado Alves [not found] ` <f205219321dd18dba878fab16b7cb50d@netcabo.pt> 2005-03-22 13:58 ` Robert A Duff 3 siblings, 0 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 12:53 UTC (permalink / raw) To: comp.lang.ada >> for I in S'Range loop >> if S (I) = EOL then Lines := Lines + 1; end if; >> if Is_Separator (S (I)) then >> if In_Word then Finish_Word; end if; >> else >> if not In_Word then Start_Word; end if; >> end if; >> end loop; >> >> I have not tried with string matching (if that's what you mean with >> "FSM").... > > No, I meant a finite state machine: Of course. Well, there is already the little FSM above for the word count. I didn't extend the FSM to the whole process because the other counts (lines, characters) don't need it (they don't depend on any state). And I doubt it would increase speed. But someone should try. > ... FSM is one of that rare cases where gotos are natural. What a pity > that Ada > does not have arrays of labels! (:-)) Indeed. Usually I use an enumeration for states, and a case statement on the state, inside a loop. ^ permalink raw reply [flat|nested] 122+ messages in thread
[parent not found: <f205219321dd18dba878fab16b7cb50d@netcabo.pt>]
* Re: Ada bench : count words [not found] ` <f205219321dd18dba878fab16b7cb50d@netcabo.pt> @ 2005-03-22 13:12 ` Marius Amado Alves 2005-03-23 16:58 ` Isaac Gouy 0 siblings, 1 reply; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 13:12 UTC (permalink / raw) To: comp.lang.ada In case someone's interested, to repeat the input file N=500 times as seemingly required by the benchmark, I use the program below. with Ada.Command_Line; with Ada.Strings.Unbounded; with Ada.Text_IO; procedure Repeat_File is use Ada.Command_Line; use Ada.Strings.Unbounded; use Ada.Text_IO; N : Natural; U : Unbounded_String; C : Character; begin N := Natural'Value (Argument (1)); while not End_Of_File loop Get_Immediate (C); Append (U, C); end loop; for I in 1 .. N loop Put (To_String (U)); end loop; end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 13:12 ` Marius Amado Alves @ 2005-03-23 16:58 ` Isaac Gouy 0 siblings, 0 replies; 122+ messages in thread From: Isaac Gouy @ 2005-03-23 16:58 UTC (permalink / raw) Marius Amado Alves wrote: > In case someone's interested, to repeat the input file N=500 times as > seemingly required by the benchmark The program just needs to process stdin Input-files of various sizes will be re-directed to stdin ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 12:17 ` Dmitry A. Kazakov ` (2 preceding siblings ...) [not found] ` <f205219321dd18dba878fab16b7cb50d@netcabo.pt> @ 2005-03-22 13:58 ` Robert A Duff 2005-03-22 16:30 ` Marius Amado Alves ` (2 more replies) 3 siblings, 3 replies; 122+ messages in thread From: Robert A Duff @ 2005-03-22 13:58 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > On Tue, 22 Mar 2005 11:57:22 +0000, Marius Amado Alves wrote: > > >>> ... To implement buffering, I have resorted to > >>> Ada.Direct_IO, which I think cannot apply to standard input. > >> > >> Is Text_IO that bad? > > > > No, if you can solve The Get_Line puzzle :-) > > What about Get (Item : out Character)? > > I wonder if calling C-lib's read would qualify! (:-)) Text_IO is pretty broken, from both a usability perspective and an efficiency perspective. I normally avoid it, and use pragma Import on the C routines (open, read, etc). Suitably wrapped in a clean interface, of course. I recently installed some third-party code into my project, which was using Direct_IO (not Text_IO). I rewrote it to use my I/O routines, and that part of the program sped up by a couple orders of magnitude. > I think Text_IO should translate it into LF. But anyway you can always > assume LF = new line, CR = space in the FSM. It's astonishing how much human intellectual work has been wasted on stupid issues like CR/LF vs. LF! And endianness. > FSM is one of that rare cases where gotos are natural. What a pity that Ada > does not have arrays of labels! (:-)) I prefer a loop containing a case statement. This has the advantage that you can insert extra code that happens on every state transition (e.g. assert that you didn't forget to set the next state explicitly, print out the state transitions for debugging, etc). But in *this* case, I'd just use a bunch of while loops and case statements and so forth. In my experience, most state machines have plenty of structure, so the totally unstructured methods: (loop-with-case or rats-nest-of-gotos or state-transition-table) seem unnecessary. - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 13:58 ` Robert A Duff @ 2005-03-22 16:30 ` Marius Amado Alves 2005-03-22 16:41 ` Tapio Kelloniemi ` (2 more replies) [not found] ` <1820eab50b57f2fe1c4e8e50bb0f4fe5@netcabo.pt> 2005-03-22 23:27 ` Larry Kilgallen 2 siblings, 3 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 16:30 UTC (permalink / raw) To: comp.lang.ada > Text_IO is pretty broken, from both a usability perspective and an > efficiency perspective. I normally avoid it, and use pragma Import on > the C routines (open, read, etc). Suitably wrapped in a clean > interface, of course. Bob, would you contribute an efficient Read procedure to this benchmark? The speed difference is greater than I thought. Ada (with Text_IO) is about 10 times slower than the C benchmark. I had forgotten to compile the C benchmark with -O3. This would be dandy: procedure Read (Buffer : out String; Length : out Natural); reading from Standard_Input (this is required). Implemented with the C library if necessary (this is allowed, I think). (I know C well, but I'll be damned if I ever understand the view of C in Interfaces.C) Thanks a lot. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 16:30 ` Marius Amado Alves @ 2005-03-22 16:41 ` Tapio Kelloniemi 2005-03-22 17:39 ` Marius Amado Alves 2005-03-22 18:34 ` Georg Bauhaus 2005-03-22 19:32 ` Robert A Duff 2 siblings, 1 reply; 122+ messages in thread From: Tapio Kelloniemi @ 2005-03-22 16:41 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> wrote: >> Text_IO is pretty broken, from both a usability perspective and an >> efficiency perspective. I normally avoid it, and use pragma Import on >> the C routines (open, read, etc). Suitably wrapped in a clean >> interface, of course. > >Bob, would you contribute an efficient Read procedure to this benchmark? And to all other benchmarks which need read... >The speed difference is greater than I thought. Ada (with Text_IO) is >about 10 times slower than the C benchmark. I had forgotten to compile >the C benchmark with -O3. Why not use GNAT.OS_Lib. I think it does not add so much overhead, though it is most likely still a bit slower than libc read. But if that matters, why not use syscall (or is it denied). Also adding pragma Inlines somewhere might give us some juicy nanoseconds in case gcc does not inline everything automatically. -- Tapio ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 16:41 ` Tapio Kelloniemi @ 2005-03-22 17:39 ` Marius Amado Alves 2005-03-22 18:59 ` Dmitry A. Kazakov 2005-03-22 19:08 ` Tapio Kelloniemi 0 siblings, 2 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 17:39 UTC (permalink / raw) To: comp.lang.ada > Why not use GNAT.OS_Lib. I'm trying, but the program does not work properly. It seems to terminate too early, and the results oscillate between 20 and 49 lines. I'll be damned if I understand what's happening. -- Count words in Ada for the language shootout -- by Marius Amado Alves with Ada.Characters.Handling; with Ada.Characters.Latin_1; with Ada.Strings.Fixed; with Ada.Text_IO; with GNAT.OS_Lib; procedure Count_Words_OS_Lib is use Ada.Characters.Handling; use Ada.Characters.Latin_1; use Ada.Text_IO; Buffer : String (1 .. 4096); EOL : String := (1 => LF); Lines : Natural := 0; Words : Natural := 0; Total : Natural := 0; In_Word : Boolean := False; N : Natural; function Is_Separator (C : Character) return Boolean is begin return Is_Control (C) or C = ' '; end; procedure Begin_Word is begin In_Word := True; end; procedure End_Word is begin if In_Word then Words := Words + 1; In_Word := False; end if; end; procedure End_Line is begin Lines := Lines + 1; Total := Total + 1; End_Word; end; procedure Count_Words (S : in String) is begin Total := Total + S'Length; Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); for I in S'Range loop if Is_Separator (S (I)) then if In_Word then End_Word; end if; else if not In_Word then Begin_Word; end if; end if; end loop; end; pragma Inline (Begin_Word, End_Word, End_Line, Count_Words); begin loop N := GNAT.OS_Lib.Read (GNAT.OS_Lib.Standin, Buffer'Address, Buffer'Length); Count_Words (String (Buffer (1 .. N))); exit when N < Buffer'Length; end loop; Ada.Text_IO.Put_Line (Natural'Image (Lines) & Natural'Image (Words) & Natural'Image (Total)); end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 17:39 ` Marius Amado Alves @ 2005-03-22 18:59 ` Dmitry A. Kazakov 2005-03-22 19:08 ` Tapio Kelloniemi 1 sibling, 0 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 18:59 UTC (permalink / raw) On Tue, 22 Mar 2005 17:39:42 +0000, Marius Amado Alves wrote: >> Why not use GNAT.OS_Lib. > > I'm trying, but the program does not work properly. It seems to > terminate too early, and the results oscillate between 20 and 49 lines. > I'll be damned if I understand what's happening. > > -- Count words in Ada for the language shootout > -- by Marius Amado Alves > > with Ada.Characters.Handling; > with Ada.Characters.Latin_1; > with Ada.Strings.Fixed; > with Ada.Text_IO; > with GNAT.OS_Lib; > > procedure Count_Words_OS_Lib is > > use Ada.Characters.Handling; > use Ada.Characters.Latin_1; > use Ada.Text_IO; > > Buffer : String (1 .. 4096); > EOL : String := (1 => LF); > Lines : Natural := 0; > Words : Natural := 0; > Total : Natural := 0; > In_Word : Boolean := False; > N : Natural; > > function Is_Separator (C : Character) return Boolean is > begin > return Is_Control (C) or C = ' '; > end; > > procedure Begin_Word is > begin > In_Word := True; > end; > > procedure End_Word is > begin > if In_Word then > Words := Words + 1; > In_Word := False; > end if; > end; > > procedure End_Line is > begin > Lines := Lines + 1; > Total := Total + 1; > End_Word; > end; > > procedure Count_Words (S : in String) is > begin > Total := Total + S'Length; > Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); > for I in S'Range loop > if Is_Separator (S (I)) then > if In_Word then End_Word; end if; > else > if not In_Word then Begin_Word; end if; > end if; > end loop; > end; > > pragma Inline (Begin_Word, End_Word, End_Line, Count_Words); > > begin > loop > N := GNAT.OS_Lib.Read > (GNAT.OS_Lib.Standin, > Buffer'Address, Hmm, why not Buffer (Buffer'First)'Address? > Buffer'Length); > Count_Words (String (Buffer (1 .. N))); > exit when N < Buffer'Length; > end loop; > > Ada.Text_IO.Put_Line > (Natural'Image (Lines) & > Natural'Image (Words) & > Natural'Image (Total)); > end; -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 17:39 ` Marius Amado Alves 2005-03-22 18:59 ` Dmitry A. Kazakov @ 2005-03-22 19:08 ` Tapio Kelloniemi 1 sibling, 0 replies; 122+ messages in thread From: Tapio Kelloniemi @ 2005-03-22 19:08 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> wrote: >> Why not use GNAT.OS_Lib. > >I'm trying, but the program does not work properly. It seems to >terminate too early, and the results oscillate between 20 and 49 lines. >I'll be damned if I understand what's happening. [--] > procedure End_Line is > begin > Lines := Lines + 1; Remove the following line: > Total := Total + 1; > End_Word; > end; > > procedure Count_Words (S : in String) is > begin > Total := Total + S'Length; Remove the following line (it checks every character of the buffer once more): > Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); > for I in S'Range loop Add: if S (I) = LF then End_Line; elsif Is_Separator (S (I)) then End_Word; Note that the In-Word condition was checked in the loop and in In_Word. elsif not In_Word then Begin_Word; > end if; > end loop; > end; > > pragma Inline (Begin_Word, End_Word, End_Line, Count_Words); > >begin > loop > N := GNAT.OS_Lib.Read > (GNAT.OS_Lib.Standin, > Buffer'Address, Replace the above line with Buffer (Buffer'First)'Address, And Read is given the address of first byte to fill, not some internal pointers which are used for bounds checking. Notice that when you leave that safe Text_IO, it is very easy to make the same mistakes as in C (we are programming in C style). -- Tapio ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 16:30 ` Marius Amado Alves 2005-03-22 16:41 ` Tapio Kelloniemi @ 2005-03-22 18:34 ` Georg Bauhaus 2005-03-22 19:32 ` Robert A Duff 2 siblings, 0 replies; 122+ messages in thread From: Georg Bauhaus @ 2005-03-22 18:34 UTC (permalink / raw) Marius Amado Alves wrote: >> Text_IO is pretty broken, > > Bob, would you contribute an efficient Read procedure to this benchmark? If Ada.Text_IO is pretty broken (in this context), then it is only fair if this is admitted. Cheating here will reflect on the whole effort, I'd say. Remember there are things that you can do with Text_IO that cannot easily be done with I/O-libraries of other languages. OTOH, I recall that streams have been suggested as a faster replacement when the Text_IO features are not needed. There are wordcount programs in the AI302 examples directory. Georg ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 16:30 ` Marius Amado Alves 2005-03-22 16:41 ` Tapio Kelloniemi 2005-03-22 18:34 ` Georg Bauhaus @ 2005-03-22 19:32 ` Robert A Duff 2005-03-22 20:04 ` tmoran 2005-03-23 16:55 ` Isaac Gouy 2 siblings, 2 replies; 122+ messages in thread From: Robert A Duff @ 2005-03-22 19:32 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> writes: > > Text_IO is pretty broken, from both a usability perspective and an > > efficiency perspective. I normally avoid it, and use pragma Import on > > the C routines (open, read, etc). Suitably wrapped in a clean > > interface, of course. > > Bob, would you contribute an efficient Read procedure to this benchmark? You could use OS_Lib.Read. I believe it's portable to non-GNAT Ada compilers, and I believe its license is pretty liberal. I haven't looked at any of these benchmarks. It sounds like this one is measuring speed of text input. Is that intentional? If so, then Text_IO is the appropriate thing to measure, because that's the standard way of doing it in Ada. This is one of the few areas where C is simply better than Ada. Of course, as somebody pointed out, input speed is irrelevant for many programs. But I don't think Ada compilers typically use their own Text_IO to read their input! - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 19:32 ` Robert A Duff @ 2005-03-22 20:04 ` tmoran 2005-03-23 16:55 ` Isaac Gouy 1 sibling, 0 replies; 122+ messages in thread From: tmoran @ 2005-03-22 20:04 UTC (permalink / raw) >measuring speed of text input. Is that intentional? If so, then >Text_IO is the appropriate thing to measure, because that's the standard >way of doing it in Ada. This is one of the few areas where C is simply >better than Ada. Commenting out the call on Count_Words, so the program basically just calls End_Of_File and Get_Line, yields a rather modest speedup. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 19:32 ` Robert A Duff 2005-03-22 20:04 ` tmoran @ 2005-03-23 16:55 ` Isaac Gouy 1 sibling, 0 replies; 122+ messages in thread From: Isaac Gouy @ 2005-03-23 16:55 UTC (permalink / raw) Robert A Duff wrote: -snip- > If so, then Text_IO is the appropriate thing to measure, because that's the > standard way of doing it in Ada. This is one of the few areas where C is > simply better than Ada. Yes! Contribute programs that show 'the standard way of doing it in Ada'. (Contribute wierd twisted stuff as-well but first show the Ada way. Y'all welcome to complain that wierd twisted C programs should be demoted.) ^ permalink raw reply [flat|nested] 122+ messages in thread
[parent not found: <1820eab50b57f2fe1c4e8e50bb0f4fe5@netcabo.pt>]
* Re: Ada bench : count words [not found] ` <1820eab50b57f2fe1c4e8e50bb0f4fe5@netcabo.pt> @ 2005-03-22 22:49 ` Stephen Leake 2005-03-22 22:58 ` Robert A Duff 0 siblings, 1 reply; 122+ messages in thread From: Stephen Leake @ 2005-03-22 22:49 UTC (permalink / raw) To: comp.lang.ada Marius Amado Alves <amado.alves@netcabo.pt> writes: >> Text_IO is pretty broken, from both a usability perspective and an >> efficiency perspective. I normally avoid it, and use pragma Import on >> the C routines (open, read, etc). Suitably wrapped in a clean >> interface, of course. > > Bob, would you contribute an efficient Read procedure to this benchmark? How about Stream_IO? that's closer to the C library, but is Pure Ada. -- -- Stephe ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 22:49 ` Stephen Leake @ 2005-03-22 22:58 ` Robert A Duff 0 siblings, 0 replies; 122+ messages in thread From: Robert A Duff @ 2005-03-22 22:58 UTC (permalink / raw) Stephen Leake <stephen_leake@acm.org> writes: > Marius Amado Alves <amado.alves@netcabo.pt> writes: > > >> Text_IO is pretty broken, from both a usability perspective and an > >> efficiency perspective. I normally avoid it, and use pragma Import on > >> the C routines (open, read, etc). Suitably wrapped in a clean > >> interface, of course. > > > > Bob, would you contribute an efficient Read procedure to this benchmark? > > How about Stream_IO? that's closer to the C library, but is Pure Ada. Yes, that makes sense. It's probably quite fast, but I haven't measured it. It's a bit of a pain because you have to deal in Stream_Elements instead of characters. But that shouldn't slow it down -- just makes the source code a little more complicated. - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 13:58 ` Robert A Duff 2005-03-22 16:30 ` Marius Amado Alves [not found] ` <1820eab50b57f2fe1c4e8e50bb0f4fe5@netcabo.pt> @ 2005-03-22 23:27 ` Larry Kilgallen 2005-03-23 22:33 ` Robert A Duff ` (2 more replies) 2 siblings, 3 replies; 122+ messages in thread From: Larry Kilgallen @ 2005-03-22 23:27 UTC (permalink / raw) In article <wccvf7jdboy.fsf@shell01.TheWorld.com>, Robert A Duff <bobduff@shell01.TheWorld.com> writes: > Text_IO is pretty broken, from both a usability perspective and an > efficiency perspective. I normally avoid it, and use pragma Import on > the C routines (open, read, etc). Suitably wrapped in a clean > interface, of course. Do those C routines handle line lengths, page lengths, and column specifications ? If not, perhaps you are looking for something different from the full feature set of Text_IO. But how Text_IO performs from an efficiency standpoint should depend upon the particular implementation. If you have a file system that does not maintain record boundaries, then Text_IO must do more work. But _something_ must handle the differences between various text file formats (Stream-LF, Stream-CR, Stream-CRLF and out-of-band record breaks). The C routines with which I am familiar (VMS) seem to force the user to convert everything into a stream of bytes and then the C routines convert it back into the actual file format (like those listed above). _That_ is my idea of inefficient. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 23:27 ` Larry Kilgallen @ 2005-03-23 22:33 ` Robert A Duff 2005-03-24 5:02 ` Larry Kilgallen [not found] ` <wccpsxqro0c.fsfOrganization: LJK Software <bM40pHW6P2KW@eisner.encompasserve.org> 2 siblings, 0 replies; 122+ messages in thread From: Robert A Duff @ 2005-03-23 22:33 UTC (permalink / raw) Kilgallen@SpamCop.net (Larry Kilgallen) writes: > In article <wccvf7jdboy.fsf@shell01.TheWorld.com>, Robert A Duff <bobduff@shell01.TheWorld.com> writes: > > > Text_IO is pretty broken, from both a usability perspective and an > > efficiency perspective. I normally avoid it, and use pragma Import on > > the C routines (open, read, etc). Suitably wrapped in a clean > > interface, of course. > > Do those C routines handle line lengths, page lengths, and column > specifications ? No. > If not, perhaps you are looking for something different from the full > feature set of Text_IO. Yes, indeed. ;-) For one thing, simple character-by-character (buffered) input is inefficient using Text_IO. I don't usually want to do line-by-line input, but when I do, the features in Text_IO are a pain. Witness the endless series of questions from beginners about that issue in this newsgroup. Another design flaw is that Text_IO mixes I/O with data formatting, which should be treated as two separate issues. > But how Text_IO performs from an efficiency standpoint should depend > upon the particular implementation. Well, in part, but *every* implementation must deal with those column numbers, and that's a waste for applications that don't need it. >... If you have a file system that > does not maintain record boundaries, then Text_IO must do more work. > But _something_ must handle the differences between various text file > formats (Stream-LF, Stream-CR, Stream-CRLF and out-of-band record > breaks). True. It's a shame the computer industry can't standardize on one method. I suspect it's impossible to design a method that is maximally efficient and maximally portable and maximally convenient in this regard. > The C routines with which I am familiar (VMS) seem to force the user > to convert everything into a stream of bytes and then the C routines > convert it back into the actual file format (like those listed above). > _That_ is my idea of inefficient. I'm not a big fan of Record Management Services, by the way. ;-) - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 23:27 ` Larry Kilgallen 2005-03-23 22:33 ` Robert A Duff @ 2005-03-24 5:02 ` Larry Kilgallen [not found] ` <wccpsxqro0c.fsfOrganization: LJK Software <bM40pHW6P2KW@eisner.encompasserve.org> 2 siblings, 0 replies; 122+ messages in thread From: Larry Kilgallen @ 2005-03-24 5:02 UTC (permalink / raw) In article <wccpsxqro0c.fsf@shell01.TheWorld.com>, Robert A Duff <bobduff@shell01.TheWorld.com> writes: > Kilgallen@SpamCop.net (Larry Kilgallen) writes: >> The C routines with which I am familiar (VMS) seem to force the user >> to convert everything into a stream of bytes and then the C routines >> convert it back into the actual file format (like those listed above). >> _That_ is my idea of inefficient. > > I'm not a big fan of Record Management Services, by the way. ;-) So it sounds like your life does not have much need to share indexed files between programming languages. ^ permalink raw reply [flat|nested] 122+ messages in thread
[parent not found: <wccpsxqro0c.fsfOrganization: LJK Software <bM40pHW6P2KW@eisner.encompasserve.org>]
* Re: Ada bench : count words [not found] ` <wccpsxqro0c.fsfOrganization: LJK Software <bM40pHW6P2KW@eisner.encompasserve.org> @ 2005-03-25 1:34 ` Robert A Duff 0 siblings, 0 replies; 122+ messages in thread From: Robert A Duff @ 2005-03-25 1:34 UTC (permalink / raw) Kilgallen@SpamCop.net (Larry Kilgallen) writes: > In article <wccpsxqro0c.fsf@shell01.TheWorld.com>, Robert A Duff <bobduff@shell01.TheWorld.com> writes: > > Kilgallen@SpamCop.net (Larry Kilgallen) writes: > > >> The C routines with which I am familiar (VMS) seem to force the user > >> to convert everything into a stream of bytes and then the C routines > >> convert it back into the actual file format (like those listed above). > >> _That_ is my idea of inefficient. > > > > I'm not a big fan of Record Management Services, by the way. ;-) > > So it sounds like your life does not have much need to share indexed > files between programming languages. True. VMS does an admirable job of being language independent, and that's a Good Thing. - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 10:59 ` Dmitry A. Kazakov 2005-03-22 11:57 ` Marius Amado Alves @ 2005-03-22 12:22 ` Jeff C 2005-03-23 16:48 ` Isaac Gouy 2005-03-23 17:06 ` Isaac Gouy 2 siblings, 1 reply; 122+ messages in thread From: Jeff C @ 2005-03-22 12:22 UTC (permalink / raw) Dmitry A. Kazakov wrote: > On Tue, 22 Mar 2005 01:16:09 +0000, Marius Amado Alves wrote: > > >>I took a shot at the count-words benchmark, a program to count lines, >>words and characters. The Ada program currently published there is >>broken. My program is correct and portable but: >> >>- the speed is circa 1/3 of the GCC C version >> >>- it fails to comply with the requirement that the input be taken from >>standard input. To implement buffering, I have resorted to >>Ada.Direct_IO, which I think cannot apply to standard input. > > > Is Text_IO that bad? It can be. It does a few things that real programs often do not need to do (counting pages) and may not do buffering that is optimal for the job here. The problem is of course that it is a requirement to read from standard input and if I were running the website I would reject a program that does not. Perhaps this could be done by using ada.text_io.text_streams to get access to a stream based on standard input and then use the "Read" procedure (v.s. 'read attribute) to get 4k blocks of data (recommended chunking size from problem definition). The Read procedure uses an "Item" and "Last" approach so you can look for Last to see how much you actually read. Note since Ada 95 came out I hardly ever use direct_io anymore because even problems that seem to line up well with direct IO over time degrade as file formats change. I tend to jump (perhaps to soon) to ada.streams.stream_io approaches and or approaches like getting access to text_io streams. Note in this case I am not saying that the text_io stream approach will be faster..I dont really know..But at least it can be made compliant with the requirements of the test. with Text_IO; with Ada.Text_IO.Text_Streams; use Ada.Text_IO.Text_Streams; use Text_IO; with Ada.Streams; use Ada.Streams; procedure stre_test is S : Stream_Access := Stream(Standard_Input); Data : Stream_Element_Array(1 .. 10); Last : Stream_Element_Offset; begin Read(S.all, Data, Last); Text_IO.Put_Line(Stream_Element_Offset'image(Last)); end stre_test; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 12:22 ` Jeff C @ 2005-03-23 16:48 ` Isaac Gouy 0 siblings, 0 replies; 122+ messages in thread From: Isaac Gouy @ 2005-03-23 16:48 UTC (permalink / raw) > The problem is of course that it is a requirement to read from > standard input and if I were running the website I would reject a > program that does not. We don't even need to reject such a program - it'll fail all-by-itself. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 10:59 ` Dmitry A. Kazakov 2005-03-22 11:57 ` Marius Amado Alves 2005-03-22 12:22 ` Jeff C @ 2005-03-23 17:06 ` Isaac Gouy 2 siblings, 0 replies; 122+ messages in thread From: Isaac Gouy @ 2005-03-23 17:06 UTC (permalink / raw) Dmitry A. Kazakov wrote: -snip- > P.S. For word frequencies: gcc version uses hash + sort. I wonder if binary > trees could be better here. Or even sorted arrays for simplicity, there is > no item deletion, search is more often than insert... Of course, for tree > node allocation one could use a stack pool instead of heap. > > (That's one of the weaknesses of this contest. Actually the method should > have been specified) Yes, that's one of the more glaring weaknesses. (It's mostly laziness and jaundiced experience.) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 1:16 ` Ada bench : count words Marius Amado Alves 2005-03-22 10:59 ` Dmitry A. Kazakov @ 2005-03-22 19:49 ` tmoran 2005-03-22 21:51 ` Dmitry A. Kazakov ` (2 more replies) 2005-03-22 22:27 ` Dmitry A. Kazakov 2 siblings, 3 replies; 122+ messages in thread From: tmoran @ 2005-03-22 19:49 UTC (permalink / raw) > - the speed is circa 1/3 of the GCC C version > ... > The complete program follows. I missed the original post with the URL of the benchmark, but this appears on my machine to be about 5x as fast (gnatmake -gnato -O2): with Ada.Calendar, Ada.Streams, Ada.Streams.Stream_IO, Ada.Text_IO, Ada.Text_IO.Text_Streams; procedure Cw is use Ada.Streams; use type Ada.Calendar.Time; Stream : Ada.Text_IO.Text_Streams.Stream_Access; Buffer : Stream_Element_Array(1 .. 4096); Last : Stream_Element_Offset; Lines, Words, Total : Natural := 0; In_Word : Boolean := False; LF : constant Stream_Element := Character'pos(Ascii.LF); CR : constant Stream_Element := Character'pos(Ascii.CR); EOF_Char: constant Stream_Element := 16#1A#; Is_Separator: constant array (Stream_Element) of Boolean := (0 .. 32 | 127 .. 159 => True, others => False); T0, T1 : Ada.Calendar.Time; begin T0 := Ada.Calendar.Clock; Stream := Ada.Text_IO.Text_Streams.Stream(Ada.Text_IO.Current_Input); Through_File: loop Ada.Streams.Read(Ada.Streams.Root_Stream_Type'Class(Stream.all), Buffer, Last); for I in 1 .. Last loop exit Through_File when Buffer(I) = EOF_Char; Total := Total + 1; if Is_Separator(Buffer(I)) then In_Word := False; if Buffer(I) = LF then -- LF counts toward Total and toward Lines Lines := Lines + 1; elsif Buffer(I) = CR then -- don't count CR as content or as a line Total := Total-1; end if; else if not In_Word then Words := Words + 1; In_Word := True; end if; end if; end loop; exit Through_File when Last < Buffer'Last; end loop Through_File; T1 := Ada.Calendar.Clock; Ada.Text_IO.Put_Line(Natural'Image(Lines) & Natural'Image(Words) & Natural'Image(Total)); Ada.Text_IO.Put_Line("took" & Duration'Image(T1 - T0)); end Cw; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 19:49 ` tmoran @ 2005-03-22 21:51 ` Dmitry A. Kazakov 2005-03-23 0:16 ` tmoran 2005-03-22 22:33 ` Marius Amado Alves [not found] ` <00b362390273e6c04844dd4ff1885ee0@netcabo.pt> 2 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 21:51 UTC (permalink / raw) On Tue, 22 Mar 2005 13:49:26 -0600, tmoran@acm.org wrote: >> - the speed is circa 1/3 of the GCC C version >> ... >> The complete program follows. > > I missed the original post with the URL of the benchmark, I saw two: http://shootout.alioth.debian.org/great/benchmark.php?test=wc&lang=all&sort=fullcpu, and http://dada.perl.it/shootout/wc.html > but this > appears on my machine to be about 5x as fast (gnatmake -gnato -O2): > > with Ada.Calendar, > Ada.Streams, > Ada.Streams.Stream_IO, > Ada.Text_IO, > Ada.Text_IO.Text_Streams; > procedure Cw is > use Ada.Streams; > use type Ada.Calendar.Time; > Stream : Ada.Text_IO.Text_Streams.Stream_Access; > Buffer : Stream_Element_Array(1 .. 4096); > Last : Stream_Element_Offset; > Lines, Words, Total : Natural := 0; > In_Word : Boolean := False; > LF : constant Stream_Element := Character'pos(Ascii.LF); > CR : constant Stream_Element := Character'pos(Ascii.CR); > EOF_Char: constant Stream_Element := 16#1A#; > Is_Separator: constant array (Stream_Element) of Boolean > := (0 .. 32 | 127 .. 159 => True, others => False); It seems (from the description) that the separators are HT, SP, LF. Everything else is a "letter". > T0, T1 : Ada.Calendar.Time; > begin > > T0 := Ada.Calendar.Clock; No, they count load time as the run-time to be run as: for i in 1 2 3 4 5 6 7 8 9 10; do time command done So better link all static! (:-)) > Stream := Ada.Text_IO.Text_Streams.Stream(Ada.Text_IO.Current_Input); > Through_File: > loop > Ada.Streams.Read(Ada.Streams.Root_Stream_Type'Class(Stream.all), > Buffer, Last); > for I in 1 .. Last loop > exit Through_File when Buffer(I) = EOF_Char; > Total := Total + 1; > if Is_Separator(Buffer(I)) then > In_Word := False; > if Buffer(I) = LF then -- LF counts toward Total and toward Lines > Lines := Lines + 1; > elsif Buffer(I) = CR then -- don't count CR as content or as a line > Total := Total-1; Remove this, CR is a letter! (:-)) > end if; > else > if not In_Word then > Words := Words + 1; > In_Word := True; > end if; > end if; > end loop; > exit Through_File when Last < Buffer'Last; > end loop Through_File; > > T1 := Ada.Calendar.Clock; > > Ada.Text_IO.Put_Line(Natural'Image(Lines) > & Natural'Image(Words) > & Natural'Image(Total)); > Ada.Text_IO.Put_Line("took" & Duration'Image(T1 - T0)); > end Cw; -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 21:51 ` Dmitry A. Kazakov @ 2005-03-23 0:16 ` tmoran 2005-03-23 7:25 ` Dmitry A. Kazakov 0 siblings, 1 reply; 122+ messages in thread From: tmoran @ 2005-03-23 0:16 UTC (permalink / raw) >It seems (from the description) that the separators are HT, SP, LF. > ... The "same thing" FAQ says "We prefer plain vanilla programs - after all we're trying to compare language implementations not programmer effort and skill." So the "Ada version" should match the "Timing Trials" C version, which uses a block read, not the line oriented "gets". So Ada.Text_IO.Get_Line should not be used. (I grant that probably zero people paid attention to this "prefer".) When I browsed to the "input" page and cut&pasted to save the part that appeared to be the input file, the file size was 6102, whereas it's apparently supposed to be 6096. Also, if CR is not a separator, then the lines consisting of CR-LF only should count CR as a short word, no? Or is the file supposed to be Unix-style, with CRs removed? ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 0:16 ` tmoran @ 2005-03-23 7:25 ` Dmitry A. Kazakov 0 siblings, 0 replies; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-23 7:25 UTC (permalink / raw) On Tue, 22 Mar 2005 18:16:16 -0600, tmoran@acm.org wrote: >>It seems (from the description) that the separators are HT, SP, LF. >> ... > > The "same thing" FAQ says "We prefer plain vanilla programs - after all > we're trying to compare language implementations not programmer effort and > skill." So the "Ada version" should match the "Timing Trials" C version, > which uses a block read, not the line oriented "gets". So > Ada.Text_IO.Get_Line should not be used. (I grant that probably zero > people paid attention to this "prefer".) > > When I browsed to the "input" page and cut&pasted to save the part that > appeared to be the input file, the file size was 6102, whereas it's > apparently supposed to be 6096. Also, if CR is not a separator, then the > lines consisting of CR-LF only should count CR as a short word, no? Probably yes. Look at this: /* -*- mode: c -*- * $Id: wc-gcc.code,v 1.9 2005/03/21 08:36:50 bfulgham Exp $ * http://www.bagley.org/~doug/shootout/ * * Author: Waldemar Hebisch (hebisch@math.uni.wroc.pl) * Optimizations: Michael Herf (mike@herfconsulting.com) * Further Revisions: Paul Hsieh (qed@pobox.com) */ #include <stdio.h> #include <unistd.h> #include <limits.h> #define BSIZ 4096 unsigned long ws[UCHAR_MAX + 1]; unsigned long nws[UCHAR_MAX + 1]; char buff[BSIZ]; int main(void) { unsigned long prev_nws = 0x10000L, w_cnt = 0, l_cnt = 0, b_cnt = 0, cnt; /* Fill tables */ for (cnt = 0; cnt <= UCHAR_MAX; cnt++) { ws[cnt] = (cnt == ' ' || cnt == '\n' || cnt == '\t') + (0x10000L & -(cnt == '\n')); nws[cnt] = !(cnt == ' ' || cnt == '\n' || cnt == '\t') + 0x10000L; } /* Main loop */ while (0 != (cnt = read (0, buff, BSIZ))) { unsigned long vect_count = 0; unsigned char *pp, *pe; b_cnt += cnt; pe = buff + cnt; pp = buff; while (pp < pe) { vect_count += ws[*pp] & prev_nws; prev_nws = nws[*pp]; pp ++; } w_cnt += vect_count & 0xFFFFL; l_cnt += vect_count >> 16; } w_cnt += 1 & prev_nws; printf ("%d %d %d\n", l_cnt, w_cnt, b_cnt); return 0; } This will count CR as a word. > Or is the file supposed to be Unix-style, with CRs removed? Yes if they accept "read", and no if they don't. Usual schism of all contests of this sort... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 19:49 ` tmoran 2005-03-22 21:51 ` Dmitry A. Kazakov @ 2005-03-22 22:33 ` Marius Amado Alves [not found] ` <00b362390273e6c04844dd4ff1885ee0@netcabo.pt> 2 siblings, 0 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-22 22:33 UTC (permalink / raw) To: comp.lang.ada On 22 Mar 2005, at 19:49, tmoran@acm.org wrote: >> - the speed is circa 1/3 of the GCC C version >> ... >> The complete program follows. > > I missed the original post with the URL of the benchmark, but this > appears on my machine to be about 5x as fast (gnatmake -gnato -O2): Excelent. > with Ada.Calendar, > Ada.Streams, > Ada.Streams.Stream_IO, > Ada.Text_IO, > Ada.Text_IO.Text_Streams; I'm so stupid! I had forgotten about Text_Streams! I'll review this tomorrow on the bus to work (I think your program is separating words at buffer end, and it should not). ^ permalink raw reply [flat|nested] 122+ messages in thread
[parent not found: <00b362390273e6c04844dd4ff1885ee0@netcabo.pt>]
* Re: Ada bench : count words [not found] ` <00b362390273e6c04844dd4ff1885ee0@netcabo.pt> @ 2005-03-23 15:09 ` Marius Amado Alves 2005-03-23 19:00 ` tmoran 2005-03-30 16:08 ` Andre 0 siblings, 2 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-23 15:09 UTC (permalink / raw) To: comp.lang.ada > I'll review this tomorrow on the bus to work (I think your program is > separating words at buffer end, and it should not). Done. Tmoran, sorry, my first sight was wrong, your algorithm is mostly fine. Minor reservations: - special statute given to CR; personally I think all characters (control or not) should count to total - reliance on Stream_Element representing one character; portable across all range of environments? My own program rewritten with Text_Streams attains the same speed as yours. The fact that it is structured doesn't hit performance significantly. Pragma Inline improves a little bit. Real times (ms) on my iBook, for input file repeated 2500 times, all compiled with only -O3: C ........................ 600 Yours, or mine inlined ... 700 Mine not inlined ......... 750 So we should submit yours, or mine inlined. It will put Ada right after the Cs w.r.t. speed. W.r.t. executable file size Ada is much bigger. On my iBook: C ....... 15k Mine .... 423k Yours ... 459k W.r.t. source code size all are similar. Number of significant semicolons (Ada), or semicolons + {} blocks + #includes + #defines (C): C .............. 27 Yours .......... 33 Mine inlined ... 52 My program follows for reference. It accepts the code for EOL as an argument. Default = 10 (LF). -- The Great Computer Language Shootout -- http://shootout.alioth.debian.org/ -- -- contributed by Guys De Cla with Ada.Characters.Handling; with Ada.Characters.Latin_1; with Ada.Command_Line; with Ada.Streams; with Ada.Streams.Stream_IO; with Ada.Strings.Fixed; with Ada.Text_IO; with Ada.Text_IO.Text_Streams; procedure Count_Words_Portable is use Ada.Characters.Handling; use Ada.Characters.Latin_1; use Ada.Command_Line; use Ada.Streams; use Ada.Streams.Stream_IO; use Ada.Text_IO; use Ada.Text_IO.Text_Streams; Buffer : Stream_Element_Array (1 .. 4096); Input_Stream : Ada.Text_IO.Text_Streams.Stream_Access := Ada.Text_IO.Text_Streams.Stream (Current_Input); EOL_Character_Pos : Stream_Element := Character'Pos (LF); Lines : Natural := 0; Words : Natural := 0; Total : Natural := 0; In_Word : Boolean := False; N : Stream_Element_Offset; Is_Separator : array (Stream_Element) of Boolean := (0 .. 32 | 127 .. 159 => True, others => False); procedure Begin_Word is begin Words := Words + 1; In_Word := True; end; procedure End_Word is begin In_Word := False; end; procedure End_Line is begin Lines := Lines + 1; End_Word; end; procedure Count_Words (S : in Stream_Element_Array) is begin Total := Total + S'Length; for I in S'Range loop if S (I) = EOL_Character_Pos then End_Line; else if Is_Separator (S (I)) then if In_Word then End_Word; end if; else if not In_Word then Begin_Word; end if; end if; end if; end loop; end; pragma Inline (Begin_Word, End_Word, End_Line, Count_Words); begin begin EOL_Character_Pos := Stream_Element'Value (Argument (1)); exception when Constraint_Error => null; end; Ada.Text_IO.Put_Line ("EOL =>" & Stream_Element'Image (EOL_Character_Pos)); loop Read (Root_Stream_Type'Class (Input_Stream.all), Buffer, N); Count_Words (Buffer (1 .. N)); exit when N < Buffer'Length; end loop; Ada.Text_IO.Put_Line (Natural'Image (Lines) & Natural'Image (Words) & Natural'Image (Total)); end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 15:09 ` Marius Amado Alves @ 2005-03-23 19:00 ` tmoran 2005-03-23 19:30 ` tmoran ` (2 more replies) 2005-03-30 16:08 ` Andre 1 sibling, 3 replies; 122+ messages in thread From: tmoran @ 2005-03-23 19:00 UTC (permalink / raw) > - special statute given to CR; personally I think all characters > (control or not) should count to total Right. I hadn't yet seen the original problem description. > - reliance on Stream_Element representing one character; portable > across all range of environments? I would argue that that's in effect assumed by the C and probably most of the other programs, so it should be assumed for Ada. (Similarly for the input file being an unstructured byte stream.) If you truly want a portable program, then the competition should be written to be portable also. I'll bet that would change some results. ;) >So we should submit yours, or mine inlined. It will put Ada right after >the Cs w.r.t. speed. W.r.t. executable file size Ada is much bigger. On In my (minimal) trial, Dmitry A. Kazakov's (Message-ID: <1j5jfikipztuc.jx5n2zhbg2np$.dlg@40tude.net>) appears still faster. I don't understand why, unless Gnat.OS.Lib is significantly faster than Gnat's Ada.Streams.Read, which seems unlikely. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 19:00 ` tmoran @ 2005-03-23 19:30 ` tmoran 2005-03-23 21:38 ` tmoran 2005-03-23 19:54 ` Tapio Kelloniemi 2005-03-24 20:19 ` tmoran 2 siblings, 1 reply; 122+ messages in thread From: tmoran @ 2005-03-23 19:30 UTC (permalink / raw) If the object is to compare substantially different programming languages, it would be interesting to make a multitasking Ada version (two or more counting tasks being fed buffers from the Read loop) and run it on a multiiple or hyperthreaded CPU. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 19:30 ` tmoran @ 2005-03-23 21:38 ` tmoran 2005-03-25 7:30 ` Simon Wright 0 siblings, 1 reply; 122+ messages in thread From: tmoran @ 2005-03-23 21:38 UTC (permalink / raw) Here's a simple multitasking conversion. I would appreciate it if someone with a dual core or hyperthreaded CPU tried it out. Probably increasing the buffer size, by lessening task interactions, would speed it up. With multiple CPUs, the array Counter : array(1 .. 2) of Counters; might be increased to more tasks, also. with Ada.Calendar, Ada.Streams, Ada.Streams.Stream_IO, Ada.Text_IO, Ada.Text_IO.Text_Streams; procedure Cwt is use Ada.Streams; use type Ada.Calendar.Time; T0 : Ada.Calendar.Time := Ada.Calendar.Clock; -- start timing T1 : Ada.Calendar.Time; subtype Buffer_Type is Stream_Element_Array(1 .. 4096); LF : constant Stream_Element := Character'pos(Ascii.LF); Is_Whitespace : constant array(Stream_Element) of Boolean := (Character'pos(Ascii.LF) | Character'pos(Ascii.HT) | Character'pos(' ') => True, others => False); -- "Counters" tasks run independently, asking Data_Source for buffer loads -- and tallying word and line counts. When done, they wait for their -- (partial) counts to be harvested, then terminate. task type Counters is entry Harvest(Line_Count, Word_Count : out Natural); end Counters; -- Data_Source is the IO task, supplying data as requested to Counters. -- Note that Data_Source counts the whitespace->non-white transitions -- at the ends of buffers. Counters' partial counts are added to that. -- Data_Source also counts the Total number of characters. task Data_Source is entry Get(Buffer : out Buffer_Type; Last : out Stream_Element_Offset); end Data_Source; task body Counters is Buffer : Buffer_Type; Last : Stream_Element_Offset; Lines, Words, Total : Natural := 0; In_Whitespace : Boolean; begin loop Data_Source.Get(Buffer, Last); exit when Last = 0; In_Whitespace := Is_Whitespace(Buffer(1)); for I in 1 .. Last loop if Is_Whitespace(Buffer(I)) then if Buffer(I) = LF then Lines := Lines + 1; end if; In_Whitespace := True; elsif In_Whitespace then In_Whitespace := False; Words := Words + 1; end if; end loop; exit when Last < Buffer'last; end loop; accept Harvest(Line_Count, Word_Count : out Natural) do Line_Count := Lines; Word_Count := Words; end; end Counters; Lines, Words, Total : Natural := 0; task body Data_Source is Stream : Ada.Text_IO.Text_Streams.Stream_Access; At_End : Boolean := False; In_Whitespace : Boolean := True; -- we must count at edges of buffer loads begin Stream := Ada.Text_IO.Text_Streams.Stream(Ada.Text_IO.Current_Input); loop select accept Get(Buffer : out Buffer_Type; Last : out Stream_Element_Offset) do if At_End then Last := 0; else Ada.Streams.Read(Ada.Streams.Root_Stream_Type'Class(Stream.all), Buffer, Last); Total := Total+Integer(Last); if Last > 0 then if In_Whitespace and not Is_Whitespace(Buffer(1)) then Words := Words+1; end if; In_Whitespace := Is_Whitespace(Buffer(Last)); end if; if Last < Buffer'last then At_End := True;end if; end if; end; or terminate; end select; end loop; end Data_Source; A_Line_Count, A_Word_Count : Natural := 0; Counter : array(1 .. 2) of Counters; begin for i in Counter'range loop Counter(i).Harvest(A_Line_Count, A_Word_Count); Lines := Lines+A_Line_Count; Words := Words+A_Word_Count; end loop; T1 := Ada.Calendar.Clock; Ada.Text_IO.Put_Line(Natural'Image(Lines) & Natural'Image(Words) & Natural'Image(Total)); Ada.Text_IO.Put_Line("took" & Duration'Image(T1 - T0)); end Cwt; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 21:38 ` tmoran @ 2005-03-25 7:30 ` Simon Wright 2005-03-25 9:38 ` tmoran 0 siblings, 1 reply; 122+ messages in thread From: Simon Wright @ 2005-03-25 7:30 UTC (permalink / raw) tmoran@acm.org writes: > Here's a simple multitasking conversion. I would appreciate it if someone > with a dual core or hyperthreaded CPU tried it out. Probably increasing > the buffer size, by lessening task interactions, would speed it up. With > multiple CPUs, the array > Counter : array(1 .. 2) of Counters; > might be increased to more tasks, also. This is a dual-processor 400 MHz Celeron running Mandrake Linux 8.2 and GNAT 5.02a1. On my ~/RMAIL file (must trim out some of that old mail!) the settled values were: with one task: 417604 1638075 19453698 took 1.030899000 with 2 tasks: 417604 1638075 19453698 took 0.666758000 with 3 tasks: 417604 1638075 19453698 took 0.637034000 -- Simon Wright 100% Ada, no bugs. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-25 7:30 ` Simon Wright @ 2005-03-25 9:38 ` tmoran 2005-03-25 16:20 ` Tapio Kelloniemi 0 siblings, 1 reply; 122+ messages in thread From: tmoran @ 2005-03-25 9:38 UTC (permalink / raw) > with one task: > > 417604 1638075 19453698 > took 1.030899000 > > with 2 tasks: > > 417604 1638075 19453698 > took 0.666758000 > > with 3 tasks: > > 417604 1638075 19453698 > took 0.637034000 Excellent! By my calculations then, Dmitry's program (with Ada.Streams.Read substituted for Gnat.OS.Lib) should run on your machine in 0.746 seconds and the original C should take 0.883 seconds. So Ada is about 40% faster than C on the word count shootout. It's too late at night for me to make Dmitry's multitasking right now, but that would be simple and ought to be nearly 80% faster than C. Those would be nice numbers to post to the shootout page. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-25 9:38 ` tmoran @ 2005-03-25 16:20 ` Tapio Kelloniemi 2005-03-25 22:18 ` Ada:The High Performance Language for Hyperthreaded and Multicore CPUs; was " tmoran 0 siblings, 1 reply; 122+ messages in thread From: Tapio Kelloniemi @ 2005-03-25 16:20 UTC (permalink / raw) tmoran@acm.org wrote: >> with one task: >> >> 417604 1638075 19453698 >> took 1.030899000 >> >> with 2 tasks: >> >> 417604 1638075 19453698 >> took 0.666758000 >> >> with 3 tasks: >> >> 417604 1638075 19453698 >> took 0.637034000 > Excellent! By my calculations then, Dmitry's program (with >Ada.Streams.Read substituted for Gnat.OS.Lib) should run on your machine >in 0.746 seconds and the original C should take 0.883 seconds. So Ada is >about 40% faster than C on the word count shootout. It's too late at >night for me to make Dmitry's multitasking right now, but that would be >simple and ought to be nearly 80% faster than C. Those would be nice >numbers to post to the shootout page. Except that shootout benchmarks are run on a single CPU. I don't know whether it is hyperthreaded or not. -- Tapio ^ permalink raw reply [flat|nested] 122+ messages in thread
* Ada:The High Performance Language for Hyperthreaded and Multicore CPUs; was Re: Ada bench : count words 2005-03-25 16:20 ` Tapio Kelloniemi @ 2005-03-25 22:18 ` tmoran 0 siblings, 0 replies; 122+ messages in thread From: tmoran @ 2005-03-25 22:18 UTC (permalink / raw) > > > > Here's a simple multitasking conversion. > > > This is a dual-processor 400 MHz Celeron running Mandrake Linux 8.2 > > > and GNAT 5.02a1. On my ~/RMAIL file (must trim out some of that > > > old mail!) the settled values were: > > > > > > with one task: > > > > > > 417604 1638075 19453698 > > > took 1.030899000 > > > > > > with 2 tasks: > > > > > > 417604 1638075 19453698 > > > took 0.666758000 > >multitasking ... ought to be nearly 80% faster than C. Those would be nice > >numbers to post to the shootout page. > > Except that shootout benchmarks are run on a single CPU. ^^^ have been Most of the shootout examples could be run on a 16 bit DOS system. If "16 bit DOS programs" was a requirement even fewer people would pay any attention to the shootout site(s). Hyperthreading and now multicore CPUs are clearly coming, and a language that makes it easy to get benefits from them is to be preferred to a language that doesn't do that. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 19:00 ` tmoran 2005-03-23 19:30 ` tmoran @ 2005-03-23 19:54 ` Tapio Kelloniemi 2005-03-23 20:39 ` Ada bench : word frequency Marius Amado Alves 2005-03-23 21:38 ` Ada bench : count words tmoran 2005-03-24 20:19 ` tmoran 2 siblings, 2 replies; 122+ messages in thread From: Tapio Kelloniemi @ 2005-03-23 19:54 UTC (permalink / raw) tmoran@acm.org wrote: > In my (minimal) trial, Dmitry A. Kazakov's >(Message-ID: <1j5jfikipztuc.jx5n2zhbg2np$.dlg@40tude.net>) >appears still faster. I don't understand why, unless Gnat.OS.Lib is >significantly faster than Gnat's Ada.Streams.Read, which seems unlikely. Remember that GNAT.OS_Lib just calls read(2). Stream_IO's read performs additional checking and calculations. Also File_Type is a tagged type, but I'm not sure how much overhead it causes, IIRC it is a controlled type. Also Stream_IO (and all other file IO in GNAT) are based on C streams which introduces yet some more complexity and buffering. Perhaps replacing Dmitry-version's Is_Control_Character function with the array scheme would result in fastest operation (provided that -gnatp is used, as it should be). -- Tapio ^ permalink raw reply [flat|nested] 122+ messages in thread
* Ada bench : word frequency 2005-03-23 19:54 ` Tapio Kelloniemi @ 2005-03-23 20:39 ` Marius Amado Alves 2005-03-23 21:26 ` Isaac Gouy 2005-04-28 4:04 ` Matthew Heaney 2005-03-23 21:38 ` Ada bench : count words tmoran 1 sibling, 2 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-23 20:39 UTC (permalink / raw) To: comp.lang.ada Here's a shot at the word frequency benchmark. By my calculations, it is as fast as the GCC C benchmark. I did not compare the output with the reference output. A line-by-line comparison requires a criterion for words with the same frequency, which AFAICT is not documented. Also, my current concept of word separator does not include punctuation marks. Again, I could not find a reference definition. The program does not use any external data structures library. It includes its own specific structures, an n-ary tree keyed by word form, and a binary tree keyed by word frequency. It increments the counts in the n-ary tree, then traverses the n-ary tree to populate the binary tree (to order by count), then dumps the latter. with Ada.Characters.Handling; with Ada.Characters.Latin_1; with Ada.Command_Line; with Ada.Streams; with Ada.Streams.Stream_IO; with Ada.Strings.Fixed; with Ada.Text_IO; with Ada.Text_IO.Text_Streams; procedure Word_Frequency is use Ada.Characters.Handling; use Ada.Characters.Latin_1; use Ada.Command_Line; use Ada.Streams; use Ada.Streams.Stream_IO; use Ada.Text_IO; use Ada.Text_IO.Text_Streams; Buffer : Stream_Element_Array (1 .. 4096); Input_Stream : Ada.Text_IO.Text_Streams.Stream_Access := Ada.Text_IO.Text_Streams.Stream (Current_Input); N : Stream_Element_Offset; Is_Separator : array (Stream_Element) of Boolean := (0 .. 32 | 127 .. 159 => True, others => False); -- N-ary tree of word counts -- used to increment the counts in one pass of the input file -- branches on the letter -- carries the count -- very fast -- but very space consuming subtype Letter is Stream_Element range 0 .. 255; type Word is array (Positive range <>) of Letter; type Tree; type Tree_Ptr is access Tree; type Node is record Count : Natural := 0; Subtree : Tree_Ptr := null; end record; type Tree is array (Letter) of Node; procedure Inc (X : in out Integer) is begin X := X + 1; end; procedure Dec (X : in out Integer) is begin X := X - 1; end; procedure Inc_Word (Parent : Tree_Ptr; Descendents : Word) is begin if Descendents'Length > 0 then declare Child_Index : Positive := Descendents'First; Child : Letter renames Descendents (Child_Index); begin if Descendents'Length = 1 then Inc (Parent (Child).Count); else if Parent (Child).Subtree = null then Parent (Child).Subtree := new Tree; end if; Inc_Word (Parent (Child).Subtree, Descendents (Child_Index + 1 .. Descendents'Last)); end if; end; end if; end; -- Binary tree of word counts -- used for sorting the result by the count (frequency) -- branches on the word count -- carries the word form type Form_Ptr is access Word; type Binary_Tree; type Binary_Tree_Ptr is access Binary_Tree; type Binary_Tree is record Form : Form_Ptr; Count : Natural; Left, Right : Binary_Tree_Ptr; end record; procedure Add_Node (Parent : in out Binary_Tree_Ptr; Form : Form_Ptr; Count : Natural) is begin if Parent = null then Parent := new Binary_Tree; Parent.Form := Form; Parent.Count := Count; else if Count < Parent.Count then Add_Node (Parent.Left, Form, Count); else Add_Node (Parent.Right, Form, Count); end if; end if; end; -- end of binary tree primitives Root : Tree_Ptr := new Tree; Btree : Binary_Tree_Ptr := null; Current_Word : Word (1 .. 1000); Current_Word_Length : Natural range 0 .. Current_Word'Last := 0; In_Word : Boolean := False; procedure Append_To_Word (E : Stream_Element) is begin Inc (Current_Word_Length); Current_Word (Current_Word_Length) := E; In_Word := True; end; procedure End_Word is begin if Current_Word_Length > 0 then Inc_Word (Root, Current_Word (1 .. Current_Word_Length)); end if; Current_Word_Length := 0; In_Word := False; end; procedure Process (S : Stream_Element_Array) is begin for I in S'Range loop if Is_Separator (S (I)) then if In_Word then End_Word; end if; else Append_To_Word (S (I)); end if; end loop; end; procedure Populate_Btree (Ntree : Tree_Ptr) is begin Inc (Current_Word_Length); for I in Letter'Range loop Current_Word (Current_Word_Length) := I; if Ntree (I).Count > 0 then Add_Node (Btree, Form => new Word'(Current_Word (1 .. Current_Word_Length)), Count => Ntree (I).Count); end if; if Ntree (I).Subtree /= null then Populate_Btree (Ntree (I).Subtree); end if; end loop; Dec (Current_Word_Length); end; procedure Populate_Btree is begin Current_Word_Length := 0; Populate_Btree (Root); end; function To_String (X : Form_Ptr) return String is S : String (X'Range); begin for I in X'Range loop S (I) := Character'Val (X (I)); end loop; return S; end; procedure Dump_Btree (X : Binary_Tree_Ptr := Btree) is begin if X /= null then Dump_Btree (X.Right); Ada.Text_IO.Put_Line (To_String (X.Form) & Natural'Image (X.Count)); Dump_Btree (X.Left); end if; end; begin loop Read (Root_Stream_Type'Class (Input_Stream.all), Buffer, N); Process (Buffer (1 .. N)); exit when N < Buffer'Length; end loop; if In_Word then End_Word; end if; Populate_Btree; Dump_Btree; end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-23 20:39 ` Ada bench : word frequency Marius Amado Alves @ 2005-03-23 21:26 ` Isaac Gouy 2005-03-24 1:24 ` Marius Amado Alves 2005-04-28 4:04 ` Matthew Heaney 1 sibling, 1 reply; 122+ messages in thread From: Isaac Gouy @ 2005-03-23 21:26 UTC (permalink / raw) Marius Amado Alves wrote: > Here's a shot at the word frequency benchmark. By my calculations, it > is as fast as the GCC C benchmark. I did not compare the output with > the reference output. A line-by-line comparison requires a criterion > for words with the same frequency, which AFAICT is not documented. "print each word and word frequency, in descending order by frequency and descending alphabetic order by word" http://shootout.alioth.debian.org/benchmark.php?test=wordfreq&lang=all&sort=fullcpu > Also, my current concept of word separator does not include punctuation > marks. Again, I could not find a reference definition. Hmmm and yet there are 20 programs accepted as correct. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-23 21:26 ` Isaac Gouy @ 2005-03-24 1:24 ` Marius Amado Alves 2005-03-24 17:23 ` Isaac Gouy ` (3 more replies) 0 siblings, 4 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-03-24 1:24 UTC (permalink / raw) To: comp.lang.ada Program fixed. Zero differences with the reference result. It's two times slower than the GCC C benchmark :-( (CPU times (user+sys) on my iBook for n=25 repetitions of the input file: C => 0.75, Ada => 1.45) However that's already enough to put Ada on the 7th place, after OCaml and before Eiffel :-) Program follows. with Ada.Characters.Handling; with Ada.Characters.Latin_1; with Ada.Command_Line; with Ada.Streams; with Ada.Streams.Stream_IO; with Ada.Strings.Fixed; with Ada.Text_IO; with Ada.Text_IO.Text_Streams; procedure Word_Frequency is use Ada.Characters.Handling; use Ada.Characters.Latin_1; use Ada.Command_Line; use Ada.Streams; use Ada.Streams.Stream_IO; use Ada.Text_IO; use Ada.Text_IO.Text_Streams; Buffer : Stream_Element_Array (1 .. 4096); Input_Stream : Ada.Text_IO.Text_Streams.Stream_Access := Ada.Text_IO.Text_Streams.Stream (Current_Input); N : Stream_Element_Offset; Is_Separator : array (Stream_Element) of Boolean := (Character'Pos ('A') .. Character'Pos ('Z') | Character'Pos ('a') .. Character'Pos ('z') => False, others => True); -- N-ary tree of word counts -- used to increment the counts in one pass of the input file -- branches on the letter -- carries the count -- very fast -- but very space consuming subtype Letter is Stream_Element range Character'Pos ('a') .. Character'Pos ('z'); type Word is array (Positive range <>) of Letter; type Tree; type Tree_Ptr is access Tree; type Node is record Count : Natural := 0; Subtree : Tree_Ptr := null; end record; type Tree is array (Letter) of Node; procedure Inc (X : in out Integer) is begin X := X + 1; end; procedure Dec (X : in out Integer) is begin X := X - 1; end; procedure Inc_Word (Parent : Tree_Ptr; Descendents : Word) is begin if Descendents'Length > 0 then declare Child_Index : Positive := Descendents'First; Child : Letter renames Descendents (Child_Index); begin if Descendents'Length = 1 then Inc (Parent (Child).Count); else if Parent (Child).Subtree = null then Parent (Child).Subtree := new Tree; end if; Inc_Word (Parent (Child).Subtree, Descendents (Child_Index + 1 .. Descendents'Last)); end if; end; end if; end; -- Binary tree of word counts -- used for sorting the result by the count (frequency) -- branches on the word count -- carries the word form type Form_Ptr is access Word; type Binary_Tree; type Binary_Tree_Ptr is access Binary_Tree; type Binary_Tree is record Form : Form_Ptr; Count : Natural; Left, Right : Binary_Tree_Ptr; end record; procedure Add_Node (Parent : in out Binary_Tree_Ptr; Form : Form_Ptr; Count : Natural) is begin if Parent = null then Parent := new Binary_Tree; Parent.Form := Form; Parent.Count := Count; else if Count < Parent.Count then Add_Node (Parent.Left, Form, Count); else Add_Node (Parent.Right, Form, Count); end if; end if; end; -- end of binary tree primitives Root : Tree_Ptr := new Tree; Btree : Binary_Tree_Ptr := null; Current_Word : Word (1 .. 1000); Current_Word_Length : Natural range 0 .. Current_Word'Last := 0; In_Word : Boolean := False; procedure Append_To_Word (E : Letter) is begin Inc (Current_Word_Length); Current_Word (Current_Word_Length) := E; In_Word := True; end; procedure End_Word is begin if Current_Word_Length > 0 then Inc_Word (Root, Current_Word (1 .. Current_Word_Length)); end if; Current_Word_Length := 0; In_Word := False; end; To_Lower : array (Stream_Element) of Letter; procedure Initialise_To_Lower_Map is D : Integer := Character'Pos ('a') - Character'Pos ('A'); begin for I in Character'Pos ('a') .. Character'Pos ('z') loop To_Lower (Stream_Element (I)) := Letter (I); To_Lower (Stream_Element (I - D)) := Letter (I); end loop; end; procedure Process (S : Stream_Element_Array) is begin for I in S'Range loop if Is_Separator (S (I)) then if In_Word then End_Word; end if; else Append_To_Word (To_Lower (S (I))); end if; end loop; end; procedure Populate_Btree (Ntree : Tree_Ptr) is begin Inc (Current_Word_Length); for I in Letter'Range loop Current_Word (Current_Word_Length) := I; if Ntree (I).Count > 0 then Add_Node (Btree, Form => new Word'(Current_Word (1 .. Current_Word_Length)), Count => Ntree (I).Count); end if; if Ntree (I).Subtree /= null then Populate_Btree (Ntree (I).Subtree); end if; end loop; Dec (Current_Word_Length); end; procedure Populate_Btree is begin Current_Word_Length := 0; Populate_Btree (Root); end; function To_String (X : Form_Ptr) return String is S : String (X'Range); begin for I in X'Range loop S (I) := Character'Val (X (I)); end loop; return S; end; subtype String7 is String (1 .. 7); function Img7 (X : Natural) return String7 is S : String := Natural'Image (X); begin return String' (1 .. 8 - S'Length => ' ') & S (2 .. S'Last); end; procedure Dump_Btree (X : Binary_Tree_Ptr := Btree) is begin if X /= null then Dump_Btree (X.Right); Ada.Text_IO.Put_Line (Img7 (X.Count) & " " & To_String (X.Form)); Dump_Btree (X.Left); end if; end; begin Initialise_To_Lower_Map; loop Read (Root_Stream_Type'Class (Input_Stream.all), Buffer, N); Process (Buffer (1 .. N)); exit when N < Buffer'Length; end loop; if In_Word then End_Word; end if; Populate_Btree; Dump_Btree; end; ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 1:24 ` Marius Amado Alves @ 2005-03-24 17:23 ` Isaac Gouy 2005-03-24 19:52 ` Martin Dowie ` (2 subsequent siblings) 3 siblings, 0 replies; 122+ messages in thread From: Isaac Gouy @ 2005-03-24 17:23 UTC (permalink / raw) And thanks for sending it to the Shootout! (It'll will appear on the website in the next week or so) Marius Amado Alves wrote: > Program fixed. Zero differences with the reference result. > > It's two times slower than the GCC C benchmark :-( ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 1:24 ` Marius Amado Alves 2005-03-24 17:23 ` Isaac Gouy @ 2005-03-24 19:52 ` Martin Dowie 2005-04-11 15:11 ` Marius Amado Alves 2005-03-24 20:16 ` Pascal Obry 2005-03-24 21:18 ` Gautier Write-only 3 siblings, 1 reply; 122+ messages in thread From: Martin Dowie @ 2005-03-24 19:52 UTC (permalink / raw) Marius Amado Alves wrote: > function To_String (X : Form_Ptr) return String is > S : String (X'Range); > begin > for I in X'Range loop > S (I) := Character'Val (X (I)); > end loop; > return S; > end; For faster code isn't this a case where an overlay is safe?... Cheers -- Martin ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 19:52 ` Martin Dowie @ 2005-04-11 15:11 ` Marius Amado Alves 0 siblings, 0 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-04-11 15:11 UTC (permalink / raw) To: comp.lang.ada On 24 Mar 2005, at 19:52, Martin Dowie wrote: > Marius Amado Alves wrote: >> function To_String (X : Form_Ptr) return String is >> S : String (X'Range); >> begin >> for I in X'Range loop >> S (I) := Character'Val (X (I)); >> end loop; >> return S; >> end; > > For faster code isn't this a case where an overlay is safe?... Maybe. I never remember if the RM guarantees that the overlay (unchecked conversion?) works here. Because this function is used only at the end to output the results I didn't care. But someone should test it ;-) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 1:24 ` Marius Amado Alves 2005-03-24 17:23 ` Isaac Gouy 2005-03-24 19:52 ` Martin Dowie @ 2005-03-24 20:16 ` Pascal Obry 2005-03-24 22:54 ` tmoran 2005-03-24 21:18 ` Gautier Write-only 3 siblings, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-03-24 20:16 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> writes: > function To_String (X : Form_Ptr) return String is Returning unconstrained object in a function cost a lot. To have comparable timing on the C and Ada side we need comaprable code. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 20:16 ` Pascal Obry @ 2005-03-24 22:54 ` tmoran 2005-04-11 15:38 ` Marius Amado Alves 0 siblings, 1 reply; 122+ messages in thread From: tmoran @ 2005-03-24 22:54 UTC (permalink / raw) > > function To_String (X : Form_Ptr) return String is > > Returning unconstrained object in a function cost a lot. True, but here To_String is only called once/Put_Line. How about eliminating the recursive call to Inc_Word for every character? ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 22:54 ` tmoran @ 2005-04-11 15:38 ` Marius Amado Alves 0 siblings, 0 replies; 122+ messages in thread From: Marius Amado Alves @ 2005-04-11 15:38 UTC (permalink / raw) To: comp.lang.ada On 24 Mar 2005, at 22:54, tmoran@acm.org wrote: > How about eliminating the recursive call to Inc_Word for every > character? I made this structure recursive because words are small (circa 10 characters). I don't believe 10 nested calls perform poorer than a non-recursive algorithm most certainly requiring additional bookkeeping. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 1:24 ` Marius Amado Alves ` (2 preceding siblings ...) 2005-03-24 20:16 ` Pascal Obry @ 2005-03-24 21:18 ` Gautier Write-only 2005-04-11 15:32 ` Marius Amado Alves 3 siblings, 1 reply; 122+ messages in thread From: Gautier Write-only @ 2005-03-24 21:18 UTC (permalink / raw) Marius Amado Alves: > Program fixed. Zero differences with the reference result. Great! > It's two times slower than the GCC C benchmark :-( > > (CPU times (user+sys) on my iBook for n=25 repetitions of the input > file: C => 0.75, Ada => 1.45) > > However that's already enough to put Ada on the 7th place, after OCaml > and before Eiffel :-) I hope you compiled with at least -O3 or -finline-(something). Otherwise it won't inline even the Inc & Dec - then, see the horrendous result in the assembler code produced... I would try to inline Append_To_Word, End_Word also (needs pragma Inline). The speed problem is that the text is so "small" (hum, or the computers so fast ?... ) that the loading & initialization times may overshadow the time of the algorithm itself. I'd suggest a bigger test text (the matrices in the matrix test are crazily small too - and/or the job to do is too simple). BTW, maybe the several unused packages have costly elaborations. ______________________________________________________________ Gautier -- http://www.mysunrise.ch/users/gdm/index.htm Ada programming -- http://www.mysunrise.ch/users/gdm/gsoft.htm NB: For a direct answer, e-mail address on the Web site! ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-24 21:18 ` Gautier Write-only @ 2005-04-11 15:32 ` Marius Amado Alves 2005-04-11 23:56 ` Robert A Duff 0 siblings, 1 reply; 122+ messages in thread From: Marius Amado Alves @ 2005-04-11 15:32 UTC (permalink / raw) To: comp.lang.ada > I hope you compiled with at least -O3 or -finline-(something). It's compiled with -O3. > I would try to inline Append_To_Word, End_Word also (needs pragma > Inline). Done. (I have only not inlined the recursive functions, fearing a compilation result of infinite size...) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-04-11 15:32 ` Marius Amado Alves @ 2005-04-11 23:56 ` Robert A Duff 0 siblings, 0 replies; 122+ messages in thread From: Robert A Duff @ 2005-04-11 23:56 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> writes: > Done. (I have only not inlined the recursive functions, fearing a > compilation result of infinite size...) There's nothing wrong with putting pragma Inline on a recursive procedure. The compiler can inline some number of levels, or not inline at all, but it won't generate an infinite-sized amount of machine code, nor take an infinite amount of time to generate that code. I've seen a compiler where: Put(Factorial(7)); was optimized to: Put(5040); where Factorial is the usual recursive definition of the factorial function. It inlined 7 levels, and then did some constant propagation, thus figuring out the answer at compile time. - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-03-23 20:39 ` Ada bench : word frequency Marius Amado Alves 2005-03-23 21:26 ` Isaac Gouy @ 2005-04-28 4:04 ` Matthew Heaney 2005-04-28 20:40 ` Matthew Heaney 1 sibling, 1 reply; 122+ messages in thread From: Matthew Heaney @ 2005-04-28 4:04 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> writes: > Here's a shot at the word frequency benchmark... > > The program does not use any external data structures library. I have updated the ai302/wordfreq example so that it's now in closer harmony with what's specified at the shootout page: <http://charles.tigris.org/source/browse/charles/src/ai302/examples/wordfreq/> The first version is implemented using a hashed map. In order to demonstrate features of the Ada 2005 container library, I have also provided a second version, implemented using a hashed set: <http://charles.tigris.org/source/browse/charles/src/ai302/examples/wordfreq2/> I was mostly interested in seeing how easily I could solve the wordfreq problem using the standard container library, so I haven't attempted any kind of optimization. But I assume both versions will have respectable performance, compared to other entries in the shootout. Regards, Matt ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : word frequency 2005-04-28 4:04 ` Matthew Heaney @ 2005-04-28 20:40 ` Matthew Heaney 0 siblings, 0 replies; 122+ messages in thread From: Matthew Heaney @ 2005-04-28 20:40 UTC (permalink / raw) "Matthew Heaney" <matthewjheaney@earthlink.net> wrote in message news:uhdhry2bw.fsf@earthlink.net... > > I have updated the ai302/wordfreq example so that it's now in closer > harmony with what's specified at the shootout page... I have just added another wordfreq example. This version is implemented using a vector, that I keep sorted: <http://charles.tigris.org/source/browse/charles/src/ai302/examples/wordfreq3/> After populating the vector in word order (which allows me to use a binary search to perform word lookups), I then re-sort the vector in frequency order, and print the results. Regards, Matt ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 19:54 ` Tapio Kelloniemi 2005-03-23 20:39 ` Ada bench : word frequency Marius Amado Alves @ 2005-03-23 21:38 ` tmoran 1 sibling, 0 replies; 122+ messages in thread From: tmoran @ 2005-03-23 21:38 UTC (permalink / raw) > Perhaps replacing Dmitry-version's Is_Control_Character function with > the array scheme would result in fastest operation (provided that > -gnatp is used, as it should be). I tried a completely table-lookup version but it was slower. OTOH, that was without -gnatp. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 19:00 ` tmoran 2005-03-23 19:30 ` tmoran 2005-03-23 19:54 ` Tapio Kelloniemi @ 2005-03-24 20:19 ` tmoran 2005-03-24 21:00 ` Pascal Obry 2 siblings, 1 reply; 122+ messages in thread From: tmoran @ 2005-03-24 20:19 UTC (permalink / raw) > In my (minimal) trial, Dmitry A. Kazakov's >(Message-ID: <1j5jfikipztuc.jx5n2zhbg2np$.dlg@40tude.net>) >appears still faster. I don't understand why, unless Gnat.OS.Lib is >significantly faster than Gnat's Ada.Streams.Read, which seems unlikely. I changed Dmitry's program to use Ada.Streams.Read instead of Gnat.OS.Lib, compiled both with -gnato -gnatp -O3, and the portable version appears notably faster. ie, Gnat.OS.Lib.Read is slower than Ada.Streams.Read. Apparently that's because it spends time deleting CRs from the data. It's more efficient here to read the whole block with Ada.Streams.Read and then add a "when CR" branch to the case statements. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-24 20:19 ` tmoran @ 2005-03-24 21:00 ` Pascal Obry 2005-03-24 22:54 ` tmoran 0 siblings, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-03-24 21:00 UTC (permalink / raw) tmoran@acm.org writes: > I changed Dmitry's program to use Ada.Streams.Read instead of > Gnat.OS.Lib, compiled both with -gnato -gnatp -O3, and the portable Why compile using -gnato ? Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-24 21:00 ` Pascal Obry @ 2005-03-24 22:54 ` tmoran 0 siblings, 0 replies; 122+ messages in thread From: tmoran @ 2005-03-24 22:54 UTC (permalink / raw) > > I changed Dmitry's program to use Ada.Streams.Read instead of > > Gnat.OS.Lib, compiled both with -gnato -gnatp -O3, and the portable > > Why compile using -gnato ? Habit. It makes Gnat do the required Ada overflow checks. But I admit it's silly to do -gnato and -gnatp together. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 15:09 ` Marius Amado Alves 2005-03-23 19:00 ` tmoran @ 2005-03-30 16:08 ` Andre 2005-03-30 16:36 ` Pascal Obry 1 sibling, 1 reply; 122+ messages in thread From: Andre @ 2005-03-30 16:08 UTC (permalink / raw) Marius Amado Alves wrote: >> I'll review this tomorrow on the bus to work (I think your program is >> separating words at buffer end, and it should not). > > > Done. Tmoran, sorry, my first sight was wrong, your algorithm is mostly > fine. Minor reservations: > - special statute given to CR; personally I think all characters > (control or not) should count to total > - reliance on Stream_Element representing one character; portable across > all range of environments? > > My own program rewritten with Text_Streams attains the same speed as > yours. The fact that it is structured doesn't hit performance > significantly. Pragma Inline improves a little bit. Real times (ms) on > my iBook, for input file repeated 2500 times, all compiled with only -O3: > C ........................ 600 > Yours, or mine inlined ... 700 > Mine not inlined ......... 750 > > So we should submit yours, or mine inlined. It will put Ada right after > the Cs w.r.t. speed. W.r.t. executable file size Ada is much bigger. On > my iBook: > C ....... 15k > Mine .... 423k > Yours ... 459k > > W.r.t. source code size all are similar. Number of significant > semicolons (Ada), or semicolons + {} blocks + #includes + #defines (C): > C .............. 27 > Yours .......... 33 > Mine inlined ... 52 > > My program follows for reference. It accepts the code for EOL as an > argument. Default = 10 (LF). > > -- The Great Computer Language Shootout > -- http://shootout.alioth.debian.org/ > -- > -- contributed by Guys De Cla > > with Ada.Characters.Handling; > with Ada.Characters.Latin_1; > with Ada.Command_Line; > with Ada.Streams; > with Ada.Streams.Stream_IO; > with Ada.Strings.Fixed; > with Ada.Text_IO; > with Ada.Text_IO.Text_Streams; > > procedure Count_Words_Portable is > > use Ada.Characters.Handling; > use Ada.Characters.Latin_1; > use Ada.Command_Line; > use Ada.Streams; > use Ada.Streams.Stream_IO; > use Ada.Text_IO; > use Ada.Text_IO.Text_Streams; > > Buffer : Stream_Element_Array (1 .. 4096); > Input_Stream : Ada.Text_IO.Text_Streams.Stream_Access > := Ada.Text_IO.Text_Streams.Stream (Current_Input); > EOL_Character_Pos : Stream_Element := Character'Pos (LF); > Lines : Natural := 0; > Words : Natural := 0; > Total : Natural := 0; > In_Word : Boolean := False; > N : Stream_Element_Offset; > Is_Separator : array (Stream_Element) of Boolean := > (0 .. 32 | 127 .. 159 => True, others => False); > > procedure Begin_Word is > begin > Words := Words + 1; > In_Word := True; > end; > > procedure End_Word is > begin > In_Word := False; > end; > > procedure End_Line is > begin > Lines := Lines + 1; > End_Word; > end; > > procedure Count_Words (S : in Stream_Element_Array) is > begin > Total := Total + S'Length; > for I in S'Range loop > if S (I) = EOL_Character_Pos then > End_Line; > else > if Is_Separator (S (I)) then > if In_Word then End_Word; end if; > else > if not In_Word then Begin_Word; end if; > end if; > end if; > end loop; > end; > > pragma Inline (Begin_Word, End_Word, End_Line, Count_Words); > > begin > begin > EOL_Character_Pos := Stream_Element'Value (Argument (1)); > exception > when Constraint_Error => null; > end; > Ada.Text_IO.Put_Line ("EOL =>" & Stream_Element'Image > (EOL_Character_Pos)); > > loop > Read (Root_Stream_Type'Class (Input_Stream.all), Buffer, N); > Count_Words (Buffer (1 .. N)); > exit when N < Buffer'Length; > end loop; > > Ada.Text_IO.Put_Line > (Natural'Image (Lines) & > Natural'Image (Words) & > Natural'Image (Total)); > end; > I checked with the Shootout side. The Ada program sent in was rated with Error. So, maybe you can check why and correct it. Andr� ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-30 16:08 ` Andre @ 2005-03-30 16:36 ` Pascal Obry 0 siblings, 0 replies; 122+ messages in thread From: Pascal Obry @ 2005-03-30 16:36 UTC (permalink / raw) Andre <avsaway@hotmail.com> writes: > I checked with the Shootout side. The Ada program sent in was rated with > Error. So, maybe you can check why and correct it. It is now fixed. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 1:16 ` Ada bench : count words Marius Amado Alves 2005-03-22 10:59 ` Dmitry A. Kazakov 2005-03-22 19:49 ` tmoran @ 2005-03-22 22:27 ` Dmitry A. Kazakov 2005-03-23 7:46 ` Pascal Obry 2 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-22 22:27 UTC (permalink / raw) On Tue, 22 Mar 2005 01:16:09 +0000, Marius Amado Alves wrote: Here is a FSM/OS_Lib variant, a shameless one, I must admit. Anyway it narrowly beats: http://dada.perl.it/shootout/wc.gcc.html on my FC3 machine. But of course it has no chance against: http://shootout.alioth.debian.org/great/benchmark.php?test=wc&lang=gcc&id=0&sort=fullcpu, which uses even dirtier tricks than me! (:-)) ---------------------------------------------------- with Ada.Characters.Latin_1; use Ada.Characters.Latin_1; with Ada.Text_IO; use Ada.Text_IO; with GNAT.OS_Lib; use GNAT.OS_Lib; procedure Ada_WC is subtype Buffer_Index is Integer range 1..4099; type File_Buffer is array (Buffer_Index) of aliased Character; Buffer : File_Buffer; Index : Buffer_Index := Buffer_Index'First; Last : Integer := Buffer_Index'First; Lines : Natural := 0; Words : Natural := 0; Total : Natural := 0; begin <<Blank>> if Index < Last then -- This "if" should be a subprogram, Index := Index + 1; -- but I don't trust GNAT! else Last := Read (Standin, Buffer (1)'Address, 4098); -- 4K! to read if Last = 0 then goto Done; end if; Total := Total + Last; Index := Buffer_Index'First; end if; case Buffer (Index) is when ' ' | HT => goto Blank; when LF => Lines := Lines + 1; goto Blank; when others => Words := Words + 1; end case; <<Word>> if Index < Last then Index := Index + 1; else Last := Read (Standin, Buffer (1)'Address, 4098); if Last = 0 then goto Done; end if; Total := Total + Last; Index := Buffer_Index'First; end if; case Buffer (Index) is when ' ' | HT => goto Blank; when LF => Lines := Lines + 1; goto Blank; when others => goto Word; end case; <<Done>> Put_Line ( Natural'Image (Lines) & Natural'Image (Words) & Natural'Image (Total) ); end Ada_WC; -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-22 22:27 ` Dmitry A. Kazakov @ 2005-03-23 7:46 ` Pascal Obry 2005-03-23 7:56 ` Dmitry A. Kazakov 0 siblings, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-03-23 7:46 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > procedure Ada_WC is > > subtype Buffer_Index is Integer range 1..4099; ^^^^ 4097 ... > Index := Index + 1; -- but I don't trust GNAT! > else > Last := Read (Standin, Buffer (1)'Address, 4098); -- 4K! to read ^^^^ 4096 ... > <<Word>> > if Index < Last then > Index := Index + 1; > else > Last := Read (Standin, Buffer (1)'Address, 4098); ^^^^ 4096 Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 7:46 ` Pascal Obry @ 2005-03-23 7:56 ` Dmitry A. Kazakov 2005-03-23 13:38 ` Robert A Duff 0 siblings, 1 reply; 122+ messages in thread From: Dmitry A. Kazakov @ 2005-03-23 7:56 UTC (permalink / raw) On 23 Mar 2005 08:46:11 +0100, Pascal Obry wrote: > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > >> procedure Ada_WC is >> >> subtype Buffer_Index is Integer range 1..4099; > ^^^^ > 4097 > ... > >> Index := Index + 1; -- but I don't trust GNAT! >> else >> Last := Read (Standin, Buffer (1)'Address, 4098); -- 4K! to read > ^^^^ > 4096 > ... > >> <<Word>> >> if Index < Last then >> Index := Index + 1; >> else >> Last := Read (Standin, Buffer (1)'Address, 4098); > ^^^^ > 4096 > Pascal. Oh, sorry, of course. I always forget that powers of 2... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench : count words 2005-03-23 7:56 ` Dmitry A. Kazakov @ 2005-03-23 13:38 ` Robert A Duff 0 siblings, 0 replies; 122+ messages in thread From: Robert A Duff @ 2005-03-23 13:38 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > >> Last := Read (Standin, Buffer (1)'Address, 4098); -- 4K! to read > > ^^^^ > > 4096 > Oh, sorry, of course. I always forget that powers of 2... That's what computers are for: ;-) 4*(2**10) - Bob ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-21 23:27 ` Georg Bauhaus 2005-03-22 1:16 ` Ada bench : count words Marius Amado Alves @ 2005-03-22 7:05 ` Pascal Obry 1 sibling, 0 replies; 122+ messages in thread From: Pascal Obry @ 2005-03-22 7:05 UTC (permalink / raw) Georg Bauhaus <sb463ba@user1.uni-duisburg.de> writes: > I get consistently better results (7% and much more) when placing > the Fib function (as online) in a package. Yes, I know and I have proposed a new version :) Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.org --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-03-19 16:55 ` Dr. Adrian Wrigley 2005-03-19 21:32 ` Michael Bode @ 2005-04-07 20:59 ` David Sauvage 2005-04-07 23:40 ` David Sauvage 2005-04-08 17:11 ` Pascal Obry 1 sibling, 2 replies; 122+ messages in thread From: David Sauvage @ 2005-04-07 20:59 UTC (permalink / raw) > Who can we volunteer to complete the table? > -- > Adrian Hi, I've just finish one implementation on the spellcheck benchmark for Ada 95 GNAT using the AI-302 Reference Implementation. http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu Before submitting it, i need to use a faster Get_Line than Ada.Text_IO.Get_Line I expect to submit it (if people of The Computer Shootout accept the AI-302 Reference Implementation) in one or two days. For N=1 i'm about 0m0.288s Full CPU time (user 0m0.198s, sys 0m0.005s) (Debian GNU/Linux kernel 2.6.10, 1.2Ghz PC) i'll upload the tarball somewhere on the web soon. -- David ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-07 20:59 ` David Sauvage @ 2005-04-07 23:40 ` David Sauvage 2005-04-08 17:11 ` Pascal Obry 1 sibling, 0 replies; 122+ messages in thread From: David Sauvage @ 2005-04-07 23:40 UTC (permalink / raw) David Sauvage wrote: > > Who can we volunteer to complete the table? > > -- > > Adrian > > Hi, > > I've just finish one implementation on the spellcheck benchmark for Ada > 95 GNAT using the AI-302 Reference Implementation. > http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu > > Before submitting it, i need to use a faster Get_Line than > Ada.Text_IO.Get_Line > > I expect to submit it (if people of The Computer Shootout accept the > AI-302 Reference Implementation) in one or two days. > > For N=1 i'm about 0m0.288s Full CPU time (user 0m0.198s, sys 0m0.005s) > (Debian GNU/Linux kernel 2.6.10, 1.2Ghz PC) > > i'll upload the tarball somewhere on the web soon. > > > -- > David Here it is http://alioth.debian.org/tracker/index.php?func=detail&aid=301407&group_id=30402&atid=411646 -- David ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-07 20:59 ` David Sauvage 2005-04-07 23:40 ` David Sauvage @ 2005-04-08 17:11 ` Pascal Obry 2005-04-08 17:39 ` tmoran ` (2 more replies) 1 sibling, 3 replies; 122+ messages in thread From: Pascal Obry @ 2005-04-08 17:11 UTC (permalink / raw) "David Sauvage" <pariakaman@gmail.com> writes: > I expect to submit it (if people of The Computer Shootout accept the > AI-302 Reference Implementation) in one or two days. This won't be very good. The AI-302 containers will be part of the bench and the lines will be counted as part of it! I think that all the remaining bench should wait for a version of FSF GNAT containing the Ada.Containers. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-08 17:11 ` Pascal Obry @ 2005-04-08 17:39 ` tmoran 2005-04-08 18:49 ` David Sauvage 2005-04-18 19:14 ` David Sauvage 2 siblings, 0 replies; 122+ messages in thread From: tmoran @ 2005-04-08 17:39 UTC (permalink / raw) >This won't be very good. The AI-302 containers will be part of the bench and >the lines will be counted as part of it! > >I think that all the remaining bench should wait for a version of FSF GNAT >containing the Ada.Containers. But Ada.Containers won't be part of "Ada" until Ada 2005 is official, so the programming language will have to be listed as "FSF GNAT". Of course a "language" that is in fact vendor-specific probably won't bother the Shootout audience anyway. #.# I haven't looked at the "spell check" program, but would be surprised if it couldn't easily be written with fixed buffer sizes with Stream_IO. (Having written one of the early commercial spell checkers, I don't think I want to look at the Shootout version. @.@) ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-08 17:11 ` Pascal Obry 2005-04-08 17:39 ` tmoran @ 2005-04-08 18:49 ` David Sauvage 2005-04-18 19:14 ` David Sauvage 2 siblings, 0 replies; 122+ messages in thread From: David Sauvage @ 2005-04-08 18:49 UTC (permalink / raw) "Pascal Obry" <pas...@obry.net> writes: > This won't be very good. The AI-302 containers will be part of the bench and > the lines will be counted as part of it! > > I think that all the remaining bench should wait for a version of FSF GNAT > containing the Ada.Containers. Hi, I've proposed the "The Great Computer Language Shootout" team to take the AI-302 Reference Implementation as Ada 95 GNAT Libraries. -- David ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-08 17:11 ` Pascal Obry 2005-04-08 17:39 ` tmoran 2005-04-08 18:49 ` David Sauvage @ 2005-04-18 19:14 ` David Sauvage 2005-04-19 16:43 ` Matthew Heaney 2 siblings, 1 reply; 122+ messages in thread From: David Sauvage @ 2005-04-18 19:14 UTC (permalink / raw) Hi, I've proposed a new spellcheck version without using the AI-302 Containers, i've used GNAT.Spitbol.Table. The app is twice faster. I also used Ada.Streams.Stream_IO.Read -- David ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-18 19:14 ` David Sauvage @ 2005-04-19 16:43 ` Matthew Heaney 2005-04-19 23:22 ` David Sauvage 0 siblings, 1 reply; 122+ messages in thread From: Matthew Heaney @ 2005-04-19 16:43 UTC (permalink / raw) David Sauvage wrote: > > I've proposed a new spellcheck version without using the AI-302 > Containers, i've used GNAT.Spitbol.Table. The app is twice faster. I > also used Ada.Streams.Stream_IO.Read. How do you know from where the improvement came -- the change in data structure or the change in read mechanism? Also, what hash function have you been using? I just compared Ada.Strings.Hash to the hash function used in the C++ example, and the C++ hash function does much better on the words in spellcheck-dict.txt. Have you tried using the C++ hash function with an Ada program that uses Ada.Containers.Hashed_Maps? Note that strictly speaking, you don't need to use the hashed map, since the shootout is comparing "hash tables." In the standard library, there are two kinds of hash tables, hashed maps and hashed sets. In the shootout spellcheck example, you only need a membership test, so a hashed set will do just as well. (Probably better, in fact.) -Matt ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-19 16:43 ` Matthew Heaney @ 2005-04-19 23:22 ` David Sauvage 2005-04-20 0:49 ` Matthew Heaney 2005-04-20 9:41 ` Pascal Obry 0 siblings, 2 replies; 122+ messages in thread From: David Sauvage @ 2005-04-19 23:22 UTC (permalink / raw) Hi, thanks for your advised comments, "Matthew Heaney" <mhea...@on2.com> wrote > How do you know from where the improvement came -- the change in > data structure or the change in read mechanism? please not i run the program just a few times to get an average value. (System config : GNU/Debian Sid kernel 2.6.10, AMD 1.2 Ghz) - Values are in millisecond, N=1 - for the GNAT.Spitbol.Table method, i just instantiate the generic with 40_000 as discriminant (same value as the C spellcheck implementation) - for the AI302.Containers.Indefinite_Hashed_Maps method i used the default AI302.Strings.Hash as hash function - When using the Ada.Streams.Stream_IO method, i used a 4_096 buffer. GNAT.Spitbol.Table + Ada.Streams.Stream_IO (*1) : 145 GNAT.Spitbol.Table + Ada.Text_IO : 175 AI302.Containers.Indefinite_Hashed_Maps + Ada.Streams.Stream_IO : 278 AI302.Containers.Indefinite_Hashed_Maps + Ada.Text_IO : 310 "Matthew Heaney" <mhea...@on2.com> wrote > Have you tried using the C++ hash function with an Ada program that > uses Ada.Containers.Hashed_Maps? nop, excellent i will try this soon. (*1) this implementaton can be also find here http://psycose.apinc.org/ada_shootout/ -- David ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-19 23:22 ` David Sauvage @ 2005-04-20 0:49 ` Matthew Heaney 2005-04-20 4:22 ` Georg Bauhaus 2005-04-20 21:24 ` David Sauvage 2005-04-20 9:41 ` Pascal Obry 1 sibling, 2 replies; 122+ messages in thread From: Matthew Heaney @ 2005-04-20 0:49 UTC (permalink / raw) "David Sauvage" <pariakaman@gmail.com> writes: > - for the GNAT.Spitbol.Table method, i just instantiate the generic > with 40_000 as discriminant (same value as the C spellcheck > implementation) I would also call Reserve_Capacity with a value of 40_000, in both the hashed map and hashed set versions. > - for the AI302.Containers.Indefinite_Hashed_Maps method i used the > default AI302.Strings.Hash as hash function > - When using the Ada.Streams.Stream_IO method, i used a 4_096 buffer. > > GNAT.Spitbol.Table + Ada.Streams.Stream_IO (*1) : 145 > GNAT.Spitbol.Table + Ada.Text_IO : 175 > AI302.Containers.Indefinite_Hashed_Maps + Ada.Streams.Stream_IO : 278 > AI302.Containers.Indefinite_Hashed_Maps + Ada.Text_IO : 310 This is good data. Can you make a version that uses the (indefinite) hashed set too, and post those results? Also, can you post a data set with the new hash function (see below)? > "Matthew Heaney" <mhea...@on2.com> wrote > > Have you tried using the C++ hash function with an Ada program that > > uses Ada.Containers.Hashed_Maps? > > nop, excellent i will try this soon. Here's the hash function: function Hash_Word (S : String) return Hash_Type is H : Hash_Type := 0; begin for I in S'Range loop H := 5 * H + Character'Pos (S (I)); end loop; return H; end Hash_Word; -Matt ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 0:49 ` Matthew Heaney @ 2005-04-20 4:22 ` Georg Bauhaus 2005-04-20 21:24 ` David Sauvage 1 sibling, 0 replies; 122+ messages in thread From: Georg Bauhaus @ 2005-04-20 4:22 UTC (permalink / raw) Matthew Heaney wrote: > "David Sauvage" <pariakaman@gmail.com> writes: > > >>- for the GNAT.Spitbol.Table method, i just instantiate the generic >>with 40_000 as discriminant (same value as the C spellcheck >>implementation) > > > I would also call Reserve_Capacity with a value of 40_000, in both the > hashed map and hashed set versions. Tried for the maps version, and not really noticed an effect, using GCC 4.0rc2. -- Georg ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 0:49 ` Matthew Heaney 2005-04-20 4:22 ` Georg Bauhaus @ 2005-04-20 21:24 ` David Sauvage 2005-04-20 23:06 ` Georg Bauhaus 1 sibling, 1 reply; 122+ messages in thread From: David Sauvage @ 2005-04-20 21:24 UTC (permalink / raw) Hi, "Matthew Heaney" <mhea...@on2.com> wrote > This is good data. Can you make a version that uses the (indefinite) > hashed set too, and post those results? > > Also, can you post a data set with the new hash function (see below)? - The same Ada.Streams.Stream_IO.Read method is used in all the following cases - Please do not take the following real time values as _serious_, as just a few runs have been done to make a rapid average. I'm also afraid about the influence of my system load average on the time command used to get the real cpu time. - I did not figure out how to use Reserve_Capacity in Indefinite_Hashed_Maps & Indefinite_Hashed_Sets. - Using GNAT 3.15p-12 on GNU Debian/Linux, for other system characteristics please have a look on my previous post. - http://psycose.apinc.org/ada_shootout/ also contain the spellcheck_ada-ai302_hashed_set & spellcheck_ada-ai302_hashed_map tarball. AI302.Containers.Indefinite_Hashed_Maps + C++ spellcheck hash function : 240 AI302.Containers.Indefinite_Hashed_Maps + AI302.Strings.Hash function : 257 AI302.Containers.Indefinite_Hashed_Sets : 370 -- David ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 21:24 ` David Sauvage @ 2005-04-20 23:06 ` Georg Bauhaus 0 siblings, 0 replies; 122+ messages in thread From: Georg Bauhaus @ 2005-04-20 23:06 UTC (permalink / raw) David Sauvage wrote: > - I did not figure out how to use Reserve_Capacity in > Indefinite_Hashed_Maps & Indefinite_Hashed_Sets. Maybe this will work: begin Dictionnary.Reserve_Capacity(Map, 40_000); end Dictionnary_Mgr; -- Georg ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-19 23:22 ` David Sauvage 2005-04-20 0:49 ` Matthew Heaney @ 2005-04-20 9:41 ` Pascal Obry 2005-04-20 11:44 ` Matthew Heaney 1 sibling, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-04-20 9:41 UTC (permalink / raw) "David Sauvage" <pariakaman@gmail.com> writes: > GNAT.Spitbol.Table + Ada.Streams.Stream_IO (*1) : 145 > GNAT.Spitbol.Table + Ada.Text_IO : 175 > AI302.Containers.Indefinite_Hashed_Maps + Ada.Streams.Stream_IO : 278 > AI302.Containers.Indefinite_Hashed_Maps + Ada.Text_IO : 310 There is probably a not so good hash routin implementation in the AI302 containers and you end up with many collisions. You should try the one from GNAT s-htable.adb: << function Hash (Key : String) return Header_Num is type Uns is mod 2 ** 32; function Rotate_Left (Value : Uns; Amount : Natural) return Uns; pragma Import (Intrinsic, Rotate_Left); Hash_Value : Uns; begin Hash_Value := 0; for J in Key'Range loop Hash_Value := Rotate_Left (Hash_Value, 3) + Character'Pos (Key (J)); end loop; return Header_Num'First + Header_Num'Base (Hash_Value mod Header_Num'Range_Length); end Hash; >> Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 9:41 ` Pascal Obry @ 2005-04-20 11:44 ` Matthew Heaney 2005-04-20 14:47 ` Pascal Obry 0 siblings, 1 reply; 122+ messages in thread From: Matthew Heaney @ 2005-04-20 11:44 UTC (permalink / raw) Pascal Obry <pascal@obry.net> writes: > There is probably a not so good hash routin implementation in the AI302 > containers and you end up with many collisions. > > You should try the one from GNAT s-htable.adb: That was my source for the algorithm (in fact there's a comment to that effect), but it looks like it was changed since I adapted it. ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 11:44 ` Matthew Heaney @ 2005-04-20 14:47 ` Pascal Obry 2005-04-20 19:26 ` Georg Bauhaus 0 siblings, 1 reply; 122+ messages in thread From: Pascal Obry @ 2005-04-20 14:47 UTC (permalink / raw) Matthew Heaney <matthewjheaney@earthlink.net> writes: > That was my source for the algorithm (in fact there's a comment to that > effect), but it looks like it was changed since I adapted it. Yes it has. The rotate of 1 was not good for small strings. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 14:47 ` Pascal Obry @ 2005-04-20 19:26 ` Georg Bauhaus 2005-04-20 19:34 ` Pascal Obry 0 siblings, 1 reply; 122+ messages in thread From: Georg Bauhaus @ 2005-04-20 19:26 UTC (permalink / raw) Pascal Obry wrote: > Matthew Heaney <matthewjheaney@earthlink.net> writes: > > >>That was my source for the algorithm (in fact there's a comment to that >>effect), but it looks like it was changed since I adapted it. > > > Yes it has. The rotate of 1 was not good for small strings. Will it make sense to expressed the shifting amount in terms of some 'Length dependent constant? -- Georg ^ permalink raw reply [flat|nested] 122+ messages in thread
* Re: Ada bench 2005-04-20 19:26 ` Georg Bauhaus @ 2005-04-20 19:34 ` Pascal Obry 0 siblings, 0 replies; 122+ messages in thread From: Pascal Obry @ 2005-04-20 19:34 UTC (permalink / raw) Georg Bauhaus <bauhaus@futureapps.de> writes: > Will it make sense to expressed the shifting amount in terms > of some 'Length dependent constant? Don't know. This is quite tricky and it is difficult to have a good general purpose hash routine. We also must keep in mind that the hash routine must be very fast. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 122+ messages in thread
end of thread, other threads:[~2005-04-28 20:40 UTC | newest] Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-03-19 16:22 Ada bench Pascal Obry 2005-03-19 16:55 ` Dr. Adrian Wrigley 2005-03-19 21:32 ` Michael Bode 2005-03-20 9:20 ` Pascal Obry 2005-03-20 9:39 ` Michael Bode 2005-03-20 11:16 ` Pascal Obry 2005-03-20 12:20 ` Dmitry A. Kazakov 2005-03-20 12:32 ` Pascal Obry 2005-03-21 1:42 ` (see below) 2005-03-21 2:24 ` (see below) 2005-03-21 15:00 ` (see below) 2005-03-21 3:54 ` Ed Falis 2005-03-21 12:20 ` Jeff C 2005-03-21 15:18 ` (see below) 2005-03-21 15:24 ` (see below) 2005-03-21 18:56 ` Isaac Gouy 2005-03-21 21:31 ` Randy Brukardt 2005-03-21 22:14 ` Ed Falis 2005-03-21 18:07 ` Pascal Obry 2005-03-20 10:11 ` Adrian Knoth 2005-03-20 10:30 ` Michael Bode 2005-03-21 23:27 ` Georg Bauhaus 2005-03-22 1:16 ` Ada bench : count words Marius Amado Alves 2005-03-22 10:59 ` Dmitry A. Kazakov 2005-03-22 11:57 ` Marius Amado Alves 2005-03-22 12:17 ` Dmitry A. Kazakov 2005-03-22 12:47 ` Marius Amado Alves 2005-03-22 13:08 ` Dmitry A. Kazakov 2005-03-22 13:28 ` Marius Amado Alves 2005-03-22 16:48 ` Marius Amado Alves 2005-03-22 17:34 ` Dmitry A. Kazakov 2005-03-27 20:14 ` jtg 2005-03-27 21:22 ` Dmitry A. Kazakov 2005-03-28 19:54 ` jtg 2005-03-28 20:56 ` Dmitry A. Kazakov 2005-03-29 12:40 ` jtg 2005-03-29 13:00 ` [OT] " Tapio Kelloniemi 2005-03-29 13:47 ` Dmitry A. Kazakov 2005-03-29 15:53 ` Tapio Kelloniemi 2005-03-29 16:17 ` Dmitry A. Kazakov [not found] ` <k33j419lgei1ui89s26o1dlr9ccf1qe1hd@4ax.com> 2005-04-11 23:04 ` Marius Amado Alves 2005-03-29 13:47 ` Dmitry A. Kazakov 2005-04-01 20:58 ` Georg Bauhaus 2005-04-01 20:18 ` Pascal Obry 2005-03-22 12:53 ` Marius Amado Alves [not found] ` <f205219321dd18dba878fab16b7cb50d@netcabo.pt> 2005-03-22 13:12 ` Marius Amado Alves 2005-03-23 16:58 ` Isaac Gouy 2005-03-22 13:58 ` Robert A Duff 2005-03-22 16:30 ` Marius Amado Alves 2005-03-22 16:41 ` Tapio Kelloniemi 2005-03-22 17:39 ` Marius Amado Alves 2005-03-22 18:59 ` Dmitry A. Kazakov 2005-03-22 19:08 ` Tapio Kelloniemi 2005-03-22 18:34 ` Georg Bauhaus 2005-03-22 19:32 ` Robert A Duff 2005-03-22 20:04 ` tmoran 2005-03-23 16:55 ` Isaac Gouy [not found] ` <1820eab50b57f2fe1c4e8e50bb0f4fe5@netcabo.pt> 2005-03-22 22:49 ` Stephen Leake 2005-03-22 22:58 ` Robert A Duff 2005-03-22 23:27 ` Larry Kilgallen 2005-03-23 22:33 ` Robert A Duff 2005-03-24 5:02 ` Larry Kilgallen [not found] ` <wccpsxqro0c.fsfOrganization: LJK Software <bM40pHW6P2KW@eisner.encompasserve.org> 2005-03-25 1:34 ` Robert A Duff 2005-03-22 12:22 ` Jeff C 2005-03-23 16:48 ` Isaac Gouy 2005-03-23 17:06 ` Isaac Gouy 2005-03-22 19:49 ` tmoran 2005-03-22 21:51 ` Dmitry A. Kazakov 2005-03-23 0:16 ` tmoran 2005-03-23 7:25 ` Dmitry A. Kazakov 2005-03-22 22:33 ` Marius Amado Alves [not found] ` <00b362390273e6c04844dd4ff1885ee0@netcabo.pt> 2005-03-23 15:09 ` Marius Amado Alves 2005-03-23 19:00 ` tmoran 2005-03-23 19:30 ` tmoran 2005-03-23 21:38 ` tmoran 2005-03-25 7:30 ` Simon Wright 2005-03-25 9:38 ` tmoran 2005-03-25 16:20 ` Tapio Kelloniemi 2005-03-25 22:18 ` Ada:The High Performance Language for Hyperthreaded and Multicore CPUs; was " tmoran 2005-03-23 19:54 ` Tapio Kelloniemi 2005-03-23 20:39 ` Ada bench : word frequency Marius Amado Alves 2005-03-23 21:26 ` Isaac Gouy 2005-03-24 1:24 ` Marius Amado Alves 2005-03-24 17:23 ` Isaac Gouy 2005-03-24 19:52 ` Martin Dowie 2005-04-11 15:11 ` Marius Amado Alves 2005-03-24 20:16 ` Pascal Obry 2005-03-24 22:54 ` tmoran 2005-04-11 15:38 ` Marius Amado Alves 2005-03-24 21:18 ` Gautier Write-only 2005-04-11 15:32 ` Marius Amado Alves 2005-04-11 23:56 ` Robert A Duff 2005-04-28 4:04 ` Matthew Heaney 2005-04-28 20:40 ` Matthew Heaney 2005-03-23 21:38 ` Ada bench : count words tmoran 2005-03-24 20:19 ` tmoran 2005-03-24 21:00 ` Pascal Obry 2005-03-24 22:54 ` tmoran 2005-03-30 16:08 ` Andre 2005-03-30 16:36 ` Pascal Obry 2005-03-22 22:27 ` Dmitry A. Kazakov 2005-03-23 7:46 ` Pascal Obry 2005-03-23 7:56 ` Dmitry A. Kazakov 2005-03-23 13:38 ` Robert A Duff 2005-03-22 7:05 ` Ada bench Pascal Obry 2005-04-07 20:59 ` David Sauvage 2005-04-07 23:40 ` David Sauvage 2005-04-08 17:11 ` Pascal Obry 2005-04-08 17:39 ` tmoran 2005-04-08 18:49 ` David Sauvage 2005-04-18 19:14 ` David Sauvage 2005-04-19 16:43 ` Matthew Heaney 2005-04-19 23:22 ` David Sauvage 2005-04-20 0:49 ` Matthew Heaney 2005-04-20 4:22 ` Georg Bauhaus 2005-04-20 21:24 ` David Sauvage 2005-04-20 23:06 ` Georg Bauhaus 2005-04-20 9:41 ` Pascal Obry 2005-04-20 11:44 ` Matthew Heaney 2005-04-20 14:47 ` Pascal Obry 2005-04-20 19:26 ` Georg Bauhaus 2005-04-20 19:34 ` Pascal Obry
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox