* [Shootout] Spellcheck.adb @ 2005-04-25 21:30 albert.bachmann 2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann ` (4 more replies) 0 siblings, 5 replies; 23+ messages in thread From: albert.bachmann @ 2005-04-25 21:30 UTC (permalink / raw) Dear newsgroup, on http://shootout.alioth.debian.org/ I discovered that the Ada entry for the "spellcheck" benchmark produces an error according to the information on that website. The specification for this benchmark can be found here: http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu Since I am new to Ada I decided that it might be a good idea to implement a solution and see how it performs. Finally I came up with a naive program that performs not very well in comparison to other language implementations. I produced a naive implementation in C++ as well which solves the problem in less than half the time. The entry for C++ at the Shootout takes half the time of my naive solution. Now what I'd like to ask you is how I can speed up my Ada solution? As I said I'm a novice and certainly there is a huge optimization potential here. Here is the output of 'time': Ada: real 0m0.742s user 0m0.720s sys 0m0.009s C++: real 0m0.217s user 0m0.192s sys 0m0.008s First I post the C++ solution since it is much shorter and easy to understand: #include <map> #include <string> #include <fstream> #include <iostream> using namespace std; int main() { map<string, bool> dict; string key; ifstream in("spellcheck-dict.txt"); while (getline(in, key)) dict[key] = true; while (cin >> key) { if (!dict[key]) cout << key << endl; } } And now spellcheck.adb: with ada.text_io; with ada.integer_text_io; with gnat.htable; use ada; procedure spellcheck is subtype word is string(1 .. 128); subtype index is natural range 0 .. 2760; file : text_io.file_type; item : word; item_offset : natural; no_item : constant boolean := false; function hash(item : word) return index is a : constant positive := 127; h : index := 0; begin for i in item'range loop h := (a * h + character'pos(item(i))) mod index'last; end loop; return h; end hash; package dictionary is new gnat.htable.simple_htable(header_num => index, element => boolean, no_element => no_item, key => word, hash => hash, equal => "="); begin text_io.open(file, text_io.in_file, "spellcheck-dict.txt"); while text_io.end_of_file(file) = false loop text_io.get_line(file, item, item_offset); strings.fixed.delete(item, item_offset + 1, item'last); dictionary.set(item, true); end loop; while text_io.end_of_file = false loop text_io.get_line(item, item_offset); strings.fixed.delete(item, item_offset + 1, item'last); if dictionary.get(item) = no_item then text_io.put_line(item); null; end if; end loop; text_io.close(file); end spellcheck; Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann @ 2005-04-25 21:57 ` Albert Bachmann 2005-04-25 23:27 ` [Shootout] Spellcheck.adb Marius Amado Alves ` (3 subsequent siblings) 4 siblings, 0 replies; 23+ messages in thread From: Albert Bachmann @ 2005-04-25 21:57 UTC (permalink / raw) That "null;" statement is a holdover from previous testing and should be deleted. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb 2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann 2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann @ 2005-04-25 23:27 ` Marius Amado Alves 2005-04-26 18:04 ` Spellcheck.adb Matthew Heaney ` (2 subsequent siblings) 4 siblings, 0 replies; 23+ messages in thread From: Marius Amado Alves @ 2005-04-25 23:27 UTC (permalink / raw) To: comp.lang.ada At first sight the thing to improve is the hash function. The current function seems too expensive. Some time ago I had this problem. I invented a less expensive function with good dispersion. I don't have the code with me now, but it was as follows. Treat both the input and output values as bit strings X (1 .. M) and Y (1 .. N) respectively. (Use representation clauses, packed arrays, and unchecked conversion for this.) Make N small, say 16. Set Y bits to the value of X bits as follows: if M > N, Y (J) = X (1 + (J - 1) * M / N), for J in 1 .. N if M = N, Y = X if M < N, initialize Y to all zeros, then let Y (1 + (I - 1) * N / M) = X (I), for I in 1 .. M Another thing to improve may be the text input. Buffering with streams has proved faster than Text_IO. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann 2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann 2005-04-25 23:27 ` [Shootout] Spellcheck.adb Marius Amado Alves @ 2005-04-26 18:04 ` Matthew Heaney 2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage 2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov 4 siblings, 0 replies; 23+ messages in thread From: Matthew Heaney @ 2005-04-26 18:04 UTC (permalink / raw) albert.bachmann@gmx.de wrote: > > > First I post the C++ solution since it is much shorter and easy to > understand: The Ada solution based on the Ada 2005 standard container library isn't much different from your C++ example: with Hashed_Word_Sets; use Hashed_Word_Sets; with Ada.Text_IO; use Ada.Text_IO; procedure Spellcheck is Words : Set; Line : String (1 .. 132); Last : Natural; File : File_Type; begin Open (File, In_File, "spellcheck-dict.txt"); while not End_Of_File (File) loop Get_Line (File, Line, Last); Insert (Words, New_Item => Line (Line'First .. Last)); end loop; while not End_Of_File loop Get_Line (Line, Last); if not Contains (Words, Item => Line (Line'First .. Last)) then Put_Line (Line (Line'First .. Last)); end if; end loop; end Spellcheck; <http://charles.tigris.org/source/browse/charles/src/ai302/examples/spellcheck/> We did determine that the hash function doesn't perform well for small strings, but I haven't fixed it yet. If you're desperate I can fix it tonight. -Matt ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb 2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann ` (2 preceding siblings ...) 2005-04-26 18:04 ` Spellcheck.adb Matthew Heaney @ 2005-04-26 19:36 ` David Sauvage 2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann 2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann 2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov 4 siblings, 2 replies; 23+ messages in thread From: David Sauvage @ 2005-04-26 19:36 UTC (permalink / raw) Hi, happy to read your post, also because i'm the folk who proposed the spellcheck ada implementation :-), i don't undestand yet the shootout spellcheck error as it works nicely on my system, does it failed on yours too ? may be you could describe the errors here if so ? thanks albert.bachmann@gmx.de wrote: > Now what I'd like to ask you is how I can speed up my Ada solution? You can find interesting discussion on the spellcheck implementation & speed issue in the thread named "Ada Bench" of comp.lang.ada you could use : - Ada.Streams.Stream_IO.Read instead of Ada.Text_IO.Get_Line - the shootout spellcheck C++ hash function as hash function albert.bachmann@gmx.de wrote: > > real 0m0.742s > user 0m0.720s > sys 0m0.009s > Is it for N=1 ? could you describe your system configuration & the command line used to launch the program ? Well done, i was about 5 min on my first implementation ;-) here is the shootout spellcheck implementation (for N=1) duration, you can take a look on my system configuration on the Ada Bench thread : real 0m0.126s user 0m0.078s sys 0m0.013s To compare (just because we are talking about benchmark and how to speed up our programs) here is your posted implementation on my system (i add the line with Ada.Strings.Fixed; as "strings.fixed.delete" was not handle by my compiler without it) real 0m1.432s user 0m1.361s sys 0m0.014s Your next version should be far far better ;-) Hope it could help ;-) David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage @ 2005-04-26 20:50 ` Albert Bachmann 2005-04-27 15:11 ` Spellcheck.adb Matthew Heaney 2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann 1 sibling, 1 reply; 23+ messages in thread From: Albert Bachmann @ 2005-04-26 20:50 UTC (permalink / raw) Hello David, my system is a Fedora Core 3 Linux box with a Pentium 4 running at 2.4 GHz with 512MB RAM. I use GNAT from GCC-4.0.0. with -O3. All my timings are for n = 1. In the meantime I followed the advice from Matthew Heaney (thanks Matthew) and went on with Ada.Containers. I again tried a simple implementation. Unfortunately I recognized that controlled types have to be instantiated at library level which requires now a total of 3 files since the instantiation of ada.containers.hashed_maps needs a definite subtype for its key_type. Anyway the resulting solution runs with the following times: real 0m0.352s user 0m0.318s sys 0m0.015s Here are the file contents: types.ads: ---------- package types is subtype word is string(1 .. 128); end types; hashmap.ads: ------------ with ada.strings.fixed; with ada.strings.hash; with ada.containers.hashed_maps; with types; package hashmap is new ada.containers.hashed_maps(key_type => types.word, element_type => boolean, hash => ada.strings.hash, equivalent_keys => "="); and finally spellcheck2.adb: ---------------------------- with ada.strings.fixed; with ada.strings.hash; with ada.text_io; with ada.integer_text_io; with hashmap; with types; use types; use ada; procedure spellcheck2 is subtype index is natural range 0 .. 2760; file : text_io.file_type; item : word; item_offset : natural; dict : hashmap.map; begin text_io.open(file, text_io.in_file, "spellcheck-dict.txt"); while not text_io.end_of_file(file) loop text_io.get_line(file, item, item_offset); strings.fixed.delete(item, item_offset + 1, word'last); hashmap.insert(dict, item, true); end loop; while not text_io.end_of_file loop text_io.get_line(item, item_offset); strings.fixed.delete(item, item_offset + 1, word'last); if not hashmap.contains(dict, item) then text_io.put_line(item); end if; end loop; text_io.close(file); end spellcheck2; Please note that those attempts are only quick and dirty hacks. If someone knows how to get rid of the seperate types.ads file I would welcome his advice. So far I have not tested your proposed solution but eventually I will do so and let you know about any possible issues. Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann @ 2005-04-27 15:11 ` Matthew Heaney 2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann 0 siblings, 1 reply; 23+ messages in thread From: Matthew Heaney @ 2005-04-27 15:11 UTC (permalink / raw) "Albert Bachmann" <albert.bachmann@gmx.de> wrote in message news:1114548638.851249.246280@f14g2000cwb.googlegroups.com... > > In the meantime I followed the advice from Matthew Heaney (thanks > Matthew) and went on with Ada.Containers. I again tried a simple > implementation. Unfortunately I recognized that controlled types have > to be instantiated at library level This is no longer true in Ada 2005. Eventually you'll be able to instantiate the containers in your main subprogram. > which requires now a total of 3 > files since the instantiation of ada.containers.hashed_maps needs a > definite subtype for its key_type. True, but that's why we have the Indefinite_Hashed_Maps container package. If you're manipulating strings, you should probably be using that. Also, as I mentioned in an earlier post, you don't really need a map, since all you're doing is performing a membership test. Hence, a (hashed) set will do. > types.ads: > ---------- > > package types is > subtype word is string(1 .. 128); > end types; Get rid of this. Instantiate the Indefinite_Hashed_Sets (or _Maps, if you prefer) with type String. > hashmap.ads: > ------------ > > package hashmap is new ada.containers.hashed_maps(key_type => > types.word, No. This should just be type String. > and finally spellcheck2.adb: > ---------------------------- > > strings.fixed.delete(item, item_offset + 1, word'last); You can get rid of this. > hashmap.insert(dict, item, true); If you used a (hashed) set, then all you'd have to say is insert(dict, item); > strings.fixed.delete(item, item_offset + 1, word'last); You can get rid of this. > if not hashmap.contains(dict, item) then See, you're not using the map element at all. You're only using the key. > Please note that those attempts are only quick and dirty hacks. If > someone knows how to get rid of the seperate types.ads file I would > welcome his advice. You can get rid of the types file, by using the indefinite hashed map (or set), and instantiating it with type String. -Matt ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-27 15:11 ` Spellcheck.adb Matthew Heaney @ 2005-04-27 22:49 ` Albert Bachmann 2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves 2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney 0 siblings, 2 replies; 23+ messages in thread From: Albert Bachmann @ 2005-04-27 22:49 UTC (permalink / raw) On Wed, 27 Apr 2005 11:11:06 -0400, Matthew Heaney wrote: > > "Albert Bachmann" <albert.bachmann@gmx.de> wrote in message > news:1114548638.851249.246280@f14g2000cwb.googlegroups.com... >> >> In the meantime I followed the advice from Matthew Heaney (thanks >> Matthew) and went on with Ada.Containers. I again tried a simple >> implementation. Unfortunately I recognized that controlled types have >> to be instantiated at library level > > This is no longer true in Ada 2005. Eventually you'll be able to > instantiate the containers in your main subprogram. That may be true but with GNAT (as of GCC-4.0.0) I get an error: gcc -c -gnat05 -O3 spellcheck3.adb spellcheck3.adb:10:09: instantiation error at a-cohata.ads:29 spellcheck3.adb:10:09: instantiation error at a-cihase.ads:214 spellcheck3.adb:10:09: controlled type must be declared at the library level gnatmake: "spellcheck3.adb" compilation error > > >> which requires now a total of 3 >> files since the instantiation of ada.containers.hashed_maps needs a >> definite subtype for its key_type. > > True, but that's why we have the Indefinite_Hashed_Maps container package. > If you're manipulating strings, you should probably be using that. > > Also, as I mentioned in an earlier post, you don't really need a map, since > all you're doing is performing a membership test. Hence, a (hashed) set > will do. > When I use and instance of indefinite_hashed_sets or indefinite_hashed_maps the program compiles fine but I get an exception during runtime: raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map) raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set) >> types.ads: >> ---------- >> >> package types is >> subtype word is string(1 .. 128); >> end types; > > Get rid of this. Instantiate the Indefinite_Hashed_Sets (or _Maps, if you > prefer) with type String. > Ok. > >> hashmap.ads: >> ------------ >> >> package hashmap is new ada.containers.hashed_maps(key_type => >> types.word, > > No. This should just be type String. > Ok. > >> and finally spellcheck2.adb: >> ---------------------------- >> >> strings.fixed.delete(item, item_offset + 1, word'last); > > You can get rid of this. I did. > > >> hashmap.insert(dict, item, true); > > If you used a (hashed) set, then all you'd have to say is > > insert(dict, item); > > >> strings.fixed.delete(item, item_offset + 1, word'last); > > You can get rid of this. > Here too. > >> if not hashmap.contains(dict, item) then > > See, you're not using the map element at all. You're only using the key. > > >> Please note that those attempts are only quick and dirty hacks. If >> someone knows how to get rid of the seperate types.ads file I would >> welcome his advice. > > You can get rid of the types file, by using the indefinite hashed map (or > set), and instantiating it with type String. > > -Matt I post the modified source. Maybe I've overlooked something because as I said above during runtime I get an exception. However thanks for those valuable information Matthew! hashed_set.ads: -------------- with ada.containers.indefinite_hashed_sets; with ada.strings.hash; --with hash_function; package hashed_set is new ada.containers.indefinite_hashed_set( element_type => string, hash => ada.strings.hash, --hash_function, equivalent_keys => "="); spellcheck3.adb: ---------------- with ada.strings.fixed; with ada.text_io; with hashed_set; use ada; procedure spellcheck3 is file : text_io.file_type; item : string(1 .. 128); item_offset : natural; dict : hashed_set.set; begin text_io.open(file, text_io.in_file, "spellcheck-dict.txt"); while not text_io.end_of_file(file) loop text_io.get_line(file, item, item_offset); hashed_set.insert(dict, item); end loop; while not text_io.end_of_file loop text_io.get_line(item, item_offset); if not hashed_set.contains(dict, item) then text_io.put_line(item); end if; end loop; text_io.close(file); end spellcheck3; Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann @ 2005-04-27 23:35 ` Marius Amado Alves 2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann 2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney 1 sibling, 1 reply; 23+ messages in thread From: Marius Amado Alves @ 2005-04-27 23:35 UTC (permalink / raw) To: Albert Bachmann; +Cc: comp.lang.ada > ... > while not text_io.end_of_file(file) loop > text_io.get_line(file, item, item_offset); > hashed_set.insert(dict, item); Don't pass the whole item, but the slice item (1 .. item_offset). > ... > while not text_io.end_of_file loop > text_io.get_line(item, item_offset); > if not hashed_set.contains(dict, item) then > text_io.put_line(item); > Ditto. (And item_offset should be called item_length or similar.) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves @ 2005-04-27 23:58 ` Albert Bachmann 2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves 2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney 0 siblings, 2 replies; 23+ messages in thread From: Albert Bachmann @ 2005-04-27 23:58 UTC (permalink / raw) On Thu, 28 Apr 2005 00:35:21 +0100, Marius Amado Alves wrote: >> ... >> while not text_io.end_of_file(file) loop >> text_io.get_line(file, item, item_offset); >> hashed_set.insert(dict, item); > > Don't pass the whole item, but the slice item (1 .. item_offset). > >> ... >> while not text_io.end_of_file loop >> text_io.get_line(item, item_offset); >> if not hashed_set.contains(dict, item) then >> text_io.put_line(item); >> > > Ditto. > > (And item_offset should be called item_length or similar.) Yes. That's it. I should have known. And timing is not bad with hashed_set: -- with custom hash function: real 0m0.119s user 0m0.101s sys 0m0.009s -- with ada.strings.hash: real 0m0.130s user 0m0.116s sys 0m0.005s Using streams makes for another 10ms. But the GNAT.Spitbol.Table still seems to be the fastest version so far. Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann @ 2005-04-28 0:19 ` Marius Amado Alves 2005-04-29 10:22 ` Spellcheck.adb Albert Bachmann 2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney 1 sibling, 1 reply; 23+ messages in thread From: Marius Amado Alves @ 2005-04-28 0:19 UTC (permalink / raw) To: comp.lang.ada > -- with custom hash function: > real 0m0.119s > user 0m0.101s > sys 0m0.009s > > -- with ada.strings.hash: > real 0m0.130s > user 0m0.116s > sys 0m0.005s Have you tried my bit-based hash function? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves @ 2005-04-29 10:22 ` Albert Bachmann 0 siblings, 0 replies; 23+ messages in thread From: Albert Bachmann @ 2005-04-29 10:22 UTC (permalink / raw) On Thu, 28 Apr 2005 01:19:03 +0100, Marius Amado Alves wrote: >> -- with custom hash function: >> real 0m0.119s >> user 0m0.101s >> sys 0m0.009s >> >> -- with ada.strings.hash: >> real 0m0.130s >> user 0m0.116s >> sys 0m0.005s > > Have you tried my bit-based hash function? Not yet ;-) But if time permits I will try it during the next days. Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann 2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves @ 2005-04-28 0:58 ` Matthew Heaney 2005-04-29 10:19 ` Spellcheck.adb Albert Bachmann 1 sibling, 1 reply; 23+ messages in thread From: Matthew Heaney @ 2005-04-28 0:58 UTC (permalink / raw) Albert Bachmann <albert.bachmann@gmx.de> writes: > Yes. That's it. I should have known. And timing is not bad with hashed_set: > > -- with custom hash function: What is your custom hash algorithm? > -- with ada.strings.hash: I just checked in Pascal Obry's improved algorithm for ada.strings.hash. See if it does any better on your input. > Using streams makes for another 10ms. But the GNAT.Spitbol.Table still > seems to be the fastest version so far. What were those numbers again? Regards, Matt ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney @ 2005-04-29 10:19 ` Albert Bachmann 0 siblings, 0 replies; 23+ messages in thread From: Albert Bachmann @ 2005-04-29 10:19 UTC (permalink / raw) On Thu, 28 Apr 2005 00:58:29 +0000, Matthew Heaney wrote: > Albert Bachmann <albert.bachmann@gmx.de> writes: > >> Yes. That's it. I should have known. And timing is not bad with hashed_set: >> >> -- with custom hash function: > > What is your custom hash algorithm? > Just the one used in the C++ entry. With "custom" I just wanted to distinguish it from Ada.Strings.Hash. with ada.containers; use ada.containers; function hash_function(key : string) return hash_type is n : hash_type := 0; begin for i in key'range loop n := (n * 5 + character'pos(key(i))); end loop; return n; end hash_function; > >> -- with ada.strings.hash: > > I just checked in Pascal Obry's improved algorithm for ada.strings.hash. > See if it does any better on your input. > > >> Using streams makes for another 10ms. But the GNAT.Spitbol.Table still >> seems to be the fastest version so far. > > What were those numbers again? > > Regards, > Matt With David Sauvage's version (with uses GNAT.Spitbol.Table) I get the following times: real 0m0.074s user 0m0.054s sys 0m0.012s I have no specific times for the streams version at hand since I deleted it already. But I remember that back then the difference was in the range of 10ms. Note that David Sauvage's version uses streams which contribute to his excellent results. http://charles.tigris.org/source/browse/charles/src/ai302/examples/spellcheck/ still shows only this initial revisions and the hash function there is the same as the one I used. Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann 2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves @ 2005-04-28 0:41 ` Matthew Heaney 2005-04-28 21:20 ` Spellcheck.adb Simon Wright 2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann 1 sibling, 2 replies; 23+ messages in thread From: Matthew Heaney @ 2005-04-28 0:41 UTC (permalink / raw) Albert Bachmann <albert.bachmann@gmx.de> writes: > When I use and instance of indefinite_hashed_sets or > indefinite_hashed_maps the program compiles fine but I get an exception > during runtime: > > raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map) > raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set) What version of the container packages are you using? Either you found a bug in the sources you're using, or you're doing something wrong. Can you send me (or just post) the portion of the code that surrounds the line numbers in the exception message above? Thanks, Matt ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney @ 2005-04-28 21:20 ` Simon Wright 2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann 1 sibling, 0 replies; 23+ messages in thread From: Simon Wright @ 2005-04-28 21:20 UTC (permalink / raw) Matthew Heaney <matthewjheaney@earthlink.net> writes: > Albert Bachmann <albert.bachmann@gmx.de> writes: > > > When I use and instance of indefinite_hashed_sets or > > indefinite_hashed_maps the program compiles fine but I get an exception > > during runtime: > > > > raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map) > > raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set) > > What version of the container packages are you using? Either you found > a bug in the sources you're using, or you're doing something wrong. What revision of the tigris.org sources is in GCC 4.0.0? (I was most impressed to find them there!) With 4.0.0 on Mac OS X 10.3.9, I have managed (after a few minutes) to run test_maps with no assertion failures. I couldn't see instructions for running the tests? If in ai302/testdir, * to compile with assertions against the 4.0.0 Ada.Containers, gnatmake -gnat05 -gnata test_maps * To compile with assertions against the local Ada.Containers, gnatmake -gnat05 -gnata -I.. -a test_maps (the -a says rebuild runtime components if necessary). -- Simon Wright 100% Ada, no bugs. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney 2005-04-28 21:20 ` Spellcheck.adb Simon Wright @ 2005-04-29 10:05 ` Albert Bachmann 2005-04-29 12:35 ` Spellcheck.adb Matthew Heaney 1 sibling, 1 reply; 23+ messages in thread From: Albert Bachmann @ 2005-04-29 10:05 UTC (permalink / raw) On Thu, 28 Apr 2005 00:41:22 +0000, Matthew Heaney wrote: > Albert Bachmann <albert.bachmann@gmx.de> writes: > >> When I use and instance of indefinite_hashed_sets or >> indefinite_hashed_maps the program compiles fine but I get an exception >> during runtime: >> >> raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map) >> raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set) > > What version of the container packages are you using? Either you found > a bug in the sources you're using, or you're doing something wrong. > > Can you send me (or just post) the portion of the code that surrounds > the line numbers in the exception message above? > > Thanks, > Matt Well, Marius Amado Alves advised me not to pass the whole string to the hashed_set (see the code in my post you replied to). I get no error anymore then. Do you consider this an error of Ada.Containers despite this? I assumed it was solely my fault. Regards, Albert ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann @ 2005-04-29 12:35 ` Matthew Heaney 0 siblings, 0 replies; 23+ messages in thread From: Matthew Heaney @ 2005-04-29 12:35 UTC (permalink / raw) Albert Bachmann <albert.bachmann@gmx.de> writes: > Well, Marius Amado Alves advised me not to pass the whole string to > the hashed_set (see the code in my post you replied to). I get no > error anymore then. Do you consider this an error of Ada.Containers > despite this? I assumed it was solely my fault. The exception behavior you observed is in harmony with what's described in the latest AI-302 draft. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage 2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann @ 2005-04-26 21:21 ` Albert Bachmann 2005-04-26 22:04 ` Spellcheck.adb David Sauvage 1 sibling, 1 reply; 23+ messages in thread From: Albert Bachmann @ 2005-04-26 21:21 UTC (permalink / raw) David, regarding your entry at the Shootout--it compiled and ran fine on my system. Timing is excellent: real 0m0.074s user 0m0.054s sys 0m0.012s I don't know the infrastructure used by the Shootout. I think there is a mailing list for questions like this. Regards, Albert. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb 2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann @ 2005-04-26 22:04 ` David Sauvage 0 siblings, 0 replies; 23+ messages in thread From: David Sauvage @ 2005-04-26 22:04 UTC (permalink / raw) Hello again, Albert Bachmann wrote: > regarding your entry at the Shootout--it compiled and ran fine on my > system. Timing is excellent: > > real 0m0.074s > user 0m0.054s > sys 0m0.012s > Surprisely your timing is just twice faster than mine ;-) may be the 1.2Ghz to 2.4 Ghz ;-) Excellent to hear that it run well on your system too. > I don't know the infrastructure used by the Shootout. I think there is > a mailing list for questions like this. > yes thanks for your advise, i will try posting on the mailing list. David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb 2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann ` (3 preceding siblings ...) 2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage @ 2005-04-26 20:36 ` Dmitry A. Kazakov 2005-04-26 21:50 ` David Sauvage 4 siblings, 1 reply; 23+ messages in thread From: Dmitry A. Kazakov @ 2005-04-26 20:36 UTC (permalink / raw) On 25 Apr 2005 14:30:42 -0700, albert.bachmann@gmx.de wrote: > on http://shootout.alioth.debian.org/ I discovered that the Ada entry > for the "spellcheck" benchmark produces an error according to the > information on that website. The specification for this benchmark can > be found here: > http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu > > Since I am new to Ada I decided that it might be a good idea to > implement a solution and see how it performs. Finally I came up with a > naive program that performs not very well in comparison to other > language implementations. I produced a naive implementation in C++ as > well which solves the problem in less than half the time. The entry for > C++ at the Shootout takes half the time of my naive solution. > > Now what I'd like to ask you is how I can speed up my Ada solution? As > I said I'm a novice and certainly there is a huge optimization > potential here. Here is the output of 'time': -------------- spell_check.adb ----------------- with Ada.Text_IO; use Ada.Text_IO; with Dictionary_Tables; use Dictionary_Tables; procedure Spell_Check is File : File_type; Line : String (1..130); Last : Natural; Dictionary : Table; begin Open (File, In_File, "spellcheck-dict.txt"); begin -- Reading dictionary loop Get_Line (File, Line, Last); Add (Dictionary, Line (Line'First..Last), true); end loop; exception when End_Error => null; end; Close (File); begin -- Reading test file loop Get_Line (Line, Last); begin if Find (Dictionary, Line (Line'First..Last)) then null; end if; exception when End_Error => Put_Line (Line (Line'First..Last)); end; end loop; exception when End_Error => null; end; end Spell_Check; ----------- dictionary_tables.adb ---------------------- with Tables; package Dictionary_Tables is new Tables (Boolean); The above is approximately five times faster on my computer under FC3. Probably, because it does not use hash tables? [ The package tables is here: http://www.dmitry-kazakov.de/ada/tables.htm ] -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb 2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov @ 2005-04-26 21:50 ` David Sauvage 2005-04-27 8:40 ` Dmitry A. Kazakov 0 siblings, 1 reply; 23+ messages in thread From: David Sauvage @ 2005-04-26 21:50 UTC (permalink / raw) Dmitry A. Kazakov wrote: > ... > The above is approximately five times faster on my computer under FC3. > Probably, because it does not use hash tables? > GNAT.Spitbol.Table generic use hash tables (i wonder if it could be called an hashed table maps method?), may be people would like to have a look on it's hash function (see #1) The published shootout Ada spellcheck using GNAT.Spitbol.Table : real 0m0.120s user 0m0.083s sys 0m0.007s (i run all the programs several times before each post ;-), so i get fresh time average each post) I try using the Tables generic package implementation (see #2) instead of GNAT.Spitbol.Table generic on the shootout Ada spellcheck implementation, which gave me : real 0m0.271s user 0m0.197s sys 0m0.014s system configuration : Debian GNU/Linux Sid - kernel 2.6.10 - 1.2 Ghz gcc 3-3, GNAT 3.15p-12 #1 the GNAT.Spitbol.Table generic hash function ------------------------------------------------ (see g-spitbo.adb) function Hash (Str : String) return Unsigned_32 is Result : Unsigned_32 := Str'Length; begin for J in Str'Range loop Result := Rotate_Left (Result, 1) + Unsigned_32 (Character'Pos (Str (J))); end loop; return Result; end Hash; #2 The Tables generic package : ------------------------------- (see previous post) > [ http://www.dmitry-kazakov.de/ada/tables.htm ] David ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb 2005-04-26 21:50 ` David Sauvage @ 2005-04-27 8:40 ` Dmitry A. Kazakov 0 siblings, 0 replies; 23+ messages in thread From: Dmitry A. Kazakov @ 2005-04-27 8:40 UTC (permalink / raw) On Tue, 26 Apr 2005 23:50:12 +0200, David Sauvage wrote: > Dmitry A. Kazakov wrote: >> ... >> The above is approximately five times faster on my computer under FC3. >> Probably, because it does not use hash tables? > > GNAT.Spitbol.Table generic use hash tables (i wonder if it could be > called an hashed table maps method?), may be people would like to have a > look on it's hash function (see #1) > > The published shootout Ada spellcheck using GNAT.Spitbol.Table : > real 0m0.120s > user 0m0.083s > sys 0m0.007s > > (i run all the programs several times before each post ;-), so i get > fresh time average each post) > > I try using the Tables generic package implementation (see #2) instead > of GNAT.Spitbol.Table generic on the shootout Ada spellcheck > implementation, which gave me : > real 0m0.271s > user 0m0.197s > sys 0m0.014s Aha, now it is how it must be! (:-)) Hash table should be slower than sorted arrays on natural dictionaries. But because it is allocated in advance (40_000 items), that should give it an advantage over a dynamically allocated sorted array, when sizes of the dictionary and of the text are close: 1 x malloc + n x string compare >> n x (hash + integer compare), though string compare < hash + integer compare when n ~ 1 two times looks OK. [I can't stop wondering how badly wrong the benchmarks were designed] BTW, you can try to use streams in the second part of your code. That should bring another 10ms or so. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2005-04-29 12:35 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann 2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann 2005-04-25 23:27 ` [Shootout] Spellcheck.adb Marius Amado Alves 2005-04-26 18:04 ` Spellcheck.adb Matthew Heaney 2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage 2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann 2005-04-27 15:11 ` Spellcheck.adb Matthew Heaney 2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann 2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves 2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann 2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves 2005-04-29 10:22 ` Spellcheck.adb Albert Bachmann 2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney 2005-04-29 10:19 ` Spellcheck.adb Albert Bachmann 2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney 2005-04-28 21:20 ` Spellcheck.adb Simon Wright 2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann 2005-04-29 12:35 ` Spellcheck.adb Matthew Heaney 2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann 2005-04-26 22:04 ` Spellcheck.adb David Sauvage 2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov 2005-04-26 21:50 ` David Sauvage 2005-04-27 8:40 ` Dmitry A. Kazakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox