* [Shootout] Spellcheck.adb
@ 2005-04-25 21:30 albert.bachmann
2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: albert.bachmann @ 2005-04-25 21:30 UTC (permalink / raw)
Dear newsgroup,
on http://shootout.alioth.debian.org/ I discovered that the Ada entry
for the "spellcheck" benchmark produces an error according to the
information on that website. The specification for this benchmark can
be found here:
http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu
Since I am new to Ada I decided that it might be a good idea to
implement a solution and see how it performs. Finally I came up with a
naive program that performs not very well in comparison to other
language implementations. I produced a naive implementation in C++ as
well which solves the problem in less than half the time. The entry for
C++ at the Shootout takes half the time of my naive solution.
Now what I'd like to ask you is how I can speed up my Ada solution? As
I said I'm a novice and certainly there is a huge optimization
potential here. Here is the output of 'time':
Ada:
real 0m0.742s
user 0m0.720s
sys 0m0.009s
C++:
real 0m0.217s
user 0m0.192s
sys 0m0.008s
First I post the C++ solution since it is much shorter and easy to
understand:
#include <map>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
map<string, bool> dict;
string key;
ifstream in("spellcheck-dict.txt");
while (getline(in, key))
dict[key] = true;
while (cin >> key)
{
if (!dict[key])
cout << key << endl;
}
}
And now spellcheck.adb:
with ada.text_io;
with ada.integer_text_io;
with gnat.htable;
use ada;
procedure spellcheck is
subtype word is string(1 .. 128);
subtype index is natural range 0 .. 2760;
file : text_io.file_type;
item : word;
item_offset : natural;
no_item : constant boolean := false;
function hash(item : word) return index is
a : constant positive := 127;
h : index := 0;
begin
for i in item'range loop
h := (a * h + character'pos(item(i))) mod index'last;
end loop;
return h;
end hash;
package dictionary is new gnat.htable.simple_htable(header_num =>
index,
element => boolean,
no_element => no_item,
key => word,
hash => hash,
equal => "=");
begin
text_io.open(file, text_io.in_file, "spellcheck-dict.txt");
while text_io.end_of_file(file) = false loop
text_io.get_line(file, item, item_offset);
strings.fixed.delete(item, item_offset + 1, item'last);
dictionary.set(item, true);
end loop;
while text_io.end_of_file = false loop
text_io.get_line(item, item_offset);
strings.fixed.delete(item, item_offset + 1, item'last);
if dictionary.get(item) = no_item then
text_io.put_line(item);
null;
end if;
end loop;
text_io.close(file);
end spellcheck;
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann
@ 2005-04-25 21:57 ` Albert Bachmann
2005-04-25 23:27 ` [Shootout] Spellcheck.adb Marius Amado Alves
` (3 subsequent siblings)
4 siblings, 0 replies; 23+ messages in thread
From: Albert Bachmann @ 2005-04-25 21:57 UTC (permalink / raw)
That "null;" statement is a holdover from previous testing and should
be deleted.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb
2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann
2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann
@ 2005-04-25 23:27 ` Marius Amado Alves
2005-04-26 18:04 ` Spellcheck.adb Matthew Heaney
` (2 subsequent siblings)
4 siblings, 0 replies; 23+ messages in thread
From: Marius Amado Alves @ 2005-04-25 23:27 UTC (permalink / raw)
To: comp.lang.ada
At first sight the thing to improve is the hash function. The current
function seems too expensive. Some time ago I had this problem. I
invented a less expensive function with good dispersion. I don't have
the code with me now, but it was as follows. Treat both the input and
output values as bit strings X (1 .. M) and Y (1 .. N) respectively.
(Use representation clauses, packed arrays, and unchecked conversion
for this.) Make N small, say 16. Set Y bits to the value of X bits as
follows:
if M > N, Y (J) = X (1 + (J - 1) * M / N), for J in 1 .. N
if M = N, Y = X
if M < N, initialize Y to all zeros,
then let Y (1 + (I - 1) * N / M) = X (I), for I in 1 .. M
Another thing to improve may be the text input. Buffering with streams
has proved faster than Text_IO.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann
2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann
2005-04-25 23:27 ` [Shootout] Spellcheck.adb Marius Amado Alves
@ 2005-04-26 18:04 ` Matthew Heaney
2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage
2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov
4 siblings, 0 replies; 23+ messages in thread
From: Matthew Heaney @ 2005-04-26 18:04 UTC (permalink / raw)
albert.bachmann@gmx.de wrote:
>
>
> First I post the C++ solution since it is much shorter and easy to
> understand:
The Ada solution based on the Ada 2005 standard container library isn't
much different from your C++ example:
with Hashed_Word_Sets; use Hashed_Word_Sets;
with Ada.Text_IO; use Ada.Text_IO;
procedure Spellcheck is
Words : Set;
Line : String (1 .. 132);
Last : Natural;
File : File_Type;
begin
Open (File, In_File, "spellcheck-dict.txt");
while not End_Of_File (File) loop
Get_Line (File, Line, Last);
Insert (Words, New_Item => Line (Line'First .. Last));
end loop;
while not End_Of_File loop
Get_Line (Line, Last);
if not Contains (Words, Item => Line (Line'First .. Last)) then
Put_Line (Line (Line'First .. Last));
end if;
end loop;
end Spellcheck;
<http://charles.tigris.org/source/browse/charles/src/ai302/examples/spellcheck/>
We did determine that the hash function doesn't perform well for small
strings, but I haven't fixed it yet. If you're desperate I can fix it
tonight.
-Matt
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb
2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann
` (2 preceding siblings ...)
2005-04-26 18:04 ` Spellcheck.adb Matthew Heaney
@ 2005-04-26 19:36 ` David Sauvage
2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann
2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann
2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov
4 siblings, 2 replies; 23+ messages in thread
From: David Sauvage @ 2005-04-26 19:36 UTC (permalink / raw)
Hi,
happy to read your post, also because i'm the folk who proposed the
spellcheck ada implementation :-), i don't undestand yet the shootout
spellcheck error as it works nicely on my system, does it failed on
yours too ? may be you could describe the errors here if so ? thanks
albert.bachmann@gmx.de wrote:
> Now what I'd like to ask you is how I can speed up my Ada solution?
You can find interesting discussion on the spellcheck implementation &
speed issue in the thread named "Ada Bench" of comp.lang.ada
you could use :
- Ada.Streams.Stream_IO.Read instead of Ada.Text_IO.Get_Line
- the shootout spellcheck C++ hash function as hash function
albert.bachmann@gmx.de wrote:
>
> real 0m0.742s
> user 0m0.720s
> sys 0m0.009s
>
Is it for N=1 ? could you describe your system configuration & the
command line used to launch the program ?
Well done, i was about 5 min on my first implementation ;-)
here is the shootout spellcheck implementation (for N=1) duration, you
can take a look on my system configuration on the Ada Bench thread :
real 0m0.126s
user 0m0.078s
sys 0m0.013s
To compare (just because we are talking about benchmark and how to speed
up our programs) here is your posted implementation on my system (i add
the line with Ada.Strings.Fixed; as "strings.fixed.delete" was not
handle by my compiler without it)
real 0m1.432s
user 0m1.361s
sys 0m0.014s
Your next version should be far far better ;-)
Hope it could help ;-)
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb
2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann
` (3 preceding siblings ...)
2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage
@ 2005-04-26 20:36 ` Dmitry A. Kazakov
2005-04-26 21:50 ` David Sauvage
4 siblings, 1 reply; 23+ messages in thread
From: Dmitry A. Kazakov @ 2005-04-26 20:36 UTC (permalink / raw)
On 25 Apr 2005 14:30:42 -0700, albert.bachmann@gmx.de wrote:
> on http://shootout.alioth.debian.org/ I discovered that the Ada entry
> for the "spellcheck" benchmark produces an error according to the
> information on that website. The specification for this benchmark can
> be found here:
> http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu
>
> Since I am new to Ada I decided that it might be a good idea to
> implement a solution and see how it performs. Finally I came up with a
> naive program that performs not very well in comparison to other
> language implementations. I produced a naive implementation in C++ as
> well which solves the problem in less than half the time. The entry for
> C++ at the Shootout takes half the time of my naive solution.
>
> Now what I'd like to ask you is how I can speed up my Ada solution? As
> I said I'm a novice and certainly there is a huge optimization
> potential here. Here is the output of 'time':
-------------- spell_check.adb -----------------
with Ada.Text_IO; use Ada.Text_IO;
with Dictionary_Tables; use Dictionary_Tables;
procedure Spell_Check is
File : File_type;
Line : String (1..130);
Last : Natural;
Dictionary : Table;
begin
Open (File, In_File, "spellcheck-dict.txt");
begin -- Reading dictionary
loop
Get_Line (File, Line, Last);
Add (Dictionary, Line (Line'First..Last), true);
end loop;
exception
when End_Error =>
null;
end;
Close (File);
begin -- Reading test file
loop
Get_Line (Line, Last);
begin
if Find (Dictionary, Line (Line'First..Last)) then
null;
end if;
exception
when End_Error =>
Put_Line (Line (Line'First..Last));
end;
end loop;
exception
when End_Error =>
null;
end;
end Spell_Check;
----------- dictionary_tables.adb ----------------------
with Tables;
package Dictionary_Tables is new Tables (Boolean);
The above is approximately five times faster on my computer under FC3.
Probably, because it does not use hash tables?
[ The package tables is here: http://www.dmitry-kazakov.de/ada/tables.htm ]
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage
@ 2005-04-26 20:50 ` Albert Bachmann
2005-04-27 15:11 ` Spellcheck.adb Matthew Heaney
2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann
1 sibling, 1 reply; 23+ messages in thread
From: Albert Bachmann @ 2005-04-26 20:50 UTC (permalink / raw)
Hello David,
my system is a Fedora Core 3 Linux box with a Pentium 4 running at 2.4
GHz with 512MB RAM. I use GNAT from GCC-4.0.0. with -O3. All my timings
are for n = 1.
In the meantime I followed the advice from Matthew Heaney (thanks
Matthew) and went on with Ada.Containers. I again tried a simple
implementation. Unfortunately I recognized that controlled types have
to be instantiated at library level which requires now a total of 3
files since the instantiation of ada.containers.hashed_maps needs a
definite subtype for its key_type.
Anyway the resulting solution runs with the following times:
real 0m0.352s
user 0m0.318s
sys 0m0.015s
Here are the file contents:
types.ads:
----------
package types is
subtype word is string(1 .. 128);
end types;
hashmap.ads:
------------
with ada.strings.fixed;
with ada.strings.hash;
with ada.containers.hashed_maps;
with types;
package hashmap is new ada.containers.hashed_maps(key_type =>
types.word,
element_type => boolean,
hash => ada.strings.hash,
equivalent_keys => "=");
and finally spellcheck2.adb:
----------------------------
with ada.strings.fixed;
with ada.strings.hash;
with ada.text_io;
with ada.integer_text_io;
with hashmap;
with types;
use types;
use ada;
procedure spellcheck2 is
subtype index is natural range 0 .. 2760;
file : text_io.file_type;
item : word;
item_offset : natural;
dict : hashmap.map;
begin
text_io.open(file, text_io.in_file, "spellcheck-dict.txt");
while not text_io.end_of_file(file) loop
text_io.get_line(file, item, item_offset);
strings.fixed.delete(item, item_offset + 1, word'last);
hashmap.insert(dict, item, true);
end loop;
while not text_io.end_of_file loop
text_io.get_line(item, item_offset);
strings.fixed.delete(item, item_offset + 1, word'last);
if not hashmap.contains(dict, item) then
text_io.put_line(item);
end if;
end loop;
text_io.close(file);
end spellcheck2;
Please note that those attempts are only quick and dirty hacks. If
someone knows how to get rid of the seperate types.ads file I would
welcome his advice.
So far I have not tested your proposed solution but eventually I will
do so and let you know about any possible issues.
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage
2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann
@ 2005-04-26 21:21 ` Albert Bachmann
2005-04-26 22:04 ` Spellcheck.adb David Sauvage
1 sibling, 1 reply; 23+ messages in thread
From: Albert Bachmann @ 2005-04-26 21:21 UTC (permalink / raw)
David,
regarding your entry at the Shootout--it compiled and ran fine on my
system. Timing is excellent:
real 0m0.074s
user 0m0.054s
sys 0m0.012s
I don't know the infrastructure used by the Shootout. I think there is
a mailing list for questions like this.
Regards,
Albert.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb
2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov
@ 2005-04-26 21:50 ` David Sauvage
2005-04-27 8:40 ` Dmitry A. Kazakov
0 siblings, 1 reply; 23+ messages in thread
From: David Sauvage @ 2005-04-26 21:50 UTC (permalink / raw)
Dmitry A. Kazakov wrote:
> ...
> The above is approximately five times faster on my computer under FC3.
> Probably, because it does not use hash tables?
>
GNAT.Spitbol.Table generic use hash tables (i wonder if it could be
called an hashed table maps method?), may be people would like to have a
look on it's hash function (see #1)
The published shootout Ada spellcheck using GNAT.Spitbol.Table :
real 0m0.120s
user 0m0.083s
sys 0m0.007s
(i run all the programs several times before each post ;-), so i get
fresh time average each post)
I try using the Tables generic package implementation (see #2) instead
of GNAT.Spitbol.Table generic on the shootout Ada spellcheck
implementation, which gave me :
real 0m0.271s
user 0m0.197s
sys 0m0.014s
system configuration :
Debian GNU/Linux Sid - kernel 2.6.10 - 1.2 Ghz
gcc 3-3, GNAT 3.15p-12
#1 the GNAT.Spitbol.Table generic hash function
------------------------------------------------
(see g-spitbo.adb)
function Hash (Str : String) return Unsigned_32 is
Result : Unsigned_32 := Str'Length;
begin
for J in Str'Range loop
Result := Rotate_Left (Result, 1) +
Unsigned_32 (Character'Pos (Str (J)));
end loop;
return Result;
end Hash;
#2 The Tables generic package :
-------------------------------
(see previous post)
> [ http://www.dmitry-kazakov.de/ada/tables.htm ]
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann
@ 2005-04-26 22:04 ` David Sauvage
0 siblings, 0 replies; 23+ messages in thread
From: David Sauvage @ 2005-04-26 22:04 UTC (permalink / raw)
Hello again,
Albert Bachmann wrote:
> regarding your entry at the Shootout--it compiled and ran fine on my
> system. Timing is excellent:
>
> real 0m0.074s
> user 0m0.054s
> sys 0m0.012s
>
Surprisely your timing is just twice faster than mine ;-) may be the
1.2Ghz to 2.4 Ghz ;-)
Excellent to hear that it run well on your system too.
> I don't know the infrastructure used by the Shootout. I think there is
> a mailing list for questions like this.
>
yes thanks for your advise, i will try posting on the mailing list.
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Shootout] Spellcheck.adb
2005-04-26 21:50 ` David Sauvage
@ 2005-04-27 8:40 ` Dmitry A. Kazakov
0 siblings, 0 replies; 23+ messages in thread
From: Dmitry A. Kazakov @ 2005-04-27 8:40 UTC (permalink / raw)
On Tue, 26 Apr 2005 23:50:12 +0200, David Sauvage wrote:
> Dmitry A. Kazakov wrote:
>> ...
>> The above is approximately five times faster on my computer under FC3.
>> Probably, because it does not use hash tables?
>
> GNAT.Spitbol.Table generic use hash tables (i wonder if it could be
> called an hashed table maps method?), may be people would like to have a
> look on it's hash function (see #1)
>
> The published shootout Ada spellcheck using GNAT.Spitbol.Table :
> real 0m0.120s
> user 0m0.083s
> sys 0m0.007s
>
> (i run all the programs several times before each post ;-), so i get
> fresh time average each post)
>
> I try using the Tables generic package implementation (see #2) instead
> of GNAT.Spitbol.Table generic on the shootout Ada spellcheck
> implementation, which gave me :
> real 0m0.271s
> user 0m0.197s
> sys 0m0.014s
Aha, now it is how it must be! (:-))
Hash table should be slower than sorted arrays on natural dictionaries. But
because it is allocated in advance (40_000 items), that should give it an
advantage over a dynamically allocated sorted array, when sizes of the
dictionary and of the text are close:
1 x malloc + n x string compare >> n x (hash + integer compare),
though string compare < hash + integer compare
when n ~ 1 two times looks OK.
[I can't stop wondering how badly wrong the benchmarks were designed]
BTW, you can try to use streams in the second part of your code. That
should bring another 10ms or so.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann
@ 2005-04-27 15:11 ` Matthew Heaney
2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann
0 siblings, 1 reply; 23+ messages in thread
From: Matthew Heaney @ 2005-04-27 15:11 UTC (permalink / raw)
"Albert Bachmann" <albert.bachmann@gmx.de> wrote in message
news:1114548638.851249.246280@f14g2000cwb.googlegroups.com...
>
> In the meantime I followed the advice from Matthew Heaney (thanks
> Matthew) and went on with Ada.Containers. I again tried a simple
> implementation. Unfortunately I recognized that controlled types have
> to be instantiated at library level
This is no longer true in Ada 2005. Eventually you'll be able to
instantiate the containers in your main subprogram.
> which requires now a total of 3
> files since the instantiation of ada.containers.hashed_maps needs a
> definite subtype for its key_type.
True, but that's why we have the Indefinite_Hashed_Maps container package.
If you're manipulating strings, you should probably be using that.
Also, as I mentioned in an earlier post, you don't really need a map, since
all you're doing is performing a membership test. Hence, a (hashed) set
will do.
> types.ads:
> ----------
>
> package types is
> subtype word is string(1 .. 128);
> end types;
Get rid of this. Instantiate the Indefinite_Hashed_Sets (or _Maps, if you
prefer) with type String.
> hashmap.ads:
> ------------
>
> package hashmap is new ada.containers.hashed_maps(key_type =>
> types.word,
No. This should just be type String.
> and finally spellcheck2.adb:
> ----------------------------
>
> strings.fixed.delete(item, item_offset + 1, word'last);
You can get rid of this.
> hashmap.insert(dict, item, true);
If you used a (hashed) set, then all you'd have to say is
insert(dict, item);
> strings.fixed.delete(item, item_offset + 1, word'last);
You can get rid of this.
> if not hashmap.contains(dict, item) then
See, you're not using the map element at all. You're only using the key.
> Please note that those attempts are only quick and dirty hacks. If
> someone knows how to get rid of the seperate types.ads file I would
> welcome his advice.
You can get rid of the types file, by using the indefinite hashed map (or
set), and instantiating it with type String.
-Matt
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-27 15:11 ` Spellcheck.adb Matthew Heaney
@ 2005-04-27 22:49 ` Albert Bachmann
2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves
2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney
0 siblings, 2 replies; 23+ messages in thread
From: Albert Bachmann @ 2005-04-27 22:49 UTC (permalink / raw)
On Wed, 27 Apr 2005 11:11:06 -0400, Matthew Heaney wrote:
>
> "Albert Bachmann" <albert.bachmann@gmx.de> wrote in message
> news:1114548638.851249.246280@f14g2000cwb.googlegroups.com...
>>
>> In the meantime I followed the advice from Matthew Heaney (thanks
>> Matthew) and went on with Ada.Containers. I again tried a simple
>> implementation. Unfortunately I recognized that controlled types have
>> to be instantiated at library level
>
> This is no longer true in Ada 2005. Eventually you'll be able to
> instantiate the containers in your main subprogram.
That may be true but with GNAT (as of GCC-4.0.0) I get an error:
gcc -c -gnat05 -O3 spellcheck3.adb
spellcheck3.adb:10:09: instantiation error at a-cohata.ads:29
spellcheck3.adb:10:09: instantiation error at a-cihase.ads:214
spellcheck3.adb:10:09: controlled type must be declared at the library level
gnatmake: "spellcheck3.adb" compilation error
>
>
>> which requires now a total of 3
>> files since the instantiation of ada.containers.hashed_maps needs a
>> definite subtype for its key_type.
>
> True, but that's why we have the Indefinite_Hashed_Maps container package.
> If you're manipulating strings, you should probably be using that.
>
> Also, as I mentioned in an earlier post, you don't really need a map, since
> all you're doing is performing a membership test. Hence, a (hashed) set
> will do.
>
When I use and instance of indefinite_hashed_sets or
indefinite_hashed_maps the program compiles fine but I get an exception
during runtime:
raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map)
raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set)
>> types.ads:
>> ----------
>>
>> package types is
>> subtype word is string(1 .. 128);
>> end types;
>
> Get rid of this. Instantiate the Indefinite_Hashed_Sets (or _Maps, if you
> prefer) with type String.
>
Ok.
>
>> hashmap.ads:
>> ------------
>>
>> package hashmap is new ada.containers.hashed_maps(key_type =>
>> types.word,
>
> No. This should just be type String.
>
Ok.
>
>> and finally spellcheck2.adb:
>> ----------------------------
>>
>> strings.fixed.delete(item, item_offset + 1, word'last);
>
> You can get rid of this.
I did.
>
>
>> hashmap.insert(dict, item, true);
>
> If you used a (hashed) set, then all you'd have to say is
>
> insert(dict, item);
>
>
>> strings.fixed.delete(item, item_offset + 1, word'last);
>
> You can get rid of this.
>
Here too.
>
>> if not hashmap.contains(dict, item) then
>
> See, you're not using the map element at all. You're only using the key.
>
>
>> Please note that those attempts are only quick and dirty hacks. If
>> someone knows how to get rid of the seperate types.ads file I would
>> welcome his advice.
>
> You can get rid of the types file, by using the indefinite hashed map (or
> set), and instantiating it with type String.
>
> -Matt
I post the modified source. Maybe I've overlooked something because as I
said above during runtime I get an exception. However thanks for those
valuable information Matthew!
hashed_set.ads:
--------------
with ada.containers.indefinite_hashed_sets;
with ada.strings.hash;
--with hash_function;
package hashed_set is new ada.containers.indefinite_hashed_set(
element_type => string,
hash => ada.strings.hash, --hash_function,
equivalent_keys => "=");
spellcheck3.adb:
----------------
with ada.strings.fixed;
with ada.text_io;
with hashed_set;
use ada;
procedure spellcheck3 is
file : text_io.file_type;
item : string(1 .. 128);
item_offset : natural;
dict : hashed_set.set;
begin
text_io.open(file, text_io.in_file, "spellcheck-dict.txt");
while not text_io.end_of_file(file) loop
text_io.get_line(file, item, item_offset);
hashed_set.insert(dict, item);
end loop;
while not text_io.end_of_file loop
text_io.get_line(item, item_offset);
if not hashed_set.contains(dict, item) then
text_io.put_line(item);
end if;
end loop;
text_io.close(file);
end spellcheck3;
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann
@ 2005-04-27 23:35 ` Marius Amado Alves
2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann
2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney
1 sibling, 1 reply; 23+ messages in thread
From: Marius Amado Alves @ 2005-04-27 23:35 UTC (permalink / raw)
To: Albert Bachmann; +Cc: comp.lang.ada
> ...
> while not text_io.end_of_file(file) loop
> text_io.get_line(file, item, item_offset);
> hashed_set.insert(dict, item);
Don't pass the whole item, but the slice item (1 .. item_offset).
> ...
> while not text_io.end_of_file loop
> text_io.get_line(item, item_offset);
> if not hashed_set.contains(dict, item) then
> text_io.put_line(item);
>
Ditto.
(And item_offset should be called item_length or similar.)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves
@ 2005-04-27 23:58 ` Albert Bachmann
2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves
2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney
0 siblings, 2 replies; 23+ messages in thread
From: Albert Bachmann @ 2005-04-27 23:58 UTC (permalink / raw)
On Thu, 28 Apr 2005 00:35:21 +0100, Marius Amado Alves wrote:
>> ...
>> while not text_io.end_of_file(file) loop
>> text_io.get_line(file, item, item_offset);
>> hashed_set.insert(dict, item);
>
> Don't pass the whole item, but the slice item (1 .. item_offset).
>
>> ...
>> while not text_io.end_of_file loop
>> text_io.get_line(item, item_offset);
>> if not hashed_set.contains(dict, item) then
>> text_io.put_line(item);
>>
>
> Ditto.
>
> (And item_offset should be called item_length or similar.)
Yes. That's it. I should have known. And timing is not bad with hashed_set:
-- with custom hash function:
real 0m0.119s
user 0m0.101s
sys 0m0.009s
-- with ada.strings.hash:
real 0m0.130s
user 0m0.116s
sys 0m0.005s
Using streams makes for another 10ms. But the GNAT.Spitbol.Table
still seems to be the fastest version so far.
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann
@ 2005-04-28 0:19 ` Marius Amado Alves
2005-04-29 10:22 ` Spellcheck.adb Albert Bachmann
2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney
1 sibling, 1 reply; 23+ messages in thread
From: Marius Amado Alves @ 2005-04-28 0:19 UTC (permalink / raw)
To: comp.lang.ada
> -- with custom hash function:
> real 0m0.119s
> user 0m0.101s
> sys 0m0.009s
>
> -- with ada.strings.hash:
> real 0m0.130s
> user 0m0.116s
> sys 0m0.005s
Have you tried my bit-based hash function?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann
2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves
@ 2005-04-28 0:41 ` Matthew Heaney
2005-04-28 21:20 ` Spellcheck.adb Simon Wright
2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann
1 sibling, 2 replies; 23+ messages in thread
From: Matthew Heaney @ 2005-04-28 0:41 UTC (permalink / raw)
Albert Bachmann <albert.bachmann@gmx.de> writes:
> When I use and instance of indefinite_hashed_sets or
> indefinite_hashed_maps the program compiles fine but I get an exception
> during runtime:
>
> raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map)
> raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set)
What version of the container packages are you using? Either you found
a bug in the sources you're using, or you're doing something wrong.
Can you send me (or just post) the portion of the code that surrounds
the line numbers in the exception message above?
Thanks,
Matt
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann
2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves
@ 2005-04-28 0:58 ` Matthew Heaney
2005-04-29 10:19 ` Spellcheck.adb Albert Bachmann
1 sibling, 1 reply; 23+ messages in thread
From: Matthew Heaney @ 2005-04-28 0:58 UTC (permalink / raw)
Albert Bachmann <albert.bachmann@gmx.de> writes:
> Yes. That's it. I should have known. And timing is not bad with hashed_set:
>
> -- with custom hash function:
What is your custom hash algorithm?
> -- with ada.strings.hash:
I just checked in Pascal Obry's improved algorithm for ada.strings.hash.
See if it does any better on your input.
> Using streams makes for another 10ms. But the GNAT.Spitbol.Table still
> seems to be the fastest version so far.
What were those numbers again?
Regards,
Matt
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney
@ 2005-04-28 21:20 ` Simon Wright
2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann
1 sibling, 0 replies; 23+ messages in thread
From: Simon Wright @ 2005-04-28 21:20 UTC (permalink / raw)
Matthew Heaney <matthewjheaney@earthlink.net> writes:
> Albert Bachmann <albert.bachmann@gmx.de> writes:
>
> > When I use and instance of indefinite_hashed_sets or
> > indefinite_hashed_maps the program compiles fine but I get an exception
> > during runtime:
> >
> > raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map)
> > raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set)
>
> What version of the container packages are you using? Either you found
> a bug in the sources you're using, or you're doing something wrong.
What revision of the tigris.org sources is in GCC 4.0.0? (I was most
impressed to find them there!)
With 4.0.0 on Mac OS X 10.3.9, I have managed (after a few minutes) to
run test_maps with no assertion failures.
I couldn't see instructions for running the tests? If in ai302/testdir,
* to compile with assertions against the 4.0.0 Ada.Containers,
gnatmake -gnat05 -gnata test_maps
* To compile with assertions against the local Ada.Containers,
gnatmake -gnat05 -gnata -I.. -a test_maps
(the -a says rebuild runtime components if necessary).
--
Simon Wright 100% Ada, no bugs.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney
2005-04-28 21:20 ` Spellcheck.adb Simon Wright
@ 2005-04-29 10:05 ` Albert Bachmann
2005-04-29 12:35 ` Spellcheck.adb Matthew Heaney
1 sibling, 1 reply; 23+ messages in thread
From: Albert Bachmann @ 2005-04-29 10:05 UTC (permalink / raw)
On Thu, 28 Apr 2005 00:41:22 +0000, Matthew Heaney wrote:
> Albert Bachmann <albert.bachmann@gmx.de> writes:
>
>> When I use and instance of indefinite_hashed_sets or
>> indefinite_hashed_maps the program compiles fine but I get an exception
>> during runtime:
>>
>> raised CONSTRAINT_ERROR : a-cihama.adb:443 explicit raise (for the map)
>> raised CONSTRAINT_ERROR : a-cihase.adb:375 explicit raise (for the set)
>
> What version of the container packages are you using? Either you found
> a bug in the sources you're using, or you're doing something wrong.
>
> Can you send me (or just post) the portion of the code that surrounds
> the line numbers in the exception message above?
>
> Thanks,
> Matt
Well, Marius Amado Alves advised me not to pass the whole string to the
hashed_set (see the code in my post you replied to). I get no error
anymore then. Do you consider this an error of Ada.Containers despite
this? I assumed it was solely my fault.
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney
@ 2005-04-29 10:19 ` Albert Bachmann
0 siblings, 0 replies; 23+ messages in thread
From: Albert Bachmann @ 2005-04-29 10:19 UTC (permalink / raw)
On Thu, 28 Apr 2005 00:58:29 +0000, Matthew Heaney wrote:
> Albert Bachmann <albert.bachmann@gmx.de> writes:
>
>> Yes. That's it. I should have known. And timing is not bad with hashed_set:
>>
>> -- with custom hash function:
>
> What is your custom hash algorithm?
>
Just the one used in the C++ entry. With "custom" I just wanted to
distinguish it from Ada.Strings.Hash.
with ada.containers;
use ada.containers;
function hash_function(key : string) return hash_type is
n : hash_type := 0;
begin
for i in key'range loop
n := (n * 5 + character'pos(key(i)));
end loop;
return n;
end hash_function;
>
>> -- with ada.strings.hash:
>
> I just checked in Pascal Obry's improved algorithm for ada.strings.hash.
> See if it does any better on your input.
>
>
>> Using streams makes for another 10ms. But the GNAT.Spitbol.Table still
>> seems to be the fastest version so far.
>
> What were those numbers again?
>
> Regards,
> Matt
With David Sauvage's version (with uses GNAT.Spitbol.Table) I get the
following times:
real 0m0.074s
user 0m0.054s
sys 0m0.012s
I have no specific times for the streams version at hand since I deleted
it already. But I remember that back then the difference was in the range
of 10ms. Note that David Sauvage's version uses streams which contribute
to his excellent results.
http://charles.tigris.org/source/browse/charles/src/ai302/examples/spellcheck/
still shows only this initial revisions and the hash function there is the
same as the one I used.
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves
@ 2005-04-29 10:22 ` Albert Bachmann
0 siblings, 0 replies; 23+ messages in thread
From: Albert Bachmann @ 2005-04-29 10:22 UTC (permalink / raw)
On Thu, 28 Apr 2005 01:19:03 +0100, Marius Amado Alves wrote:
>> -- with custom hash function:
>> real 0m0.119s
>> user 0m0.101s
>> sys 0m0.009s
>>
>> -- with ada.strings.hash:
>> real 0m0.130s
>> user 0m0.116s
>> sys 0m0.005s
>
> Have you tried my bit-based hash function?
Not yet ;-) But if time permits I will try it during the next days.
Regards,
Albert
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Spellcheck.adb
2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann
@ 2005-04-29 12:35 ` Matthew Heaney
0 siblings, 0 replies; 23+ messages in thread
From: Matthew Heaney @ 2005-04-29 12:35 UTC (permalink / raw)
Albert Bachmann <albert.bachmann@gmx.de> writes:
> Well, Marius Amado Alves advised me not to pass the whole string to
> the hashed_set (see the code in my post you replied to). I get no
> error anymore then. Do you consider this an error of Ada.Containers
> despite this? I assumed it was solely my fault.
The exception behavior you observed is in harmony with what's described
in the latest AI-302 draft.
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2005-04-29 12:35 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-25 21:30 [Shootout] Spellcheck.adb albert.bachmann
2005-04-25 21:57 ` Spellcheck.adb Albert Bachmann
2005-04-25 23:27 ` [Shootout] Spellcheck.adb Marius Amado Alves
2005-04-26 18:04 ` Spellcheck.adb Matthew Heaney
2005-04-26 19:36 ` [Shootout] Spellcheck.adb David Sauvage
2005-04-26 20:50 ` Spellcheck.adb Albert Bachmann
2005-04-27 15:11 ` Spellcheck.adb Matthew Heaney
2005-04-27 22:49 ` Spellcheck.adb Albert Bachmann
2005-04-27 23:35 ` Spellcheck.adb Marius Amado Alves
2005-04-27 23:58 ` Spellcheck.adb Albert Bachmann
2005-04-28 0:19 ` Spellcheck.adb Marius Amado Alves
2005-04-29 10:22 ` Spellcheck.adb Albert Bachmann
2005-04-28 0:58 ` Spellcheck.adb Matthew Heaney
2005-04-29 10:19 ` Spellcheck.adb Albert Bachmann
2005-04-28 0:41 ` Spellcheck.adb Matthew Heaney
2005-04-28 21:20 ` Spellcheck.adb Simon Wright
2005-04-29 10:05 ` Spellcheck.adb Albert Bachmann
2005-04-29 12:35 ` Spellcheck.adb Matthew Heaney
2005-04-26 21:21 ` Spellcheck.adb Albert Bachmann
2005-04-26 22:04 ` Spellcheck.adb David Sauvage
2005-04-26 20:36 ` [Shootout] Spellcheck.adb Dmitry A. Kazakov
2005-04-26 21:50 ` David Sauvage
2005-04-27 8:40 ` Dmitry A. Kazakov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox