From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=ham autolearn_force=no version=3.4.4
X-Google-Thread: 103376,64fc02e079586f1b,start
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!postnews.google.com!f14g2000cwb.googlegroups.com!not-for-mail
From: albert.bachmann@gmx.de
Newsgroups: comp.lang.ada
Subject: [Shootout] Spellcheck.adb
Date: 25 Apr 2005 14:30:42 -0700
Organization: http://groups.google.com
Message-ID: <1114464642.518876.137610@f14g2000cwb.googlegroups.com>
NNTP-Posting-Host: 62.227.18.117
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1114464647 6753 127.0.0.1 (25 Apr 2005 21:30:47
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Mon, 25 Apr 2005 21:30:47 +0000 (UTC)
User-Agent: G2/0.2
Complaints-To: groups-abuse@google.com
Injection-Info: f14g2000cwb.googlegroups.com; posting-host=62.227.18.117;
   posting-account=f7MwlQ0AAADH3JnxnSY_MnjSYBkvfEa7
Xref: g2news1.google.com comp.lang.ada:10702
Date: 2005-04-25T14:30:42-07:00
List-Id: <comp.lang.ada>

Dear newsgroup,

on http://shootout.alioth.debian.org/ I discovered that the Ada entry
for the "spellcheck" benchmark produces an error according to the
information on that website. The specification for this benchmark can
be found here:
http://shootout.alioth.debian.org/benchmark.php?test=spellcheck&lang=all&sort=fullcpu

Since I am new to Ada I decided that it might be a good idea to
implement a solution and see how it performs. Finally I came up with a
naive program that performs not very well in comparison to other
language implementations. I produced a naive implementation in C++ as
well which solves the problem in less than half the time. The entry for
C++ at the Shootout takes half the time of my naive solution.

Now what I'd like to ask you is how I can speed up my Ada solution? As
I said I'm a novice and certainly there is a huge optimization
potential here. Here is the output of 'time':

Ada:

real    0m0.742s
user    0m0.720s
sys     0m0.009s

C++:

real    0m0.217s
user    0m0.192s
sys     0m0.008s


First I post the C++ solution since it is much shorter and easy to
understand:

#include <map>
#include <string>
#include <fstream>
#include <iostream>

using namespace std;

int main()
{
	map<string, bool> dict;
	string key;

	ifstream in("spellcheck-dict.txt");

	while (getline(in, key))
		dict[key] = true;

	while (cin >> key)
	{
		if (!dict[key])
			cout << key << endl;
	}
}


And now spellcheck.adb:


with ada.text_io;
with ada.integer_text_io;
with gnat.htable;
use  ada;

procedure spellcheck is

	subtype word is string(1 .. 128);
	subtype index is natural range 0 .. 2760;

	file        : text_io.file_type;
	item        : word;
	item_offset : natural;
	no_item     : constant boolean := false;

	function hash(item : word) return index is
		a : constant positive := 127;
		h : index := 0;
	begin
		for i in item'range loop
			h := (a * h + character'pos(item(i))) mod index'last;
		end loop;

		return h;
	end hash;

	package dictionary is new gnat.htable.simple_htable(header_num =>
index,
							element => boolean,
							no_element => no_item,
							key => word,
							hash => hash,
							equal => "=");

begin
	text_io.open(file, text_io.in_file, "spellcheck-dict.txt");

	while text_io.end_of_file(file) = false loop
		text_io.get_line(file, item, item_offset);
		strings.fixed.delete(item, item_offset + 1, item'last);
		dictionary.set(item, true);
	end loop;

	while text_io.end_of_file = false loop
		text_io.get_line(item, item_offset);
		strings.fixed.delete(item, item_offset + 1, item'last);
		if dictionary.get(item) = no_item then
			text_io.put_line(item);
			null;
		end if;
	end loop;
	
	text_io.close(file);

end spellcheck;


Regards,
Albert