From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.7 required=5.0 tests=BAYES_00,INVALID_DATE,
	MSGID_SHORT,REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no
	version=3.4.4
Path: utzoo!mnetor!uunet!lll-winken!lll-lcc!ames!pasteur!ucbvax!hplabs!sdcrdcf!trwrb!trwrc!agnew
From: agnew@trwrc.UUCP (R.A. Agnew)
Newsgroups: comp.lang.ada
Subject: Re: Hashing Function wanted
Message-ID: <239@trwrc.UUCP>
Date: 29 Feb 88 21:29:05 GMT
References: <8802231548.AA16212@ajpo.sei.cmu.edu>
Reply-To: agnew@trwrc.UUCP (R.A. Agnew)
Organization: TRW/MEAD San Diego, Ca.
List-Id: <comp.lang.ada>

In article <8802231548.AA16212@ajpo.sei.cmu.edu> "EDWARD CRAGG" <ecragg@gmuvax.gmu.edu> writes:
>Does anyone have either ideas or experience related to a hashing
>function in Ada for converting a string of unspecified length
>the type conversion using attributes (to keep totaly machine 
>independent) is eating me alive.  Does anybody know what kind of
>
    If using character'pos(s(i)) is generating slow code, you need a new compiler.
This should only be overhead for the compiler but should generate very little code.
I generally define my own enumeration type, say Identifier_Char and type convert
with Identifier_Char'Pos(s(i)). I just benched a minimally perfect hash code for 
Ada reserved words using this technique on a Sun3/75 with the Telesoft Telegen2
compiler and got 133 microseconds worst case (case where tokens are always Ada
reserved words).

	Just to check the above, I put 10000 character'pos conversions in a loop
and timed them under Unix and got 0.0 seconds on all three times. In order to
get a better comparison, I wrote the following two subprograms:


procedure test_mod is

	i: positive;
	n: natural := 0;
	c: Character;

begin
	for i in 1..10000 loop
		for c in 'A' .. 'Z' loop
			n := n mod 16381;
		end loop;
	end loop;
end test_mod;

procedure test_pos is

	i: positive;
	n: natural := 0;
	c: Character;

begin
	for i in 1..10000 loop
		for c in 'A' .. 'Z' loop
			n := (n + Character'Pos(c)) mod 16381;
		end loop;
	end loop;
end test_pos;

The only difference between the two is the character to integer conversion and
so the difference in execution times between them is the cost of ten thousand
conversions. I did not use the optimizer. They both timed out between 1.9 and
2.0 seconds elapsed time on the Sun3/75 so I still claim that the compiler
generates no code due to the type conversion.  Perhaps your hash agolrithm is
inherently slow. I have developed a very efficient hashcode algorithm for arbitrary
length strings which is based on Fermat and Mersenne primes and uses only shifts
and integer addition.

    And no, that's not the reason why the Ada compilers are so slow. The language
that they are compiling is just a couple orders of magnitude more complex than 
most others. To appreciate this, visualize writing a compiler to optimize code
generated by nested generics instantiated on a data type like a complex tensor!