comp.lang.ada
 help / color / mirror / Atom feed
From: "John B. Matthews" <nospam@nospam.invalid>
Subject: Re: find words that contains some specific letters
Date: Sun, 07 Jun 2009 15:30:31 -0400
Date: 2009-06-07T15:30:31-04:00	[thread overview]
Message-ID: <nospam-5B13F6.15303107062009@news.aioe.org> (raw)
In-Reply-To: alpine.DEB.1.10.0906070035110.5405@urchin.earth.li

In article <alpine.DEB.1.10.0906070035110.5405@urchin.earth.li>,
 Tom Anderson <twic@urchin.earth.li> wrote:

> > John B. Matthews wrote:
> >> The jumble algorithm relies on a hashed map for efficiency, but a 
> >> perfect hash is not essential. I recently implemented the 
> >> algorithm in both Java and Ada using library routines:
> >> 
> >> <http://en.wikipedia.org/wiki/Jumble>
> >> <http://sites.google.com/site/drjohnbmatthews/jumble>
> >> <http://home.roadrunner.com/~jbmatthews/jumble.html>
> >> 
> >> I was intrigued to compare the two languages' library 
> >> implementations of a general purpose string hash. Using a 
> >> multiplicative function with an initial value of zero, both offer 
> >> excellent performance. Java uses a multiplier of 31 on a signed 
> >> 32-bit integer; Ada uses a multiplier of eight on an unsigned 
> >> (modular) value whose size is implementation defined (e.g. mod 
> >> 2**32):
> >> 
> >> <http://www.docjar.org/html/api/java/lang/String.java.html>
> >> <http://gcc.gnu.org/viewcvs/trunk/gcc/ada/a-strhas.adb>
> 
> John, i think you've missed a trick there - doesn't Rotate_Left do a 
> barrel shift, where the bits that fall off the left-hand end get 
> shoved back in at the right, rather than a plain shift? If so, the 
> basic algebraic structure is actually a bit different to java's hash.

D'oh, you're absolutely right. I overlooked this when I was tinkering 
with my own hash function, too. Thank you for looking at this.

[...]

> I still think this is a shoddy hash.

Well, it's a legacy from GNAT.HTable.Hash, but it does very well in the 
context of Ada.Containers.Hashed_Maps. Java uses a bucket list whose 
size is an integral power of two, while (this implementation of) Ada 
uses the next largest prime. Modeling an Ada collision counter on your 
example <http://urchin.earth.li/~twic/tmp/Collisions.java>, I got these 
results for collision count, number of buckets having that count, and 
the percentage of the total:

$ ./collisions 
 0: 218586 (55.6%)
 1: 126250 (32.1%)
 2: 38432 (9.8%)
 3: 8362 (2.1%)
 4: 1354 (0.3%)
 5: 229 (0.1%)
 6: 24 (0.0%)
 7: 4 (0.0%)
 8: 1 (0.0%)

The code is here: <http://home.roadrunner.com/~jbmatthews/jumble.html>

[Followup-To set to comp.lang.ada.]
-- 
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>



      reply	other threads:[~2009-06-07 19:30 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <edaeda6b-3793-48f2-8f8f-6c9701759de5@r33g2000yqn.googlegroups.com>
     [not found] ` <f350aaae-d045-450d-b255-308eb7fb8f8a@e24g2000vbe.googlegroups.com>
     [not found]   ` <d121e6f6-6afa-43e7-bcd1-c9a770e12704@z9g2000yqi.googlegroups.com>
     [not found]     ` <nospam-83CE7B.07382201062009@news.aioe.org>
     [not found]       ` <78i1lbF1m69gkU1@mid.individual.net>
     [not found]         ` <78i2ajF1m4nglU1@mid.individual.net>
     [not found]           ` <h00l40$r5h$1@news.albasani.net>
     [not found]             ` <78i4i2F1magkfU1@mid.individual.net>
     [not found]               ` <3f1d007f-bcae-42b4-afb0-215b18f51b9c@n21g2000vba.googlegroups.com>
     [not found]                 ` <78ib19F1mfh7qU1@mid.individual.net>
     [not found]                   ` <3b519936-3db9-4dad-85ba-371fa4b29c8f@z5g2000vba.googlegroups.com>
     [not found]                     ` <78quiaF1n95fsU1@mid.individual.net>
     [not found]                       ` <h09q6f$41t$1@news.albasani.net>
     [not found]                         ` <MJWdnUzVI6hADrXXnZ2dnUVZ_q2dnZ2d@earthlink.com>
     [not found]                           ` <78rpbkF1mqrc7U1@mid.individual.net>
2009-06-06 16:13                             ` find words that contains some specific letters John B. Matthews
2009-06-06 16:22                               ` Patricia Shanahan
2009-06-06 20:42                                 ` Giovanni Azua
2009-06-06 23:43                                   ` Giovanni Azua
2009-06-06 17:11                               ` Mark Space
2009-06-06 18:23                               ` Lew
2009-06-06 23:58                                 ` Tom Anderson
2009-06-07 19:30                                   ` John B. Matthews [this message]
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox