comp.lang.ada
 help / color / mirror / Atom feed
From: "Giovanni Azua" <bravegag@hotmail.com>
Subject: Re: find words that contains some specific letters
Date: Sat, 6 Jun 2009 22:42:45 +0200
Date: 2009-06-06T22:42:45+02:00	[thread overview]
Message-ID: <7902mcF1nf9ngU1@mid.individual.net> (raw)
In-Reply-To: se-dnWhNM6ZWCrfXnZ2dnUVZ_vmdnZ2d@earthlink.com


"Patricia Shanahan" <pats@acm.org> wrote:
>>>> A perfect hash function is one that guarantees distinct hash codes for 
>>>> a specific set of keys. There are algorithms for constructing one, 
>>>> given a set of keys.

"Giovanni Azua" <bravegag@hotmail.com> wrote:
>>> For example the Character class.

"John B. Matthews" <nospam@nospam.invalid> wrote:
>> I think all of the primitive wrappers (except Double) have trivially 
>> perfect integer hashes, as they wrap distinct values. Patricia was 
>> perhaps referring to algorithms of the kind discussed here:
>>
>> <http://en.wikipedia.org/wiki/Perfect_hash_function>

"Patricia Shanahan" <pats@acm.org> wrote:
> Yes, that's the sort of thing I meant.
>
> The Long hash code is also not perfect, and cannot be, because there are
> more longs than ints.
>

To achieve the perfect, optimal and ideal no collisions Hash table "perfect 
Hash table" these are the two factors that play a role:

1-. the "perfect" property of the hash function as defined in the wiki page 
cited above
2-. the size of the elements set

If #2 goes arbitrarily large meaning allocating a hash table of the size of 
that elements set becomes prohibitive memory-wise then it does not really 
matter how perfect the hash function is, there will be collisions anyway and 
an additional search algorithm will be needed to disambiguate the colliding 
elements that fall under the same buckets. This is what I think John 
Matthews hasn't realized yet about his implementation of the Jumble 
algorithm. For a realistic dataset he will be returning elements that do not 
match the of input (String built on top of the sorted character input) but 
matches that happen to collide under the same bucket.

e.g. trying to allocate a Hash table of the size ( Integer.MAX_VALUE - 
Integer.MIN_VALUE) would be prohibitive unless you place unrealistic 
requirements so you have the Integer with a perfect hash function but you 
will not get a "perfect Hash table". It is of course worse for the String 
class where neither #1 nor #2 are fulfilled.

Character class offers the perfect hash function but since the elements set 
is discrete i.e. 26 elements http://en.wikipedia.org/wiki/English_alphabet 
you can build a perfect Hash structure like the one I described and edited 
by Tom McGlynn.

Best regards,
Giovanni 





  reply	other threads:[~2009-06-06 20:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <edaeda6b-3793-48f2-8f8f-6c9701759de5@r33g2000yqn.googlegroups.com>
     [not found] ` <f350aaae-d045-450d-b255-308eb7fb8f8a@e24g2000vbe.googlegroups.com>
     [not found]   ` <d121e6f6-6afa-43e7-bcd1-c9a770e12704@z9g2000yqi.googlegroups.com>
     [not found]     ` <nospam-83CE7B.07382201062009@news.aioe.org>
     [not found]       ` <78i1lbF1m69gkU1@mid.individual.net>
     [not found]         ` <78i2ajF1m4nglU1@mid.individual.net>
     [not found]           ` <h00l40$r5h$1@news.albasani.net>
     [not found]             ` <78i4i2F1magkfU1@mid.individual.net>
     [not found]               ` <3f1d007f-bcae-42b4-afb0-215b18f51b9c@n21g2000vba.googlegroups.com>
     [not found]                 ` <78ib19F1mfh7qU1@mid.individual.net>
     [not found]                   ` <3b519936-3db9-4dad-85ba-371fa4b29c8f@z5g2000vba.googlegroups.com>
     [not found]                     ` <78quiaF1n95fsU1@mid.individual.net>
     [not found]                       ` <h09q6f$41t$1@news.albasani.net>
     [not found]                         ` <MJWdnUzVI6hADrXXnZ2dnUVZ_q2dnZ2d@earthlink.com>
     [not found]                           ` <78rpbkF1mqrc7U1@mid.individual.net>
2009-06-06 16:13                             ` find words that contains some specific letters John B. Matthews
2009-06-06 16:22                               ` Patricia Shanahan
2009-06-06 20:42                                 ` Giovanni Azua [this message]
2009-06-06 23:43                                   ` Giovanni Azua
2009-06-06 17:11                               ` Mark Space
2009-06-06 18:23                               ` Lew
2009-06-06 23:58                                 ` Tom Anderson
2009-06-07 19:30                                   ` John B. Matthews
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox