From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,918406c619dd104d X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news4.google.com!feeder.news-service.com!newsfeed.freenet.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: "Giovanni Azua" Newsgroups: comp.lang.java.programmer,comp.lang.ada Subject: Re: find words that contains some specific letters Date: Sat, 6 Jun 2009 22:42:45 +0200 Message-ID: <7902mcF1nf9ngU1@mid.individual.net> References: <78i1lbF1m69gkU1@mid.individual.net> <78i2ajF1m4nglU1@mid.individual.net> <78i4i2F1magkfU1@mid.individual.net> <3f1d007f-bcae-42b4-afb0-215b18f51b9c@n21g2000vba.googlegroups.com> <78ib19F1mfh7qU1@mid.individual.net> <3b519936-3db9-4dad-85ba-371fa4b29c8f@z5g2000vba.googlegroups.com> <78quiaF1n95fsU1@mid.individual.net> <78rpbkF1mqrc7U1@mid.individual.net> X-Trace: individual.net Va9dhyoKTs8cpJGTUkLtlA0ID+ONeV3auI77LSXxrCIuEobRDe Cancel-Lock: sha1:aapD+oToAX4VJe5TSS88oojEc/4= X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.5512 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-RFC2646: Format=Flowed; Response Xref: g2news2.google.com comp.lang.java.programmer:42923 comp.lang.ada:6343 Date: 2009-06-06T22:42:45+02:00 List-Id: "Patricia Shanahan" wrote: >>>> A perfect hash function is one that guarantees distinct hash codes for >>>> a specific set of keys. There are algorithms for constructing one, >>>> given a set of keys. "Giovanni Azua" wrote: >>> For example the Character class. "John B. Matthews" wrote: >> I think all of the primitive wrappers (except Double) have trivially >> perfect integer hashes, as they wrap distinct values. Patricia was >> perhaps referring to algorithms of the kind discussed here: >> >> "Patricia Shanahan" wrote: > Yes, that's the sort of thing I meant. > > The Long hash code is also not perfect, and cannot be, because there are > more longs than ints. > To achieve the perfect, optimal and ideal no collisions Hash table "perfect Hash table" these are the two factors that play a role: 1-. the "perfect" property of the hash function as defined in the wiki page cited above 2-. the size of the elements set If #2 goes arbitrarily large meaning allocating a hash table of the size of that elements set becomes prohibitive memory-wise then it does not really matter how perfect the hash function is, there will be collisions anyway and an additional search algorithm will be needed to disambiguate the colliding elements that fall under the same buckets. This is what I think John Matthews hasn't realized yet about his implementation of the Jumble algorithm. For a realistic dataset he will be returning elements that do not match the of input (String built on top of the sorted character input) but matches that happen to collide under the same bucket. e.g. trying to allocate a Hash table of the size ( Integer.MAX_VALUE - Integer.MIN_VALUE) would be prohibitive unless you place unrealistic requirements so you have the Integer with a perfect hash function but you will not get a "perfect Hash table". It is of course worse for the String class where neither #1 nor #2 are fulfilled. Character class offers the perfect hash function but since the elements set is discrete i.e. 26 elements http://en.wikipedia.org/wiki/English_alphabet you can build a perfect Hash structure like the one I described and edited by Tom McGlynn. Best regards, Giovanni