* Re: Hash Type Size
2013-09-03 17:18 0% ` Peter Brooks
@ 2013-09-03 21:21 0% ` John B. Matthews
0 siblings, 0 replies; 3+ results
From: John B. Matthews @ 2013-09-03 21:21 UTC (permalink / raw)
In article <c6c24f9a-c5ba-4068-be58-c7d1cff4889b@googlegroups.com>,
Peter Brooks <peter.h.m.brooks@gmail.com> wrote:
> On Tuesday, 3 September 2013 12:50:04 UTC+2, John B. Matthews wrote:
> >
> > As a concrete example, this program [1] tallies collisions
> > using an instance of Ada.Strings.Bounded.Hash on words found
> > in a dictionary. While it's specific to GNAT, it may suggest
> > how to obtain a baseline against which to compare your
> > implementation(s).
> >
> > [1] <http://home.roadrunner.com/~jbmatthews/jumble.html#sec2>
> >
> Thank you for that -it's extremely useful. 32% collisions is far
> too high for me, so I certainly need a better hash function.
You are welcome. That 2009 data was obtained using GNAT 4.3.4;
newer versions are typically better. For example, the GNAT 4.6
implementation yields the following result:
./collisions
0: 305643 (77.72%)
1: 76947 (19.57%)
2: 9790 (2.49%)
3: 805 (0.20%)
4: 56 (0.01%)
5: 1 (0.00%)
--
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>
^ permalink raw reply [relevance 0%]
* Re: Hash Type Size
2013-09-03 10:50 7% ` John B. Matthews
@ 2013-09-03 17:18 0% ` Peter Brooks
2013-09-03 21:21 0% ` John B. Matthews
0 siblings, 1 reply; 3+ results
From: Peter Brooks @ 2013-09-03 17:18 UTC (permalink / raw)
On Tuesday, 3 September 2013 12:50:04 UTC+2, John B. Matthews wrote:
>
> As a concrete example, this program [1] tallies collisions using an
> instance of Ada.Strings.Bounded.Hash on words found in a dictionary.
> While it's specific to GNAT, it may suggest how to obtain a baseline
> against which to compare your implementation(s).
>
> [1] <http://home.roadrunner.com/~jbmatthews/jumble.html#sec2>
>
Thank you for that -it's extremely useful. 32% collisions is far too high for me, so I certainly need a better hash function.
^ permalink raw reply [relevance 0%]
* Re: Hash Type Size
@ 2013-09-03 10:50 7% ` John B. Matthews
2013-09-03 17:18 0% ` Peter Brooks
0 siblings, 1 reply; 3+ results
From: John B. Matthews @ 2013-09-03 10:50 UTC (permalink / raw)
In article <8268e85c-e372-4883-8449-ef5253e2c77e@googlegroups.com>,
Peter Brooks <peter.h.m.brooks@gmail.com> wrote:
> The triples are of the form:
>
> <http://dbpedia.org/resource/Alabama>
> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> <http://schema.org/AdministrativeArea>
>
> subject - predicate - object
>
> Clearly the '<http://' and '>' don't need to be part of the hash,
> I'll need a mapping both ways, from subject -> predicate ->
> object and from object -> predicate -> subject, that's what'll be
> in the binary trees.
As a concrete example, this program [1] tallies collisions using an
instance of Ada.Strings.Bounded.Hash on words found in a dictionary.
While it's specific to GNAT, it may suggest how to obtain a baseline
against which to compare your implementation(s).
[1] <http://home.roadrunner.com/~jbmatthews/jumble.html#sec2>
--
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>
^ permalink raw reply [relevance 7%]
Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2013-08-18 21:05 Hash Type Size sbelmont700
2013-08-19 22:12 ` Randy Brukardt
2013-08-31 6:22 ` Peter Brooks
2013-08-31 15:57 ` sbelmont700
2013-09-03 1:47 ` Randy Brukardt
2013-09-03 2:31 ` Peter Brooks
2013-09-03 10:50 7% ` John B. Matthews
2013-09-03 17:18 0% ` Peter Brooks
2013-09-03 21:21 0% ` John B. Matthews
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox