* hashing @ 2002-04-10 21:31 chris.danx 2002-04-10 21:24 ` hashing Steven Deller ` (7 more replies) 0 siblings, 8 replies; 19+ messages in thread From: chris.danx @ 2002-04-10 21:31 UTC (permalink / raw) Hi, Does anyone have a good hash function for strings (length is arbitary, number of buckets is determined when package is instantiated and collisions are handled with chaining)? I'm new to writing hash tables and functions (used them in libs, but this has to be all my own work!), and have chosen this technique to implement in an assignment -- had enough linked lists, and want to expand my horizons. I'm searching for a function just now, so this isn't laziness just wanted to utilise another resource (your brains) in case nothing turns up. After the assignment submission in a few weeks, I'll tidy up the sources for all the useful adts I've written this term and GMGPL them (lists (singly, doubly), hash tables, and whatever else comes up) if ppl express an interest (all adts are generic so far). Thanks, Chris p.s. does anyone have any uses for circular lists? Tom Swans Tp5.5 book mentions them, what are they used for? Just curious... p.p.s. should performance/complexity information be included? I seem to recall a conversation on this a while back, but can't remember if ppl thought it was a good idea. ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: hashing 2002-04-10 21:31 hashing chris.danx @ 2002-04-10 21:24 ` Steven Deller 2002-04-10 22:41 ` hashing Nick Williams ` (2 more replies) 2002-04-10 21:56 ` hashing Chad R. Meiners ` (6 subsequent siblings) 7 siblings, 3 replies; 19+ messages in thread From: Steven Deller @ 2002-04-10 21:24 UTC (permalink / raw) > -----Original Message----- > [mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of chris.danx > Sent: Wednesday, April 10, 2002 5:31 PM > To: comp.lang.ada@ada.eu.org > Subject: hashing > Hi, > > Does anyone have a good hash function for strings (length is > arbitary, number of buckets is determined when package is I'd check out the PERL 5 sources. The hashing function there seems to be excellent for numerous cases I have looked into. Not sure, but the internal documentation might just explain the algorithm without having to reverse engineer C code. Other than that, I'd try one of the numerous ISO-defined hash functions, such as CRC-32, etc. I did a web search on "hash function algorithms" and came up with NUMEROUS interesting sites. >...> > p.s. does anyone have any uses for circular lists? Tom Swans > Tp5.5 book mentions them, what are they used for? Just curious... I have used circular lists in numerous buffering situations where a simple double-buffer is insufficient. Typically the situation is one where the list is created once and the "head" and "tail" positions move as data is put in and taken out. Using a circular list allows different sizes to be put in and taken out. I've also used a 10ary-buffer system (the buffers themselves formed a circular list) to meet some system transfer and error-correction requirements. Some transmission systems use a (short) counter of items sent and look for return ACK's for the items; that algorithm is easy to write using a circular list. > p.p.s. should performance/complexity information be included? > I seem to recall a conversation on this a while back, but > can't remember if ppl thought it was a good idea. Complexity information always raises its head at some point. Best to include it. Regards, Steve ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:24 ` hashing Steven Deller @ 2002-04-10 22:41 ` Nick Williams 2002-04-10 23:36 ` hashing chris.danx 2002-04-11 23:18 ` hashing(HATs. A more efficient algorithm?) Wannabe h4x0r 2 siblings, 0 replies; 19+ messages in thread From: Nick Williams @ 2002-04-10 22:41 UTC (permalink / raw) Could I just register my objection to the use of CRC-32 as a hashing algorithm? Cyclic redundancy checks are not designed to distribute through their range, but rather to detect multiple bit errors. They are much less than ideal when hashing potentially similar strings. Cheers, Nick. "Steven Deller" <deller@smsail.com> wrote in message news:mailman.1018477621.31315.comp.lang.ada@ada.eu.org... > Other than that, I'd try one of the numerous ISO-defined hash functions, > such as CRC-32, etc. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:24 ` hashing Steven Deller 2002-04-10 22:41 ` hashing Nick Williams @ 2002-04-10 23:36 ` chris.danx 2002-04-11 14:18 ` hashing joevl 2002-04-15 15:32 ` hashing Ken Burtch 2002-04-11 23:18 ` hashing(HATs. A more efficient algorithm?) Wannabe h4x0r 2 siblings, 2 replies; 19+ messages in thread From: chris.danx @ 2002-04-10 23:36 UTC (permalink / raw) "Steven Deller" <deller@smsail.com> wrote in message news:mailman.1018477621.31315.comp.lang.ada@ada.eu.org... > > Subject: hashing > > Hi, > > > > Does anyone have a good hash function for strings (length is > > arbitary, number of buckets is determined when package is I'll maybe look into that (if it doesn't involve reverse engineering C). I found one that may suffice for the purposes of the exercise, it's nothing fancy but it should work. Sum all ascii vals of chars in string, then take modulo bucket size. > Other than that, I'd try one of the numerous ISO-defined hash functions, > such as CRC-32, etc. ISO define hash functions? > I did a web search on "hash function algorithms" and came up with > NUMEROUS interesting sites. oops wrong search string, "hash function strings". Got loads of stuff in code, but code is not as useful. Algorithm/function descriptions (mathematics included) are better than code IMO, since they express algorithms functions in a programming language independant manner. When we were learning about bstrees they gave us Ada code which made no sense at the time due to poor descriptions of the algorithms and considerations involved. However when we learned how to do bstrees in Haskell, just before easter break it made perfect sense since the lecturer took the time to describe the algorithms. The code was secondary to the algorithm. [snip] > > p.p.s. should performance/complexity information be included? > > I seem to recall a conversation on this a while back, but > > can't remember if ppl thought it was a good idea. > > Complexity information always raises its head at some point. Best to > include it. it does makes sense to include it and it'll definitely go in the gmgpl version... But does complexity information come under the header of an implementation detail? i don't think it does but if it does and I include it in the spec, my tutor might disagree and I'll get penalised. This is one of my grumps about uni, not knowing what they want with things. In first year, you'd get penalised for *not* putting pointless comments in. Ignore that, I'm probably worrying about nothing!!! Thanks, Chris ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 23:36 ` hashing chris.danx @ 2002-04-11 14:18 ` joevl 2002-04-15 15:32 ` hashing Ken Burtch 1 sibling, 0 replies; 19+ messages in thread From: joevl @ 2002-04-11 14:18 UTC (permalink / raw) >> "Steven Deller" <deller@smsail.com> wrote in message >> Other than that, I'd try one of the numerous ISO-defined hash >> functions, such as CRC-32, etc. "chris.danx" replied: > ISO define hash functions? I think Steve is referring to the CRC-32 (and also CRC-16) algorithms defined by the ITU (formerly CCITT). Can't recall the specific ITU-R recommendation (V.33?). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 23:36 ` hashing chris.danx 2002-04-11 14:18 ` hashing joevl @ 2002-04-15 15:32 ` Ken Burtch 1 sibling, 0 replies; 19+ messages in thread From: Ken Burtch @ 2002-04-15 15:32 UTC (permalink / raw) I believe there's a simple string hash value generator (HashOf) included in the strings package of my drawtools suite http://www.pegasoft.ca Ken B. "chris.danx" wrote: > > "Steven Deller" <deller@smsail.com> wrote in message > news:mailman.1018477621.31315.comp.lang.ada@ada.eu.org... > > > Subject: hashing > > > Hi, > > > > > > Does anyone have a good hash function for strings (length is > > > arbitary, number of buckets is determined when package is > > I'll maybe look into that (if it doesn't involve reverse engineering C). I > found one that may suffice for the purposes of the exercise, it's nothing > fancy but it should work. > > Sum all ascii vals of chars in string, then take modulo bucket size. > > > Other than that, I'd try one of the numerous ISO-defined hash functions, > > such as CRC-32, etc. > > ISO define hash functions? > > > I did a web search on "hash function algorithms" and came up with > > NUMEROUS interesting sites. > > oops wrong search string, "hash function strings". Got loads of stuff in > code, but code is not as useful. Algorithm/function descriptions > (mathematics included) are better than code IMO, since they express > algorithms functions in a programming language independant manner. When we > were learning about bstrees they gave us Ada code which made no sense at the > time due to poor descriptions of the algorithms and considerations involved. > However when we learned how to do bstrees in Haskell, just before easter > break it made perfect sense since the lecturer took the time to describe the > algorithms. The code was secondary to the algorithm. > > [snip] > > > > p.p.s. should performance/complexity information be included? > > > I seem to recall a conversation on this a while back, but > > > can't remember if ppl thought it was a good idea. > > > > Complexity information always raises its head at some point. Best to > > include it. > > it does makes sense to include it and it'll definitely go in the gmgpl > version... But does complexity information come under the header of an > implementation detail? i don't think it does but if it does and I include > it in the spec, my tutor might disagree and I'll get penalised. This is one > of my grumps about uni, not knowing what they want with things. In first > year, you'd get penalised for *not* putting pointless comments in. > > Ignore that, I'm probably worrying about nothing!!! > > Thanks, > Chris -- Ken O. Burtch: http://www.pegasoft.ca : Pegasoft System Manager in a Box / Business Shell : R.R.#1 Bio: 36;Bsc,UI,Lang,Games;Toons,Elves,SF,Pizza;Xian : Jordan Station, ON ``````````````````````````````````````````````````````` Canada L0R 1S0 ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: hashing(HATs. A more efficient algorithm?) 2002-04-10 21:24 ` hashing Steven Deller 2002-04-10 22:41 ` hashing Nick Williams 2002-04-10 23:36 ` hashing chris.danx @ 2002-04-11 23:18 ` Wannabe h4x0r 2002-04-11 23:29 ` chris.danx 2 siblings, 1 reply; 19+ messages in thread From: Wannabe h4x0r @ 2002-04-11 23:18 UTC (permalink / raw) Lately I've been using Hashed Array Trees(or Tables), which have cut the execution time by about %40, and has given me a memory savings of about %20-%30. Note that I'm just progressing from the beginner to the intermediate stage in my "skillz". Basically, you start with an initial array of n number of elements. The first element is a "pointer"(however one may choose to define that) to the last element of the array. When the array is filled to n-2 elements, automatically add x additional elements to the array, and them set the pointer in the first element to point to the end of the array again(which should be n+x_l). One can use any number of methods to decide how many elements to add at any time, it's not fixed. For an additional bit of buffer overflow protection, one could set the pointer to point to n-2 (or n-1 or whatever) depending on whatever makes sense at the time. This is about the simplest damn algorithm in the world, at least to me it is. Heh. Chris ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing(HATs. A more efficient algorithm?) 2002-04-11 23:18 ` hashing(HATs. A more efficient algorithm?) Wannabe h4x0r @ 2002-04-11 23:29 ` chris.danx 2002-04-12 7:24 ` Wannabe h4x0r 0 siblings, 1 reply; 19+ messages in thread From: chris.danx @ 2002-04-11 23:29 UTC (permalink / raw) "Wannabe h4x0r" <chris@dont.spam.me> wrote in message news:mRot8.209679$Yv2.66927@rwcrnsc54... > Lately I've been using Hashed Array Trees(or Tables), > which have cut the execution time by about %40, and has > given me a memory savings of about %20-%30. Note that I'm > just progressing from the beginner to the intermediate > stage in my "skillz". > > Basically, you start with an initial array of n number of > elements. > The first element is a "pointer"(however one may choose to > define that) to the last element of the array. When the > array is filled to n-2 elements, automatically add x > additional elements to the array, and them set the > pointer in the first element to point to the end of the > array again(which should be n+x_l). > One can use any number of methods to decide how many > elements to add at any time, it's not fixed. > For an additional bit of buffer overflow protection, one > could set the pointer to point to n-2 (or n-1 or > whatever) depending on whatever makes sense at the time. > > This is about the simplest damn algorithm in the world, > at least to me it is. Heh. I don't really get it, but this sounds like chains to me. Can you explain it a bitty more? bye, Chris ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing(HATs. A more efficient algorithm?) 2002-04-11 23:29 ` chris.danx @ 2002-04-12 7:24 ` Wannabe h4x0r 2002-04-12 15:00 ` Chad R. Meiners 2002-04-12 17:22 ` tmoran 0 siblings, 2 replies; 19+ messages in thread From: Wannabe h4x0r @ 2002-04-12 7:24 UTC (permalink / raw) On Thu, 11 Apr 2002 19:29:16 -0400, chris.danx wrote: > "Wannabe h4x0r" <chris@dont.spam.me> wrote in message > news:mRot8.209679$Yv2.66927@rwcrnsc54... >> Lately I've been using Hashed Array Trees(or Tables), which have cut >> the execution time by about %40, and has given me a memory savings of >> about %20-%30. Note that I'm just progressing from the beginner to the >> intermediate stage in my "skillz". >> >> Basically, you start with an initial array of n number of elements. The >> first element is a "pointer"(however one may choose to define that) to >> the last element of the array. When the array is filled to n-2 >> elements, automatically add x additional elements to the array, and >> them set the pointer in the first element to point to the end of the >> array again(which should be n+x_l). >> One can use any number of methods to decide how many elements to add at >> any time, it's not fixed. For an additional bit of buffer overflow >> protection, one could set the pointer to point to n-2 (or n-1 or >> whatever) depending on whatever makes sense at the time. >> >> This is about the simplest damn algorithm in the world, at least to me >> it is. Heh. > > I don't really get it, but this sounds like chains to me. Can you > explain it a bitty more? > > bye, > Chris > > Alright. Suppose you have some numbers like 1,2,3,4(or could be characters, or whatever). The actual array would look like (using psuedocode) array := (pointer(array'range - 1),1,2,3,4,NULL); -- The pointer points to the NULL at the end of -- The array. Of course I'm using NULL here as -- a placeholder for the psuedocode. The NULL -- element could hold anything to indicate that -- says "This is the end of the array" Now you want extend that array to hold four more values. Look at where the pointer in the first element is pointing, jump to that element in the array, add five more elements from that point(one to hold the reference element), then change the pointer to point at the end of the array again. The extended array will look like this... array := (pointer(array'range-1),1,2,3,4, , , , ,NULL --The pointer now points here.--); Then you just fill in the elements. array := (pointer(array'range-1),1,2,3,4,5,6,7,8,NULL); This beats making a new larger array and copying all the elements over, plus the new ones. Plus you can determine with fair degree of certainty how much memory you'll need at any one time. Since your only working with the specific elements you need, it saves processor time. And because your not duplicating data in various moves and copies, it saves memory(especially on very large files.) It could be said that one does not need a reference pointer, but that becomes problematic if you have more than routine modifying data in the array. To make the array smaller, just do the reverse. Move the reference character(or integer, or bit, or whatever) to a point in the array, and free all memory above that point. Amy I making any better sense. Dr. Dobbs had an article on this. I'll see if I can find it. Chris ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing(HATs. A more efficient algorithm?) 2002-04-12 7:24 ` Wannabe h4x0r @ 2002-04-12 15:00 ` Chad R. Meiners 2002-04-12 17:22 ` tmoran 1 sibling, 0 replies; 19+ messages in thread From: Chad R. Meiners @ 2002-04-12 15:00 UTC (permalink / raw) So what does it have to do with hash tables? All you have here is an efficient method of expanding the size of an array. -CRM "Wannabe h4x0r" <chris@dont.spam.me> wrote in message news:YYvt8.11521$Gl6.6265@sccrnsc01... > On Thu, 11 Apr 2002 19:29:16 -0400, chris.danx wrote: > > > > "Wannabe h4x0r" <chris@dont.spam.me> wrote in message > > news:mRot8.209679$Yv2.66927@rwcrnsc54... > >> Lately I've been using Hashed Array Trees(or Tables), which have cut > >> the execution time by about %40, and has given me a memory savings of > >> about %20-%30. Note that I'm just progressing from the beginner to the > >> intermediate stage in my "skillz". > >> > >> Basically, you start with an initial array of n number of elements. The > >> first element is a "pointer"(however one may choose to define that) to > >> the last element of the array. When the array is filled to n-2 > >> elements, automatically add x additional elements to the array, and > >> them set the pointer in the first element to point to the end of the > >> array again(which should be n+x_l). > >> One can use any number of methods to decide how many elements to add at > >> any time, it's not fixed. For an additional bit of buffer overflow > >> protection, one could set the pointer to point to n-2 (or n-1 or > >> whatever) depending on whatever makes sense at the time. > >> > >> This is about the simplest damn algorithm in the world, at least to me > >> it is. Heh. > > > > I don't really get it, but this sounds like chains to me. Can you > > explain it a bitty more? > > > > bye, > > Chris > > > > > > Alright. > > Suppose you have some numbers like 1,2,3,4(or could be > characters, or whatever). > The actual array would look like (using psuedocode) > > array := (pointer(array'range - 1),1,2,3,4,NULL); > -- The pointer points to the NULL at the end of > -- The array. Of course I'm using NULL here as > -- a placeholder for the psuedocode. The NULL > -- element could hold anything to indicate that > -- says "This is the end of the array" > > Now you want extend that array to hold four more values. > Look at where the pointer in the first element is > pointing, jump to that element in the array, add five > more elements from that point(one to hold the reference > element), then change the pointer to point at the end > of the array again. The extended array will look like > this... > > array := (pointer(array'range-1),1,2,3,4, , , , ,NULL --The pointer now > points here.--); > > Then you just fill in the elements. > > array := (pointer(array'range-1),1,2,3,4,5,6,7,8,NULL); > > > This beats making a new larger array and copying all the elements over, > plus the new ones. Plus you can determine with fair degree of certainty > how much memory you'll need at any one time. Since your only working with > the specific elements you need, it saves processor time. And because your > not duplicating data in various moves and copies, it saves > memory(especially on very large files.) > It could be said that one does not need a reference pointer, but that > becomes problematic if you have more than routine modifying data in the > array. > To make the array smaller, just do the reverse. Move the reference > character(or integer, or bit, or whatever) to a point in the array, and > free all memory above that point. > > Amy I making any better sense. Dr. Dobbs had an article on this. I'll see > if I can find it. > > Chris ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing(HATs. A more efficient algorithm?) 2002-04-12 7:24 ` Wannabe h4x0r 2002-04-12 15:00 ` Chad R. Meiners @ 2002-04-12 17:22 ` tmoran 1 sibling, 0 replies; 19+ messages in thread From: tmoran @ 2002-04-12 17:22 UTC (permalink / raw) I surely don't understand. There's no hashing indicated, and the array+pointer structure, if you make the "pointer" the index of the last full cell, is simply an (array, Last) structure. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx 2002-04-10 21:24 ` hashing Steven Deller @ 2002-04-10 21:56 ` Chad R. Meiners 2002-04-10 23:43 ` hashing Ted Dennison ` (5 subsequent siblings) 7 siblings, 0 replies; 19+ messages in thread From: Chad R. Meiners @ 2002-04-10 21:56 UTC (permalink / raw) Look in "The Art of Computer Programming" Volume 3 Sorting and Searching by Donald E. Knuth. Chapter 6 section 4 is about hashing. -CRM "chris.danx" <chris.danx@ntlworld.com> wrote in message news:1b2t8.507$na.19833@news8-gui.server.ntli.net... > Hi, > > Does anyone have a good hash function for strings (length is arbitary, > number of buckets is determined when package is instantiated and collisions > are handled with chaining)? I'm new to writing hash tables and functions > (used them in libs, but this has to be all my own work!), and have chosen > this technique to implement in an assignment -- had enough linked lists, and > want to expand my horizons. > > I'm searching for a function just now, so this isn't laziness just wanted to > utilise another resource (your brains) in case nothing turns up. After the > assignment submission in a few weeks, I'll tidy up the sources for all the > useful adts I've written this term and GMGPL them (lists (singly, doubly), > hash tables, and whatever else comes up) if ppl express an interest (all > adts are generic so far). > > > > Thanks, > Chris > > p.s. does anyone have any uses for circular lists? Tom Swans Tp5.5 book > mentions them, what are they used for? Just curious... > > > p.p.s. should performance/complexity information be included? I seem to > recall a conversation on this a while back, but can't remember if ppl > thought it was a good idea. > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx 2002-04-10 21:24 ` hashing Steven Deller 2002-04-10 21:56 ` hashing Chad R. Meiners @ 2002-04-10 23:43 ` Ted Dennison 2002-04-11 16:12 ` hashing Georg Bauhaus ` (4 subsequent siblings) 7 siblings, 0 replies; 19+ messages in thread From: Ted Dennison @ 2002-04-10 23:43 UTC (permalink / raw) chris.danx wrote: > p.s. does anyone have any uses for circular lists? Tom Swans Tp5.5 book > mentions them, what are they used for? Just curious... I use them all the time in my real-time work, and in device drivers. They are awesome for implementing queues using a fixed set of nodes. In real-time work, you really can't go around dynamicly allocating nodes after initilization, so this is very important. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx ` (2 preceding siblings ...) 2002-04-10 23:43 ` hashing Ted Dennison @ 2002-04-11 16:12 ` Georg Bauhaus 2002-04-11 22:36 ` hashing Simon Wright ` (3 subsequent siblings) 7 siblings, 0 replies; 19+ messages in thread From: Georg Bauhaus @ 2002-04-11 16:12 UTC (permalink / raw) chris.danx <chris.danx@ntlworld.com> wrote: : Does anyone have a good hash function for strings There is a wealth of implementations of languages with extensive table support, and with source code to look at. So there is something to compare. (ICON, Tcl, SPITBOL, lua, PostScript, ...) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx ` (3 preceding siblings ...) 2002-04-11 16:12 ` hashing Georg Bauhaus @ 2002-04-11 22:36 ` Simon Wright 2002-05-03 10:16 ` hashing Antonio Duran ` (2 subsequent siblings) 7 siblings, 0 replies; 19+ messages in thread From: Simon Wright @ 2002-04-11 22:36 UTC (permalink / raw) "chris.danx" <chris.danx@ntlworld.com> writes: > Does anyone have a good hash function for strings (length is > arbitary, number of buckets is determined when package is > instantiated and collisions are handled with chaining)? I'm new to > writing hash tables and functions (used them in libs, but this has > to be all my own work!), and have chosen this technique to implement > in an assignment -- had enough linked lists, and want to expand my > horizons. Not sure they're _good_, but see http://www.pushface.org/components/bc/coldframe-hash.zip (if that fails, mail me) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx ` (4 preceding siblings ...) 2002-04-11 22:36 ` hashing Simon Wright @ 2002-05-03 10:16 ` Antonio Duran 2002-05-05 13:55 ` hashing Robert Dewar 2002-05-06 20:42 ` hashing Antonio Duran 7 siblings, 0 replies; 19+ messages in thread From: Antonio Duran @ 2002-05-03 10:16 UTC (permalink / raw) I've been using for that purposes the same hash algorithm used in java that is, as a matter of fact, the hash algorithm cited on Robert Sedgewick's Algorithms book. Look at the source of the hash method for Java String class and translate it into Ada. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx ` (5 preceding siblings ...) 2002-05-03 10:16 ` hashing Antonio Duran @ 2002-05-05 13:55 ` Robert Dewar 2002-05-06 20:42 ` hashing Antonio Duran 7 siblings, 0 replies; 19+ messages in thread From: Robert Dewar @ 2002-05-05 13:55 UTC (permalink / raw) "chris.danx" <chris.danx@ntlworld.com> wrote in message news:<1b2t8.507$na.19833@news8-gui.server.ntli.net>... > p.s. does anyone have any uses for circular lists? Tom > Swans Tp5.5 book mentions them, what are they used for? > Just curious... Lots and lots of good uses for circular lists. I regard them as one of the fundamental data strucures for queues. Consider the following. Implement a circular list where the tail of the list points to the head (singly linked). Now point the head pointer to the LAST item on the list, not the FIRST. That way, since the LAST is linked to the FIRST you have immediate access to both and it is easy to remove the head or add to the tail, without needing double links, or a separate pointer to the head and tail. There are many uses, but in the operating systems I wrote for Honeywell, all queues were represented this way (space was very important in those systems which were for machines that by todays standards had small memories -- the large members of the family had 32K bytes of memory, and these were full featured real time executives and time sharing operating systems :-) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-04-10 21:31 hashing chris.danx ` (6 preceding siblings ...) 2002-05-05 13:55 ` hashing Robert Dewar @ 2002-05-06 20:42 ` Antonio Duran 2002-05-06 23:36 ` hashing Jeffrey Carter 7 siblings, 1 reply; 19+ messages in thread From: Antonio Duran @ 2002-05-06 20:42 UTC (permalink / raw) "chris.danx" <chris.danx@ntlworld.com> wrote in message news:<1b2t8.507$na.19833@news8-gui.server.ntli.net>... > Hi, > > Does anyone have a good hash function for strings (length is arbitary, > number of buckets is determined when package is instantiated and collisions > are handled with chaining)? I'm new to writing hash tables and functions > (used them in libs, but this has to be all my own work!), and have chosen > this technique to implement in an assignment -- had enough linked lists, and > want to expand my horizons. > > I'm searching for a function just now, so this isn't laziness just wanted to > utilise another resource (your brains) in case nothing turns up. After the > assignment submission in a few weeks, I'll tidy up the sources for all the > useful adts I've written this term and GMGPL them (lists (singly, doubly), > hash tables, and whatever else comes up) if ppl express an interest (all > adts are generic so far). > > > > Thanks, > Chris > > p.s. does anyone have any uses for circular lists? Tom Swans Tp5.5 book > mentions them, what are they used for? Just curious... > > > p.p.s. should performance/complexity information be included? I seem to > recall a conversation on this a while back, but can't remember if ppl > thought it was a good idea. I've found the hash function: with Interfaces; function Hash( What : in String) return Interfaces.Unsigned_32 is Hash_Code : Interfaces.Unsigned_32 := 0; begin for I in What'Range loop Hash_Code := (31 * Hash_Code) + Interfaces.Unsigned_32(Character'Pos(What(I))); end loop; return Hash_Code; end Hash; The returned value must be mod'ed by the number of buckets in your hash table (that number should be a prime number). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: hashing 2002-05-06 20:42 ` hashing Antonio Duran @ 2002-05-06 23:36 ` Jeffrey Carter 0 siblings, 0 replies; 19+ messages in thread From: Jeffrey Carter @ 2002-05-06 23:36 UTC (permalink / raw) Antonio Duran wrote: > > "chris.danx" <chris.danx@ntlworld.com> wrote in message news:<1b2t8.507$na.19833@news8-gui.server.ntli.net>... > > Hi, > > > > Does anyone have a good hash function for strings (length is arbitary, > > number of buckets is determined when package is instantiated and collisions > > are handled with chaining)? I'm new to writing hash tables and functions > > (used them in libs, but this has to be all my own work!), and have chosen > > this technique to implement in an assignment -- had enough linked lists, and > > want to expand my horizons. You might want to look at P. K. Pearson, "Fast Hashing of Variable-Length Text Strings," Comm. ACM, 1990 Jun. It's a fast "hashing function specifically tailored to variable-length text strings." "Similar strings are not likely to collide." It produces an unsigned 8-bit value for the hash. An implementation is available as part of the PragmAda Reusable Components, available from http://home.earthlink.net/~jrcarter010/pragmarc.htm and the mirror at www.adapower.com. Look for PragmARC.Hash_Fast_Variable_Length. -- Jeff Carter "You empty-headed animal-food-trough wiper." Monty Python & the Holy Grail ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2002-05-06 23:36 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-04-10 21:31 hashing chris.danx 2002-04-10 21:24 ` hashing Steven Deller 2002-04-10 22:41 ` hashing Nick Williams 2002-04-10 23:36 ` hashing chris.danx 2002-04-11 14:18 ` hashing joevl 2002-04-15 15:32 ` hashing Ken Burtch 2002-04-11 23:18 ` hashing(HATs. A more efficient algorithm?) Wannabe h4x0r 2002-04-11 23:29 ` chris.danx 2002-04-12 7:24 ` Wannabe h4x0r 2002-04-12 15:00 ` Chad R. Meiners 2002-04-12 17:22 ` tmoran 2002-04-10 21:56 ` hashing Chad R. Meiners 2002-04-10 23:43 ` hashing Ted Dennison 2002-04-11 16:12 ` hashing Georg Bauhaus 2002-04-11 22:36 ` hashing Simon Wright 2002-05-03 10:16 ` hashing Antonio Duran 2002-05-05 13:55 ` hashing Robert Dewar 2002-05-06 20:42 ` hashing Antonio Duran 2002-05-06 23:36 ` hashing Jeffrey Carter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox