Re: Hashing Function wanted

comp.lang.ada
 help / color / mirror / Atom feed

From: djones@megatest.UUCP (Dave Jones)
Subject: Re: Hashing Function wanted
Date: 27 Feb 88 00:07:22 GMT	[thread overview]
Message-ID: <292@goofy.megatest.UUCP> (raw)
In-Reply-To: 8802231548.AA16212@ajpo.sei.cmu.edu

in article <8802231548.AA16212@ajpo.sei.cmu.edu>, ecragg@GMUVAX.GMU.EDU ("EDWARD CRAGG") says:
> 
> Does anyone have either ideas or experience related to a hashing
> function in Ada ...
>

I don't speak Ada. So I don't know if my input will help at all,
but I'll give it a try. (If you are wondering how I came to find
this posting: a cohort told me about it, and suggested I might be able to 
help.)

Like I said, I don't speak Ada, but if conversion of characters into
unsigned integers is efficient (it should be), you can use the following
hash function, which I will write in my native dialect, C:

	int hash(str)
	char *str;
	{
	  int result;
	  while(*str) result += result + (unsigned int)(*str++);
	  return (result % table_size);
	}

If you don't speak C, we are going to need an interpreter. 

I have discovered a rehashing technique which will save time on the 
mod-by-table-size part.  It is based on the rather remarkable fact,
which I have very tediously proven, that the range of the sequence
defined below is the set of all numbers in 0..table_size-1  which
are congruent to 0 or 3 mod 4, provided that table_size is a power of two.

	S(0) = 0
	S(n+1) = ((S(n) + 1)*3) % table_size

You use the function R(n) = ((n+1)*3) % table_size as a rehash function
for collision resolution. You always use a table_size which is an even power 
of two, so that modding it out can be strength-reduced to masking with 
table_size-1. (Will the Ada strong-typing let you do that?)

I am including a sharfile which may be of some use.  It is a hash-table
package written in C++.  If you can translate it into Ada you are in
business.  But the type-checking may make that impossible.  In C++
you have strong type-checking when you want it, but if it gets in the
way, you can chuck it.

Good luck,

		Dave Jones
		Megatest Corp.
		880 Fox Lane
		San Jose, CA.
		95131

		(408) 437-9700 Ext 3227
		UUCP: ucbvax!sun!megatest!djones
		ARPA: megatest!djones@riacs.ARPA

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  DISCLAIMER Assoc.C Assoc.H Assoc.doc assoc_demo.C heap.C
#   heap.H heap.doc idents.C idents.H idents.doc queue.C queue.H
#   queue.doc
# Wrapped by djones@goofy on Fri Feb 26 15:35:22 1988
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f DISCLAIMER -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"DISCLAIMER\"
else
echo shar: Extracting \"DISCLAIMER\" \(243 characters\)
sed "s/^X//" >DISCLAIMER <<'END_OF_DISCLAIMER'
X
X
X//
X//  DISCLAIMER.  
X// 
X//  Neither the author nor Megatest Corp. issues any waranty of the
X//  contents of these files.  No claim is made that they are adequate to any
X//  purpose, nor that any is error-free, nor that any works at all.
X//
END_OF_DISCLAIMER
if test 243 -ne `wc -c <DISCLAIMER`; then
    echo shar: \"DISCLAIMER\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f Assoc.C -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"Assoc.C\"
else
echo shar: Extracting \"Assoc.C\" \(6850 characters\)
sed "s/^X//" >Assoc.C <<'END_OF_Assoc.C'
X
X#include "Assoc.H"
X
X//  This is a ASSOCIATIVE ARRAY class, which maps arbitrary things to
X//  structures of arbitary size.  There is also an ITERATOR CLASS for 
X//  the associative arrays.
X//
X//  C.f.  C++ Programming Language book, sections 2.3.10 and 6.7.
X//
X//            
X
X// Author: Dave Jones
X
X//
X//  NOTES:
X//
X//  Here's how it works.  We keep an array of Pairs, 
X//     { char* key; VOID* contents }
X//
X//  A Pair is valid if and only if the contents is not null.
X//  If contents is not null, it points to a struct of the given size.
X//  We hash into the array to find keys, using a rehash function for
X//  collision avoidance.  If you look up a previously invalid key,
X//  we create a valid Pair for it.  By default, the first long int of a 
X//  new field is zeroed, so a user may use it to determine whether or not an
X//  array element is a new one or not.  Users may specify some other
X//  initialization routine. (It's "virtual".)
X//
X//  To save space, the user may invalidate entries with the procedure remove().
X//
X//  When the array approaches half full, we double its size.
X//
X//  WARNING:  These algorithms are quite tricky. They are built for speed,
X//  not clarity.  If you change something, you may very well introduce a bug 
X//  which will not manifest itself except in very infrequent situations,
X//  and which will be impossible to track down.  Fore-warned is skeptical,
X//  but fore-armed.
X//
X//  For example, the algorithms will not work unless the table size is a 
X//  power of two, and the table is kept strictly less than half full.
X//  In particular, the lookup routine only provably terminates
X//  because for a given pos. int. M, the sequence
X//
X//        s(0) = 0
X//
X//       s(i+1) = ((s(i)+1)*3) mod 2**M
X//
X//  repeats after exactly 2**(M-1) steps.  I have proved this true
X//  for all pos. ints. M, but my proof is very inelegant.  You can
X//  prove it true for small M, say less than 33, with a simple computer
X//  program.  I have done that also, just to be on the safe side.
X//
X
X\f
X
Xstatic Assoc_entry* new_table(int);
Xstatic int round_up_to_pwr2(int);
X
X
XAssoc::Assoc( int struct_size, int initial_size)
X     : heap(struct_size)
X{
X  size = round_up_to_pwr2(initial_size);
X  num_entries = 0;
X  max_entries = size/2 - 1;
X  mask = size - 1;  // Binary arithmetic. x mod size == x & (size-1)
X  hash_table = new_table(size);
X}
X
X
X
X
X
XAssoc::~Assoc()
X{
X  Assoc_iterator next(*this);
X  Assoc_entry* bucket;
X
X  while(bucket=next())
X    delete_key(bucket->key);
X
X  delete hash_table;
X}
X\f
X
X
X
XVOID*
XAssoc::lookup( char* key )
X{
X  // The lookup may add an entry, so..
X  if ( num_entries >= max_entries ) overflow();
X  
X  register int bucket_number = HASH(key);
X  register Assoc_entry * bucket;
X
X  while(1)
X    {
X
X      bucket = hash_table + bucket_number;
X
X      if ( bucket->contents == 0 )
X	{ 
X	  bucket->key = copy_key(key);
X	  bucket->contents = (VOID*)heap.alloc();
X	  init(bucket->contents);
X	  num_entries++;
X	  break;    // <====== created new entry
X	}
X      
X      if ( !equiv( bucket->key, key) )
X	{ 
X	  bucket_number = REHASH(bucket_number);
X	  continue; // <====== search some more (collision)
X	}
X
X      break;        // <====== found old Assoc_entry
X
X    }
X  
X  return bucket->contents;
X
X}
X\f
X
X
Xvoid
XAssoc::overflow()
X{
X  Assoc_entry* old_hash = hash_table;
X  int old_size = size;
X
X  max_entries = size - 1;
X  size = size * 2;
X  mask = size - 1;
X  hash_table = new_table(size);
X  
X  /* Take everything that was in the old table, and put it
X  ** into the new one.
X  */
X
X  register int recno;
X  for (recno = 0; recno < old_size; recno++)
X    { Assoc_entry *mem = old_hash + recno;
X      if ( mem->key )
X	{ 
X	  register int bucket_number = HASH(mem->key);
X	  while(1)
X	    {
X	      register Assoc_entry* bucket = hash_table + bucket_number;
X	      if ( bucket->contents == 0 )
X		{ 
X		  bucket->key = mem->key;
X		  bucket->contents = mem->contents;
X		  break; // <==== copied it
X		}
X
X	      // search some more
X	      bucket_number = REHASH(bucket_number);
X	      
X	    }
X	}
X      
X    }
X
X  delete old_hash;
X}
X\f
X
X
X
X
Xvoid
XAssoc::remove( char* key )
X{
X  register int  bucket_number  = HASH(key);
X  register struct Assoc_entry * bucket;
X
X  // Find old entry and remove it.
X  while(1)
X    { bucket = hash_table+bucket_number;
X       
X       if ( bucket->contents == 0 )
X	 return;  // <===== nothing to delete
X
X       if ( equiv(bucket->key, key ))
X	 { // found the condemned item
X	   delete_key(bucket->key);
X	   heap.free(bucket->contents);
X	   bucket->contents = 0;
X	   num_entries-=1;
X	   break; // <====== clobbered it
X	 }
X       // collision
X       bucket_number = REHASH(bucket_number);
X     }
X  
X  // Entries that might have been "bumped down the rehash chain"
X  // when they collided with the now defunct Assoc_entry, may now be misplaced.
X  // So we remove them, then reinsert them.   This is the trickiest part.
X  while(
X        bucket_number = REHASH(bucket_number),
X	bucket = hash_table + bucket_number,
X	bucket->contents != 0
X       )
X    { // check for bumpee
X      register char* key = bucket->key;
X      register int new_bucket_number = HASH(key);
X
X      if(new_bucket_number != bucket_number) // unnecessary test (accelerator)
X	{ // remove and reinsert
X	  VOID* contents = bucket->contents;
X	  bucket->contents = 0;  // remove it.
X	  
X	  while(1)
X	    {
X	      Assoc_entry* new_bucket = hash_table + new_bucket_number;
X	      
X	      if ( new_bucket->contents == 0 )
X		{ 
X		  new_bucket->key = key;
X		  new_bucket->contents = contents;
X		  break;    // <====== reinserted entry
X		}
X	      
X	      new_bucket_number = REHASH(new_bucket_number);
X	    }
X
X	}// end remove and reinsert
X    } // end check for bumpee
X
X  return;
X
X} // end Assoc::remove()
X  \f
X
X
X
XAssoc_entry*
XAssoc_iterator::operator()()
X{
X  for (; i < cs->size; i++)
X    if ( cs->hash_table[i].contents)
X      { 
X	return &cs->hash_table[i++];
X      }
X  i = 0;  // Table is exhausted. We will start the sequence anew next time.
X  return 0;
X}
X\f
X
Xint
XName_table::hash(register  char* key)
X{
X  /* I looked at the assembly language generated for various hashing
X  ** algorithms, (Sun-3, release 3.4), and this one won.
X  */
X  register int retval = 0;
X  while(*key) retval += retval + (unsigned char)(*key++);
X  return retval;
X}
X
Xvoid
XName_table::delete_key(char* key)
X{ delete key; }
X
Xchar*
XName_table::copy_key(char* key)
X{ return strcpy( new char[strlen(key)], key ); }
X
Xint
XName_table::equiv(char* key1, char* key2)
X{ return strcmp(key1, key2) == 0; }
X
X\f
X
X
XAssoc_entry* Assoc::new_table(int size)
X{
X  Assoc_entry* table = new Assoc_entry [size];
X  int i = 0;
X  for(i = 0; i < size; i++) table[i].contents = 0;
X  return table;
X}
X
Xstatic int round_up_to_pwr2(int initial_size)
X{
X  const int SMALLEST = 4;
X  int size = SMALLEST;
X
X  while(initial_size > SMALLEST)
X    { size *= 2;
X      initial_size = ( initial_size + 1)/2;
X    }
X  return size*2;
X}
END_OF_Assoc.C
if test 6850 -ne `wc -c <Assoc.C`; then
    echo shar: \"Assoc.C\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f Assoc.H -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"Assoc.H\"
else
echo shar: Extracting \"Assoc.H\" \(2495 characters\)
sed "s/^X//" >Assoc.H <<'END_OF_Assoc.H'
X#ifndef NAME_TABLE_INCLUDED
X#define NAME_TABLE_INCLUDED
X
X#include <string.h>
X#include "heap.H"
X#include <generic.h>
X
X// Author: Dave Jones
X
Xstruct VOID   // structure of unknown size and content
X{ char* pointer;  // Assoc's make this null
X};
X
Xclass Assoc_entry
X{  
X  friend class Assoc;
X  friend class Assoc_iterator;
X  char* key;
X  VOID* contents;
X public:
X  inline const char* index() { return key; }
X  inline const VOID* value() { return contents; }
X};
X
X
Xclass Assoc  // generic associative array
X{
X  virtual int hash( char* whatever ) { return (int) whatever; }
X
X  friend class Assoc_iterator;
X  Heap heap;
X  Assoc_entry* hash_table;  
X  int size;
X  int max_entries;
X  int mask;
X  int num_entries;
X  inline int HASH( char* key) { return (hash(key)) & mask;}
X  inline int REHASH(int num)  { return ((((num)+1)*3) & mask); }
X  void overflow();
X  Assoc_entry* new_table(int);
X
X public:
X  
X  virtual char* copy_key(char* key) { return key; }
X  virtual void delete_key(char* key) {}
X  virtual int equiv( char* key1,  char* key2) { return key1 == key2; }
X  virtual void init( VOID *member ) { member->pointer = 0; }
X
X  Assoc( int struct_size, int initial_size = 4 );
X
X  ~Assoc();
X
X  VOID* lookup(char*);
X  VOID& operator[] (char* key) { return *Assoc::lookup(key); }
X  void remove( char * );
X
X  inline int entries( ) { return num_entries; }
X
X};
X
Xclass Name_table : public Assoc
X{
X  int hash( char* );
X  
Xpublic:
X  char* copy_key(char *);
X  void delete_key(char*);
X  int equiv(char*, char*);
X  Name_table(int struct_size, int initial_size = 4) : 
X     (struct_size, initial_size) {}
X
X};
X
X
X// CAVEAT
X// Do not use assoc::operator[] or assoc::remove() while an assoc_iterator
X// is active, or risk having the iterator miss some entries.
X
Xclass Assoc_iterator  
X{
X  Assoc* cs;
X  int i;
X public:
X  Assoc_iterator(Assoc& s) { cs = &s; i = 0; }
X  Assoc_entry* operator() ();
X};
X\f
X
X#define NAME_TABLE(_name_,_type_) \
Xclass name2(_name_,_table) : public Name_table \
X{ \
X  public: \
X    name2(_name_,_table) () : (sizeof(_type_)) {} \
X    _type_& operator[] (char* key) \
X      { return *((_type_*)Name_table::lookup(key));} \
X}; \
X\
Xclass name2(_name_,_entry) : public Assoc_entry \
X{ \
X  public: \
X    inline const _type_* value() { return (_type_*)(Assoc_entry::value()); } \
X};  \
X\
Xclass name2(_name_,_iterator): public Assoc_iterator \
X{ \
X  public: \
X    name2(_name_,_entry)* operator() () \
X      { return (name2(_name_,_entry)*) Assoc_iterator::operator() (); } \
X}; 
X#endif NAME_TABLE_INCLUDED
END_OF_Assoc.H
if test 2495 -ne `wc -c <Assoc.H`; then
    echo shar: \"Assoc.H\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f Assoc.doc -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"Assoc.doc\"
else
echo shar: Extracting \"Assoc.doc\" \(1716 characters\)
sed "s/^X//" >Assoc.doc <<'END_OF_Assoc.doc'
X//
X// Class Assoc is a generic associative-array class.
X// Class Name_table is a somewhat less generic derived class which
X// maps null-terminated strings to arbitrary things.
X//
X// There is also an iterator class for the arrays.
X//
X// These classes use open hash-tables.   The "keys" are typed a char*, but 
X// as far as Assoc is concerned, they could be anything.
X// 
X// Virutal functions:
X//
X//    copy_key(char*)     Makes a copy of the key, to be stored in the hash-
X//                          table for future comparison
X//    delete_key(char*)   Deletes the key when no longer needed
X//    equiv(char*,char*)  Returns 1 iff the keys are equivalent.
X//    hash(char*)         Used by hashing algorithm. 
X//                          Requirement: hash(key1)==hash(key2) if
X//                          equiv(key1,key2) is true
X//    init(VOID *)        Initializes an entry which has not been
X//                          previously referenced
X//
X// See definitions of Assoc and Name_table in Assoc.C and Assoc.H
X// for the default function definitions of these virtual functions.
X//
X// The macro NAME_TABLE defines three derived classes of Name_table.  
X// For example,
X//
X// NAME_TABLE(Identifier, struct id_rec )
X// 
X// defines classes "Identifier_table", "Identifier_entry", and 
X// "Identifier_iterator".
X//
X// You can use an object of type "Identifier_table" as though it were 
X// an array of struct id_rec's :
X//
X// table["foo"] = some_struct_id_rec;
X//
X// Identifier_entry has methods index() and value() which return the
X// key associated with an entry, and the contents of the entry, respectively.
X//
X// Identifier_iterator will iterate through all the entries in the table.
X//
X// See assoc_demo.C.
X// 
END_OF_Assoc.doc
if test 1716 -ne `wc -c <Assoc.doc`; then
    echo shar: \"Assoc.doc\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f assoc_demo.C -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"assoc_demo.C\"
else
echo shar: Extracting \"assoc_demo.C\" \(1112 characters\)
sed "s/^X//" >assoc_demo.C <<'END_OF_assoc_demo.C'
X#include "Assoc.H"
X#include <stream.h>
X
Xchar* copy(char* str)
X{
X  return strcpy(new char[strlen(str)], str);
X}
X
XNAME_TABLE(String, char*)
X
XString_table table;
X
Xmain()
X{
X  cout << "\nTo put an entry, type p <key> <contents>\n"
X       << "To get an entry, type g <key>\n"
X       << "To show all entries, type s\n"
X       << "To delete an entry, type d\n"
X       << "To count the entries, type c\n\n"
X  ;
X 
X  char key[128];
X  char cont[128];
X  char command;
X
X  do
X    {
X      cout << ". ";
X      cin >> command;
X      switch(command)
X	{
X	  break;case 'q': {}
X
X	  break;case 'p':
X            cin >> key >> cont;
X	    table[key] = copy(cont);
X
X	  break;case 'g':
X	    cin >> key;
X	    cout << table[key] << "\n";
X
X	  break;case 's':
X	  { 
X	    String_entry *p;
X	    String_iterator next(table);
X	    while(p = next())
X	      cout << p->index() << " " << *(p->value()) << "\n";
X	  }
X
X	  break;case 'd':
X	    cin >> key;
X	    delete table[key];
X	    table.remove(key);
X
X	  break;case 'c':
X	    cout << table.entries() << "\n";
X
X	  break; default:
X	    cout << "Huh?\n";
X
X	  break;
X	}
X    }
X  while(command != 'q');
X
X}
X
END_OF_assoc_demo.C
if test 1112 -ne `wc -c <assoc_demo.C`; then
    echo shar: \"assoc_demo.C\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f heap.C -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"heap.C\"
else
echo shar: Extracting \"heap.C\" \(1209 characters\)
sed "s/^X//" >heap.C <<'END_OF_heap.C'
X#include "heap.H"
X#include "Assoc.H"
X
X// Everything is calculated in "heap-units", which is the amount of space
X// required for an arbitrary pointer.
X//
X// Each block of packets is linked to the previous ones through the
X// first "heap-unit". The head of the list is "cache".
X//
X// All free packets are similarly linked. The head of the list is
X// "next_free".
X//
X
X
XHeap::Heap(unsigned int size, unsigned int init_size )
X{
X  // calculate size of element in heap-units.
X  packet_size = (size + (sizeof(heap_unit) - 1)) / sizeof(heap_unit);
X  cache = 0;
X  num_elements = init_size;
X  next_free = 0;
X}
X
Xvoid Heap::underflow()
X{
X  heap_unit* more = cache;
X  cache = new heap_unit[packet_size*num_elements + 1]; // one extra for link
X  cache->next = more;  // This is the link.
X  heap_unit* freebee = cache + 1;  // First free element is 1 beyond link.
X
X  int i;
X  for(i=0; i<num_elements; i++)
X    { free(freebee);  // This is "our" free, not C-library's free.
X      freebee += packet_size;
X    }
X
X  // If we run out again, we'll get this many more.
X  num_elements = next_load(num_elements);
X
X}
X
XHeap::~Heap()
X{
X  while(cache)
X    { heap_unit* more = cache->next;
X      delete cache;
X      cache = more;
X    }
X}
X
END_OF_heap.C
if test 1209 -ne `wc -c <heap.C`; then
    echo shar: \"heap.C\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f heap.H -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"heap.H\"
else
echo shar: Extracting \"heap.H\" \(1133 characters\)
sed "s/^X//" >heap.H <<'END_OF_heap.H'
X#ifndef HEAP_DEFD
X#define HEAP_DEFD
X
X
Xclass heap_unit  
X{ friend class Heap;
X  heap_unit* next;
X};
X
Xclass Heap  // implements heap of packets of a fixed size
X{
X  virtual int next_load(int n) // Strategy determines size of subsequent
X    { return n*2; }            //   block, calculated from size of 
X			       //   previous block.
X
X  int packet_size;      // in units of sizeof(heap_unit)
X  int num_elements;     // number of packets to allocate in next block
X  heap_unit* cache;     // queue of all blocks of packets
X  heap_unit* next_free; // free store of packets
X  void underflow();     // called when free store is depleted
X
Xpublic:
X
X  Heap(unsigned int     /* packet-size in bytes */, 
X       unsigned int = 4 /* number of packets in first block */);
X
X  ~Heap();
X
X  inline void* 
X  alloc()
X    { heap_unit* retval;
X      if(next_free == 0) underflow();
X      retval=next_free;  next_free = next_free->next;
X      return (void*)retval;
X    }
X
X  inline void
X  free(void* packet)
X    {
X      heap_unit *more = next_free;
X      next_free = (heap_unit*)packet;
X      ((heap_unit*)packet)->next = more;
X    }
X}; 
X
X
X#endif HEAP_DEFD
END_OF_heap.H
if test 1133 -ne `wc -c <heap.H`; then
    echo shar: \"heap.H\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f heap.doc -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"heap.doc\"
else
echo shar: Extracting \"heap.doc\" \(252 characters\)
sed "s/^X//" >heap.doc <<'END_OF_heap.doc'
X//
X// Class Heap implements a free-store (heap) of packets of one size.
X// It can be significantly faster than default methods, if
X// packets are repeatedly allocated and freed.  Uses C++ built-in 
X// function "new" to allocate blocks of packets.
X//
X
X
END_OF_heap.doc
if test 252 -ne `wc -c <heap.doc`; then
    echo shar: \"heap.doc\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f idents.C -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"idents.C\"
else
echo shar: Extracting \"idents.C\" \(2304 characters\)
sed "s/^X//" >idents.C <<'END_OF_idents.C'
X//  COPYRIGHT 1988 Megatest Corp.
X
X#include <ctype.h>
X#include "idents.H"
X
Xinline unsigned char
XLOWER(unsigned char ch)  { return (isupper(ch)?tolower(ch):(ch)); }
X
XIdents_UC::equiv(register  char* str1, register  char* str2)
X{
X  while( isalnum(*str1) || *str1 == '_' )
X    if(*str1++ != *str2++)
X      return 0;
X  return !(isalnum(*str2) || *str2 == '_');
X}
X
XIdents_uC::equiv(register  char* str1, register  char* str2)
X{
X  while( isalnum(*str1))
X    if(*str1++ != *str2++)
X      return 0;
X  return !isalnum(*str2);
X}
X
XIdents_uc::equiv(register  char* str1, register  char* str2)
X{
X  while( isalnum(*str1))
X    if( LOWER(*str1) != LOWER( *str2))
X      return 0;
X    else
X      { str1++; str2++; }
X  return !isalnum(*str2);
X}
X
XIdents_Uc::equiv(register  char* str1, register  char* str2)
X{
X  while(isalnum(*str1) || *str1 == '_' )
X    if( LOWER(*str1) != LOWER( *str2))
X      return 0;
X    else
X      { str1++; str2++; }
X  return !(isalnum(*str2) || *str2 == '_');
X}
X
XIdents_Tc::equiv(register  char* str1, register  char* str2)
X{
X  do
X    if( LOWER(*str1) != LOWER(*str2))
X      return 0;
X    else
X      { do str1++; while (*str1 == '-');
X	do str2++; while (*str2 == '-');
X      }
X  while(isalnum(*str1));
X  return !(isalnum(*str2));
X}
X
XIdents_UC::hash(register  char* str)
X{
X  register int hash = 0;
X  while( isalnum(*str) || *str == '_' )
X    hash += hash + (unsigned char)(*str++);
X  return hash;
X}
X
X/* underscores not valid. case matters. */
XIdents_uC::hash(register  char* str)
X{
X  register int hash = 0;
X  while( isalnum(*str))
X      { hash += hash + (unsigned char)(*str++); }
X  return hash;
X}
X
X
X/* underscores are not valid. case does not matter. (Standard Pascal) */
XIdents_uc::hash(register  char* str)
X{
X  register int hash = 0;
X  while( isalnum(*str))
X      { hash += hash + LOWER((unsigned char)(*str++));
X      }
X  return hash;
X}
X
X/* Telephone scheme */
XIdents_Tc::hash(register  char* str)
X{
X  register int hash = 0;
X  do
X      { hash += hash + LOWER((unsigned char)(*str));
X	do str++; while (*str == '-');
X      }
X  while( isalnum(*str));
X  return hash;
X}
X
X/* underscores are valid. case does not matter. */
XIdents_Uc::hash(register  char* str)
X{
X  register int hash = 0;
X  while(isalnum(*str) || *str == '_' )
X    { hash += hash + LOWER((unsigned char)(*str++));
X    }
X  return hash;
X}
END_OF_idents.C
if test 2304 -ne `wc -c <idents.C`; then
    echo shar: \"idents.C\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f idents.H -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"idents.H\"
else
echo shar: Extracting \"idents.H\" \(1652 characters\)
sed "s/^X//" >idents.H <<'END_OF_idents.H'
X#ifndef IDENTS_INCLUDED
X#define IDENTS_INCLUDED
X
X#include "Assoc.H"
X
X
X/* Underscores are valid, Case matters. (Mega-Pascal/C/C++) */
Xclass Idents_UC: public Assoc
X{ int hash( char*);
X  int equiv( char*,  char*);
X
X public:
X  Idents_UC(int struct_size, int initial_size = 4):(struct_size,initial_size)
X    {}
X};
X
X
X/* Underscores are not valid. Case does not matter.  (Standard Pascal) */
Xclass Idents_uc: public Assoc
X{ int hash( char*);
X  int equiv( char*,  char*);
X
X public:
X  Idents_uc(int struct_size, int initial_size = 4):(struct_size,initial_size)
X    {}
X};
X
X/* Underscores are valid, but "transparent glue" when not in the first
X** position. Case does not matter.  E.g. "foo_bar" is equivalent to
X** "FooBar". But "_foo" is not equivalent to "foo".
X**
X** This scheme passes the "telephone test":  You can read programs to
X** people over the telephone, without constantly saying, "all caps",
X** "underscore", etc..  I wish this were standard practice.
X*/
Xclass Idents_Tc: public Assoc  // T for telephone.
X{ int hash( char*);
X  int equiv( char*,  char*);
X
X public:
X  Idents_Tc(int struct_size, int initial_size = 4):(struct_size,initial_size)
X    {}
X};
X
X
X/* Underscores are not valid.  Case matters.  (Modula II) */
Xclass Idents_uC: public Assoc
X{ int hash( char*);
X  int equiv( char*,  char*);
X
X public:
X  Idents_uC(int struct_size, int initial_size = 4):(struct_size,initial_size)
X    {}
X};
X
X
X/* Underscores are valid. Case does not matter. */
Xclass Idents_Uc: public Assoc
X{ int hash( char*);
X  int equiv( char*,  char*);
X
X public:
X  Idents_Uc(int struct_size, int initial_size = 4):(struct_size,initial_size)
X    {}
X};
X
X#endif IDENTS_INCLUDED
END_OF_idents.H
if test 1652 -ne `wc -c <idents.H`; then
    echo shar: \"idents.H\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f idents.doc -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"idents.doc\"
else
echo shar: Extracting \"idents.doc\" \(668 characters\)
sed "s/^X//" >idents.doc <<'END_OF_idents.doc'
X//
X// These are classes derived from class Assoc, for indexing with identifiers
X// (as defined by various languages).  They are intended for use in compilers
X// and such.
X//
X// The keys need not be null-terminated, so you can just read an entire 
X// source file into a buffer, then use pointers into the buffer as keys.
X// The redefined virtual functions hash() and equiv() recongize the
X// ends of the keys by context.
X//
X// The virtual functions copy_key() and delete_key() are not redefined.
X// The default version does not actually make a copy, it just returns
X// the char* it is given.  So the keys must remain untouched so long
X// as the lookup table is active.
END_OF_idents.doc
if test 668 -ne `wc -c <idents.doc`; then
    echo shar: \"idents.doc\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f queue.C -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"queue.C\"
else
echo shar: Extracting \"queue.C\" \(1049 characters\)
sed "s/^X//" >queue.C <<'END_OF_queue.C'
X#include "queue.H"
X
X
X// All Queue_lists will come from this private heap.  MUCH quicker.
X
Xstatic Heap heap(sizeof(Queue_list));
X
X
Xvoid* Queue::push(void* cont)
X{
X  size++;
X  Queue_list* old_head = head;
X  head = (struct Queue_list*)heap.alloc(); 
X  head->cdr = old_head;
X  head->car = cont;
X  if(!tail) tail = head;
X  return cont;
X}
X
Xvoid* Queue::pop()
X{
X  if(size == 0) return 0;
X  size--;
X  void* cont = head->car;
X  Queue_list* condemned = head;
X  head = head->cdr;
X  heap.free(condemned);
X  if(size==0) tail = 0;
X  return cont;
X}
X
Xvoid* Queue::append(void *cont)
X{
X  size++;
X  Queue_list* old_tail = tail;
X  tail = (struct Queue_list*)heap.alloc();
X  tail->cdr=0;
X  tail->car=cont;
X  if(old_tail)
X    old_tail->cdr = tail;
X  if(head==0)head = tail;
X  return cont;
X}
X
X
Xvoid* Queue_iterator::operator() ()
X{
X  if(done()) { rest = lq->head;  return 0; }
X  void* retval = rest->car;
X  rest=rest->cdr;
X  return retval;
X}
X
XQueue::~Queue()
X{
X  while(head)
X    { Queue_list* condemned = head;
X      head = head->cdr;
X      heap.free(condemned);
X    }
X}
END_OF_queue.C
if test 1049 -ne `wc -c <queue.C`; then
    echo shar: \"queue.C\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f queue.H -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"queue.H\"
else
echo shar: Extracting \"queue.H\" \(779 characters\)
sed "s/^X//" >queue.H <<'END_OF_queue.H'
X#ifndef QUEUE_DEFD
X#define QUEUE_DEFD
X
X#include "heap.H"
X
X
Xclass Queue_list
X{
X  
Xfriend class Queue_iterator;
Xfriend class Queue;
X  
X  Queue_list* cdr;
X  void* car;
X  
X};
X
X
Xclass Queue
X{
X
X  Queue_list* tail;
X  Queue_list* head;
X  int size;
X  friend class Queue_iterator;
X
X public:
X  Queue() { size = 0; tail=0; head=0; }
X  ~Queue();
X
X  void* push(void*);
X  void* pop();
X  void* append(void*);
X
X  inline  
X  void* first()
X    { return(size==0)? 0 : head->car; }
X
X  inline  
X  void* last()
X    { return (size==0) ? 0:  tail->car; }
X  
X  inline int length() { return size; }
X};
X
Xclass Queue_iterator
X{
X  Queue* lq;
X  Queue_list* rest;
X
X public:
X  Queue_iterator(Queue& q) { lq=&q; rest = q.head; }
X  void* operator() ();
X  int done() { return rest==0; }
X				 
X};
X
X#endif QUEUE_DEFD
END_OF_queue.H
if test 779 -ne `wc -c <queue.H`; then
    echo shar: \"queue.H\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f queue.doc -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"queue.doc\"
else
echo shar: Extracting \"queue.doc\" \(267 characters\)
sed "s/^X//" >queue.doc <<'END_OF_queue.doc'
X
X// Class Queue impelements queues of pointers.  Can be used
X// as FIFO (use method "append") or LIFO (use method "push").
X//
X// Public methods are
X//
X
X  void* push(void*);
X  void* pop();
X  void* append(void*);
X  void* first();
X  void* last();
X  inline int length();
END_OF_queue.doc
if test 267 -ne `wc -c <queue.doc`; then
    echo shar: \"queue.doc\" unpacked with wrong size!
fi
# end of overwriting check
fi
echo shar: End of shell archive.
exit 0

next prev parent reply	other threads:[~1988-02-27  0:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1988-02-23 15:43 Hashing Function wanted "EDWARD CRAGG"
1988-02-27  0:07 ` Dave Jones [this message]
1988-02-29 21:29 ` R.A. Agnew

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox