From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ac39a12d5faf5b14 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-04-24 08:26:58 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed1.cidera.com!Cidera!cyclone.socal.rr.com!cyclone3.kc.rr.com!news3.kc.rr.com!twister.socal.rr.com.POSTED!not-for-mail Message-ID: <3CC6CED0.C5E203F9@san.rr.com> From: Darren New X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Grace and Maps (was Re: Development process in the Ada community) References: <3CB46975.90408@snafu.de> <3CBAFFEE.2080708@snafu.de> <4519e058.0204171036.6f0a7394@posting.google.com> <3CBDD795.4060706@snafu.de> <4519e058.0204180800.44fac012@posting.google.com> <3CBF0341.8020406@mail.com> <4519e058.0204190529.559a47ae@posting.google.com> <3CC1C6B3.6060306@telepath.com> <3CC21747.5000501@telepath.com> <4519e058.0204220534.2eb33730@posting.go <3CC48F34.5A474E0F@boeing.com> <3CC49C50.485AE213@san.rr.com> <3CC4B5C0.4D16077C@acm.org> <3CC4E9DA.E02BE0DA@san.rr.com> <7vznzukhzf.fsf@vlinux.voxelvision.no> <3CC5B161.C719C5D8@san.rr.com> <7vg01lwaku.fsf@vlinux.voxelvision.no> <3CC6B932.75C468E1@san.rr.com> <7vbsc9w367.fsf@vlinux.voxelvision.no> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Wed, 24 Apr 2002 15:25:55 GMT NNTP-Posting-Host: 66.75.151.160 X-Complaints-To: abuse@rr.com X-Trace: twister.socal.rr.com 1019661955 66.75.151.160 (Wed, 24 Apr 2002 08:25:55 PDT) NNTP-Posting-Date: Wed, 24 Apr 2002 08:25:55 PDT Organization: RoadRunner - West Xref: archiver1.google.com comp.lang.ada:23058 Date: 2002-04-24T15:25:55+00:00 List-Id: Ole-Hjalmar Kristensen wrote: > OK, but how does this malicious user manage to do this when I'm using > a hash function which on average changes half the bits of its output > for any single bit change in its input? It's called the birthday paradox. You hash enough values, you're going to find collisions. > I'm not saying it's > impossible, but it seems pretty far-fetched to me. It seemed far-fetched to me that someone would disassemble IIS or XP in order to find unchecked buffer overflows, but they managed within days of the release of the software, too. > Of course, if he's got the source to my program I kind of thought that was the point of this exercise - to make source, yes? >and is good at inverting hash functions > he can do it. In particular (pardon the C code, but as you say > yourself, in C it's easy), how do you construct your keys so they all > hash to the same value given this particular hash function? : Well, it's hard to read when it isn't complete C, but assuming "u4" is an unsigned integer four bytes long, it's pretty easy. /* Crappy pseudo-C... This should find about 2**16 collisions. */ for (i = 0; i < 2**32; i++) { if (burtle(i, ...) == 27) { printf("Found one that hashes to 27: %ul\n", i); } } > > Hmmm... So if you have keys you can't break up? I mean, I'm not sure how > > you would make a generic hash function that would hash any record, or > > any array, or any private type, or ... > > > > Sure, in that case you will have to call the hash function for each > part of the key. But in that case, wouldn't you also need to define a > comparision operator if you used a tree? I think there are many more private types that can be ordered than those that can be hashed, in Ada at least. Most of the fundamental types in Ada come with ordering operators defined, while few come with hashes defined. (In Java, "hash" is defined everywhere, for just this reason.) > In Ada, you can get rid of any packing if you insist, at least more > portably than in C. Well, yeah, but I don't think you'd want to. > Hmm. What's wrong with having a generic hash function which you > instantiate for your private type? Provided that you can read the key > as an array of bytes, it should work the same in both C and Ada. Well, yeah, but I dont think you can. Doing so would mean that you're dependent on the packing of the structure, for example. If your private type contains pointers or gaps in the structure or fields that don't participate in an equality comparison anything like that, your hash is going to fail. Remember that you not only have to generate a hash, but that the hash has to be the same for all values that are considered equal. If you have a string implementation that has a length count but leaves junk at the end of the array of characters, a hash based on the physical contents of the record isn't going to work. > BTW., I'm not arguing for hash tables everywhere. I'm just saying > there are some appliactions where O(log N) just isn't good enough, and that > hash tables *in practice* is what you need. Agreed. There are many applications where you can control the kind of key and almost never want the results softed, in which hash tables are perfect. Symbol tables leap to mind. Perhaps the hashtable would be the third data structure, after sorted maps. :-) Hashtables are actually pretty simple to do, if you make the user supply the hash value and don't make keys or values limited. A hash table with implementations specifically having keys being the various string types would probably satisfy 90% of the need for hash tables, as well. There aren't a whole lot of operations on a hash table, so it's pretty easy to put one together, methinks. -- Darren New San Diego, CA, USA (PST). Cryptokeys on demand. The 90/10 rule of toothpaste: the last 10% of the tube lasts as long as the first 90%.