From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ac39a12d5faf5b14 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-04-24 12:36:38 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news-out.visi.com!hermes.visi.com!uunet!ash.uu.net!spool0900.news.uu.net!reader0901.news.uu.net!not-for-mail Sender: DB3L@CTWD0143 Newsgroups: comp.lang.ada Subject: Re: Grace and Maps (was Re: Development process in the Ada community) References: <3CB46975.90408@snafu.de> <3CBAFFEE.2080708@snafu.de> <4519e058.0204171036.6f0a7394@posting.google.com> <3CBDD795.4060706@snafu.de> <4519e058.0204180800.44fac012@posting.google.com> <3CBF0341.8020406@mail.com> <4519e058.0204190529.559a47ae@posting.google.com> <3CC1C6B3.6060306@telepath.com> <3CC21747.5000501@telepath.com> <3CC59ED2.1000803@home.com> <3CC5B286.6FE61551@san.rr.com> <3CC5B9EE.32F3060@san.rr.com> From: David Bolen Date: 24 Apr 2002 15:38:25 -0400 Message-ID: Organization: Fitlinxx, Inc. - Stamford, CT X-Newsreader: Gnus v5.7/Emacs 20.6 NNTP-Posting-Host: 208.247.212.3 X-Trace: 1019676996 reader1.ash.ops.us.uu.net 14973 208.247.212.3 Xref: archiver1.google.com comp.lang.ada:23076 Date: 2002-04-24T15:38:25-04:00 List-Id: Stephen Leake writes: > But if I just declare a map in Python, there's no way it's guaranteed > to be best case. So I think you are saying "yes, they live with the > O(n) worst case, and hope for the O(1) best case, and take whatever > they actually get". Yes. But it's also that a good deal of implementation effort has gone into making the odds of the O(1) case much better than that of the O(n) case. > Hmm. Maybe there is a way to tell Python what the hash table size > should be, and what hash function to use? Then I can get O(1). I > confess to never having used either a Perl or Python map. Python automatically maintains (grows) the hash table so that it is no more than 2/3 full. It is limited to a 32-bit key, and is open-addressed, not chained. Any "hashable" object in Python may be used as a dictionary key - objects may implement their own hashing function if they prefer. The default hashing function for user created objects is to hash to the object's location in memory. The built-in types tend to have type-specific tuned hashing functions. The source indicates the algorithm was originally based on Algorithm D from Knuth Vol. 3, Sec. 6.4. > Well, the question is, is a "typical" map in Perl or Python "close to" > O(n), or "close to" O(1)? Anybody have any data? In Python it's expected to be average O(1), and only extremely rarely O(n) (e.g., if everything hashes to the same value). But formally the operations are still O(n) - a lot of tuning over the course of Python's evolution just makes the typical case much closer to best than worst. In particular, Python's hash table does not depend on the randomness of the hashing function, and in fact works very well for very regular (even monotonically increasing) hash functions. There are some newsgroup posts over the years (look for stuff by Tim Peters in particular), but I don't know of a formal algorithm analysis online anywhere. The dictionary object source itself has quite a bit of information: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/dictobject.c?rev=HEAD (Hoping this isn't too off-topic for an Ada group) -- -- David -- /-----------------------------------------------------------------------\ \ David Bolen \ E-mail: db3l@fitlinxx.com / | FitLinxx, Inc. \ Phone: (203) 708-5192 | / 860 Canal Street, Stamford, CT 06902 \ Fax: (203) 316-5150 \ \-----------------------------------------------------------------------/