From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.224.52.6 with SMTP id f6mr19412499qag.2.1378024734811; Sun, 01 Sep 2013 01:38:54 -0700 (PDT) X-Received: by 10.49.98.162 with SMTP id ej2mr29501qeb.10.1378024734797; Sun, 01 Sep 2013 01:38:54 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!feeder01.blueworldhosting.com!usenet.blueworldhosting.com!feeder02.blueworldhosting.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!fx3no6402647qab.0!news-out.google.com!p7ni0qas.0!nntp.google.com!fx3no6402643qab.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Sun, 1 Sep 2013 01:38:54 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=105.236.92.204; posting-account=p-xPhAkAAADjHQWEO7sFME2XBdF1P_2H NNTP-Posting-Host: 105.236.92.204 References: <9fc26b8a-7a5c-4941-81cb-8f0e9c17a660@googlegroups.com> <2383e534-646c-4500-ad05-40caf9460c35@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: B-tree performance From: Peter Brooks Injection-Date: Sun, 01 Sep 2013 08:38:54 +0000 Content-Type: text/plain; charset=ISO-8859-1 X-Received-Bytes: 3191 Xref: news.eternal-september.org comp.lang.ada:17076 Date: 2013-09-01T01:38:54-07:00 List-Id: On Sunday, 1 September 2013 10:05:27 UTC+2, Niklas Holsti wrote: > > > If for some reason you want to the keep the hash-table small, although > this leads to many collisions, I suppose you could organize the > colliding object sets into some kind of keyed containers, for example > B-trees, but then you should use the actual key to index these > containers, not the hash-value. > I'm using a tree structure to represent a sparse array because it makes inserts and deletes easier and quicker, and cleaner. I'm using the large hash index to minimise the number of collisions. If there was a better sparse array algorithm that'd probably suit me better. Then I'd simply hash everything to a 32 bit (or even larger) array index and be done with it. With dynamic allocation, memory is used more efficiently. If you use a sparse array, it can grow in size slowly (needing periodic cleaning) because deletions don't delete the location, but simply put the contents to null, and such null entries accumulate. > > > I assume your data structures will be in main memory. If some parts are > in secondary memory (disk files) there are more trade-offs to consider. > I'm hoping to use persistent objects to reduce the extra complications there - using OS paging makes more sense, from a performance point of view, than using files. If the persistent objects are shadowed in RAM disc and that's shared over a network, it can make the solution nicely scalable across multiple networked machines. > > Either make up an analytical model (which will of course depend on lots > of parameters of the problem and processor) or experiment and measure. I > would do the latter. But it will probably also depend on the pattern of > accesses. For example, if the accesses have temporal locality, using > some kind of "move to front" heuristic on the lists of colliding objects > can help. > Yes, some sort of caching might help, particularly for predicates.