From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.66.26.5 with SMTP id h5mr8442884pag.1.1378174543088; Mon, 02 Sep 2013 19:15:43 -0700 (PDT) X-Received: by 10.49.70.138 with SMTP id m10mr5140qeu.9.1378174542769; Mon, 02 Sep 2013 19:15:42 -0700 (PDT) Path: border1.nntp.dca3.giganews.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!n2no17353330pbg.1!news-out.google.com!z6ni30971pbu.0!nntp.google.com!j7no108323qai.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Mon, 2 Sep 2013 19:15:42 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=105.236.92.252; posting-account=p-xPhAkAAADjHQWEO7sFME2XBdF1P_2H NNTP-Posting-Host: 105.236.92.252 References: <9fc26b8a-7a5c-4941-81cb-8f0e9c17a660@googlegroups.com> <2383e534-646c-4500-ad05-40caf9460c35@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: B-tree performance From: Peter Brooks Injection-Date: Tue, 03 Sep 2013 02:15:42 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Original-Bytes: 2531 Xref: number.nntp.dca.giganews.com comp.lang.ada:183262 Date: 2013-09-02T19:15:42-07:00 List-Id: On Monday, 2 September 2013 06:28:57 UTC+2, Niklas Holsti wrote: >=20 >=20 > But insertion and deletion are simple and fast in the basic hash-table > structure, too, as long as the collision sets are small. >=20 Only if the hash-table is large relative to the date. The test data I'm usi= ng contains 13,225,167 triples - a real triplestore might have ten times th= at. The file is nearly 2Gb in size. >=20 > In the traditional hash-table structure, the collision sets are > implemented as unordered linked lists of objects. You can insert or > delete any list entry with a few link changes. If you adjust the size of > the hash table to keep the collision set size bounded by some constant, >=20 I am meaning, as I said, to use a binary tree - a linked list. So collision= s will just be further links. The problem I see is the tree getting too deep, so that it takes too long t= o find items (as usual the time to add something is less critical than the = time to find something). The fewer collisions, the less lookup time, but th= e more items in the top-level hash table, the shallower the overall structu= re.=20