From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Received: by 10.224.137.68 with SMTP id v4mr14030312qat.1.1377933262605;
 Sat, 31 Aug 2013 00:14:22 -0700 (PDT)
X-Received: by 10.49.1.112 with SMTP id 16mr34906qel.7.1377933262541; Sat, 31
 Aug 2013 00:14:22 -0700 (PDT)
Path: 
 border1.nntp.dca3.giganews.com!border2.nntp.dca3.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!q10no2258353qai.0!news-out.google.com!p7ni0qas.0!nntp.google.com!fx3no6284886qab.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Sat, 31 Aug 2013 00:14:22 -0700 (PDT)
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=105.236.92.204;
 posting-account=p-xPhAkAAADjHQWEO7sFME2XBdF1P_2H
NNTP-Posting-Host: 105.236.92.204
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9fc26b8a-7a5c-4941-81cb-8f0e9c17a660@googlegroups.com>
Subject: B-tree performance
From: Peter Brooks <peter.h.m.brooks@gmail.com>
Injection-Date: Sat, 31 Aug 2013 07:14:22 +0000
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Original-Bytes: 3484
Xref: number.nntp.dca.giganews.com comp.lang.ada:183241
Date: 2013-08-31T00:14:22-07:00
List-Id: <comp.lang.ada>

I'm interested to know if anybody has any real-world evidence for this hypo=
thesis, or if they agree with my gut feel.

If I build a B-tree, then the conventional means to search it would be to s=
tart in the middle, then cascade left and right following the pointer struc=
ture. That's slow because most searches won't turn up a value in the first =
N levels of the b-tree.

So it'd make sense to have an array which contains the address values of th=
e B-tree down to a certain level. That way, for a look up, you search the a=
rray, which is nice and quick, and this leads you to the correct sub-tree a=
nd you then continue as usual, from the top of that sub-tree.

The question is how many levels would it make sense to keep in the array. T=
oo many and the array itself consumes too many resources and slows things d=
own. Too few and you don't gain much, particularly if the B-tree is sparse.=
 The sizes of the buckets, or sub-b-trees would be:

Level      Buckets
1                 18446744073709551616
3                 9223372036854775808
5                 4611686018427387904
9                 2305843009213693952
17                1152921504606846976
33                576460752303423488
65                576460752303423488
129               288230376151711744
257               144115188075855872
513               72057594037927936
1025              36028797018963968
2049              18014398509481984
4097              9007199254740992
8193              2251799813685248
16385             1125899906842624
32769             562949953421312
65537             140737488355328
131073            70368744177664
262145            35184372088832
524289            17592186044416
1048577           8796093022208
2097153           4398046511104
4194305           2199023255552
8388609           1099511627776
16777217          549755813888
33554433          274877906944
67108865          137438953472
134217729         68719476736
268435457         34359738368
536870913         17179869184
1073741825        8589934592
2147483649        4294967296
4294967297        2147483648
8589934593        1073741824

13 levels  would, I think, speed things up considerably, with only 4k's wor=
th of overhead. How much deeper would it make sense to go before you get di=
minishing returns?