From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.224.137.68 with SMTP id v4mr14030312qat.1.1377933262605; Sat, 31 Aug 2013 00:14:22 -0700 (PDT) X-Received: by 10.49.1.112 with SMTP id 16mr34906qel.7.1377933262541; Sat, 31 Aug 2013 00:14:22 -0700 (PDT) Path: border1.nntp.dca3.giganews.com!border2.nntp.dca3.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!q10no2258353qai.0!news-out.google.com!p7ni0qas.0!nntp.google.com!fx3no6284886qab.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Sat, 31 Aug 2013 00:14:22 -0700 (PDT) Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=105.236.92.204; posting-account=p-xPhAkAAADjHQWEO7sFME2XBdF1P_2H NNTP-Posting-Host: 105.236.92.204 User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <9fc26b8a-7a5c-4941-81cb-8f0e9c17a660@googlegroups.com> Subject: B-tree performance From: Peter Brooks Injection-Date: Sat, 31 Aug 2013 07:14:22 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Original-Bytes: 3484 Xref: number.nntp.dca.giganews.com comp.lang.ada:183241 Date: 2013-08-31T00:14:22-07:00 List-Id: I'm interested to know if anybody has any real-world evidence for this hypo= thesis, or if they agree with my gut feel. If I build a B-tree, then the conventional means to search it would be to s= tart in the middle, then cascade left and right following the pointer struc= ture. That's slow because most searches won't turn up a value in the first = N levels of the b-tree. So it'd make sense to have an array which contains the address values of th= e B-tree down to a certain level. That way, for a look up, you search the a= rray, which is nice and quick, and this leads you to the correct sub-tree a= nd you then continue as usual, from the top of that sub-tree. The question is how many levels would it make sense to keep in the array. T= oo many and the array itself consumes too many resources and slows things d= own. Too few and you don't gain much, particularly if the B-tree is sparse.= The sizes of the buckets, or sub-b-trees would be: Level Buckets 1 18446744073709551616 3 9223372036854775808 5 4611686018427387904 9 2305843009213693952 17 1152921504606846976 33 576460752303423488 65 576460752303423488 129 288230376151711744 257 144115188075855872 513 72057594037927936 1025 36028797018963968 2049 18014398509481984 4097 9007199254740992 8193 2251799813685248 16385 1125899906842624 32769 562949953421312 65537 140737488355328 131073 70368744177664 262145 35184372088832 524289 17592186044416 1048577 8796093022208 2097153 4398046511104 4194305 2199023255552 8388609 1099511627776 16777217 549755813888 33554433 274877906944 67108865 137438953472 134217729 68719476736 268435457 34359738368 536870913 17179869184 1073741825 8589934592 2147483649 4294967296 4294967297 2147483648 8589934593 1073741824 13 levels would, I think, speed things up considerably, with only 4k's wor= th of overhead. How much deeper would it make sense to go before you get di= minishing returns?