comp.lang.ada
 help / color / mirror / Atom feed
* B-tree performance
@ 2013-08-31  7:14 Peter Brooks
  2013-08-31 15:26 ` Shark8
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Peter Brooks @ 2013-08-31  7:14 UTC (permalink / raw)


I'm interested to know if anybody has any real-world evidence for this hypothesis, or if they agree with my gut feel.

If I build a B-tree, then the conventional means to search it would be to start in the middle, then cascade left and right following the pointer structure. That's slow because most searches won't turn up a value in the first N levels of the b-tree.

So it'd make sense to have an array which contains the address values of the B-tree down to a certain level. That way, for a look up, you search the array, which is nice and quick, and this leads you to the correct sub-tree and you then continue as usual, from the top of that sub-tree.

The question is how many levels would it make sense to keep in the array. Too many and the array itself consumes too many resources and slows things down. Too few and you don't gain much, particularly if the B-tree is sparse. The sizes of the buckets, or sub-b-trees would be:

Level      Buckets
1                 18446744073709551616
3                 9223372036854775808
5                 4611686018427387904
9                 2305843009213693952
17                1152921504606846976
33                576460752303423488
65                576460752303423488
129               288230376151711744
257               144115188075855872
513               72057594037927936
1025              36028797018963968
2049              18014398509481984
4097              9007199254740992
8193              2251799813685248
16385             1125899906842624
32769             562949953421312
65537             140737488355328
131073            70368744177664
262145            35184372088832
524289            17592186044416
1048577           8796093022208
2097153           4398046511104
4194305           2199023255552
8388609           1099511627776
16777217          549755813888
33554433          274877906944
67108865          137438953472
134217729         68719476736
268435457         34359738368
536870913         17179869184
1073741825        8589934592
2147483649        4294967296
4294967297        2147483648
8589934593        1073741824

13 levels  would, I think, speed things up considerably, with only 4k's worth of overhead. How much deeper would it make sense to go before you get diminishing returns?



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-09-05 10:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-31  7:14 B-tree performance Peter Brooks
2013-08-31 15:26 ` Shark8
2013-08-31 15:47   ` Peter Brooks
2013-09-01  4:57 ` Niklas Holsti
2013-09-01  6:05   ` Peter Brooks
2013-09-01  8:05     ` Niklas Holsti
2013-09-01  8:38       ` Peter Brooks
2013-09-02  4:28         ` Niklas Holsti
2013-09-03  2:15           ` Peter Brooks
2013-09-05  8:25         ` Memory mapping and network file systems (Was: B-tree performance) Jacob Sparre Andersen
2013-09-05 10:12           ` Peter Brooks
2013-09-03 16:14 ` B-tree performance Dan'l Miller
2013-09-03 17:43   ` Peter Brooks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox