From: Peter Brooks <peter.h.m.brooks@gmail.com>
Subject: B-tree performance
Date: Sat, 31 Aug 2013 00:14:22 -0700 (PDT)
Date: 2013-08-31T00:14:22-07:00 [thread overview]
Message-ID: <9fc26b8a-7a5c-4941-81cb-8f0e9c17a660@googlegroups.com> (raw)
I'm interested to know if anybody has any real-world evidence for this hypothesis, or if they agree with my gut feel.
If I build a B-tree, then the conventional means to search it would be to start in the middle, then cascade left and right following the pointer structure. That's slow because most searches won't turn up a value in the first N levels of the b-tree.
So it'd make sense to have an array which contains the address values of the B-tree down to a certain level. That way, for a look up, you search the array, which is nice and quick, and this leads you to the correct sub-tree and you then continue as usual, from the top of that sub-tree.
The question is how many levels would it make sense to keep in the array. Too many and the array itself consumes too many resources and slows things down. Too few and you don't gain much, particularly if the B-tree is sparse. The sizes of the buckets, or sub-b-trees would be:
Level Buckets
1 18446744073709551616
3 9223372036854775808
5 4611686018427387904
9 2305843009213693952
17 1152921504606846976
33 576460752303423488
65 576460752303423488
129 288230376151711744
257 144115188075855872
513 72057594037927936
1025 36028797018963968
2049 18014398509481984
4097 9007199254740992
8193 2251799813685248
16385 1125899906842624
32769 562949953421312
65537 140737488355328
131073 70368744177664
262145 35184372088832
524289 17592186044416
1048577 8796093022208
2097153 4398046511104
4194305 2199023255552
8388609 1099511627776
16777217 549755813888
33554433 274877906944
67108865 137438953472
134217729 68719476736
268435457 34359738368
536870913 17179869184
1073741825 8589934592
2147483649 4294967296
4294967297 2147483648
8589934593 1073741824
13 levels would, I think, speed things up considerably, with only 4k's worth of overhead. How much deeper would it make sense to go before you get diminishing returns?
next reply other threads:[~2013-08-31 7:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-31 7:14 Peter Brooks [this message]
2013-08-31 15:26 ` B-tree performance Shark8
2013-08-31 15:47 ` Peter Brooks
2013-09-01 4:57 ` Niklas Holsti
2013-09-01 6:05 ` Peter Brooks
2013-09-01 8:05 ` Niklas Holsti
2013-09-01 8:38 ` Peter Brooks
2013-09-02 4:28 ` Niklas Holsti
2013-09-03 2:15 ` Peter Brooks
2013-09-05 8:25 ` Memory mapping and network file systems (Was: B-tree performance) Jacob Sparre Andersen
2013-09-05 10:12 ` Peter Brooks
2013-09-03 16:14 ` B-tree performance Dan'l Miller
2013-09-03 17:43 ` Peter Brooks
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox