comp.lang.ada
 help / color / mirror / Atom feed
* B-Tree v Skip-List Insertion Benchmark
@ 2014-09-18  1:15 Jeffrey Carter
  2014-09-18  7:41 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 7+ messages in thread
From: Jeffrey Carter @ 2014-09-18  1:15 UTC (permalink / raw)


I ran a quick comparison of the insertion times for Kazakov's Generic_B_Tree pkg
against PragmARC.Skip_List_Unbounded from the PragmAda Reusable Components.

Both data structures are used for similar purposes, allowing O(log N) look-up
times. Insertion and deletion are expensive operations on balanced trees due to
re-balancing, and one reason skip lists were invented was to have fast insertion
and deletion times compared to balanced trees. I had never actually compared
times, and Kazakov's recent post comparing DB times got me thinking about it.

A typical run inserting the same 1,000 items into both structures gives times of

1.484 ms for the B tree
0.709 ms for the skip list

(Divide by 1,000 for average per-insertion times.)

The trade off is similar to using heap sort or quick sort. Both are O(N log N),
with quick sort usually being faster. Heap sort is always O(N log N), but in
rare cases, however, quick sort has worst-case performance of O(N**2).

A skip list is probabilistically balanced, and has a worst-case search time of
O(N) in very rare circumstances, almost always for small N (< 256).

-- 
Jeff Carter
"Perfidious English mouse-dropping hoarders."
Monty Python & the Holy Grail
10


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: B-Tree v Skip-List Insertion Benchmark
  2014-09-18  1:15 B-Tree v Skip-List Insertion Benchmark Jeffrey Carter
@ 2014-09-18  7:41 ` Dmitry A. Kazakov
  2014-09-18 16:53   ` Jeffrey Carter
  2014-09-18 17:39   ` Jeffrey Carter
  0 siblings, 2 replies; 7+ messages in thread
From: Dmitry A. Kazakov @ 2014-09-18  7:41 UTC (permalink / raw)


On Wed, 17 Sep 2014 18:15:15 -0700, Jeffrey Carter wrote:

> I ran a quick comparison of the insertion times for Kazakov's Generic_B_Tree pkg
> against PragmARC.Skip_List_Unbounded from the PragmAda Reusable Components.
> 
> Both data structures are used for similar purposes, allowing O(log N) look-up
> times. Insertion and deletion are expensive operations on balanced trees due to
> re-balancing, and one reason skip lists were invented was to have fast insertion
> and deletion times compared to balanced trees. I had never actually compared
> times, and Kazakov's recent post comparing DB times got me thinking about it.
> 
> A typical run inserting the same 1,000 items into both structures gives times of
> 
> 1.484 ms for the B tree
> 0.709 ms for the skip list
> 
> (Divide by 1,000 for average per-insertion times.)
> 
> The trade off is similar to using heap sort or quick sort. Both are O(N log N),
> with quick sort usually being faster. Heap sort is always O(N log N), but in
> rare cases, however, quick sort has worst-case performance of O(N**2).
> 
> A skip list is probabilistically balanced, and has a worst-case search time of
> O(N) in very rare circumstances, almost always for small N (< 256).

Thanks for pointing this.

Can skip list made fit into a series fixed-size blocks?

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: B-Tree v Skip-List Insertion Benchmark
  2014-09-18  7:41 ` Dmitry A. Kazakov
@ 2014-09-18 16:53   ` Jeffrey Carter
  2014-09-18 17:54     ` Dmitry A. Kazakov
  2014-09-18 17:39   ` Jeffrey Carter
  1 sibling, 1 reply; 7+ messages in thread
From: Jeffrey Carter @ 2014-09-18 16:53 UTC (permalink / raw)


On 09/18/2014 12:41 AM, Dmitry A. Kazakov wrote:
> 
> Can skip list made fit into a series fixed-size blocks?

I don't understand the question.

-- 
Jeff Carter
"From this day on, the official language of San Marcos will be Swedish."
Bananas
28


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: B-Tree v Skip-List Insertion Benchmark
  2014-09-18  7:41 ` Dmitry A. Kazakov
  2014-09-18 16:53   ` Jeffrey Carter
@ 2014-09-18 17:39   ` Jeffrey Carter
  1 sibling, 0 replies; 7+ messages in thread
From: Jeffrey Carter @ 2014-09-18 17:39 UTC (permalink / raw)


One comment on Generic_B_Tree: O(log N)-searchable structures are commonly used
as maps, sets, and priority queues. The interface of Generic_B_Tree seems skewed
towards use as a map; for the other uses the client must supply a dummy Value
parameter. It might be better to have a more general B-tree pkg, and, if
desired, build a more specific map interface on top of that.

PragmARC.Skip_List_Unbounded takes the opposite approach, providing a general
interface and presuming that the client knows how to use it to implement a map.

Ada.Containers provides the specific interfaces (Ordered_Maps and such) but does
not provide the underlying data structure.

-- 
Jeff Carter
"From this day on, the official language of San Marcos will be Swedish."
Bananas
28


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: B-Tree v Skip-List Insertion Benchmark
  2014-09-18 16:53   ` Jeffrey Carter
@ 2014-09-18 17:54     ` Dmitry A. Kazakov
  2014-09-18 19:33       ` Jeffrey Carter
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry A. Kazakov @ 2014-09-18 17:54 UTC (permalink / raw)


On Thu, 18 Sep 2014 09:53:50 -0700, Jeffrey Carter wrote:

> On 09/18/2014 12:41 AM, Dmitry A. Kazakov wrote:
>> 
>> Can skip list made fit into a series fixed-size blocks?
> 
> I don't understand the question.

The structure must be chunked into blocks in order to be stored. When you
update the container, the change should be localized to one block when
possible. This is a property B-trees have. You make its buckets of the file
block size and then insert or delete would update just one block, most of
the time.

If, say, a doubly-linked list element is inserted or deleted this would
involve, most likely, tree blocks (to fix the links), which would devaluate
the fact that these operations have O(1) complexity.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: B-Tree v Skip-List Insertion Benchmark
  2014-09-18 17:54     ` Dmitry A. Kazakov
@ 2014-09-18 19:33       ` Jeffrey Carter
  2014-09-18 20:31         ` Dmitry A. Kazakov
  0 siblings, 1 reply; 7+ messages in thread
From: Jeffrey Carter @ 2014-09-18 19:33 UTC (permalink / raw)


On 09/18/2014 10:54 AM, Dmitry A. Kazakov wrote:
> 
> The structure must be chunked into blocks in order to be stored. When you
> update the container, the change should be localized to one block when
> possible. This is a property B-trees have. You make its buckets of the file
> block size and then insert or delete would update just one block, most of
> the time.

So you store the tree structure in the file, as well as the data? A node in a
skip list has a variable number of links to other nodes, so as implemented in
PragmARC.Skip_List_Unbounded would probably not lend itself to that. It could be
implemented with a fixed number of links, of which only a certain number are
actually used, which might work better.

I was only interested in checking the claim that insertion in a skip list is
quicker than in a B tree.

> If, say, a doubly-linked list element is inserted or deleted this would
> involve, most likely, tree blocks (to fix the links), which would devaluate
> the fact that these operations have O(1) complexity.

Insertion and deletion in a skip list are O(log N), since you have to search for
where to insert the new node. The time savings comes from not having a
re-balancing step.

-- 
Jeff Carter
"From this day on, the official language of San Marcos will be Swedish."
Bananas
28


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: B-Tree v Skip-List Insertion Benchmark
  2014-09-18 19:33       ` Jeffrey Carter
@ 2014-09-18 20:31         ` Dmitry A. Kazakov
  0 siblings, 0 replies; 7+ messages in thread
From: Dmitry A. Kazakov @ 2014-09-18 20:31 UTC (permalink / raw)


On Thu, 18 Sep 2014 12:33:16 -0700, Jeffrey Carter wrote:

> On 09/18/2014 10:54 AM, Dmitry A. Kazakov wrote:
>> 
>> The structure must be chunked into blocks in order to be stored. When you
>> update the container, the change should be localized to one block when
>> possible. This is a property B-trees have. You make its buckets of the file
>> block size and then insert or delete would update just one block, most of
>> the time.
> 
> So you store the tree structure in the file, as well as the data?

Yes. Everything is in the same file.

> A node in a
> skip list has a variable number of links to other nodes, so as implemented in
> PragmARC.Skip_List_Unbounded would probably not lend itself to that. It could be
> implemented with a fixed number of links, of which only a certain number are
> actually used, which might work better.

There is a persistent memory pool, so it would be no problem to allocate
variable-length nodes. The concern is the case when the nodes become too
large for one block. The memory pool supports a stream interface. So a node
can be read from a chain of blocks and reconstructed in the memory then
updated and written back (in other blocks), no matter how many blocks it
spans. But it would be a considerable performance hit, I guess.

> I was only interested in checking the claim that insertion in a skip list is
> quicker than in a B tree.

Yes, there are many structures more efficient than B-tree, but they might
be difficult to apply to the case of segmented (as blocks) storage rather
than flat storage. It is interesting to learn if skip lists could be used
this way. Which is why I asked.

>> If, say, a doubly-linked list element is inserted or deleted this would
>> involve, most likely, tree blocks (to fix the links), which would devaluate
>> the fact that these operations have O(1) complexity.
> 
> Insertion and deletion in a skip list are O(log N), since you have to search for
> where to insert the new node. The time savings comes from not having a
> re-balancing step.

No, I consider search as another operation. Let the location be already
determined. Then maintaining the structure gets a hit when not localized to
few blocks.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-09-18 20:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-18  1:15 B-Tree v Skip-List Insertion Benchmark Jeffrey Carter
2014-09-18  7:41 ` Dmitry A. Kazakov
2014-09-18 16:53   ` Jeffrey Carter
2014-09-18 17:54     ` Dmitry A. Kazakov
2014-09-18 19:33       ` Jeffrey Carter
2014-09-18 20:31         ` Dmitry A. Kazakov
2014-09-18 17:39   ` Jeffrey Carter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox