From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,ac39a12d5faf5b14
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-04-24 10:37:40 PST
Newsgroups: comp.lang.ada
Path: 
 archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!uunet!sea.uu.net!ash.uu.net!xyzzy!nntp
From: Jeffrey Carter <jeffrey.carter@boeing.com>
Subject: Re: Grace and Maps (was Re: Development process in the Ada community)
X-Nntp-Posting-Host: e246420.msc.az.boeing.com
Message-ID: <3CC6EA85.CA2735B2@boeing.com>
Sender: nntp@news.boeing.com (Boeing NNTP News Access)
Organization: The Boeing Company
X-Accept-Language: en
References: <3CB46975.90408@snafu.de> <3CBAFFEE.2080708@snafu.de>
 <4519e058.0204171036.6f0a7394@posting.google.com> <3CBDD795.4060706@snafu.de>
 <4519e058.0204180800.44fac012@posting.google.com> <3CBF0341.8020406@mail.com>
 <a9n0tk$6fs$1@nh.pace.co.uk>
 <4519e058.0204190529.559a47ae@posting.google.com>
 <a9p98h$8hn$1@nh.pace.co.uk> <3CC1C6B3.6060306@telepath.com>
 <3CC21747.5000501@telepath.com> <V+UgdHreGTJg@eisner.encompasserve.org>
 <x7vr8l871it.fsf@smaug.pushface.org> <4519e058.0204220534.2eb33730@posting.go
Mime-Version: 1.0
Date: Wed, 24 Apr 2002 17:25:25 GMT
X-Mailer: Mozilla 4.73 [en]C-CCK-MCD Boeing Kit  (WinNT; U)
Xref: archiver1.google.com comp.lang.ada:23067
Date: 2002-04-24T17:25:25+00:00
List-Id: <comp.lang.ada>

		 <3CC48F34.5A474E0F@boeing.com> <3CC4C800.10107@telepath.com> <3CC5F26A.6F74BD0F@boeing.com> <3CC6C8DA.162604F7@san.rr.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Darren New wrote:
> 
> Jeffrey Carter wrote:
> > Skip lists are probabilistically balanced. Average search times are
> > O(log N). The good news is that even a halfway decent random-number
> > generator ensures decent balancing if N is large enough to matter. See
> > figure 8 in
> >
> > ftp://ftp.cs.umd.edu/pub/skipLists/skiplists.pdf

I was a bit imprecise in my language here. The quality of the generator
is not very important if insertions are fairly random. If insertions
tend to be ordered, a poor generator can degrade performance. With a
decent generator, such as the Ada.Float_Random that comes with GNAT, the
order of insertions is immaterial.

> 
> Excellent reference. But I think there's still a number of advantages to
> red-black trees over skip lists. I'm interested in hearing why you think
> skip-lists are superior.

This is the paper in CACM in which Pugh introduced skip lists. Those
were the good old days. You don't find useful algorithms in CACM anymore
...

> 
> Here's my "Top Ten" list...

Looks like 6 to me, with three of them being reasons skip lists are
better :)

> 
> 10) A malicious user could make even the best skip list perform really
> poorly. In these days of internet break-ins, this could make a
> difference. All the user has to do is know what the random number
> generator generates. The user can either delete all the non-level-1
> nodes, or run a copy of the PRNG and insert nodes whose value equals the
> height of the node. In the latter case, you'd wind up with a string of 1
> values, all at height 1, then a string of 2 values all at height 2, then
> a string of 3 values all at height 3, etc.

This is difficult to do if, as one would expect, the random number
generator is seeded using the time. Not impossible, I agree. Is it worth
worrying about in the general case? For small N, the difference in time
is negligible; for large N, the best you can achieve is to slow the
system down. I don't expect generic components to be suitable for secure
applications.

> 
> 9) The skip-lists look like they take more space. In addition to an
> average of 2 pointers for each node, you also have the 'Range value.
> Since tree nodes are fixed size, I would think a decent compiler
> wouldn't bother to actually encode the size of the node in each node.

With the recommended p=0.25, they take an average of 1.333
pointers/node. They have a Height component lacking in trees, but
balanced trees tend to have balance information in the nodes. You've
addressed this below. Some argue that skip lists take less space, but I
consider them equivalent.

> 
> 8) The skip-lists look like they take more space, part deux. If I do
> "insert 1, insert 2, delete 1, insert 3, insert 4, delete 3, insert 5,
> insert 6, delete 5, ..." 100 times on a tree, I'm going to use up 100
> nodes of memory and have one free left over. With a skip-list, since the
> nodes are different sizes, I'll likely have some additional memory
> fragmentation.

True for an unbounded implementation. You could make all the nodes the
same size, if you don't mind the wasted space :) Naturally, a bounded
implementation avoids fragmentation.

> 6) The tree has no obvious way of going from a node to the next node,
> making an iterator complex to implement. That is, it's easy to traverse
> a tree in order recursively, but somewhat more difficult to traverse a
> tree iteratively (in the sense used in Grace.Lists). In a skip-list,
> each node points to the next node. On the other hand, I don't see any
> obvious way, given an arbitrary node, to do an insert/delete on a
> skip-list without repeating the search.

For a general purpose component, I don't see trees offering
insert/delete without a search, either. If you're going to keep state
information about the last search and use it to optimize insert/delete,
you could do that for any implementation.

> 
> 5) The algorithms are admittedly a little more complicated, with
> probably a few more assignments of pointers than with skip-lists.
> However, the algorithms are well-known, widely available, and only need
> to be implemented once, so I don't see the complexity factor as very
> important. It might be a touch slower tho, with maybe 8 or 10 pointer
> assignments where a skip-list doing the same thing might have 2 or 4.

A *little* more complicated? I've seen balanced tree deletion algorithms
in texts that run to 4 pages; compare that to the 20 lines or so for a
skip list.

As for timing, Pugh compares the CPU time of skip lists against several
kinds of balanced trees. Skip lists are about as fast or faster than the
trees for searching, and faster than the trees for insert/delete.
http://www.csihq.com/analysis/skiplist.pdf compares insertion times in
skip lists to B-trees (N=50,000 to 950,000 in 50,000 steps) and found
that Unix system time increased exponentially for N>=600,000, while it
increased linearly for skip lists, indicating that large B-trees did
significantly. Considering only Unix CPU time, skip lists were 5-10
times as fast as B-trees; an order of magnitude is not a "touch" slower.

> I'd be interested in hearing what the advantages of the skip-list are
> over trees, other than slightly simpler code. Is it just the
> performance? (Which is, of course, important.)

Between the significantly simpler algorithms, which are more likely to
be implemented correctly, and the observed performance advantages of
skip lists, the choice seems obvious.

-- 
Jeffrey Carter