From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,ac39a12d5faf5b14
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-04-24 08:01:32 PST
Path: 
 archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed1.cidera.com!Cidera!cyclone.socal.rr.com!cyclone3.kc.rr.com!news3.kc.rr.com!twister.socal.rr.com.POSTED!not-for-mail
Message-ID: <3CC6C8DA.162604F7@san.rr.com>
From: Darren New <dnew@san.rr.com>
X-Mailer: Mozilla 4.77 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.lang.ada
Subject: Re: Grace and Maps (was Re: Development process in the Ada community)
References: <3CB46975.90408@snafu.de> <3CBAFFEE.2080708@snafu.de>
 <4519e058.0204171036.6f0a7394@posting.google.com> <3CBDD795.4060706@snafu.de>
 <4519e058.0204180800.44fac012@posting.google.com> <3CBF0341.8020406@mail.com>
 <a9n0tk$6fs$1@nh.pace.co.uk>
 <4519e058.0204190529.559a47ae@posting.google.com>
 <a9p98h$8hn$1@nh.pace.co.uk> <3CC1C6B3.6060306@telepath.com>
 <3CC21747.5000501@telepath.com> <V+UgdHreGTJg@eisner.encompasserve.org>
 <x7vr8l871it.fsf@smaug.pushface.org> <4519e058.0204220534.2eb33730@posting.go
	 <3CC48F34.5A474E0F@boeing.com> <3CC4C800.10107@telepath.com>
 <3CC5F26A.6F74BD0F@boeing.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Date: Wed, 24 Apr 2002 15:01:07 GMT
NNTP-Posting-Host: 66.75.151.160
X-Complaints-To: abuse@rr.com
X-Trace: twister.socal.rr.com 1019660467 66.75.151.160 (Wed,
 24 Apr 2002 08:01:07 PDT)
NNTP-Posting-Date: Wed, 24 Apr 2002 08:01:07 PDT
Organization: RoadRunner - West
Xref: archiver1.google.com comp.lang.ada:23056
Date: 2002-04-24T15:01:07+00:00
List-Id: <comp.lang.ada>

Jeffrey Carter wrote:
> Skip lists are probabilistically balanced. Average search times are
> O(log N). The good news is that even a halfway decent random-number
> generator ensures decent balancing if N is large enough to matter. See
> figure 8 in
> 
> ftp://ftp.cs.umd.edu/pub/skipLists/skiplists.pdf

Excellent reference. But I think there's still a number of advantages to
red-black trees over skip lists. I'm interested in hearing why you think
skip-lists are superior.

Here's my "Top Ten" list...

10) A malicious user could make even the best skip list perform really
poorly. In these days of internet break-ins, this could make a
difference. All the user has to do is know what the random number
generator generates. The user can either delete all the non-level-1
nodes, or run a copy of the PRNG and insert nodes whose value equals the
height of the node. In the latter case, you'd wind up with a string of 1
values, all at height 1, then a string of 2 values all at height 2, then
a string of 3 values all at height 3, etc.

9) The skip-lists look like they take more space. In addition to an
average of 2 pointers for each node, you also have the 'Range value.
Since tree nodes are fixed size, I would think a decent compiler
wouldn't bother to actually encode the size of the node in each node.

8) The skip-lists look like they take more space, part deux. If I do
"insert 1, insert 2, delete 1, insert 3, insert 4, delete 3, insert 5,
insert 6, delete 5, ..." 100 times on a tree, I'm going to use up 100
nodes of memory and have one free left over. With a skip-list, since the
nodes are different sizes, I'll likely have some additional memory
fragmentation. 

On the other hand, I think after re-reading some red-black and AVL
algorithms again, I'm mistaken about the "more space" bit. The problems
with a red-black tree are...

7) The balanced trees require you to carry balance information around in
the nodes - the red-black tree carries the color and the AVL trees carry
the balance.

6) The tree has no obvious way of going from a node to the next node,
making an iterator complex to implement. That is, it's easy to traverse
a tree in order recursively, but somewhat more difficult to traverse a
tree iteratively (in the sense used in Grace.Lists). In a skip-list,
each node points to the next node. On the other hand, I don't see any
obvious way, given an arbitrary node, to do an insert/delete on a
skip-list without repeating the search.

5) The algorithms are admittedly a little more complicated, with
probably a few more assignments of pointers than with skip-lists.
However, the algorithms are well-known, widely available, and only need
to be implemented once, so I don't see the complexity factor as very
important. It might be a touch slower tho, with maybe 8 or 10 pointer
assignments where a skip-list doing the same thing might have 2 or 4.

+=+=+

Overall, it might be that the best implementation is a doubly-linked
list with either a tree or a skip-list on top. That is, either a
skip-list where each node also includes back pointers to the previous
nodes, or a red-black or an AVL tree where each node includes pointers
to the previous and next value. This way, there's an very similar
interface to the List package, except that "insert" takes both a key and
a value, and instantiation requires the "<" operator. You could build it
as essentially a doubly-linked list that keeps itself in order
efficiently. 

I don't see any compelling value of using a skip-list over using a tree,
or vica versa. Perhaps providing both implementations would be of
interest, proving that the interface is sufficiently generic that the
implementation can be completely changed under the covers. This would
allow the real-time folks to get deterministic behavior when they wanted
at the expense of better probabilistic performance.

I'd be interested in hearing what the advantages of the skip-list are
over trees, other than slightly simpler code. Is it just the
performance? (Which is, of course, important.)

> "Although a skip list may not be balanced, the probability of a skip
> list being substantially unbalanced is insignificant.

I think he means a *random* skip-list. The probability is insignificant
if you don't actually *try* to unbalance it.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   The 90/10 rule of toothpaste: the last 10% of 
         the tube lasts as long as the first 90%.