From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ac39a12d5faf5b14 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-04-24 10:37:40 PST Newsgroups: comp.lang.ada Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!uunet!sea.uu.net!ash.uu.net!xyzzy!nntp From: Jeffrey Carter Subject: Re: Grace and Maps (was Re: Development process in the Ada community) X-Nntp-Posting-Host: e246420.msc.az.boeing.com Message-ID: <3CC6EA85.CA2735B2@boeing.com> Sender: nntp@news.boeing.com (Boeing NNTP News Access) Organization: The Boeing Company X-Accept-Language: en References: <3CB46975.90408@snafu.de> <3CBAFFEE.2080708@snafu.de> <4519e058.0204171036.6f0a7394@posting.google.com> <3CBDD795.4060706@snafu.de> <4519e058.0204180800.44fac012@posting.google.com> <3CBF0341.8020406@mail.com> <4519e058.0204190529.559a47ae@posting.google.com> <3CC1C6B3.6060306@telepath.com> <3CC21747.5000501@telepath.com> <4519e058.0204220534.2eb33730@posting.go Mime-Version: 1.0 Date: Wed, 24 Apr 2002 17:25:25 GMT X-Mailer: Mozilla 4.73 [en]C-CCK-MCD Boeing Kit (WinNT; U) Xref: archiver1.google.com comp.lang.ada:23067 Date: 2002-04-24T17:25:25+00:00 List-Id: <3CC48F34.5A474E0F@boeing.com> <3CC4C800.10107@telepath.com> <3CC5F26A.6F74BD0F@boeing.com> <3CC6C8DA.162604F7@san.rr.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Darren New wrote: > > Jeffrey Carter wrote: > > Skip lists are probabilistically balanced. Average search times are > > O(log N). The good news is that even a halfway decent random-number > > generator ensures decent balancing if N is large enough to matter. See > > figure 8 in > > > > ftp://ftp.cs.umd.edu/pub/skipLists/skiplists.pdf I was a bit imprecise in my language here. The quality of the generator is not very important if insertions are fairly random. If insertions tend to be ordered, a poor generator can degrade performance. With a decent generator, such as the Ada.Float_Random that comes with GNAT, the order of insertions is immaterial. > > Excellent reference. But I think there's still a number of advantages to > red-black trees over skip lists. I'm interested in hearing why you think > skip-lists are superior. This is the paper in CACM in which Pugh introduced skip lists. Those were the good old days. You don't find useful algorithms in CACM anymore ... > > Here's my "Top Ten" list... Looks like 6 to me, with three of them being reasons skip lists are better :) > > 10) A malicious user could make even the best skip list perform really > poorly. In these days of internet break-ins, this could make a > difference. All the user has to do is know what the random number > generator generates. The user can either delete all the non-level-1 > nodes, or run a copy of the PRNG and insert nodes whose value equals the > height of the node. In the latter case, you'd wind up with a string of 1 > values, all at height 1, then a string of 2 values all at height 2, then > a string of 3 values all at height 3, etc. This is difficult to do if, as one would expect, the random number generator is seeded using the time. Not impossible, I agree. Is it worth worrying about in the general case? For small N, the difference in time is negligible; for large N, the best you can achieve is to slow the system down. I don't expect generic components to be suitable for secure applications. > > 9) The skip-lists look like they take more space. In addition to an > average of 2 pointers for each node, you also have the 'Range value. > Since tree nodes are fixed size, I would think a decent compiler > wouldn't bother to actually encode the size of the node in each node. With the recommended p=0.25, they take an average of 1.333 pointers/node. They have a Height component lacking in trees, but balanced trees tend to have balance information in the nodes. You've addressed this below. Some argue that skip lists take less space, but I consider them equivalent. > > 8) The skip-lists look like they take more space, part deux. If I do > "insert 1, insert 2, delete 1, insert 3, insert 4, delete 3, insert 5, > insert 6, delete 5, ..." 100 times on a tree, I'm going to use up 100 > nodes of memory and have one free left over. With a skip-list, since the > nodes are different sizes, I'll likely have some additional memory > fragmentation. True for an unbounded implementation. You could make all the nodes the same size, if you don't mind the wasted space :) Naturally, a bounded implementation avoids fragmentation. > 6) The tree has no obvious way of going from a node to the next node, > making an iterator complex to implement. That is, it's easy to traverse > a tree in order recursively, but somewhat more difficult to traverse a > tree iteratively (in the sense used in Grace.Lists). In a skip-list, > each node points to the next node. On the other hand, I don't see any > obvious way, given an arbitrary node, to do an insert/delete on a > skip-list without repeating the search. For a general purpose component, I don't see trees offering insert/delete without a search, either. If you're going to keep state information about the last search and use it to optimize insert/delete, you could do that for any implementation. > > 5) The algorithms are admittedly a little more complicated, with > probably a few more assignments of pointers than with skip-lists. > However, the algorithms are well-known, widely available, and only need > to be implemented once, so I don't see the complexity factor as very > important. It might be a touch slower tho, with maybe 8 or 10 pointer > assignments where a skip-list doing the same thing might have 2 or 4. A *little* more complicated? I've seen balanced tree deletion algorithms in texts that run to 4 pages; compare that to the 20 lines or so for a skip list. As for timing, Pugh compares the CPU time of skip lists against several kinds of balanced trees. Skip lists are about as fast or faster than the trees for searching, and faster than the trees for insert/delete. http://www.csihq.com/analysis/skiplist.pdf compares insertion times in skip lists to B-trees (N=50,000 to 950,000 in 50,000 steps) and found that Unix system time increased exponentially for N>=600,000, while it increased linearly for skip lists, indicating that large B-trees did significantly. Considering only Unix CPU time, skip lists were 5-10 times as fast as B-trees; an order of magnitude is not a "touch" slower. > I'd be interested in hearing what the advantages of the skip-list are > over trees, other than slightly simpler code. Is it just the > performance? (Which is, of course, important.) Between the significantly simpler algorithms, which are more likely to be implemented correctly, and the observed performance advantages of skip lists, the choice seems obvious. -- Jeffrey Carter