From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,cda33fc7f63c2885
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-01-09 10:37:03 PST
Path: archiver1.google.com!news1.google.com!postnews1.google.com!not-for-mail
From: dennison@telepath.com (Ted Dennison)
Newsgroups: comp.lang.ada
Subject: Re: list strawman
Date: 9 Jan 2002 10:37:02 -0800
Organization: http://groups.google.com/
Message-ID: <4519e058.0201091037.325fbdbb@posting.google.com>
References: <mailman.1010451062.16465.comp.lang.ada@ada.eu.org>
 <7iE_7.8661$cD4.15714@www.newsranger.com>
 <3c3b13ba$0$212$626a54ce@news.free.fr>
 <sHF_7.8807$cD4.16831@www.newsranger.com>
 <3c3b2aa0$0$212$626a54ce@news.free.fr>
 <zwH_7.9026$cD4.17983@www.newsranger.com>
 <3c3bfabc$0$3190$626a54ce@news.free.fr>
NNTP-Posting-Host: 65.115.221.98
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: posting.google.com 1010601422 21891 127.0.0.1 (9 Jan 2002 18:37:02
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: 9 Jan 2002 18:37:02 GMT
Xref: archiver1.google.com comp.lang.ada:18699
Date: 2002-01-09T18:37:02+00:00
List-Id: <comp.lang.ada>

Jean-Marc Bourguet <jm@bourguet.org> wrote in message news:<3c3bfabc$0$3190$626a54ce@news.free.fr>...
> Ted Dennison<dennison@telepath.com> writes:
> 
> > That's a good point. It would probably be fairly easy to alternate
> > which side the pivot is taken from.
> 
> This will not change the behaviour in N^2 for a (nearly) sorted input.

After thinking about it harder, you are right. Everything will allways
go on one side of the pivot, no matter which side you take it from. So
no "dividing" will actually happen, which is where this
divide-and-conquer algorithm gets its speed from.

> > I believe the splitting part of mergesort would involve a linear
> > search for a linked-list. That would take us into O(n*2) territory,
> > if not worse. In my student days I could figure out the exact
> > factor...hmmm...perhaps its just another nlogn...
> 
> Merge sort can be implemented in O(N log N) worst case, with a O(N)
> for an already sorted input.

OK. By arguing through repeated assertion, you are forcing to either
ignore you, or do the work of proof myself. Since I made the first
claim, its only fair that I do the work I guess...

...OK it looks like the work involved in doing *all* the splits using
linear searches is (n/2 + 2(n/4) + 4(n/8) ....) where n>1. Since there
is one of these factors for each "level", and there are logn levels,
that should work out to roughly (n/2)logn, or O(n logn). So you are
right that it is O(N log N) worst case. However, you are wrong about
the best case. With a linked-list, that is *also* O(N log N).

However, this also shows one the reasons why Quicksort is going to be
faster in practice. We haven't even done any of the real sorting work
yet, and we *already* have a factor of n logn to deal with. Quicksort
in the best case will do exactly that same amount of linear searching,
and be done sorting at the end of it!

Also note that if you were to modify the Quicksort routine to take a
middle pivot, you would get the exact same linear search factor added
in, and it would no longer suck at unordered lists. So your issue
actually degenerates to whether it is worth the speed hit to modify
Quicksort to use a middle pivot. If it is, *then* we can talk about if
mergesort would be quicker.

So again, its a matter of priorities. Algorithm aside, I don't think
that large sorted lists are worth slowing down everyone else.

> Most litterature I've read is about sorting arrays.  AFAIK, it is not
I noticed that too.

> possible to do merge sort in place for arrays (but it is for linked
> list, even singly linked list) and that's probably why this algorithm
> is often neglected.

Perhaps, but then the linear searches required to do the splits kind
of make up for that, don't they?

The impression I got from my last two Algorithms courses was the the
basic problem with Mergesort wasn't its time behavior, but (in
algorithm terms) the huge constant factor that it is multiplied by.
The reason Qucksort generally beats Mergesort's butt is that its
average case O(n log n) is a heck of a lot smaller than mergesort's
average case O(n log n). Until the input gets large, its worst case is
often better than mergesort's best case.