From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: fc89c,97188312486d4578
X-Google-Attributes: gidfc89c,public
X-Google-Thread: 109fba,baaf5f793d03d420
X-Google-Attributes: gid109fba,public
X-Google-Thread: 1014db,6154de2e240de72a
X-Google-Attributes: gid1014db,public
X-Google-Thread: 103376,97188312486d4578
X-Google-Attributes: gid103376,public
From: ok@goanna.cs.rmit.edu.au (Richard A. O'Keefe)
Subject: Re: Teaching sorts [was Re: What's the best language to start with?]
Date: 1996/08/27
Message-ID: <4vtq9g$24j@goanna.cs.rmit.edu.au>
X-Deja-AN: 176689767
references: <31FBC584.4188@ivic.qc.ca>
 <01bb83f5$923391e0$87ee6fce@timpent.airshields.com>
 <EACHUS.96Aug8155946@spectre.mitre.org>
 <Dw49yC.Mur@ukpsshp1.serigate.philips.nl> <dewar.840034153@schonberg>
 <4v98io$e99@goanna.cs.rmit.edu.au> <dewar.840491732@schonberg>
 <dewar.840493975@schonberg>
organization: Comp Sci, RMIT, Melbourne, Australia
newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada
nntp-posting-user: ok
Date: 1996-08-27T00:00:00+00:00
List-Id: <comp.lang.ada>


dewar@cs.nyu.edu (Robert Dewar) writes:
>In fact the critical thing to preserve the number of compares is to run
>the inner loop backwards. That comes naturally in C, but not in Ada,
>where the preferable code is to find the insertion point and then do a
>slice assignment to shuffle the data (the slice assignment is not only
>at a preferable higher level, but with a good optimizing compiler has
>the potential of generating much more efficient code on some architectures
>than the individual moves.

Whether the language is C or Ada has nothing to do with it.
The key step in insertion sort is moving from

	+--------+-+--------+
    A:	|   ab   |x|    w   |
	+--------+-+--------+
	 L        I        U

where	A[L..I-1] is-permutation-of old A[L..I-1]
  and	A[L..I-1] is-sorted
  and	A[I..U] = old A[I..U]

to

	+----+-+----+--------+
    A:	|  a |x| b  |    w   |
	+----+-+----+--------+
	 L    J    I        U

where	all(A[L..J-1] <= x)
  and	all(x < A[j+1..I])

This can be viewed as two sub-tasks:
    - find J
    - rotate A[J..I] one place

I would expect any good programmer to think "hmm, I *have* to touch
the `b' elements in order to move them; do I really have to touch the
`a' elements as well to find J, or can I fuse these two tasks to do
both while touching only the `b' elements?"

As for the slice being faster, here are the figures I get from
Gnat 3.04 using
	gnatmake -O4 -gnatp -gnatr insert.adb
I already had several versions of insertion sort coded, amongst other
things to explore the question of whether slice assignment helped.
Thanks for your bubble sort procedures.  I have retyped them like this
to make the comparison more direct:

    procedure bubbl1(A: in out Sequence) is
	Switched: Boolean;
	T: Element;
    begin
	loop
	    Switched := False;
	    for J in A'First .. A'Last - 1 loop
		if A(J+1) < A(J) then
		    T := A(J);
		    A(J) := A(J+1);
		    A(J+1) := T;
		    Switched := True;
		end if;
	    end loop;
	    exit when not Switched;
	end loop;
    end bubbl1;

    procedure bubbl2(A: in out Sequence) is
	T: Element;
    begin
	<<Restart>>
	    for J in A'First .. A'Last - 1 loop
		if A(J+1) < A(J) then
		    T := A(J);
		    A(J) := A(J+1);
		    A(J+1) := T;
		    for K in J+1 .. A'Last-1 loop
			if A(K+1) < A(K) then
			    T := A(K);
			    A(K) := A(K+1);
			    A(K+1) := T;
			end if;
		    end loop;
		    goto Restart;
		end if;
	    end loop;
    end bubbl2;

Now, here are the times I got on a SPARCstation-10 (sun4m).
All times are in nanoseconds.  The best case times should not be
taken very seriously; all the methods go so fast it is hard to time
them.

Best	Average  	Worst	Method
400*N	 45*N**2    90*N**2   Naive insertion sort
300*N	110*N**2   240*N**2   Slice assignment insertion sort
200*N	180*N**2   200*N**2   bubbl1 (bubble sort with boolean variable)
200*N	170*N**2   200*N**2   bubbl2 (bubble sort with jump)

Result 1:  I was unable to obtain any evidence that bubbl2 is faster
	   than bubbl1.  (Remember that these times are _not_ precise.)
Result 2:  Using GNAT 3.04 with array bounds checking switched off and
	   optimisation high, the slice assignment does not pay off.
Result 3:  Insertion sort is much faster than bubble sort, except for
	   the case of an array that is already in sort order.

>So, you ran the inner loop backwards and indeed got the number of
>compares right, but you still have the overhead of both loops testing
>the termination condition,

and the evidence *does* show this happening in the best case.

>I find Hillam's initial bubble sort algorithm curious. I certainly don't
>call this bubble sort.

I call your code bubble sort too, and was rather surprised by Hillam's
book.  I used his code merely because it happened to be handy.  I guess
this is evidence that Hillam's book should be moved to the "Sturgeon's Law"
pile, which would not surprise me greatly.

>You missed the point. In the bubble sort, the outer loop is executed only
>once for a sorted list. For your insertion sort, the outer loop runs a full
>number of times. Your insertion sort has 2*N loop termination tests, while
>a properly written bubble sort has N loop termination tests.

Of course, the times I reported above apply to sorting an array of Integer.
I would say that bubble sort does N index comparisons and N element
comparisons in the best case, while insertion sort does 2N index comparisons
and N element comparisons, so we're comparing 2N with 3N, rather than 1N with
2N.  In the cases I have cared about in the past, the element comparisons have
been much more costly than index comparisons.

>I do not seem to count five variables here! More like two, one of which
>is a loop control constant.

Expanding Exch() inline, I count three variables for bubble sort:
	- a boolean
	- an index
	- an element
which is pretty close to insertion sort's two indices and one element.

>I do not know any way to write the insertion sort so that it has only N
>loop termination tests, certainly your example has 2*N tests.

Well, there are several pretty obvious things to try.

(1) Do one pass of selection sort, to find the left-most smallest element,
    and rotate the array so that it goes first.  Now we have a sentinel,
    and the index comparison can be dropped from the inner loop.

(2) Insert a "check for sorted prefix".  Something like this:

	void insertion_sort(elt *a, int n) {
	    int i, j;

	    /* find sorted prefix */
	    for (i = 1; i < N && !(a[i] < a[i-1]); i++) {}
	    /* now resume normal insertion sort */
	    for (; i < N; i++) {
		elt const t = a[i];
		for (j = i; j > 0 && t < a[j-1]; j--)
		    a[j] = a[j-1];
		a[j] = t;
	    }
	}

    If the array is already sorted, the first loop will do N index
    comparisons and N-1 element comparisons.

(3) We can actually write insertion sort as
	<<initialise i>>
	loop
	    <<search forward for out of place element>>
	    exit when <<there isn't any>>
	    <<fused find J and rotate>>
	end loop;
    so that the cost is O(N + X.N) where X is the number of out-of-order
    elements encountered.

This actually bears on another thread in this group:  you do NOT need to
think in assembly code terms to think of improvements such as this.

-- 
Australian citizen since 14 August 1996.  *Now* I can vote the xxxs out!
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.