From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: fc89c,97188312486d4578 X-Google-Attributes: gidfc89c,public X-Google-Thread: 109fba,baaf5f793d03d420 X-Google-Attributes: gid109fba,public X-Google-Thread: 1014db,6154de2e240de72a X-Google-Attributes: gid1014db,public X-Google-Thread: 103376,97188312486d4578 X-Google-Attributes: gid103376,public From: ok@goanna.cs.rmit.edu.au (Richard A. O'Keefe) Subject: Re: Teaching sorts [was Re: What's the best language to start with?] Date: 1996/08/19 Message-ID: <4v98io$e99@goanna.cs.rmit.edu.au> X-Deja-AN: 175076557 references: <31FBC584.4188@ivic.qc.ca> <01bb83f5$923391e0$87ee6fce@timpent.airshields.com> organization: Comp Sci, RMIT, Melbourne, Australia newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada nntp-posting-user: ok Date: 1996-08-19T00:00:00+00:00 List-Id: dewar@cs.nyu.edu (Robert Dewar) writes: >The one advantage of bubble sort is that it is close to optimal on sorted >or nearly sorted arrays. You have to be very careful how you write insertion >sort not to require more compares in the fully sorted case, and you will >almost certainly find you require more overhead, because of the two nested >loops. Hmm. Here's C code to sort an N-element array a[0..N-1]. void insertion_sort(elt *a, int n) { int i, j; /* invariant: a[0..i-1] is a sorted permutation of old a[0..i-1] */ for (i = 1; i < N; i++) { elt const t = a[i]; for (j = i; j > 0 && t < a[j-1]; j--) a[j] = a[j-1]; a[j] = t; } } If a is already sorted, this does N-1 comparisons, which is optimal. I don't see any need for extreme care here. Let's put that into Ada: generic type Element is private; with function "<"(Left, Right: Element) return Boolean; type Index is (<>); type Vector is array (Index) of Element; procedure Insertion_Sort(A: in out Vector); procedure Insertion_Sort(A: in out Vector) is begin for I in Index'Succ(A'First) .. A'Last loop -- invariant: A(A'First .. I) is a sorted permutation -- of old A(A'First .. I). declare T: constant Element := A(I); begin Insert: for J in reverse A'First .. I loop if T < A(Index'Pred(J)) then A(J) := A(Index'Pred(J)); else A(J) := T; exit Insert; end if; end loop Insert; end; end loop; end Insertion_Sort; Now let's see bubble-sort, from "Introduction to Abstract Data Types using Ada" by Hillam. It's figure 11.1.2 on p380. generic type ITEM_TYPE is private; type VECTOR is array (integer range < >) of ITEM_TYPE; with function "<"(LEFT, RIGHT : ITEM_TYPE) return boolean; procedure BUBBLE_SORT (V : in out VECTOR); procedure BUBBLE_SORT (V : in out VECTOR) is TEMP_ITEM : ITEM_TYPE; begin for OUTER IN V'first .. V'last-1 loop -- note same number of outer loop iterations as insertion sort for INNER in V'first + 1 .. V'last loop -- note no early exit if V(INNER) < V(INNER - 1) then TEMP_ITEM := V(INNER); V(INNER) := V(INNER - 1); V(INNER - 1) := TEMP_ITEM; end if; end loop; end loop; end BUBBLE_SORT; This clearly cannot be anywhere near close to optimal for sorted or nearly sorted arrays, because it always does 1/2N**2 + O(N) element comparisons. The version of bubble sort that does well in those cases is called "modified bubble sort" in Hillman, and is in his figure 11.1.4 generic type ITEM_TYPE is private; type VECTOR is array (integer range < >) of ITEM_TYPE; with function "<"(LEFT, RIGHT : ITEM_TYPE) return boolean; procedure BUBBLE_SORT (V : in out VECTOR); procedure BUBBLE_SORT (V : in out VECTOR) is SORTED : boolean false; -- sic! TEMP_ITEM : ITEM_TYPE; IN_PLACE : integer := 0; -- keeps track of number of items known -- to b in their final place at the -- beginning of each phase INDEX : integer := V'first; begin while not SORTED and then INDEX < V'last loop SORTED := true; INDEX := INDEX + 1; for INNER in V'first + 1 .. V'last - IN_PLACE loop if (V(INNER) < V(INNER - 1) then -- sic! TEMP_ITEM := V(INNER); V(INNER) := V(INNER - 1); V(INNER - 1) := TEMP_ITEM; SORTED := false; end if; end loop; IN_PLACE := IN_PLACE + 1; end loop; end BUBBLE_SORT; This is 20 non-comment lines for the body of "modified bubble sort", compared with 17 for the body of my Ada insertion sort. But that could have been shorted if I hadn'tdeclared T as locally as possible so that I could declare it as a constant. Let's eliminate that block, and while we're at it, let's eliminate the loop exit in the same of structured programming purity. procedure Insertion_Sort(A: in out Vector) is T: Element; J: Index; begin for I in Index'Succ(A'First) .. A'Last loop T := A(I); J := I; while J > A'First and then T < A(Index'Pred(J)) loop A(J) := A(Index'Pred(J)); J := Index'Pred(J); end loop; A(J) := T; end loop; end Insertion_Sort; Now the funny thing here is that Robert Dewar wrote >you will almost certainly find you require more overhead [for insertion sort than bubble sort] >because of the two nested loops. But both versions of bubble sort have two nested loops as well! >A bubble sort is certainly a much simpler solution to the problem >of optimal sorting of a sorted list, I do not call 20 lines with two loops and 5 variables "much simpler" than 14 lines with two loops and 3 variables. This leaves no apparent use for bubble sort at all. >For quick sorts, I prefer heapsort to quicksort, because of its bounded >worst case behavior. Note that there is a little-known modification to >heap sort that reduces the number of compares to about NlogN compared >with the normal 2NlogN (the 2 is where Eachus got the O(2N), though of >course constants don't belong in big-O formulas). As far as I know this >is not really properly reported in the literature -- I treat it in detail >in my 1968 thesis, and it is an excercise in Knuth volume 3 (although his >original answer was wrong, I think I kept that $1 Wells Fargo colorful >check somewhere as a souvenir :-) Papers were still appearing in The Computer Journal well after 1968 with improvements on heapsort; I feel so *stupid* for not including the proper citation in the source code I have. I don't suppose your thesis is on the Web anywhere (mine certainly isn't). -- Australian citizen since 14 August 1996. *Now* I can vote the xxxs out! Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.