From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 109fba,baaf5f793d03d420 X-Google-Attributes: gid109fba,public X-Google-Thread: 103376,97188312486d4578 X-Google-Attributes: gid103376,public X-Google-Thread: fc89c,97188312486d4578 X-Google-Attributes: gidfc89c,public X-Google-Thread: 1014db,6154de2e240de72a X-Google-Attributes: gid1014db,public From: dewar@cs.nyu.edu (Robert Dewar) Subject: Re: Teaching sorts [was Re: What's the best language to start with?] Date: 1996/08/19 Message-ID: X-Deja-AN: 175318211 references: <31FBC584.4188@ivic.qc.ca> <01bb83f5$923391e0$87ee6fce@timpent.airshields.com> <4v98io$e99@goanna.cs.rmit.edu.au> organization: Courant Institute of Mathematical Sciences newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada Date: 1996-08-19T00:00:00+00:00 List-Id: (slightly corrected version, I noticed a minor typo in the first version, which I canceled). Richard O'Keefe says void insertion_sort(elt *a, int n) { int i, j; /* invariant: a[0..i-1] is a sorted permutation of old a[0..i-1] */ for (i = 1; i < N; i++) { elt const t = a[i]; for (j = i; j > 0 && t < a[j-1]; j--) a[j] = a[j-1]; a[j] = t; } } If a is already sorted, this does N-1 comparisons, which is optimal. I don't see any need for extreme care here. Well the extreme here is your word not mine, I simply said "very careful". In fact the critical thing to preserve the number of compares is to run the inner loop backwards. That comes naturally in C, but not in Ada, where the preferable code is to find the insertion point and then do a slice assignment to shuffle the data (the slice assignment is not only at a preferable higher level, but with a good optimizing compiler has the potential of generating much more efficient code on some architectures than the individual moves. So, you ran the inner loop backwards and indeed got the number of compares right, but you still have the overhead of both loops testing the termination condition, whereas a bubble sort written in the normal manner will have only one loop, and therefore execution only half the number of loop termination tests, that's what I meant by extra overhead even if you are careful to ensure no extra comparisons. I find Hillam's initial bubble sort algorithm curious. I certainly don't call this bubble sort. To me you repeat the outer loop only if you have found that the inner loop made at least one exchange. Well anyone can call any algorithm by any name, *I* use the word bubble sort to include the notation of early termination of the outer loop if no exchanges have been made in the inner loop. To avoid too much overhead in setting this flag, it is useful to use the PC to encode the binary state of this flag. The modified bubble sort from Hillam has two "optimizations" compared to his original. First he uses the flag SORTED in the manner I suggest above, but encoding this as a boolean flag is inefficient, since it keeps getting set, better to encode it in the program counter, then the cost of setting it is almost free (one extra jump, executed once per inner loop). The second is the IN_PLACE optimization, which is reasonable, and helps sometimes, but is not necessary for the point of view of this discussion. "Now the funny thing here is that Robert Dewar wrote >you will almost certainly find you require more overhead [for insertion sort than bubble sort] >because of the two nested loops. But both versions of bubble sort have two nested loops as well!" You missed the point. In the bubble sort, the outer loop is executed only once for a sorted list. For your insertion sort, the outer loop runs a full number of times. Your insertion sort has 2*N loop termination tests, while a properly written bubble sort has N loop termination tests. I do not know any way to write the insertion sort so that it has only N loop termination tests, certainly your example has 2*N tests. Here is a simple example of bubble sort the way I would code it procedure Sort (N : Positive; Exch : Exch_Procedure; Lt : Lt_Function) is Switched : Boolean; begin loop Switched := False; for J in 1 .. N - 1 loop if Lt (J + 1, J) then Exch (J, J + 1); Switched := True; end if; end loop; exit when not Switched; end loop; end Sort; I do not seem to count five variables here! More like two, one of which is a loop control constant. True, it uses an exit, but I find the attempt to eliminate this to read to less readable, not more readable code (I judge code by its readability, not by whether it satisfies someone's idea of what structured code should be :-) Now this uses a flag to encode the switch. If we want to encode this switch in the PC, which is more efficient, but a little less clear: procedure Sort (N : Positive; Move : Move_Procedure; Lt : Lt_Function) is begin for J in 1 .. N - 1 loop if Lt (J + 1, J) then Exch (J, J + 1); for K in J + 1 .. N - 1 loop if Lt (K + 1, K) then Exch (K, K + 1); end if; end loop; Sort (N, Move, Lt); end if; end loop; end Sort; If you don't trust your compiler to eliminate the tail recursion, then you probabably should replace the recursive call with a goto!