From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,e0c23e7a19a435c4 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,CP1252 Received: by 10.68.219.170 with SMTP id pp10mr8276507pbc.1.1340482380411; Sat, 23 Jun 2012 13:13:00 -0700 (PDT) Path: l9ni11840pbj.0!nntp.google.com!news1.google.com!news4.google.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Niklas Holsti Newsgroups: comp.lang.ada Subject: Re: Recapping on =?windows-1252?Q?=93Bug_Sort=94=2E?= Date: Sat, 23 Jun 2012 23:12:59 +0300 Organization: Tidorum Ltd Message-ID: References: <169bdbcb-cb43-4db9-9d48-3be2a88473eb@googlegroups.com> <77963856-3a25-4477-9510-769df7a9b85c@googlegroups.com> Mime-Version: 1.0 X-Trace: individual.net qcfQJ4z1HczGzBRaHASqkQJsKeq7aB8TSOUCFVbMJ0/Uo+ah8St/3Pt6sFkNaPfcMP Cancel-Lock: sha1:Wl3VUmucQFABYHHQFyihxq7/uzg= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Date: 2012-06-23T23:12:59+03:00 List-Id: On 12-06-23 22:07 , Austin Obyrne wrote: > On Saturday, June 23, 2012 7:05:56 PM UTC+1, Niklas Holsti wrote: >> On 12-06-23 13:20 , Austin Obyrne wrote: >>> On Saturday, June 23, 2012 8:54:52 AM UTC+1, Austin Obyrne >>> wrote: >>>> On Friday, June 22, 2012 9:45:53 PM UTC+1, Jeffrey Carter >>>> wrote: >>>>> On 06/22/2012 12:55 PM, Austin Obyrne wrote: >>>>>> >>>>>> I have been told that my program resembles a known sort >>>>>> program called �Counting Sort�. I would hate to be guilty >>>>>> of plagiarism and I would like to point out therefore that >>>>>> the salient thing about my �Parallel Sort� is that my >>>>>> implementation is geared to capturing data during any >>>>>> unrelated program run-time and assigning the data in such a >>>>>> way that the separate elements index their own addresses in >>>>>> the sorting arrays. A similarity with some other existing >>>>>> paper algorithm is simply fortuitous. >>>>> >>>>> What you have presented is an implementation of counting >>>>> sort, nothing more. There is nothing new or unique about your >>>>> implementation. >> >> Austin, >> >> While I agree with Jeff that you have rediscovered Counting Sort, >> this does not mean that you are being accused of plagiarism. It is >> common for programmers to rediscover algorithms that are basically >> simple, but very good for some special cases -- and perhaps for >> just these reasons are not very well known. > > Thanks for that. > > The very salient thing that everybody is missing is the way the data > is collected and sorted simultaneously in "Parallel Sort" compared to > Count Sort. Yes, it is a neat feature of Counting Sort that the first step of the sort -- adding the values to the "count" array -- can be done for each new value separately, independently of the other values (since it is not a comparison sort). Only the second step, the final scan of the "count" array and output of the values in sorted order, must be delayed until all values are present. This means, as you say, that the program can do the first step (add to "count" array) immediately when a new value is generated; it is not necessary to first collect all the values in some list, say, and then sort them. But I don't think that the group is "missing" this fact; rather, we see that it does not change the order of complexity (big-oh) of the algorithm, so it is not very interesting. In fact, I'm not at all sure that adding the new values to the "count" array immediately is faster than first collecting them in a dense, unordered structure (array or list) and then sorting them. The frequent scattered references to the large "count" array can stress the data cache and virtual memory system, perhaps causing more data-cache misses and page faults which could make the program run slower. > If I asked you to demonstrate "Count Sort" being used to sort a > specimen set of data how would you go about making the data available > to the program. This is not an interesting question, again because the complexity of writing out and reading in a list of N integers is linear, O(N), in the number of integers. > Somewhere along the line you would have to prepare an external batch > file for reading in or you would key in the data interactively I can't make head or tail of that. You seem not to know how one can pass data from one part of a program to another, or from one program to another, without human intervention. > The counting part is trivial. It is the capturing concept that is > the important difference between what I am calling Parallel Sort and > what anyone else may call Count Sort . The counting is almost benign > when the elements of data index their own addresses in an array in > both cases. My view is the opposite. The "capturing" is just normal programming, passing data from here to there within a program. Adding the data to the "count" array at once may or may not be an advantage for speed, see above. From the point of view of the sorting problem, the central point in Counting Sort is the "count" array that is indexed directly by the values to be sorted. *When* then values are added to the array is not imporant, IMO. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .