From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 109fba,baaf5f793d03d420 X-Google-Attributes: gid109fba,public X-Google-Thread: 103376,97188312486d4578 X-Google-Attributes: gid103376,public X-Google-Thread: 1014db,6154de2e240de72a X-Google-Attributes: gid1014db,public X-Google-Thread: fc89c,97188312486d4578 X-Google-Attributes: gidfc89c,public From: miker3@ix.netcom.com (Mike Rubenstein) Subject: Re: Teaching sorts [was Re: What's the best language to start with?] Date: 1996/08/22 Message-ID: <321ba583.1531452@nntp.ix.netcom.com> X-Deja-AN: 175722393 references: <4v2qkd$40f@news1.mnsinc.com> <4vd05f$rt5@news1.mnsinc.com> <01bb8f1b$ce59c820$32ee6fce@timhome2> <01bb8f83$e115a340$87ee6fce@timpent.airshields.com> organization: Netcom x-netcom-date: Wed Aug 21 7:46:19 PM CDT 1996 newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada Date: 1996-08-21T19:46:19-05:00 List-Id: "Tim Behrendsen" wrote: > Tanmoy Bhattacharya wrote in article > ... > > In article <01bb8f1b$ce59c820$32ee6fce@timhome2> > > "Tim Behrendsen" writes: > > > > B: I have to admit, I have to side with Szu-Wen. I've never really > > B: thought about this case for the O() notation, but it seems from > > B: a purely mathematical standpoint that O(f(n)) means that f(n) > > B: is *the* function. If f(n) happens to be a non-continuous function, > > > > It is not a question of convenience or usefulness: it is a question of > > definitions. As is almost universally defined defined, O(n^2+n) and > > O(n^2) mean exactly the same, as does > > > > > > B: > > B: O( (n < 1000) ? n : n^2 ) > > B: > > > > and hence when you write it instead of the more easily read O(n^2), it > > tells people that you do not know the meaning of O(). > > Perhaps. My understanding of O() notation is that it does > not give a prediction of running time, it gives a rough > approximation of behavior based on data set size. > > IOW, O(n*n) has quadratic behavior relative to O(n), but the > O() function does not imply an absolute k*n*n running time. > > > B: The latter is not only more useful, but absolutely accurate. > > > > Which is irrelevant. Use, a different notation if you want to invent a > > new concept: a concept that is more useful in what you want to do. > > Well, *somebody* has to fix the concepts that may be dumbly > defined! :) > > > B: AFAIK, there are no limits on the complexity of the Big-Oh > > B: function, and no requirement that it be a continuous > > B: function. > > > > No, but a complex Big-Oh function has *exactly* the same meaning as a > > simpler one in most cases. Calling something a O(f(n)) function means > > saying that (This may not be the technical definition: but this > > expresses the meaning commonly used): > > > > 1) there exists k and N depending on k such that for all n > N, the > > function in question is bounded in absolute value by by k |f(n)|. > > 2) The minimum such k is non-zero. (This is to prevent a O(n) > > function also being classified as a a O(nlogn) for example). > > > > The notation works quite well in practice: generally what happens is > > that for very small n (for n < N in the definition), the `main loop' > > is not dominant: there are other stuff like initialization which > > dominates the time taken. The big O notation says that we are > > interested in the behaviour of the function when such subdominant > > pieces are unimportant. In fact, such a division is important because > > there is no reason why the ratio of time taken in the loop to the time > > taken outside it should be independent of the machine in use: if you > > wanted big O notation to be a property of the algorithm and not the > > hardware (as long as we stick to a reasonable architecture), you > > cannot include subdominant terms in it. Incidentally, the > > constant k simply means that it is not important what the units of > > measurement are, again serves to make the notation independent of the > > precise hardware on which it is being run. > > > > However, there are pathological cases: one such pathological case is > > when the initialization (or something similar) takes a large amount of > > time. Thus for example, we can thing of an algorithm that takes > > 0.00001 n (main loop) + 1000 (initialization) amount of time. Now, for > > n >> 100000000, the initialization is unimportant: and big O notation > > will latch on to that case and call it an O(n) problem. However, in > > practice, all that you will ever see for a reasonable n is the > > initialization time, and hence, in practice, it will look much like an > > O(1) problem. > > > > Similarly, sometimes the time taken to really do a problem (say adding > > two numbers) is proportional to the number of digits in the number: > > i.e. to log n. However, on most hardware, the time taken to do this is > > a constant, albeit with a limit to the size of numbers it can add. In > > such cases, again, the observed complexity will be O(1), whereas > > truly, the complexity is O(logn). > > Agreed; the O() function is meant to be used in context, not > as an absolute statement of algorithm efficiency. > > > But, these caveats about order O is the penalty we pay for using an > > useful concept. The defintion you called more useful, most will find > > *less* useful: to use such a concept for a sorting routine, one would > > need to know the relative time for a swap and compare on the > > particular hardware for example. Giving up the machine independence in > > a concept designed to measure abstract algorithmic efficiency is a > > stiff price to pay for a perceived usefulness. > > By "useful", I mean conveying information about the algorithm. I > agree with you that in practice the less significant terms may > be left off the function, i.e. O(n*n + n) is effectively equivalent > to O(n*n). But the particular case that started all this was the > non-continuous case, where one range of (n) produces O(n) > behavior, and another range produces O(n*n). In practice, this > would normally be described exactly as I have just done, but I > don't think it's either accurate or complete to only declare > the algorithm O(n*n) if both ranges are possible input data. > > If the definition of O() notation calls for the linear case to > simply be ignored, then the definition should be changed, > because it is simply not accurate. Note I'm not expecting > precision, I'm expecting accuracy. What do you mean "if". The definition of O() notation is a given. It's been araound much too long for us to consider changing it just because you don't like it. O(n^2) and O((n < 1000) ? n : n^2) mean exactly the same thing. > > Most people instead choose to give you the O() of the algorithm with a > > statement pointing out the caveats. I have seen statements that > > `addition is O(1) with the caveat that if the numbers really became > > big, it would become O(logn)'. Such statements are necessarily > > mathematically imprecise, but we know what they mean. And, if, someone > > were to question me as to what the precise meaning is, I could of > > course phrase it trivially: It would be a fomalization of `what I mean > > is that if we enlarge the turing machine with an oracle that can add > > arbitrarily large numbers in one move, the algorithm would become O(1) > > instead of O(n)'. > > The reason we can safely ignore the O(n) case is that the large > number case is not part of our input domain. If it was, then > we would necessarily need to have a non-continuous O() function. Huh? This doesn't make any sense at all. As long as we are talkinga about integers and the standard definition of continuity, all functions are continuous. > > Calling something O(n^2+n) is still an abomination! > > Agreed. It is exactly the same abomination as calling something O((n < 1000) ? n : n^2). Both mean that the writer was too ignorant or too lazy to reduce it to the equivalent O(n^2). Michael M Rubenstein