From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: ***
X-Spam-Status: No, score=3.2 required=5.0 tests=BAYES_00,RATWARE_MS_HASH,
	RATWARE_OUTLOOK_NONAME autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: fc89c,97188312486d4578
X-Google-Attributes: gidfc89c,public
X-Google-Thread: 109fba,baaf5f793d03d420
X-Google-Attributes: gid109fba,public
X-Google-Thread: 103376,97188312486d4578
X-Google-Attributes: gid103376,public
X-Google-Thread: 1014db,6154de2e240de72a
X-Google-Attributes: gid1014db,public
From: "Tim Behrendsen" <tim@airshields.com>
Subject: Re: Teaching sorts [was Re: What's the best language to start with?]
Date: 1996/08/21
Message-ID: <01bb8f83$e115a340$87ee6fce@timpent.airshields.com>
X-Deja-AN: 175635749
references: <4v2qkd$40f@news1.mnsinc.com>
 <dewar.840305735@schonberg><4vd05f$rt5@news1.mnsinc.com>
 <DwGoHq.6n7@cwi.nl><01bb8f1b$ce59c820$32ee6fce@timhome2>
 <TANMOY.96Aug21083507@qcd.lanl.gov>
content-type: text/plain; charset=ISO-8859-1
organization: A-SIS
mime-version: 1.0
newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada
Date: 1996-08-21T00:00:00+00:00
List-Id: <comp.lang.ada>


Tanmoy Bhattacharya <tanmoy@qcd.lanl.gov> wrote in article
<TANMOY.96Aug21083507@qcd.lanl.gov>...
> In article <01bb8f1b$ce59c820$32ee6fce@timhome2>
> "Tim Behrendsen" <tim@airshields.com> writes:
> <snip>
> B: I have to admit, I have to side with Szu-Wen.  I've never really
> B: thought about this case for the O() notation, but it seems from
> B: a purely mathematical standpoint that O(f(n)) means that f(n)
> B: is *the* function.  If f(n) happens to be a non-continuous function,
> 
> It is not a question of convenience or usefulness: it is a question of
> definitions. As is almost universally defined defined, O(n^2+n) and
> O(n^2) mean exactly the same, as does
> 
> <snip>
> B: 
> B:     O( (n < 1000) ? n : n^2 )
> B: 
> 
> and hence when you write it instead of the more easily read O(n^2), it
> tells people that you do not know the meaning of O().

Perhaps.  My understanding of O() notation is that it does
not give a prediction of running time, it gives a rough
approximation of behavior based on data set size.

IOW, O(n*n) has quadratic behavior relative to O(n), but the
O() function does not imply an absolute k*n*n running time.

> B: The latter is not only more useful, but absolutely accurate.
> 
> Which is irrelevant. Use, a different notation if you want to invent a
> new concept: a concept that is more useful in what you want to do. 

Well, *somebody* has to fix the concepts that may be dumbly
defined! :)
 
> B: AFAIK, there are no limits on the complexity of the Big-Oh
> B: function, and no requirement that it be a continuous
> B: function.
> 
> No, but a complex Big-Oh function has *exactly* the same meaning as a
> simpler one in most cases. Calling something a O(f(n)) function means
> saying that (This may not be the technical definition: but this
> expresses the meaning commonly used):
> 
>   1) there exists k and N depending on k such that for all n > N, the
>      function in question is bounded in absolute value by by k |f(n)|. 
>   2) The minimum such k is non-zero. (This is to prevent a O(n)
>      function also being classified as a a O(nlogn) for example).
> 
> The notation works quite well in practice: generally what happens is
> that for very small n (for n < N in the definition), the `main loop'
> is not dominant: there are other stuff like initialization which
> dominates the time taken. The big O notation says that we are
> interested in the behaviour of the function when such subdominant
> pieces are unimportant. In fact, such a division is important because
> there is no reason why the ratio of time taken in the loop to the time
> taken outside it should be independent of the machine in use: if you
> wanted big O notation to be a property of the algorithm and not the
> hardware (as long as we stick to a reasonable architecture), you
> cannot include subdominant terms in it. Incidentally, the
> constant k simply means that it is not important what the units of
> measurement are, again serves to make the notation independent of the
> precise hardware on which it is being run.
> 
> However, there are pathological cases: one such pathological case is
> when the initialization (or something similar) takes a large amount of
> time. Thus for example, we can thing of an algorithm that takes
> 0.00001 n (main loop) + 1000 (initialization) amount of time. Now, for
> n >> 100000000, the initialization is unimportant: and big O notation
> will latch on to that case and call it an O(n) problem. However, in
> practice, all that you will ever see for a reasonable n is the
> initialization time, and hence, in practice, it will look much like an
> O(1) problem.
> 
> Similarly, sometimes the time taken to really do a problem (say adding
> two numbers) is proportional to the number of digits in the number:
> i.e. to log n. However, on most hardware, the time taken to do this is
> a constant, albeit with a limit to the size of numbers it can add. In
> such cases, again, the observed complexity will be O(1), whereas
> truly, the complexity is O(logn).

Agreed; the O() function is meant to be used in context, not
as an absolute statement of algorithm efficiency.

> But, these caveats about order O is the penalty we pay for using an
> useful concept. The defintion you called more useful, most will find
> *less* useful: to use such a concept for a sorting routine, one would
> need to know the relative time for a swap and compare on the
> particular hardware for example. Giving up the machine independence in
> a concept designed to measure abstract algorithmic efficiency is a
> stiff price to pay for a perceived usefulness.

By "useful", I mean conveying information about the algorithm.  I
agree with you that in practice the less significant terms may
be left off the function, i.e. O(n*n + n) is effectively equivalent
to O(n*n).  But the particular case that started all this was the
non-continuous case, where one range of (n) produces O(n)
behavior, and another range produces O(n*n).  In practice, this
would normally be described exactly as I have just done, but I
don't think it's either accurate or complete to only declare
the algorithm O(n*n) if both ranges are possible input data.

If the definition of O() notation calls for the linear case to
simply be ignored, then the definition should be changed,
because it is simply not accurate.  Note I'm not expecting
precision, I'm expecting accuracy.

> Most people instead choose to give you the O() of the algorithm with a
> statement pointing out the caveats. I have seen statements that
> `addition is O(1) with the caveat that if the numbers really became
> big, it would become O(logn)'. Such statements are necessarily
> mathematically imprecise, but we know what they mean. And, if, someone
> were to question me as to what the precise meaning is, I could of
> course phrase it trivially: It would be a fomalization of `what I mean
> is that if we enlarge the turing machine with an oracle that can add
> arbitrarily large numbers in one move, the algorithm would become O(1)
> instead of O(n)'. 

The reason we can safely ignore the O(n) case is that the large
number case is not part of our input domain.  If it was, then
we would necessarily need to have a non-continuous O() function.

> Calling something O(n^2+n) is still an abomination!

Agreed.

-- Tim Behrendsen (tim@airshields.com)