From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: fc89c,97188312486d4578
X-Google-Attributes: gidfc89c,public
X-Google-Thread: 103376,97188312486d4578
X-Google-Attributes: gid103376,public
X-Google-Thread: 109fba,baaf5f793d03d420
X-Google-Attributes: gid109fba,public
X-Google-Thread: 1014db,6154de2e240de72a
X-Google-Attributes: gid1014db,public
From: Dan.Pop@cern.ch (Dan Pop)
Subject: Re: What's the best language to start with? [was: Re: Should I learn
 C or Pascal?]
Date: 1996/08/09
Message-ID: <danpop.839594575@news.cern.ch>
X-Deja-AN: 173164318
sender: news@news.cern.ch (USENET News System)
x-nntp-posting-host: ues5.cern.ch
references: <31FBC584.4188@ivic.qc.ca> <01bb8342$88cc6f40$32ee6fcf@timhome2>
 <4u7grn$eb0@news1.mnsinc.com>
 <01bb83ad$29c3cfa0$87ee6fce@timpent.airshields.com>
 <4u89c4$p7p@solutions.solon.com>
 <01bb83f5$923391e0$87ee6fce@timpent.airshields.com>
 <danpop.839450672@news.cern.ch>
 <01bb8534$b2718bc0$87ee6fce@timpent.airshields.com>
organization: CERN European Lab for Particle Physics
newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada
Date: 1996-08-09T00:00:00+00:00
List-Id: <comp.lang.ada>


In <01bb8534$b2718bc0$87ee6fce@timpent.airshields.com> "Tim Behrendsen" <tim@airshields.com> writes:

>Dan Pop <Dan.Pop@cern.ch> wrote in article
><danpop.839450672@news.cern.ch>...
>> In <01bb83f5$923391e0$87ee6fce@timpent.airshields.com> "Tim Behrendsen"
><tim@airshields.com> writes:
>> 
>> >The problem is that we *can't* think purely abstractly,
>> >otherwise we end up with slow crap code.
>> 
>> Care to provide some concrete examples?
>
>Look at the code-bloated and slow-software world we live in,
>particularly on desktop platforms.  I think this is caused by
>people not truly understanding what's *really* going on.

I was asking for a concrete example, like a quicksort implementation
made by someone thinking purely abstractly as opposed to a quicksort
implementation made by someone who understands the low level details.

>For example, look at OOP.  Very naive implementations of OOP
>used a huge amount of dynamic memory allocation, which can cause
>severe performance problems.  That's why I don't use C++ for my
>products; to do it right you have to do a very careful analysis
>of how your classes are going fit together efficiently.
>
>I've spoken to enough people that have had C++ disasters to
>convince me that the more abstraction there is, the more
>danger there is of inefficient code.  This shouldn't be that
>hard to believe; any time you abstract away details you are
>giving up knowledge of what is going to be efficient.

Nonsense.  The efficiency of an implementation can be also thought in
abstract terms, you don't need to know how the compiler works or
assembly language in order to implement your application in an efficient
way.  Or you may know all these irrelevant details and still write 
inefficient code, because it's so much easier in many OOPLs.

>I alluded to this in another post, but a good example is Motif
>and X11.  A programmer who only understands Motif, but does not
>understand X11 is going to write slow crap, period.

More nonsense.  Unless your applications spend the bulk of their CPU
time in the user interface, that should be a non-issue.

>Here's an example:
>
>int a[50000],b[50000],c[50000],d[50000],e[50000];
>
>void test1()
>{
>    int i, j;
>    for (j = 0; j < 10; ++j) {
>        for (i = 0; i < 50000; ++i) {
>            ++a[i]; ++b[i]; ++c[i]; ++d[i]; ++e[i];
>        }
>    }
>}
>
>void test2()
>{
>    int i, j;
>    for (j = 0; j < 10; ++j) {
>        for (i = 0; i < 50000; ++i) ++a[i];
>        for (i = 0; i < 50000; ++i) ++b[i];
>        for (i = 0; i < 50000; ++i) ++c[i];
>        for (i = 0; i < 50000; ++i) ++d[i];
>        for (i = 0; i < 50000; ++i) ++e[i];
>    }
>}
>
>On my AIX system, test1 runs in 2.47 seconds, and test2
>runs in 1.95 seconds using maximum optimization (-O3).  The
>reason I knew the second would be faster is because I know
>to limit the amount of context information the optimizer has
>to deal with in the inner loops, and I know to keep memory
>localized.

1. For a marginal speed increase (~25%), you compromised the readability
   of the code.

2. Another compiler, on another system, might generate faster code
   out of the test1.  This is especially true for supercomputers,
   which have no cache memory (and where the micro-optimizations are done
   based on a completely different set of criteria) and where the cpu time
   is really expensive.

3. You used exclusively abstract concepts to justify why test2 is
   faster on your particular platform/compiler combination.  No references
   to the length of a cache line or to the compiler being able to use
   a more efficient instruction or instruction combination in one case
   than in the other.

Let's see what happens on my 486DX33 box:

    ues4:~/afs/tmp 32> cc -O2 -o test1 test1.c
    ues4:~/afs/tmp 33> time ./test1
    1.100u 0.230s 0:01.51 88.0% 0+0k 0+0io 13pf+0w
    ues4:~/afs/tmp 34> cc -O2 -o test2 test2.c
    ues4:~/afs/tmp 35> time ./test2
    1.170u 0.180s 0:01.45 93.1% 0+0k 0+0io 13pf+0w

So, it's 1.10 + 0.23 = 1.33 seconds of cpu time for test1 versus
1.17 + 0.18 = 1.35 seconds for test2.  Conclusions:

1. My 486DX33 is faster than your AIX system (or maybe gcc is faster than
   xlc) :-)

2. Your results are not reproducible.  Your extra "insight" simply "helped"
   you to generate less readable code.

>Now I submit that if I showed the average C programmer
>both programs, they would guess that test1 is faster because
>it has "less code",

And he might be right, both on a CP/M-80 micro and a Cray supercomputer.
Or on most systems with a compiler that can do loop unrolling.

>and that is where abstraction,
>ignorance, and niavete begin to hurt.

The key to proper optimization is profiling.  Additional knowledge about
a certain platform is a lot less important.  

And if the wrong algorithm has been chosen in the first place, no
amount of micro-optimization will save the code performance.  The guy
who can do a decent algorithm analysis (an entirely abstract operation)
will always beat the one who is an expert assembly programmer but
prefers to spend his time coding instead of dealing with abstractions.

This is an old story (it happened during the early to mid eighties) and
I forgot the details.  One software company had a nice product for the PC,
but it was rather slow.  To get the "best" performance, it was coded in
assembly.  Another company decided to emulate that product and they did it
quite successfully (their version is about 4 times faster).  They 
implemented it in C, but they've carefully chosen their algorithms.

Dan
--
Dan Pop
CERN, CN Division
Email: Dan.Pop@cern.ch 
Mail:  CERN - PPE, Bat. 31 R-004, CH-1211 Geneve 23, Switzerland