From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,dab7d920e4340f12
X-Google-Attributes: gid103376,public
X-Google-Thread: 1014db,dab7d920e4340f12
X-Google-Attributes: gid1014db,public
From: clodius@hotspec.lanl.gov (William Clodius)
Subject: Re: C is 'better' than Ada because...
Date: 1996/07/30
Message-ID: <g168758ry5.fsf@hotspec.lanl.gov>
X-Deja-AN: 171146119
sender: clodius@hotspec.lanl.gov
references: <31e02c32.342948604@netline-fddi.jpl.nasa.gov>
organization: Los Alamos National Laboratory
newsgroups: comp.lang.ada,comp.lang.c
Date: 1996-07-30T00:00:00+00:00
List-Id: <comp.lang.ada>


In article <01bb7e2f$aeef4660$87ee6fce@timpent.airshields.com> "Tim
Behrendsen" <tim@airshields.com> writes: 

<snip>

   I'm not going to claim to have used every compiler, obviously, but
   I base this opinion on often viewing the assembly language output when
   I get a core dump or during debugging sessions.  Now, I've used a lot
   of assembly language, so I usually write my C code to be easy on the
   optimizer (use pointers when needed rather indexes, for example), and
   I admit that sometimes optimizers are very good.  The problem is, the
   compiler often doesn't have enough contextual information to know what
   to do compared to the human writing the program, which bring us to ...

There is no reason why a good compiler should generate worse code for
an array access than for a pointer access. An array can be constructed
by the compiler as a single block of memory whose access via indices
is the same as through a pointer. Au contraire, because a pointer can
be associated with any object of compatible type the analysis for a
highly optimizing compiler is more difficult with pointers than with
arrays. The problem with arrays in C is their lack of flexibility in
term of dynamic memory allocation, that if they are associated with a
pointer at any point in the scope alias analysis becomes more
compicated for the compiler, and that when pased as an argument
to a user defined function they are not distinguished from pointers.

   Global optimization: The compiler can often make decisions better
   than the human can because of the large scope of global optimizations,
   but again, the information needed to know when certain areas are
   going to be used a lot versus other areas is not known at compile
   time, which moves us to ...

Most compilers, the ones O'Keefe cites are exceptions, do relatively
little global analysis. O'Keefe's were developed for languages that
would otherwise be extremely difficult to optimize. Many languages
need much less global information for the most useful optimizations,
due to the presence of typing, data type structure, side effect, and
aliasing information that is required in standard conforming
code. Standard conforming C code provide the essential typing and data
structure information, but does not provide full side effect and
aliasing information, so that it is intermediate in optimizability.

   Run time profiling/compiler feedback: This can certainly provide
   good information back to the compiler, but it seems a little
   dangerous.  I have not used this type of optimization, so my
   opinion means little in this case, except for just my general
   impression.  Anything other than a trivial program is going
   to be very difficult to simulate real-world conditions in order
   to get an optimal result.  The downside is if your test conditions
   are wrong, you may end up with a "most pessimum" optimization.

But if your test conditions are wrong you have typically not
appropriately tested all aspects of the code, including the
correctness of its output, its robustness, and its general
efficiency.

   It just seems to me that there is no substitute for a well
   thought-out design.  Depending on the compiler to save bad design
   is just a bad idea.

True, a compiler will not change a 0(n*n) algorithm to a 0(n*ln(n))
algorithm, but it can have a significant effect on the multiplicative
factor. Further, probably every language has limitations in terms of
expressibility of design. Algorithms that are essentially independent
of data type, e.g., lists, are difficult to specify in a type
independent form in languages such as C, and must be reimplemented for
each new special case. Aspects of the design that are independent of
the details of the implementation, but can be used by the compiler for
optimization, e.g., that a list wil not be cyclic or that an object
will not be modified after its creation, are also difficult to express
in C, and many other commonly used languages.

<snip>

   The above are all worthwhile opimizations; I'm not claiming that
   optimizers do nothing.  All I'm saying is that too often programmers
   fall back on the lazy line, "Oh well, the compiler should take care
   of that."  For example, if I have ...

   for (i = 0; i < 10000000; ++i) 
       array[i] *= abs(values[i]);

   Should I be *sure* this is going to be done efficiently, and use
   pointers, or should I just let the compiler handle it? I say that
   a competent programmer should use pointers out of habit, because
   its just good design and practice.  And when the problem becomes
   less trivial than the one above, the programmer will already be
   using the proper techniques *the first time*, rather than having
   to go back and fix the inevitable performance bottlenecks.

Again a good compiler, if anything, should generate better code for an
array than for a pointer.

   I think you are being too optimistic that optimizers produce
   better code than "almost every programmer there is".  If that were
   true, then projects would *never* be written in assembly language.

   As it stands, we know that often times things are written in
   assembly for performance reasons, such as device drivers.
   The perfect example of this are computer games.  If your theory
   were true, then they would all be written in high-level languages
   to gain the best performance.  We know that this isn't the case,
   however.  A bright human can always outdo the optimizer.

Again history affects useage. First, the most efficient code
generators are produced first in a research environment, and only
propagate to commercial compilers after their robustness has been
thoroughly tested. Second, we rarely judge anything without the
predjudices of our years of experience. People typically expect what
was standard when they first started in a field, not what is currently
standard. Optimizers are typically better now than they were ten years
ago. Third, users do not always put a large emphasis on general code
efficiency. For example, the original Fortran I and II compilers used
a number of optimizations that were not used in any other compilers,
Fortran or otherwise, untill the late sixties, and were only common in
compilers by the mid to late seventies. Fourth, while high level
languages are not commonly used for such tasks they are becoming more
commonly used for such tasks, Erlang in particular comes to mind.

   > And you know what?  C is one of the *harder* languages to optimise!

   You are probably right, if you let the optimizer handle everything.
   But C also gives the capability to do things such as ...

   if ((array[++n] = GetValue(arg)) != 0) {
       ....

   v.s. a typical non-C language ...

   n = n + 1;
   array[n] = GetValue(arg);
   if (array[n] != 0) {
       ...
   }

   Now, which is easier to optimize?  For this trivial case, a
   modern compiler would probably do exactly the same thing, but
   what about when it starts getting non-trivial?  If I code my
   expressions to take advantage of the "data flow" nature of C's
   syntax, I can make it very easy for the optimizer to do a
   great job.  And it's not any more work, it's just a matter of
   experience and living in reality, and not pretending that the
   compiler is smarter than it really is.

Actually C has relatively little in the way of data flow concepts as
data flow is commonly understood. A true data flow language would
avoid side effects in almost every context except I/O. Your example is
full of side effects, some of which, particularly GetValue(arg), are
difficult to optimize. Depending on the context a data flow language
might have the syntax

array[1:n] = GetValye(arg(1:n));
where (array /= 0) { ... }

in the above GetValue is required to not have side effects so that the
definition 

array[1:n] = GetValye(arg(1:n));

can be understood as occuring in parallel. In C for full optimization
the compiler has to determine that arg is not aliased with elements of
array, and that GetValue has no side effects. Note that a highly
pipelined architecture is comparable to a truly parallel architecture
in many optimization issues.
-- 

William B. Clodius		Phone: (505)-665-9370
Los Alamos National Laboratory	Email: wclodius@lanl.gov
Los Alamos, NM 87545