From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 109fba,7efa2e62a00a99a3,start
X-Google-Attributes: gid109fba,public
X-Google-Thread: 103376,7efa2e62a00a99a3,start
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 1995-03-13 18:17:56 PST
Path: 
 bga.com!news.sprintlink.net!howland.reston.ans.net!spool.mu.edu!uwm.edu!msunews!harbinger.cc.monash.edu.au!aggedor.rmit.EDU.AU!goanna.cs.rmit.edu.au!not-for-mail
From: ok@goanna.cs.rmit.edu.au (Richard A. O'Keefe)
Newsgroups: comp.lang.ada,comp.lang.c++
Subject: Three C++ problems
Date: 14 Mar 1995 12:55:52 +1100
Organization: Comp Sci, RMIT, Melbourne, Australia
Message-ID: <3k2t38$hck@goanna.cs.rmit.edu.au>
NNTP-Posting-Host: goanna.cs.rmit.edu.au
NNTP-Posting-User: ok
Summary: inspired by D&E
Xref: bga.com comp.lang.ada:11083 comp.lang.c++:54127
Date: 1995-03-14T12:55:52+11:00
List-Id: <comp.lang.ada>

I've recently been reading Bjarne Stroustrup's "The Design and Evolution
of C++".  I've also been reading a C++ critique, whose strident tone and
occasional factual errors really put me off.  I've never liked C++, but
Stroustrup's book (henceforth D&E) struck me as so sane and balanced that
it made me look at the language with new eyes.  However, three "warts"
still stick out where C++ is weak in its _own_ terms.  Below I mention
Ada a couple of times because none of these problems is a problem in Ada.
Do *NOT* take this as a "my language is better than your language" flame;
I cite Ada and other languages as "existence proofs" that things could be
different.  I would be delighted to hear that these weaknesses are fixed
in the proposed standard.


1.  Topic: Numbers
    Goal : Portability.

    First, I have to establish that portability _is_ a goal of C++.  The
    D&E book does state that "extreme portability and ... were necessities"
    (p65) but this seems to refer to Cfront, not user code, and again p125
    refers to "portability of at least some C++ _implementations_".
    However, p22 says "the emphasis I placed on portability" and p43 cites
    as one of the major reasons for choosing C as the foundation of C++
    that "[4] C is portable".  One of the main tools for making C code
    more portable is the preprocessor, but p119 spells out Stroustrup's
    desire that "preprocessor usage should be eliminated".  This cannot be
    willed unless one also wills that the requisite level of portability
    should be obtainable directly in the language.

    To summarise, I believe it is fair to say that _one_ of the goals of
    C++ is that programmers should be enabled to write portable code if
    they wish to, with a reasonable level of effort, and without extra-
    linguistic tools such as the preprocessor.


    C++ inherits a major defect from C which C in turn copied from Algol
    68.  It was in my view the worst design mistake in Algol 68.  That is,
    programmers cannot ask for a numeric type in application-related terms
    (as in Pascal and Ada) but must ask in _implementation_-related terms.
    Specifically, there are three sets of numeric types:
	signed char; signed short; signed int; signed long
	unsigned char; unsigned short; unsigned int; unsigned long
	float; double; long double
    with some versions of C also having signed and unsigned long long.
    I have lost count of the number of times I've seen UNIX C code fail
    when ported to a PC ('int' was used for types needing more than 16
    bits), and I've also lost count of the number of times I've seen PC
    code fail when ported to a non-PC system because it assumed that
    int and short were the "same" type and that int* and short* were also
    the "same".  I have also had my own code break when I increased a
    limit (say from 100 to 1000) and thus some quantity has increased
    from below-guaranteed-int-size (in this case from 10 000) to above-
    guaranteed-int-size (in this case to 1 000 000).


    Unlike C, C++ has the syntactic resources to permit a solution to the
    problem.   If
	signed<e1>		signed<e1,e2>
	unsigned<e1>		unsigned<e1,e2>
	float<e1>		float<e1,e2>
    where e1, e2 are constant expressions of maximal signed, unsigned, and
    floating-point types, were interpreted as _synonyms for existing types_
    large enough to represent e1 (and e2 if provided) as distinct values,
    it would be possible to write things like

	const int limit = 100;
	typedef signed<limit*limit> two_d_index;
	typedef float<1.000000001e30, 1.000000002e30> myfloat;

    However, no such feature is present in the C++ ARM, nor is any such
    extension listed in D&E.


2.  Topic:  Arrays
    Goal :  Run-Time Efficiency (Stack allocation)

    One important contrast between C++ and many other object-oriented
    languages such as Simula, Smalltalk, and CLOS is that C++ allows
    objects to be allocated statically or on the run-time stack, without
    requiring the use of heap storage for _all_ objects as the other three
    do.  This is not accidental.  See p32 of D&E.  Section 2.3 (p31)
    classifies this as one of 8 "key design decisions".

    Arrays were third-class citizens in C.  They are fourth-class citizens
    in C++, because there are so many more rights that they do not enjoy.
    In particular, _some_ kind of inefficiency is mandatory with C++ arrays.
    The problem is that the size of an array must be known to the compiler.

    Now Algol 60, Algol W, Simula 67, Algol 68, and PL/I all supported
    *stack* allocation of arrays with run-time sizes.  These days, even
    Fortran can manage it.

    If you don't have run-time sizing, you have to determine an upper bound
    on array size when you write the code.  I think everyone has had the
    same experience:  when you first write the program, you find yourself
    wasting a lot of space because the upper bound was unrealistically big.
    A couple of years later the program starts crashing because the upper
    bound is unrealistically small (despite not having changed in between!).

    Because C++ allows overloading of [], and has templates, it is possible
    to define
	template<class T> class vector {
	  private:
	    int size;
	    T*  data;
	  public:
	    vector(int i) { allocate and assign data, set size }
	    ~vector() { return data to free storage }
	    T& operator[](int i) { return reference to data[i] }
	    ...
	};
	vector<float> a[N]; // N may be a run-time expression

    which can then be used as conveniently and (almost) as safely as
    stack-allocated arrays.  I say _almost_ as safely because of setjmp()/
    longjmp(), inherited from C.  (Using exception handling instead will
    reduce this problem but not eliminate it.)  This approach hides the
    dynamic allocation from the user, but the run-time cost is still there.
    There is no particular reason why malloc()/free() or new[]/delete[]
    _should_ be excruciatingly slow, but all too often they _are_.  Note
    that I am not objecting to having to write a template or a class to get
    run-time sizing, I am objecting to the use of malloc()/free() or new[]/
    delete[].


    Run-time array sizing is not an exotic technique whose efficient
    implementation is only now becoming understood in the most advanced
    computer science research laboratories.  It has been well understood
    for 30 years.  Ada 83 had it.  Fortran 90 has it.  Ada 95 manages to
    combine it with OOP.


3.  Topic: Nested functions.
    Goal : Forcing programmers to do things a particular way.

    Stroustrup explains his philosophy in section 1.3.  On p23, he writes
       'Respect for groups that doesn't include respect for individuals of
        those groups isn't respect at all.  Many C++ design decisions have
        their roots in my dislike for forcing people to do things in some
        particular way.  In history, some of the worst disasters have been
        caused by idealists trying to force people into "doing what is
        good for them."  Such idealism not only leads to suffering among
        its innocent victims, but also to delusion and corruptions of the
        idealists applying the force.  ...  Where ideals clash and sometimes
        even when pundits seem to agree, I prefer to provide support that
        gives the programmer a choice."

    Given this constrast between the philosophy behind C++ and the
    philosophy behind Ada, it is ironic that Ada 83 and Ada 95 actually
    offer the programmer _more_ choice than C++ in several areas.  I have
    already pointed out that an Ada programmer has the option of defining
    her numeric data types portably, while the C++ programmer does not
    have that choice.  Nested functions are another topic where Ada gives
    programmers a choice.  Style guides and pandits may make their
    recommendations, but the feature is _there_ in the language and if a
    programmer deems nested functions appropriate for an application she
    can go ahead and use them in Ada.

    In C++
	blocks can be nested inside blocks
	classes can be nested inside classes
	namespaces can be nested inside namespaces (p415)
	but functions cannot be nested inside functions.

    This is especially peculiar in view of the facts that
    (1) on p415 of D&E, Stroustrup says "the simple reason that
        constructs ought to nest unless there is a strong reason for them
        not to".
    (2) functions _can_ be nested inside functions, sort of.
	What I'm referring to here is that
	a class (A) may have a member function (B) which contains a
	class definition (C) which may have a member function (D).
	But (B) may not contain (D) directly.  To be sure, (D) must be
	inline, but that only adds to the confusion.
	
    So there are three reasons _internal_ to C++ why nested functions
    ought to be allowed:
    -- it gives the programmer a choice
    -- Stroustrup says constructs should nest without strong reason not to
    -- nested functions are already permitted in a restrictive form, and
       relaxing the rules would make the language easier to explain and teach
       (an explicit goal for C++).

    Once again, Algol 60, Algol W, Algol 68, PL/I, Simula 67, and Pascal
    all provided nested functions.  Ada 83 and Ada 95 provide them.  GNU
    C provides them, without adverse impact on efficiency when they are
    not used.


-- 
"The complex-type shall be a simple-type."  ISO 10206:1991 (Extended Pascal)
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.