From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 109fba,7efa2e62a00a99a3,start X-Google-Attributes: gid109fba,public X-Google-Thread: 103376,7efa2e62a00a99a3,start X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 1995-03-13 18:17:56 PST Path: bga.com!news.sprintlink.net!howland.reston.ans.net!spool.mu.edu!uwm.edu!msunews!harbinger.cc.monash.edu.au!aggedor.rmit.EDU.AU!goanna.cs.rmit.edu.au!not-for-mail From: ok@goanna.cs.rmit.edu.au (Richard A. O'Keefe) Newsgroups: comp.lang.ada,comp.lang.c++ Subject: Three C++ problems Date: 14 Mar 1995 12:55:52 +1100 Organization: Comp Sci, RMIT, Melbourne, Australia Message-ID: <3k2t38$hck@goanna.cs.rmit.edu.au> NNTP-Posting-Host: goanna.cs.rmit.edu.au NNTP-Posting-User: ok Summary: inspired by D&E Xref: bga.com comp.lang.ada:11083 comp.lang.c++:54127 Date: 1995-03-14T12:55:52+11:00 List-Id: I've recently been reading Bjarne Stroustrup's "The Design and Evolution of C++". I've also been reading a C++ critique, whose strident tone and occasional factual errors really put me off. I've never liked C++, but Stroustrup's book (henceforth D&E) struck me as so sane and balanced that it made me look at the language with new eyes. However, three "warts" still stick out where C++ is weak in its _own_ terms. Below I mention Ada a couple of times because none of these problems is a problem in Ada. Do *NOT* take this as a "my language is better than your language" flame; I cite Ada and other languages as "existence proofs" that things could be different. I would be delighted to hear that these weaknesses are fixed in the proposed standard. 1. Topic: Numbers Goal : Portability. First, I have to establish that portability _is_ a goal of C++. The D&E book does state that "extreme portability and ... were necessities" (p65) but this seems to refer to Cfront, not user code, and again p125 refers to "portability of at least some C++ _implementations_". However, p22 says "the emphasis I placed on portability" and p43 cites as one of the major reasons for choosing C as the foundation of C++ that "[4] C is portable". One of the main tools for making C code more portable is the preprocessor, but p119 spells out Stroustrup's desire that "preprocessor usage should be eliminated". This cannot be willed unless one also wills that the requisite level of portability should be obtainable directly in the language. To summarise, I believe it is fair to say that _one_ of the goals of C++ is that programmers should be enabled to write portable code if they wish to, with a reasonable level of effort, and without extra- linguistic tools such as the preprocessor. C++ inherits a major defect from C which C in turn copied from Algol 68. It was in my view the worst design mistake in Algol 68. That is, programmers cannot ask for a numeric type in application-related terms (as in Pascal and Ada) but must ask in _implementation_-related terms. Specifically, there are three sets of numeric types: signed char; signed short; signed int; signed long unsigned char; unsigned short; unsigned int; unsigned long float; double; long double with some versions of C also having signed and unsigned long long. I have lost count of the number of times I've seen UNIX C code fail when ported to a PC ('int' was used for types needing more than 16 bits), and I've also lost count of the number of times I've seen PC code fail when ported to a non-PC system because it assumed that int and short were the "same" type and that int* and short* were also the "same". I have also had my own code break when I increased a limit (say from 100 to 1000) and thus some quantity has increased from below-guaranteed-int-size (in this case from 10 000) to above- guaranteed-int-size (in this case to 1 000 000). Unlike C, C++ has the syntactic resources to permit a solution to the problem. If signed signed unsigned unsigned float float where e1, e2 are constant expressions of maximal signed, unsigned, and floating-point types, were interpreted as _synonyms for existing types_ large enough to represent e1 (and e2 if provided) as distinct values, it would be possible to write things like const int limit = 100; typedef signed two_d_index; typedef float<1.000000001e30, 1.000000002e30> myfloat; However, no such feature is present in the C++ ARM, nor is any such extension listed in D&E. 2. Topic: Arrays Goal : Run-Time Efficiency (Stack allocation) One important contrast between C++ and many other object-oriented languages such as Simula, Smalltalk, and CLOS is that C++ allows objects to be allocated statically or on the run-time stack, without requiring the use of heap storage for _all_ objects as the other three do. This is not accidental. See p32 of D&E. Section 2.3 (p31) classifies this as one of 8 "key design decisions". Arrays were third-class citizens in C. They are fourth-class citizens in C++, because there are so many more rights that they do not enjoy. In particular, _some_ kind of inefficiency is mandatory with C++ arrays. The problem is that the size of an array must be known to the compiler. Now Algol 60, Algol W, Simula 67, Algol 68, and PL/I all supported *stack* allocation of arrays with run-time sizes. These days, even Fortran can manage it. If you don't have run-time sizing, you have to determine an upper bound on array size when you write the code. I think everyone has had the same experience: when you first write the program, you find yourself wasting a lot of space because the upper bound was unrealistically big. A couple of years later the program starts crashing because the upper bound is unrealistically small (despite not having changed in between!). Because C++ allows overloading of [], and has templates, it is possible to define template class vector { private: int size; T* data; public: vector(int i) { allocate and assign data, set size } ~vector() { return data to free storage } T& operator[](int i) { return reference to data[i] } ... }; vector a[N]; // N may be a run-time expression which can then be used as conveniently and (almost) as safely as stack-allocated arrays. I say _almost_ as safely because of setjmp()/ longjmp(), inherited from C. (Using exception handling instead will reduce this problem but not eliminate it.) This approach hides the dynamic allocation from the user, but the run-time cost is still there. There is no particular reason why malloc()/free() or new[]/delete[] _should_ be excruciatingly slow, but all too often they _are_. Note that I am not objecting to having to write a template or a class to get run-time sizing, I am objecting to the use of malloc()/free() or new[]/ delete[]. Run-time array sizing is not an exotic technique whose efficient implementation is only now becoming understood in the most advanced computer science research laboratories. It has been well understood for 30 years. Ada 83 had it. Fortran 90 has it. Ada 95 manages to combine it with OOP. 3. Topic: Nested functions. Goal : Forcing programmers to do things a particular way. Stroustrup explains his philosophy in section 1.3. On p23, he writes 'Respect for groups that doesn't include respect for individuals of those groups isn't respect at all. Many C++ design decisions have their roots in my dislike for forcing people to do things in some particular way. In history, some of the worst disasters have been caused by idealists trying to force people into "doing what is good for them." Such idealism not only leads to suffering among its innocent victims, but also to delusion and corruptions of the idealists applying the force. ... Where ideals clash and sometimes even when pundits seem to agree, I prefer to provide support that gives the programmer a choice." Given this constrast between the philosophy behind C++ and the philosophy behind Ada, it is ironic that Ada 83 and Ada 95 actually offer the programmer _more_ choice than C++ in several areas. I have already pointed out that an Ada programmer has the option of defining her numeric data types portably, while the C++ programmer does not have that choice. Nested functions are another topic where Ada gives programmers a choice. Style guides and pandits may make their recommendations, but the feature is _there_ in the language and if a programmer deems nested functions appropriate for an application she can go ahead and use them in Ada. In C++ blocks can be nested inside blocks classes can be nested inside classes namespaces can be nested inside namespaces (p415) but functions cannot be nested inside functions. This is especially peculiar in view of the facts that (1) on p415 of D&E, Stroustrup says "the simple reason that constructs ought to nest unless there is a strong reason for them not to". (2) functions _can_ be nested inside functions, sort of. What I'm referring to here is that a class (A) may have a member function (B) which contains a class definition (C) which may have a member function (D). But (B) may not contain (D) directly. To be sure, (D) must be inline, but that only adds to the confusion. So there are three reasons _internal_ to C++ why nested functions ought to be allowed: -- it gives the programmer a choice -- Stroustrup says constructs should nest without strong reason not to -- nested functions are already permitted in a restrictive form, and relaxing the rules would make the language easier to explain and teach (an explicit goal for C++). Once again, Algol 60, Algol W, Algol 68, PL/I, Simula 67, and Pascal all provided nested functions. Ada 83 and Ada 95 provide them. GNU C provides them, without adverse impact on efficiency when they are not used. -- "The complex-type shall be a simple-type." ISO 10206:1991 (Extended Pascal) Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.