From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 115aec,d275ffeffdf83655 X-Google-Attributes: gid115aec,public X-Google-Thread: 146b77,d275ffeffdf83655 X-Google-Attributes: gid146b77,public X-Google-Thread: 103376,d275ffeffdf83655 X-Google-Attributes: gid103376,public From: robert_dewar@my-dejanews.com Subject: Re: Ada vs C++ vs Java Date: 1999/01/21 Message-ID: <787f4b$jl9$1@nnrp1.dejanews.com> X-Deja-AN: 435238310 References: <369C1F31.AE5AF7EF@concentric.net> <369DDDC3.FDE09999@sea.ericsson.se> <369e309a.32671759@news.demon.co.uk> <77ledn$eu7$1@remarQ.com> <77pnqc$cgi$1@newnews.global.net.uk> <8p64spq5lo5.fsf@Eng.Sun.COM> <782r25$k18$1@nnrp1.dejanews.com> X-Http-User-Agent: Mozilla/4.04 [en] (OS/2; I) X-Http-Proxy: 1.0 x16.dejanews.com:80 (Squid/1.1.22) for client 205.232.38.14 Organization: Deja News - The Leader in Internet Discussion X-Article-Creation-Date: Thu Jan 21 14:55:14 1999 GMT Newsgroups: comp.lang.ada,comp.vxworks,comp.realtime Date: 1999-01-21T00:00:00+00:00 List-Id: In article , > A data point is a data point, not a universal law. > I stand by what I said. The K&R White Book was the sole > C manual for many years, so it'll have to do. ANSI C is > not quite the same langauge, and came many years later. That's not the point. Your "data point" is invalid because it was an apples and oranges comparison. You cannot compare an informal description like the K&R white book with an ANSI standard. It is not good enough to say, well I know I was comparing X and Y, and they are not comparable but that's all I could find to compare, the data point is still entirely invalid. > > As for the size of the compilers, you were at that time > > not looking at modern optimizing compilers. If you > > repeat this experiment with modern optimizing > > compilers, you will find that for all the languages, > > the majority of the complexity, > > and weight of the compiler is in the optimizer. xt. > > Is the optimiser common to both languages? Back in > 1980s, I couldn't isolate the parts of the two compilers, > and nothing was shared anyway. Again, the fact that you couldn't do the experiment properly does not make it valid! Of course you could have isolated the parts of the compiler back then, you just did not know how to. As for the optimizer being common to both languages, yes that's typically the case now, that back end optimization is language independent in most compilers. In any case even if the code isn't common, both compilers have large chunks of language independent optimization circuitry. > So, tell us, what are the weights of the various front > ends, and the optimiser or optimisers? Is there still a > pure "C" front end, or has it been subsumed into the C++ > front end now? This varies of course from one compiler to another, I already gave some idea for the gcc compiler. There are of course some cases of separate C front ends (e.g. GNU C), and some cases of integrated front ends. But even confining your measurements to a particular front end, or set of front ends, may say more about quality of implementation than the language itself. For example, comparing a C front end and an Ada front end. If you are trying to extract accurate aliasing information (gcc does not!) then this is far harder in C, and would take more apparatus than in Ada. Also, there is pretty much a free choice of where to do certainly global optimizations, they can be done early in the front end, or later in the backend. You need to really know the structure of a compiler well to get any kind of meaningful measurements. Your attempts to "weigh" compilers cannot possibly give anything but vague misleading results. You can "stand by" these meaningless results if you like, but that does not make them meaningful. <> You seem to confuse languages and their implementations here (and elsewhere in the post). Yes, there was one particular implementation of C++ that preprocessed into C (just as there is at least one implementation of Ada 95 that preprocesses into C). It is rarely used for serious C++ work these days, since there are many C++ compilers around that generate efficient object code directly (the same can be said of Ada 95 of course). > Yep. One would assume that all these complexities are > more or less proportional to each other. In particular, > complex syntax has got to cause complex compiler code. No, not at all. An easy mistake for someone who does not know compiler details, but syntax is always a trivial part of any compiler. And it is NOT AT ALL the case that these complexities are more or less proportional, that was the whole point of my post (and I gave examples, please reread). Here is another example, I can give literally hundreds of similar ones. A language is easier to describe, define, and use if its features are orthogonal. This is a well known principle, first clearly enunciated by the Algol-68 design. Let's take a particular example, in Ada 95 (same applies to any similar Algol style language), we have records with fields of whatever type we like. We have array types, and it is nice for these to be dynamically sizable. The above decisions, in an orthogonal design, mean that records can have fields whose type is a variable sized array. This combination causes absolutely NO increase in complexity of description, or use. Indeed a restriction that made this particular combination illegal would be a non-orthogonal "odd" rule, and would make the language more complex from these points of view. However, in the implementation domain, dynamic arrays are easy to implement, and ordinary records without dynamic array components are easy to implement. But the combination is annoying, and definitely increases the complexity of an implementation. Let's take another example along a different dimension goto statements certainly do not significantly complicate the informal description or use of a language like C. Typically they don't much affect implementation either, though they do make certain optimization techniques harder. But when it comes to formal description, gotos are a real menace. In a denotation semantic definition, a goto is often far more trouble to define than say an IF or LOOP statement. Have a look at the original SETL sources for Ada to see this complication effect (the SETL implementation of Ada was essentially a formal definition that executed, so was more affected by this consideration than a normal implementation would be). I can rattle on like this for a long time, there are such trade offs between nearly every aspect of complexity. During the Ada 95 design, we often had arguments about complexity, and it was quite helpful when I first proposed this more elaborate notion of complexity to realize that very often the reason that we disagreed on the issue was because we were arguing different dimensions. For example, the design team often concentrated on making things simple for the Ada programmer, sometimes at the expense of other aspects of complexity. What seemed at first like a disagreement ("This is far too complex", "no it isn't", "yes it is"), turned out to be an argument about tradeoffs between different goals, and once we understood this clearly, it was far easier to have a clear discussion and come to the right conclusion. One more example, from a recent thread. The ALL keyword for general access types in Ada 95 adds no significant implementation complexity, or complexity in the formal description. However, it definitely adds complexity to the user, as noted in previous posts to this group. > > Have you suitable numerical metrics for the other > components of complexity? I don't know that anyone has tried to do this, it would be difficult. Certainly this decomposition of the notion of complexity makes the problem more tractable, but still extremely difficult. But we don't have to have numerical metrics for this to be a useful principle for discussion (indeed bogus numerical metrics, such as your attempts to weigh compilers, can obfuscate clear discussion -- better no metrics than bogus ones). > > > Assembly language is simpler than any high-order > > > language, but it's lots more work to code in > > > assembly. > > > > Now let me guess. The last time you looked at machine > > language was in the 80's, right? Yes, in those days, > > the semantics of machine language was pretty simple. > > No, 1998. We do read and sometimes write PowerPC > assembly code. Many 1980s machines had more complex > instruction sets than the PowerPC. I *strongly* disagree. If you are just focussing on the functionality of the instruction set, sure, but this is trivial in all cases. The complex part of any instruction set is the execution efficiency semantics. > > I am afraid that things have changed. At this stage the > > full execution semantics of a modern chip with > > extensive instruction-level parallelism is remarkably > > complex along ALL the dimensions I mention above. A > > chip like the Pentium II, if you include efficiency > > issues, which are indeed not fully documented publicly, > > let alone formally specified, you have something far > > MORE complicated than any of the languages we are > > talking about here. > > All true, but what has all this to do with the original > question, the relative complexity of C, C++, Ada83, and > Ada95? Not sure, Why not ask Joe Gwinn, it was he who gave assembly language as an example of a simple language :-) > > >> Yet, people still use assembly. > > > > Well barely ... > > We try to avoid it, but don't always succeed. Another > variant is to code in a high-order language (HOL), > inspect the generated assembly, paraphrase the HOL source > trying to improve the assembly, iteratively. That is exactly what is extremely difficult to do. To know how to optimize the code, you have to fully understand the scheduling and other instruction parallelism aspects of the code. We quite often get comments on the generated code from GNAT that show that people do not understand such things. A little example, many people would expect in if (A > 4) AND [THEN] (B > 4) then that the addition of THEN would speed things up in the case where A is indeed greater than 4. This is of course quite likely to be false on modern machines where jumps can be expensive, and indeed an important optimization on predicated machines like the Merced is to eliminate short circuiting, and in general to convert if's to straight line code without jumps. > This can be very effective, but it does give a false > impression that the code is in a HOL. It isn't really, > because a new compiler will force a repeat of the > tuning steps just described. Sounds like a highly undesirable practice to me. I would recommend instead that you put your energy into learning more about how to use and choose compilers effectively. With modern machines, you are more likely to create a mess by mucking with the generated assembly code. This may have been a reasonable approach in the 70's but not today! -----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own