From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 109fba,baaf5f793d03d420 X-Google-Attributes: gid109fba,public X-Google-Thread: fc89c,97188312486d4578 X-Google-Attributes: gidfc89c,public X-Google-Thread: 103376,97188312486d4578 X-Google-Attributes: gid103376,public X-Google-Thread: 1014db,6154de2e240de72a X-Google-Attributes: gid1014db,public From: ok@goanna.cs.rmit.edu.au (Richard A. O'Keefe) Subject: Re: What's the best language to start with? [was: Re: Should I learn C or Pascal?] Date: 1996/08/26 Message-ID: <4vroh3$17f@goanna.cs.rmit.edu.au> X-Deja-AN: 176513308 references: <31FBC584.4188@ivic.qc.ca> <01bb83f5$923391e0$87ee6fce@timpent.airshields.com> <4uah1k$b2o@solutions.solon.com> <01bb853b$ca4c8e00$87ee6fce@timpent.airshields.com> <4udb2o$7io@solutions.solon.com> <01bb8569$9910dca0$87ee6fce@timpent.airshields.com> <4urqam$r9u@goanna.cs.rmit.edu.au> <01bb8b84$200baa80$87ee6fce@timpent.airshields.com> <4vbbf6$g0a@goanna.cs.rmit.edu.au> <01bb8f18$713e0e60$32ee6fce@timhome2> organization: Comp Sci, RMIT, Melbourne, Australia newsgroups: comp.lang.c,comp.lang.c++,comp.unix.programmer,comp.lang.ada nntp-posting-user: ok Date: 1996-08-26T00:00:00+00:00 List-Id: "Tim Behrendsen" writes: >They may exist, but does it exist universally? The point is that to >go out of your way to use left-field optimizations is just bad >portability practice. "Effen it wur good enough fur granpaw, it's good enough fur me." in other words. By the way, I find it EXTREMELY odd to call a technique that has been known and used for nearly 20 years a "left-field" one. Behrendsen is putting things exactly backwards; *he* is advocating "do not use a technique in your programming language, no matter how natural or readable it is, unless you know that every compiler you use will implement it reasonably". Well, I *can't* know that. Heck, I can't even know that X := X+1; will take constant or even small time; almost all programming language standardisers say "that's a quality of implementation issue which is outside our scope". For an *exact* parallel, consider function inlining. These days, I expect a good compiler to at the very minimum support programmer-supplied inlining hints, and the Fortran, Pascal, and C compilers I normally use do it _automatically_. This is another of those optimisations which has been known and used for decades in the Lisp community, which lets you write your program using a function call instead of a macro. I guess inline functions are another "left-field optimisation". > For example, if someone does this: >void sub(char *s) >{ > ... > if (strlen(s) == 0) { /* check for null string */ > ... > } >} >and they know it's stupid, but they also know the compiler >just happens to have optimization for it, should they >still be shot? "Diana, get me my rifle." You have prejudged the issue by stipulating that the programmer doing this KNOWS it is stupid. To start with, can I just make the obvious point that in a very large number of currently used languages, the direct equivalent of this would be the very best thing to do. There are a lot of *good* programmers out there (definition: in a reasonable amount of time they can write readable code that works) who would *not* "know that it's stupid". It is, for example, much less stupid than writing if (s == "") Let's see what is good about this code: - it is correct (not a NULL test, not a "" test, but a length test) - it is clear (much clearer to people not immersed in C than "!*s") Doesn't look stupid to me yet. What's bad about it? - it takes O(|s|) time. BUT - standard optimisations like common subexpression elimination, can be applied, so that any number of calls to strlen(s) without any intervening potential change to s are no more expensive than a single call - in many applications, all or a large part of s will be processed if it is not empty, so the cost of sub() will be O(|s|) anyway so it's only "stupid" if it occurs in a loop and you know your compiler _doesn't_ move loop invariants out. I would go so far as to say that int count_blanks(char const * const s) { int i, r; r = 0; for (i = 0; i < strlen(s); i++) if (s[i] == ' ') r++; return r; } is a _good_ function. The ONLY strlen() optimisation that is needed here is one that is completely general: strlen() is a pure function and its argument does not change, so the strlen(s) computation can be hoisted out of the loop. Since the optimisation in question is one which has been in use since the very early Fortran days, it can hardly be called "left-field". A compiler that _doesn't_ do this is indeed a poor compiler, these days. Here I am actually arguing against my own inclination, because I strongly dislike seeing strlen() used this way, and will take advantage of any excuse to "fix" it. I certainly tell students why not to do this. But I cannot in good conscience call clear correct code "stupid". >> Scuse please? WHAT reality? The reality in this case is that you can >> in fact in just about any reasonable programming language sort data >> WITHOUT moving it. >Wait, hold the phone! "Sorts the data without moving it"? What, >is APL's sorting algorithm O(1)? No of course not. I didn't say that. I said that in just about any reasonable programming language [I did not say APL] you can sort data WITHOUT moving it [I did not say O(1)]. My claim is that YOU CAN SORT THE DATA WITHOUT MOVING *THE DATA*. I don't see how any sane reader could translate that into O(1), unless perhaps said reader was under the delusion that sorting involves nothing but movement! >Yes, it may not actually get >sorted until it gets printed, but that's irrelevent to the fact >that it eventually gets sorted. Printing is a complete irrelevancy indeed, dragged in solely by Behrendsen. It is another of the straw men I could well do without. Sorting is not about movement. It is about *determining a permutation*. Depending on the type of the keys, it may do divisions (as in bucket sort and radix sort) or comparisons (as in merge sort). When you are sorting large or variable length records, it is of practical importance that you can sort the data by returning a permutation vector WITHOUT MOVING THE ORIGINAL DATA. This is not an academic point, because the permutation vector may be 10s or 100s of times smaller than the original data. My reference to the APL grade-up primitive made all clear: grade-up returns a permutation vector. You can use a permutation vector to access the original elements in ascending order, descending order, or by binary search, or anything you want, without having to pay the price of moving large records. One of the things which made the Bentley & McIlroy "Engineered" version of qsort() complicated was having to work around the fact that qsort() is set up to sort large fixed length records; writing the code to take advantage when the elements turn out to fit into machine registers is tricky, and in fact not portable. >Indeed, and I think APL is kind of a neat language to learn. But >it hides so much about the internals of the computer, that I think >it would give a student too many false impressions about how >things really work. Of course APL hides the internals of the computer. That's what a high level language is FOR! I've done enough programming in ESPOL (B6700 only) and BLISS-10 (DEC-10 only) to be grateful for languages that hide machine details. I have manuals for - System/360 - Z80 (somewhere) - 8051 - 8086 - 80960 - Intergraph Clipper - Motorola 68000 - Motorola 88000 - SPARC - PA-RISC - (part of) MIPS 3000 (in Hennesey & someone). and have studied them. I not written assembly code for the Z80, 8051, 80960, or PA-RISC, though I have debugged other people's code for the Z80 and 8051. I have written assembler for the others, and for the DEC-10, PDP-11, VAX, Pyramid, IBM RT-PC, and NS32532 (oh yes, and the Computer Automation Alpha LSI 2 and the PRIME 400). I know more about the internals of computers than I really want to, and the better a job the language I'm using does of hiding them, the better I like it. (There is a *lot* to like about Oberon, but I cannot like Wirth's reversion to C-style integral types.) >I actually quite agree with this; it's how you get the "high level >thinking" that I think is an issue. There is an Edinburgh paper from the 70s that I want to quote from. It came from the EMAS project. The EMAS operating system was designed and written at Edinburgh and written an Algol-like language called IMP 77. They observed that the operating system modules written by people skilled in assembly language (and who even went to the extreme of checking the assembly code produced by the compiler to make sure it was efficient) tended to be - bigger, - less readable, and - SLOWER than that produced by people who had a "high level language" perspective. I'll try to remember to bring the paper in tomorrow so I can quote it exactly. >Even in a mathematical proof, >you are talking about a sequence of micro-steps. Yes, most >proofs are built of other proofs, but I think this is more of >a "proof macro language" than a "high-level mathematical >language" (whatever that means). I don't quite see proofs by duality or symmetry as simple macro substitution. I think your example here actually gives more comfort to the anti-assembly camp than to your own. I once took a masters level course on the Spinor Calculus. The lecturer (who came from the Sorbonne) wanted to make sure we understood the whole thing thoroughly, so he started with metamathematics and constructed everything from the ground up. We were nearly half way through the course before we got to the rational numbers. We ended up with only two lectures that were actually about Spinors, and I _still_ don't know what they are. (I've started reading Penrose & Rindler to find out.) In that case at least, over-emphasis on "fundamentals" was a major impediment to the high level understanding I really wanted. For what it's worth, students here *do* get taught about computer architecture, and they *do* get taught a bit of 68000 assembly code, and they *do* work with little single board machines so they can try their 68000 code out. But this is understood as a computer architecture course, not as a data structures and algorithms course. (The course is not confined to the 68000, but that's all _they_ have to program at that level.) -- Australian citizen since 14 August 1996. *Now* I can vote the xxxs out! Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.