From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ef33c33c4f98bde1 X-Google-Attributes: gid103376,public From: "Robert I. Eachus" Subject: Re: Compiler for Z80/6510 Date: 1999/12/01 Message-ID: <3845D4B4.98354460@mitre.org>#1/1 X-Deja-AN: 555586868 Content-Transfer-Encoding: 7bit References: <383c6fed.458467@news.fiam.net> <81k67s$47l$1@nnrp1.deja.com> <383DC86C.19A6F176@australia.boeing.com> <81m4m4$ci0$1@nnrp1.deja.com> X-Accept-Language: en Content-Type: text/plain; charset=us-ascii X-Complaints-To: usenet@news.mitre.org X-Trace: top.mitre.org 944100145 12160 129.83.41.77 (2 Dec 1999 02:02:25 GMT) Organization: The MITRE Corporation Mime-Version: 1.0 NNTP-Posting-Date: 2 Dec 1999 02:02:25 GMT Newsgroups: comp.lang.ada Date: 1999-12-02T02:02:25+00:00 List-Id: Robert Dewar In practice, really skilled assembly language programmers (there > are very few around, especially these days), can always outrun > any compiler given enough time and effort. Once, a long time ago, I taught the APL section of a Computing Languages course right after the PL/I section. (As I said, this was a long time ago.) The students were using an IBM 3033U mainframe. (As I said...) The last assignment for the PL/I segment was to write a program to generate magic squares in Ada. In the first APL session (before the assignment was due.) I put a $20 bill on the lecturn and said I'd give one to anyone whose submitted PL/I program was faster than the following APL code. (APL one-liner omitted to spare eyesight.) I then proceded to explain some of the reasons why an interpreted language could be a reasonable choice for many applications. One student who probably didn't really need the course asked after class how fast the APL version ran. I handed him a listing showing another one liner which timed the first program, and the time: 87 ms. About a week later, he ambushed me again after class and showed me what he had done. First, he used the very clever algorithm from the APL program and coded it in PL/I. Took over a second. Next he wrote a PL/I null program that still took 99 ms to run. (He said: "So you made a pretty safe bet." Me: "Did you think I didn't know that?" Next he created an APL workspace that closed itself immediately: 19 ms. Hmmm. So he rewrote the code in IBM 370 assembler, and did a very good job: 233 ms. Hmmm. Hmmm! What was going on? I explained that the IBM 3033 had build in a special accelerator for APL code, instructions that could not be generated by the assembler. This set included some vector operations which were very fast when just moving bits around, arithmetic was still limited by the ALU bottleneck. Since the magic square algorithm used only array permutations, it was very, very fast. Now this is a pretty special case, but in general modern processors have hundreds of special purpose operations which are not available from the assembler. This is because the Assembler limits itself to the set of instructions specified for all the implementations of the architecture. For example, on Sun SPARCs, at first it was the case that you could use (integer) multiply and divide operations which weren't available in compiler generated code, but could be generated by the assembler. Then it was possible to provide (in Ada terms) code inserts that generated these instructions. Then the compilers started generating them. Next the OS provided routines which trapped the instructions if not provided by the particular chip you were running on, etc. But about when the version 8 SPARC reference came out, things proceded in exactly the reverse order with some of the new instructions: the assembler was the last to support them. (I could go into details, but there were serious timing issues for these instructions, and it ws thought that they would only be used in compiler run-times for special purposes.) So for a while, I had some Ada code for the SPARC that ran faster than the tightest assembler I could write. Currently we are in this regime again. The Intel Pentium II & III and the Athelon processor have different graphics support enhancements. We could argue which is better, but using special libraries and run-times, in some cases special to both the chip AND the graphics card, you can generate code which is significantly faster than any Pentium assembly code. Of course, to get these advantages, not only do you have to have processor specific knowledge, but you often have to be able to program in machine language, and that is very definitely a dying art. -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is...