From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,1116ece181be1aea X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2003-09-22 06:15:47 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!wn14feed!worldnet.att.net!204.127.198.203!attbi_feed3!attbi.com!rwcrnsc54.POSTED!not-for-mail Message-ID: <3F6EF608.7010704@attbi.com> From: "Robert I. Eachus" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.2) Gecko/20021120 Netscape/7.01 X-Accept-Language: en-us, en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Is the Writing on the Wall for Ada? References: <3F650BBE.4080107@attbi.com> <3F67AAC6.2000906@attbi.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit NNTP-Posting-Host: 24.34.139.183 X-Complaints-To: abuse@comcast.net X-Trace: rwcrnsc54 1064236540 24.34.139.183 (Mon, 22 Sep 2003 13:15:40 GMT) NNTP-Posting-Date: Mon, 22 Sep 2003 13:15:40 GMT Organization: Comcast Online Date: Mon, 22 Sep 2003 13:15:40 GMT Xref: archiver1.google.com comp.lang.ada:42744 Date: 2003-09-22T13:15:40+00:00 List-Id: Wes Groleau wrote: > The story was interesting, but whether due to my IQ > or the fact it's an hour past my bedtime, I must confess > I do not know whether you were illustrating my point > or agreeing with Russ. Neither, I guess. The reality is that whether or not a temporary is needed may be an artifact of the hardware, and how the type is represented. All that Ada "requires" is that if you do an increment in memory, other tasks can never see a value out of range. This usually only applies of course to data visible to more than one task, but my example was a "clever" sequence which could trip up vector processing hardware--and often occured in practice with vectors of complex values. > (Russ insists that A += 1 never needs temporaries > and that A := A + 1 always does.) In pratice on modern hardware this statement is so far from the truth that it isn't even wrong. For example, at the point where the increment is done, the value of A may be in main memory, L3, L2, and L1 cache, and in a (virtual) register. After a successful increment is done, the register file will contain both a new and an old copy of A, and the processor logic will determine which value is used for other instructions in flight. For example, the code: A, B: Integer; ... B := A; A := A + 1; may be executed by the processor out of order. If so, the assignment to B will use the old value mentioned above, and if the assignment to A cause a trap, the processor will create a state where the first assignment has been done before calling the trap handler. In fact, the actual microinstructions executed may look like: M1: load A to R1 M2: load B to R2 M3: copy R1 to R2 M4: check if R1 = A's subtype'Last M5: if so cause a trap M6: increment R1 An OoO processor will then reify this code by assigning register file entries to the register names in the code. This will result in two register file entries each for R1 and R2. (And in practice, the second assignment for B will be to the renaming register initially assigned to A. This makes the "move" instruction a no-op, and it will be treated as such.) It will then execute the code in any order that is consistant with register file assignments. This can even result in later instructions being "speculatively" executed before instructions they depend on. It would not be at all uncommon for M4 and M6 to be executed simutaneously, with M6 "unwound" (in this case the result just ignored) if the trap in M5 occurs. Oh, and if this code is executed often, for example in a loop. EVERY time through the instructions may be executed in a different order than the previous time through. If instead of the A and B being scalars, they are vectors, and the intent is to add one to every element (instead of adding a unit vector), the same sort of thing will happen. But in that case, the increment and test instructions for several dozen elements of A may be in flight simultaneously, and only if a trap occurs, will the processor have to think about generating a consistant state. To condense the above, into one paragraph, learning assembly code for an ISA will result in the programmer thinking in terms of a finite state machine. But modern processors are NOT finite state machines. In particular the processors are only guarenteed to produce an image of a state consistant with the ISA implemented when certain instructions are executed. The processor is never in any consistant state. The most you can hope for is that at some points during the execution, you can define a subset of the actual hardware registers that corresponds to some externally known state. (For example, in an x86 ISA processor, the instructions are retired in order. But two of the measures of goodness for an x86 processor are how many instructions can be dispatched at once, and how many can be simultaneously retired.) In particular, trying to eliminate copying is often bogus, since in practice the copy gets optimized away, either by the compiler, or by the processor front end. -- Robert I. Eachus Ryan gunned down the last of his third white wine and told himself it would all be over in a few minutes. One thing he'd learned from Operation Beatrix: This field work wasn't for him. --from Red Rabbit by Tom Clancy.