From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII X-Google-Thread: 103376,a30e9cc47b5029fe X-Google-Attributes: gid103376,public From: gisle@struts.ii.uib.no (Gisle S�lensminde) Subject: Re: ratioanl number type Date: 1999/12/17 Message-ID: #1/1 X-Deja-AN: 562003069 Content-Transfer-Encoding: 8bit References: <38473D8A.9BB68676@gte.net> <3847E5B3.B96339A8@iforex.net> <385252FF.BB4C7A5@gte.net> <01bf43ea$646d5bc0$022a6282@dieppe> <38527A46.B4833D6A@gte.net> <01bf43f3$3aadb600$022a6282@dieppe> <82upl4$dra$1@nnrp1.deja.com> Organization: University of Bergen, Norway Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 Newsgroups: comp.lang.ada Date: 1999-12-17T00:00:00+00:00 List-Id: In article , Vladimir Olensky wrote: > >Robert Dewar wrote in message <82upl4$dra$1@nnrp1.deja.com>... >>In article , >> "Vladimir Olensky" wrote: >>> Maybe only some core assembler routines that make use of >>> processor SIMD extensions. That would allow drastically >>> improve efficiency. >> >>Well we know Vladimir likes the SIMD extensions, > >Yes I like them. >This technique was used for more than 20 years in Russian >supercomputers "Elbrus". So no wonder that one of the >leading scientists from "Elbrus" team (Pentkovsky) was invited >to the Intel and he lead the development of SIMD extensions >for Intel chips. > >>Vladimir, have you actually done assembly coding using >>these instructions? > >Not too much myself but there are a lot of SIMD code around >(MMX and SSE). I have copies of almost all most interesting >examples of code on this topic (in different domains) from Intel. I have in fact tried the MMX instructions, and found it remarkable difficult to use them. - They do not introduce full 64-bit aritmetrics. - The parallelism is difficult to use. You can't add in the upper half and multiply in the lower half of the register, it's also difficult to order you data to use the parellelism in the instruction set efficiently. - MMX adds more registers, and that's nice, but there are latencies when moving data from 32-bit registrers to MMX registers, which in many cases removes the benefits from the MMX registers. - The MMX instruction set has no FP instructions, and blocks the FP stack if you use them. You must use the emms instruction first. The MMX instructions is not very general, which makes it necesary to use 32-bit instructions anyway in most cases. Since 32-bit to MMX registers include latencies, this often eat up the speed benefits of MMX. - There use will often intoduce the need for a '32-bit version' of the code, since many computers not have MMX instructions available. In some cases the MMX instructions can make your code more eficient, but it's in practice hard to write faster programs using MMX instructions. -- Gisle S�lensminde ( gisle@ii.uib.no ) ln -s /dev/null ~/.netscape/cookies