From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,2c57913d6b8220c1 X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news3.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool1.arcor-online.net!news.arcor.de.POSTED!not-for-mail Date: Tue, 29 Sep 2009 17:26:04 +0200 From: Georg Bauhaus Reply-To: rm.tsoh+bauhaus@maps.futureapps.de User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Tasking for Mandelbrot program References: <4abebaf4$0$31342$9b4e6d93@newsspool4.arcor-online.net> <4abfd8df$0$31337$9b4e6d93@newsspool4.arcor-online.net> <96599435-110b-4213-a075-69cbeec204c5@m11g2000yqf.googlegroups.com> In-Reply-To: <96599435-110b-4213-a075-69cbeec204c5@m11g2000yqf.googlegroups.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Message-ID: <4ac2270d$0$32664$9b4e6d93@newsspool2.arcor-online.net> Organization: Arcor NNTP-Posting-Date: 29 Sep 2009 17:26:05 CEST NNTP-Posting-Host: b6a47f82.newsspool2.arcor-online.net X-Trace: DXC=XoZf0;Cc3HEFJ3]dH>I?oEA9EHlD;3YcB4Fo<]lROoRA^YC2XCjHcbIdbW4:b0`Ka@KQDKiQ7h jonathan wrote: > On Sep 27, 10:27 pm, Georg Bauhaus bug.bauh...@maps.futureapps.de> wrote: >> Some more observations: >> >> - SSE2 code performs 8% faster when suitable compilation options >> are present, -mfpmath=sse -msse2 (this is currently the case). >> Then digits 15 should probably stay in the declaration of Real. >> > > Some more notes on this puzzle ... Indeed... I might well have lost track in the labyrinth of settings, but seeing, as I do, a factor near 2 in Mandelbrot speed when switching from 15 to 16 digits (or back) looks odd, especially when the combinations alluded to below do not appear to be similarly far from each other. Next thing I'll do is work through the exponential table of options and FPT definitions and CPUs and compilers and OSs and ... carefully measuring each cell. For now, here is a little test setup that does some of this. If you like, unpack in a fresh directory and type "make". This will compile and run a few FPT related combinations taking the core loop of Mandelbrot as an example. (On Windows, type "make all-not-native" , after switching three OS-related variables near the head of the Makefile.) http://home.arcor.de/bauhaus/Ada/test1516fpt.zip I will be short of time during the next few days, and maybe off line. FTR, as an experiment I have tried to pragma Import(Intrinsic, MULPD, "__builtin_ia32_mulpd") i.e. GCC builtins for SIMD multiplication etc, like the leading programs do. Formally, this appears to be working, but the compiler finally spit a bug box.