From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,f03ffdf470e3c559 X-Google-Attributes: gid103376,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news4.google.com!news.germany.com!nuzba.szn.dk!news.jacob-sparre.dk!pnx.dk!not-for-mail From: "Randy Brukardt" Newsgroups: comp.lang.ada Subject: Re: Interesting performance quirk. Date: Thu, 6 Nov 2008 18:44:09 -0600 Organization: Jacob's private Usenet server Message-ID: References: <4903c066$0$28676$4d3efbfe@news.sover.net> NNTP-Posting-Host: static-69-95-181-76.mad.choiceone.net X-Trace: jacob-sparre.dk 1226018678 20952 69.95.181.76 (7 Nov 2008 00:44:38 GMT) X-Complaints-To: news@jacob-sparre.dk NNTP-Posting-Date: Fri, 7 Nov 2008 00:44:38 +0000 (UTC) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.5512 X-RFC2646: Format=Flowed; Original X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 Xref: g2news1.google.com comp.lang.ada:2601 Date: 2008-11-06T18:44:09-06:00 List-Id: "Peter C. Chapin" wrote in message news:4903c066$0$28676$4d3efbfe@news.sover.net... ... > Now the interesting part. My main development system is a Windows XP > laptop. On this system my "optimized" Blowfish benchmark encrypts or > decrypts at about 11 MB/s (curiously decryption is a little faster than > encryption, which seems odd). It also happens that I have OpenSUSE 10.2 > Linux running on the same box in a VMware virtual machine. In that > environment my benchmark encrypts or decrypts at fully 27 MB/s. It's > over twice as fast! I'm using GNAT GPL 2008 in both cases with the same > compiler options and exactly the same source code. I'm even using the > same basic hardware although, as I said, one of my systems---the faster > one---is a virtual machine. > > Should I be surprised at this performance difference? I wasn't expecting > it. Note that I'm using Ada.Calendar.Clock to track execution time. At > first I wondered if the virtual machine's notion of time was distorted > in some way but, no... the program is definitely faster in the VM (it > runs long enough so that the difference is speed is easily perceptible > by a human). I can't answer whether you should be surprised, but I'm not. My experience is that modern CPU chips have performance characteristics that seem random and depend on things that no one has any control over. My most recent example was a hobby program, much think yours. I was surprised to see that fixing a memory management flaw caused the program to run twice as fast. That temporarily caused rejoicing, until improving the behavior of a non critical piece of the program caused the program to slow by 50%! (This effect showed up on several Windows OSes on different Intel processors. But not on the old Pentium IIIs.) Experimenting, I discovered that I could change code in units totally unrelated to the "hot" areas of the program and cause vast changes in the performance of the inner loops. I of course verified that the generated code really was unchanged (it was). I went as far as reading the lastest Intel literature on these topics (and it is huge). I thought that the effect might have had something to do with the alignment of the innermost loops, but adding options to control that to Janus/Ada didn't help much (it did get rid of the slowest versions, but the performance still could vary wildly, about 30% if I remember correctly). Having wasted most of a nice weekend messing with this (and having no customer requirements at the time), I finally gave up and just twiddled with some unrelated code until the program ran fast. So I don't quite know what is going on. I suspect it is related in some way to alignment, but it might be necessary for some code to be page aligned for maximum performance (and that is way too expensive to use within loops and other code that is going to be executed - you have to fill the empty space with no-ops, and executing them takes some time. Intel actually recommends no-op sequences to use to fill space in order to minimize time - yuck). So it is possible that the performance difference has everything to do with unrelated parts of your program (such as the I/O libraries), which are going to be different for the two OSes. And nothing to do with your Ada code or anything that your compiler has control over. Randy.