From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,7767a311e01e1cd X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news3.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local01.nntp.dca.giganews.com!nntp.comcast.com!news.comcast.com.POSTED!not-for-mail NNTP-Posting-Date: Sat, 21 Oct 2006 11:42:02 -0500 Date: Sat, 21 Oct 2006 12:35:54 -0400 From: Jeffrey Creem User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: GNAT compiler switches and optimization References: <1161341264.471057.252750@h48g2000cwc.googlegroups.com> <9Qb_g.111857$aJ.65708@attbi_s21> <434o04-7g7.ln1@newserver.thecreems.com> <4539ce34$1_2@news.bluewin.ch> In-Reply-To: <4539ce34$1_2@news.bluewin.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Message-ID: NNTP-Posting-Host: 24.147.74.171 X-Trace: sv3-cmU/4PlJci7w5ysyvpp127PnxIqzQ3coLKcWK+zEUkKTPtoB1Jrj4uPVhPoLY8S2+LMjIghID6Ynq9S!siXsl/qgiHr5Ry9ibSQ8+TNJEFmGK/1MhCdTUiFBTkCcoOANX5iXvTV0qv08Xabiy8591Z33lI9z!sLo= X-Complaints-To: abuse@comcast.net X-DMCA-Complaints-To: dmca@comcast.net X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.32 Xref: g2news2.google.com comp.lang.ada:7116 Date: 2006-10-21T12:35:54-04:00 List-Id: Gautier wrote: > Jeffrey Creem: > >> Note, I am the first one to jump to the defense of "Ada" in general >> but in this case, GNAT just plain sucks compared to GNU FORTRAN as it >> does a poor job on (at least) the inner loop (verified by looking at >> the output assembly) > > > There is something strange... Martin Krischik was able to trim the > overall time for the Ada code down to 24% of the first version (GNAT/GCC > 4.1.1). > This should make the Ada program as fast as the FORTRAN one, shouldn't it ? > Maybe it's because the test is done on a 64 bit machine ? > It needs some reconciliation... > A good thing in that discussion would be that everybody shows each time > - which GCC version > - which machine > - the execution time of the multiplication for both Ada and Fortran > - which version of the Ada code (matrix on stack/heap, Fortran or Ada > convention) > > Cheers, Gautier > ______________________________________________________________ > Ada programming -- http://www.mysunrise.ch/users/gdm/gsoft.htm > > NB: For a direct answer, e-mail address on the Web site! I'd certainly be willing to run a few benchmarks but the important thing here is that rather innocent looking code is running 2-4x slower than it "should". There are things that I think we can really rule out as being "the" factor. 1) Random number generator - I did timings (for both the Ada and FORTRAN) with timing moved to only cover matrix multiply. 2) Difference GCC versions - I built a fresh GCC from the GCC trunk for both Ada and FORTRAN 3) The Machine - I am running both on the same machine, though I suppose there could be differences in 32 bit v.s. 64 bit comparisons. 4) Runtime checks - both the original author (and I) ran with checks suppressed 5) O2/O3 - Actually, I could look at this some more with some other versions but a quick look when I first started seemed to indicate this was not the issue. A few other thoughts. Once the timing is limited to just the matrix multiply the stack/heap thing 'should' generally not matter. Some of the changes made to the Ada version make it not really the same program as the FORTRAN version and the same changes made to the FORTRAN one would also cause it to speed up (e.g. not counting the the zeroing of the target array during the accumulation phase). I have certainly seen some amazing performance from some Ada compiler sin the past and in general, on non-trivial benchmarks I am usually pretty happy with the output of GNAT as well but in this case it is not great. Further, I tried playing a bit with the new autovectorization capability of the near 4.X series of GCC (has to be specifically enabled) and found that even very very trivial cases would refuse to vectorize under Ada (though after I submitted the bug report to GCC, I found that FORTRAN fails to vectorize these too). One thing everyone needs to remember is that this example was (probably) not "Find the way to get the smallest value out of this test program" becuase there are always ways of doing some tweaks to a small enough region of code to make it better. If there is a 2-4x global slowdown in your 100KLOC program, you will never "get there" following the conventional wisdom of profiling and looking for the problems. Now, I am not suggesting that GNAT is globally 2-4x slower than GFORTRAN or anything like that (since that does not line up with what I have generally seen on larger code bases), but, if I were a manager picking a new language based on a set of long term goals for a project and saw that GNAT was running 2-4x slower and was still runninging 1.X to 3X slower after 2 days of Ada guru's looking at it, I'd probably jettison Ada (I know, I am mixing compilers and languages here, but in reality, that is what happens in the real world) and go with something else. And before the chorus of "processors are so fast that performance does not matter as much as safety and correctness" crowd starts getting too loud, let me point out that there are still many segments of the industry where performance does still indeed matter. Especially when one is trading adding a second processor to an embedded box against a vague promise of "betterness" in terms of safety down the road....Ok..Off the soapbox. So, in closing, if someone thinks they have "the best" version of that program they want timed against gfortran, post it here and I'll run them.