From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,7767a311e01e1cd X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news4.google.com!news.glorb.com!proxad.net!cleanfeed3-b.proxad.net!nnrp12-1.free.fr!not-for-mail Sender: sam@willow.rfc1149.net From: Samuel Tardieu Newsgroups: comp.lang.ada Subject: Re: GNAT compiler switches and optimization References: <1161341264.471057.252750@h48g2000cwc.googlegroups.com> Date: 20 Oct 2006 14:09:45 +0200 Message-ID: <871wp3p4s6.fsf@willow.rfc1149.net> User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Leafnode-NNTP-Posting-Host: 2001:6f8:37a:2::2 Organization: Guest of ProXad - France NNTP-Posting-Date: 20 Oct 2006 14:10:01 MEST NNTP-Posting-Host: 88.191.14.223 X-Trace: 1161346201 news-2.free.fr 29389 88.191.14.223:55384 X-Complaints-To: abuse@proxad.net Xref: g2news2.google.com comp.lang.ada:7071 Date: 2006-10-20T14:10:01+02:00 List-Id: >>>>> "tkrauss" == tkrauss writes: tkrauss> Running them on 800x800 matrices (on my 2GHz laptop) tkrauss> for Ada: "tst_array 800" runs in 18 seconds for Fortran tkrauss> "tst_array 800" runs in 6 seconds tkrauss> (if I use the fortran "matmul" intrinsic the fortran time tkrauss> drops to 2.5 seconds) tkrauss> Note, I tried reordering the loops, removing the random tkrauss> calls, etc. none of which made huge changes. There is tkrauss> something killing performance and/or a switch or two that I'm tkrauss> missing, but I can't seem to find it. Any thoughts? First of all, what you measure is not only the matrix multiplication time but also the operation of filling the matrices with random numbers. I've moved the "start" initialization after the matrices initialization. The following optimizations make the difference smaller (9.47 seconds for Fortran vs. 11.90 seconds for Ada on my machine): - use -fomit-frame-pointer on gnatmake command line (this doesn't change anything in the Fortran case) - add: pragma Convention (Fortran, Real_Matrix) to invert the storage method (line vs. column); I guess this helps maintaining more data in the cache - use 1 .. N as loop indices instead of A'Range (1) and friends; this is more equivalent to the Fortran code you posted Still, this is a huge penaly for Ada. Unfortunately, I don't have the time to investigate further right now. However, I would be interested in other people findings. Sam -- Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/