From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,7767a311e01e1cd
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news2.google.com!news4.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!newsfeed00.sul.t-online.de!t-online.de!feeder.news-service.com!news2.euro.net!62.253.162.218.MISMATCH!news-in.ntli.net!newsrout1-win.ntli.net!ntli.net!news.highwinds-media.com!newspeer1-win.ntli.net!newsfe4-gui.ntli.net.POSTED!53ab2750!not-for-mail
From: "Dr. Adrian Wrigley" <amtw@linuxchip.demon.co.uk.uk.uk>
Subject: Re: GNAT compiler switches and optimization
User-Agent: Pan/0.14.2 (This is not a psychotic episode. It's a cleansing
 moment of clarity.)
Message-ID: <pan.2006.10.21.12.41.40.352403@linuxchip.demon.co.uk.uk.uk>
Newsgroups: comp.lang.ada
References: <1161341264.471057.252750@h48g2000cwc.googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Date: Sat, 21 Oct 2006 12:39:30 GMT
NNTP-Posting-Host: 82.10.238.153
X-Trace: newsfe4-gui.ntli.net 1161434370 82.10.238.153 (Sat,
 21 Oct 2006 13:39:30 BST)
NNTP-Posting-Date: Sat, 21 Oct 2006 13:39:30 BST
Organization: NTL
Xref: g2news2.google.com comp.lang.ada:7113
Date: 2006-10-21T12:39:30+00:00
List-Id: <comp.lang.ada>

On Fri, 20 Oct 2006 03:47:44 -0700, tkrauss wrote:

> I'm a bit stuck trying to figure out how to coax more performance
> out of some Ada code.  I suspect there is something simple (like
> compiler switches) but I'm missing it.  As an example I'm using
> a simple matrix multiply and comparing it to similar code in
> Fortran.  Unfortunately the Ada code takes 3-4 times as long.
> 
> I'm using GNAT (GPL 2006) and GFortran (4.2.0) and the following
> compile options:
> 
> gnat make -O3 -gnatp tst_array
> gfortran -O3 tst_array.f95
> 
> Running them on 800x800 matrices (on my 2GHz laptop)
> 
> for Ada: "tst_array 800" runs in 18 seconds
> for Fortran "tst_array 800" runs in 6 seconds
> 
> (if I use the fortran "matmul" intrinsic the fortran time drops to
> 2.5 seconds)
> 
> Note, I tried reordering the loops, removing the random calls, etc.
> none of which made huge changes.  There is something killing
> performance
> and/or a switch or two that I'm missing, but I can't seem to find it.
> Any
> thoughts?

When I started using Ada, I found exactly the same thing.  Programs
ran slower.  Sometimes much slower.  Perhaps this is the biggest
single disadvantage of Ada (GNAT) in practice, particularly for
heavy numerical codes (compared to Fortran or C).

Looking closer, I found the assembly output was sometimes
littered with apparently redundant or inintended function
calls, extra checks and other baggage.

The prevailing claims at the time were that Ada was roughly
as fast as C, sometimes faster (because of deeper semantics).
Whenever logically identical code was compared, however,
the output assembly code was often identical, giving the
same performance.

In real code, I found big differences when using enumerations
instead of integers, multi-dimansional arrays etc.  And
language-defined maths libraries, random number generators,
I/O etc. all risked major slow-downs.

The only solution I found was to examine the output code,
and modify the source until I got code that was acceptable.
Sometimes this meant dropping useful language features, or
using less clear constructs.

This is a real problem with the language (Ada using GNAT).
For a lot of applications, the optimisation effort isn't
worth it.  And even for performance critical applications,
most code is outside of any hot-spots.

I get the impression that Fortran is an excellent language
for getting efficient code easily.  C is also quite good.
But C++ and Ada both seem to "hand-holding" to keep them
efficient.  Perl is the worst I know(!)

If you really care about performance, you'll check the
assembly code or compare against expectations.  As you
fix unexpected bottlenecks, you'll find out what type of
code compiles well with your compiler, and write future
code avoiding the problem areas.  Of course, when the
compiler changes, you may find the rules change and your
old code appears quaint or idiosyncratic.

The other posters to this thread have given some useful
optimisations to your original code.  Let us know
whether this bridges the performance gap for you!
--
Adrian Wrigley, Cambridge, UK.