From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail
From: =?UTF-8?Q?Markus_Sch=c3=b6pflin?= <no.spam@spam.spam>
Newsgroups: comp.lang.ada
Subject: Re: Profiling Ada binaries
Date: Tue, 26 Jul 2016 10:37:29 +0200
Organization: Aioe.org NNTP Server
Message-ID: <04e12bd0-2c9d-f90d-2497-bf58593addfd@spam.spam>
References: <nmt6qe$3d6$1@gioia.aioe.org> <nmtcbq$6fh$1@dont-email.me>
 <nmtcns$81a$1@dont-email.me> <nn4dfh$1cog$1@gioia.aioe.org>
 <a9f6bc88-81e3-480f-9e2d-91060e1dbdb5@googlegroups.com>
NNTP-Posting-Host: MdpKeRr+sx3LK7JQiK5aNw.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.2.0
X-Notice: Filtered by postfilter v. 0.8.2
Xref: news.eternal-september.org comp.lang.ada:31171
Date: 2016-07-26T10:37:29+02:00
List-Id: <comp.lang.ada>

Am 25.07.2016 um 18:45 schrieb rieachus@comcast.net:

> Gee. I would never think to compile the math libraries with -O1.
> Seriously, the math libraries are written with ease of understanding in
> mind.  You may have thousands of calls in the implementation of one
> function, and due to the packages being generic, every one of those calls
> will do an elaboration check.  How can that be efficient?

GNAT by default uses static elaboration. There should be no elaboration checks 
when calling the generic versions. Or am I mistaken here?

 > I believe GNAT
> has non-generic versions for Short_Float, Float, and Long_Float which use
> the hardware built-ins.  But I doubt you would get that automatically with
> -O1.

Even using the non-generic versions I have not been able to get the hardware 
built-ins. The best I can achieve for a call to e.g. cos(X) is:

         call    ada__numerics__long_elementary_functions__cos

> Try compiling everything with -O3 (or whatever you use) then recompile only
> the unit you want the tracing for with -O1 and -fno-inline-functions-called-once.

-O3 is explicitly discouraged by the documentation, so we're normally using 
-O2. And to get a general feeling on where the application is burning its CPU 
cycles, -O1 seems to be OK, as the execution time normally is dominated by the 
choice of algorithms and not by differences in the optimization level.

Markus