From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: =?UTF-8?Q?Markus_Sch=c3=b6pflin?= Newsgroups: comp.lang.ada Subject: Re: Profiling Ada binaries Date: Tue, 26 Jul 2016 10:37:29 +0200 Organization: Aioe.org NNTP Server Message-ID: <04e12bd0-2c9d-f90d-2497-bf58593addfd@spam.spam> References: NNTP-Posting-Host: MdpKeRr+sx3LK7JQiK5aNw.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:31171 Date: 2016-07-26T10:37:29+02:00 List-Id: Am 25.07.2016 um 18:45 schrieb rieachus@comcast.net: > Gee. I would never think to compile the math libraries with -O1. > Seriously, the math libraries are written with ease of understanding in > mind. You may have thousands of calls in the implementation of one > function, and due to the packages being generic, every one of those calls > will do an elaboration check. How can that be efficient? GNAT by default uses static elaboration. There should be no elaboration checks when calling the generic versions. Or am I mistaken here? > I believe GNAT > has non-generic versions for Short_Float, Float, and Long_Float which use > the hardware built-ins. But I doubt you would get that automatically with > -O1. Even using the non-generic versions I have not been able to get the hardware built-ins. The best I can achieve for a call to e.g. cos(X) is: call ada__numerics__long_elementary_functions__cos > Try compiling everything with -O3 (or whatever you use) then recompile only > the unit you want the tracing for with -O1 and -fno-inline-functions-called-once. -O3 is explicitly discouraged by the documentation, so we're normally using -O2. And to get a general feeling on where the application is burning its CPU cycles, -O1 seems to be OK, as the execution time normally is dominated by the choice of algorithms and not by differences in the optimization level. Markus