From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,13280cdb905844e4 X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!news4.google.com!feeder.news-service.com!weretis.net!feeder2.news.weretis.net!feeder.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Colin Paul Gloster Newsgroups: comp.lang.ada Subject: Re: Is there an Ada compiler whose Ada.Numerics.Generic_Elementary_Functions.Log(Base=>10, X=>variable) is efficient? Date: Tue, 16 Feb 2010 16:50:22 +0000 Organization: A noiseless patient Spider Message-ID: References: <7b3d1a4c-e61f-41c0-a35b-a9d13c6f4f67@j31g2000yqa.googlegroups.com> Reply-To: Colin Paul Gloster Mime-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="8323328-959037319-1266337310=:21651" Injection-Date: Tue, 16 Feb 2010 16:52:40 +0000 (UTC) Injection-Info: feeder.eternal-september.org; posting-host="kheEuXGHhE2Z5eF1gAST+A"; logging-data="9374"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18BRoh1O+Hjzg13JxX09cXysOcZw0xm2UQCKB71pWwRSQ==" User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) In-Reply-To: Content-ID: Cancel-Lock: sha1:h4J8NlAcHvpXXgmdBRQ77gm5ZOo= X-X-Sender: Colin_Paul@Bluewhite64.example.net Xref: g2news1.google.com comp.lang.ada:9268 Date: 2010-02-16T16:50:22+00:00 List-Id: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323328-959037319-1266337310=:21651 Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-15; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Content-ID: On Mon, 15 Feb 2010, S. J. W. posted: |--------------------------------------------------------------------------= -| |"[..] = | |> = | |> You see now what's happening. =A0With the gnatn switch the = | |> compiler is smart enough to call the Log just once, rather = | |> than 10**6 times. = | |> = | |> If you remove the -gnatn or -gnatN switches, then it runs in = | |> =A00m0.024s again. = | | = | |The trouble is that that benchmark does something other than Colin's!" = | |--------------------------------------------------------------------------= -| That is not the problem. The code which I posted at the beginning of this thread was not a means in itself, but was intended for timing performances of implementations of logarithm functions in the base of ten in a manner representative of real code which I use. The real code is not dedicated to calculating something approximately equal to 6.3E+08. I could have written 500 * 1_000_000 calls or 3.14 * 1000 calls or a single call. A single call might have been overwhelmed by overhead unrelated to the logarithm function. In the case of the C++ version when using a particular compilation switch, I failed in the task because the hardcoded arguments I provided resulted in a trivial and dramatic optimization which would not happen in the real code. While it is unfortunate for Ada code in general that Ada compilers fail to mimic this optimization of G++'s, that particular optimization would not benefit the usage of logarithms in the real code I mentioned. Dr. Jonathan Parker is free to pursue this problem in a subthread or with vendors. |--------------------------------------------------------------------------= -| |"This might be a more accurate translation: = | | = | |with Ada.Numerics.Generic_Elementary_Functions; = | |with Text_IO; use Text_IO; = | |procedure Log_Bench_0 is = | | type Real is digits 15; = | | package Math is new Ada.Numerics.Generic_Elementary_Functions = | |(Real); = | | use Math; = | | Answer : Real :=3D 0.0; = | | Log_Base_10_Of_E : constant :=3D 0.434_294_481_903_251_827_651_129; = | |begin = | | for I in 1 .. 1_000_000 loop = | | declare = | | X : Real :=3D 0.1; = | | begin = | | for J in 1 .. 500 loop = | | Answer :=3D Answer + Log_Base_10_Of_E * Log (X); = | | X :=3D X + 0.1; = | | end loop; = | | end; = | | end loop; = | | Put (Real'Image(Answer)); = | |end Log_Bench_0; = | | = | |I've tried inlining GNAT's implementation (GCC 4.5.0, x86_64-aqpple- = | |darwin10.2.0) and even just calling up the C log10 routine using an = | |inline. None was very impressive compared to the g++ result. = | | = | |Colin's time: 37s = | |Jonathan's time (-O3 -ffast-math -gnatp): 16s = | |Jonathan;s time (-O3 -ffast-math -gnatp -gnatN -funroll-loops): 14s = | |Jonathan's time (same opts, but using C log10()): 11s" = | |--------------------------------------------------------------------------= -| That ordering does not necessarily hold... GCC4.2.4... gnatmake -O3 -ffast-math -gnatp Log_Bench_0.adb -o Log_Bench_0_compiled_by= _GCC4.2.4_with_-ffast-math_and_-gnatp time ./Log_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m14.328s user 0m14.329s sys 0m0.000s gnatmake -O3 -ffast-math -gnatp -gnatN -funroll-loops Log_Bench_0.adb -o L= og_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gnatN_and_= -funroll-loops time ./Log_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gn= atN_and_-funroll-loops 6.34086408606382E+08 real 0m14.346s user 0m14.341s sys 0m0.004s GCC4.4.3 (slower than GCC4.2.4 for this program)... gnatmake -O3 -ffast-math Log_Bench_0.adb -o Log_Bench_0_compiled_by_GCC4.= 4.3_with_-ffast-math time ./Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math 6.34086408606382E+08 real 0m14.713s user 0m14.689s sys 0m0.000s gnatmake -O3 -ffast-math -gnatp Log_Bench_0.adb -o Log_Bench_0_compiled_= by_GCC4.4.3_with_-ffast-math_and_-gnatp time ./Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m14.691s user 0m14.693s sys 0m0.000s gnatmake -O3 -ffast-math -gnatp -gnatN -funroll-loops Log_Bench_0.adb -o L= og_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gnatN_and_= -funroll-loops time ./Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-= gnatN_and_-funroll-loops 6.34086408606382E+08 real 0m14.690s user 0m14.689s sys 0m0.000s |--------------------------------------------------------------------------= -| |"so we still have 3 orders of magnitude to go to get to the g++ result: = | |0.02s = | | = | |This is my final version, with the inlined GNAT implementation too: = | | = | |with Ada.Numerics.Generic_Elementary_Functions; = | |with System.Machine_Code; use System.Machine_Code; = | |with Text_IO; use Text_IO; = | |procedure Log_Bench is = | | type Real is digits 15; = | | package Math is new Ada.Numerics.Generic_Elementary_Functions = | |(Real); = | | use Math; = | | Answer : Real :=3D 0.0; = | | Log_Base_10_Of_E : constant :=3D 0.434_294_481_903_251_827_651_129; = | | function LogM (X : Real) return Real; = | | pragma Inline_Always (LogM); = | | function LogM (X : Real) return Real is = | | Result : Real; = | | NL : constant String :=3D ASCII.LF & ASCII.HT; = | | begin = | | Asm (Template =3D> = | | "fldln2 " & NL = | | & "fxch " & NL = | | & "fyl2x " & NL, = | | Outputs =3D> Real'Asm_Output ("=3Dt", Result), = | | Inputs =3D> Real'Asm_Input ("0", X)); = | | return Result; = | | end LogM; = | | function LogL (X : Real) return Real; = | | pragma Import (C, LogL, "log10"); = | |begin = | | for I in 1 .. 1_000_000 loop = | | declare = | | X : Real :=3D 0.1; = | | begin = | | for J in 1 .. 500 loop = | |-- Answer :=3D Answer + Log_Base_10_Of_E * LogM (X); = | | Answer :=3D Answer + LogL (X); = | | X :=3D X + 0.1; = | | end loop; = | | end; = | | end loop; = | | Put (Real'Image(Answer)); = | |end Log_Bench; = | | = | |[..]" = | |--------------------------------------------------------------------------= -| Not all of those switches would yield fair proxies for timings of logarithms in the real code which inspired this thread, but anyway... 64bit GCC4.2.4... gnatmake -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.2.4_= with_-ffast-math -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math 6.34086408606382E+08 real 0m34.497s user 0m34.494s sys 0m0.004s gnatmake -gnatp -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GC= C4.2.4_with_-ffast-math_and_-gnatp -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m34.503s user 0m34.506s sys 0m0.000s gnatmake -gnatN -funroll-loops -gnatp -O3 -ffast-math Log_Bench.adb -o Lo= g_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gnatN_and_-fu= nroll-loops -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gn= atN_and_-funroll-loops 6.34086408606382E+08 real 0m34.547s user 0m34.546s sys 0m0.004s 64bit GCC4.4.3... gnatmake -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.4.3_= with_-ffast-math -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math 6.34086408606382E+08 real 0m34.257s user 0m34.258s sys 0m0.000s gnatmake -gnatp -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GC= C4.4.3_with_-ffast-math_and_-gnatp -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m34.474s user 0m34.478s sys 0m0.000s gnatmake -gnatN -funroll-loops -gnatp -O3 -ffast-math Log_Bench.adb -o Lo= g_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gnatN_and_-fu= nroll-loops -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gn= atN_and_-funroll-loops 6.34086408606382E+08 real 0m34.188s user 0m34.182s sys 0m0.004s --8323328-959037319-1266337310=:21651--