On Mon, 15 Feb 2010, S. J. W. posted: |---------------------------------------------------------------------------| |"[..] | |> | |> You see now what's happening.  With the gnatn switch the | |> compiler is smart enough to call the Log just once, rather | |> than 10**6 times. | |> | |> If you remove the -gnatn or -gnatN switches, then it runs in | |>  0m0.024s again. | | | |The trouble is that that benchmark does something other than Colin's!" | |---------------------------------------------------------------------------| That is not the problem. The code which I posted at the beginning of this thread was not a means in itself, but was intended for timing performances of implementations of logarithm functions in the base of ten in a manner representative of real code which I use. The real code is not dedicated to calculating something approximately equal to 6.3E+08. I could have written 500 * 1_000_000 calls or 3.14 * 1000 calls or a single call. A single call might have been overwhelmed by overhead unrelated to the logarithm function. In the case of the C++ version when using a particular compilation switch, I failed in the task because the hardcoded arguments I provided resulted in a trivial and dramatic optimization which would not happen in the real code. While it is unfortunate for Ada code in general that Ada compilers fail to mimic this optimization of G++'s, that particular optimization would not benefit the usage of logarithms in the real code I mentioned. Dr. Jonathan Parker is free to pursue this problem in a subthread or with vendors. |---------------------------------------------------------------------------| |"This might be a more accurate translation: | | | |with Ada.Numerics.Generic_Elementary_Functions; | |with Text_IO; use Text_IO; | |procedure Log_Bench_0 is | | type Real is digits 15; | | package Math is new Ada.Numerics.Generic_Elementary_Functions | |(Real); | | use Math; | | Answer : Real := 0.0; | | Log_Base_10_Of_E : constant := 0.434_294_481_903_251_827_651_129; | |begin | | for I in 1 .. 1_000_000 loop | | declare | | X : Real := 0.1; | | begin | | for J in 1 .. 500 loop | | Answer := Answer + Log_Base_10_Of_E * Log (X); | | X := X + 0.1; | | end loop; | | end; | | end loop; | | Put (Real'Image(Answer)); | |end Log_Bench_0; | | | |I've tried inlining GNAT's implementation (GCC 4.5.0, x86_64-aqpple- | |darwin10.2.0) and even just calling up the C log10 routine using an | |inline. None was very impressive compared to the g++ result. | | | |Colin's time: 37s | |Jonathan's time (-O3 -ffast-math -gnatp): 16s | |Jonathan;s time (-O3 -ffast-math -gnatp -gnatN -funroll-loops): 14s | |Jonathan's time (same opts, but using C log10()): 11s" | |---------------------------------------------------------------------------| That ordering does not necessarily hold... GCC4.2.4... gnatmake -O3 -ffast-math -gnatp Log_Bench_0.adb -o Log_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp time ./Log_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m14.328s user 0m14.329s sys 0m0.000s gnatmake -O3 -ffast-math -gnatp -gnatN -funroll-loops Log_Bench_0.adb -o Log_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops time ./Log_Bench_0_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops 6.34086408606382E+08 real 0m14.346s user 0m14.341s sys 0m0.004s GCC4.4.3 (slower than GCC4.2.4 for this program)... gnatmake -O3 -ffast-math Log_Bench_0.adb -o Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math time ./Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math 6.34086408606382E+08 real 0m14.713s user 0m14.689s sys 0m0.000s gnatmake -O3 -ffast-math -gnatp Log_Bench_0.adb -o Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp time ./Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m14.691s user 0m14.693s sys 0m0.000s gnatmake -O3 -ffast-math -gnatp -gnatN -funroll-loops Log_Bench_0.adb -o Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops time ./Log_Bench_0_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops 6.34086408606382E+08 real 0m14.690s user 0m14.689s sys 0m0.000s |---------------------------------------------------------------------------| |"so we still have 3 orders of magnitude to go to get to the g++ result: | |0.02s | | | |This is my final version, with the inlined GNAT implementation too: | | | |with Ada.Numerics.Generic_Elementary_Functions; | |with System.Machine_Code; use System.Machine_Code; | |with Text_IO; use Text_IO; | |procedure Log_Bench is | | type Real is digits 15; | | package Math is new Ada.Numerics.Generic_Elementary_Functions | |(Real); | | use Math; | | Answer : Real := 0.0; | | Log_Base_10_Of_E : constant := 0.434_294_481_903_251_827_651_129; | | function LogM (X : Real) return Real; | | pragma Inline_Always (LogM); | | function LogM (X : Real) return Real is | | Result : Real; | | NL : constant String := ASCII.LF & ASCII.HT; | | begin | | Asm (Template => | | "fldln2 " & NL | | & "fxch " & NL | | & "fyl2x " & NL, | | Outputs => Real'Asm_Output ("=t", Result), | | Inputs => Real'Asm_Input ("0", X)); | | return Result; | | end LogM; | | function LogL (X : Real) return Real; | | pragma Import (C, LogL, "log10"); | |begin | | for I in 1 .. 1_000_000 loop | | declare | | X : Real := 0.1; | | begin | | for J in 1 .. 500 loop | |-- Answer := Answer + Log_Base_10_Of_E * LogM (X); | | Answer := Answer + LogL (X); | | X := X + 0.1; | | end loop; | | end; | | end loop; | | Put (Real'Image(Answer)); | |end Log_Bench; | | | |[..]" | |---------------------------------------------------------------------------| Not all of those switches would yield fair proxies for timings of logarithms in the real code which inspired this thread, but anyway... 64bit GCC4.2.4... gnatmake -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math 6.34086408606382E+08 real 0m34.497s user 0m34.494s sys 0m0.004s gnatmake -gnatp -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m34.503s user 0m34.506s sys 0m0.000s gnatmake -gnatN -funroll-loops -gnatp -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.2.4_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops 6.34086408606382E+08 real 0m34.547s user 0m34.546s sys 0m0.004s 64bit GCC4.4.3... gnatmake -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math 6.34086408606382E+08 real 0m34.257s user 0m34.258s sys 0m0.000s gnatmake -gnatp -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp 6.34086408606382E+08 real 0m34.474s user 0m34.478s sys 0m0.000s gnatmake -gnatN -funroll-loops -gnatp -O3 -ffast-math Log_Bench.adb -o Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops -largs /lib/libm.so.6 time ./Log_Bench_compiled_by_GCC4.4.3_with_-ffast-math_and_-gnatp_and_-gnatN_and_-funroll-loops 6.34086408606382E+08 real 0m34.188s user 0m34.182s sys 0m0.004s