From: sjw <simon.j.wright@mac.com>
Subject: Re: Is there an Ada compiler whose Ada.Numerics.Generic_Elementary_Functions.Log(Base=>10, X=>variable) is efficient?
Date: Mon, 15 Feb 2010 11:50:06 -0800 (PST)
Date: 2010-02-15T11:50:06-08:00 [thread overview]
Message-ID: <d9ca32ef-5582-48a2-8c1e-423d2a35a5c1@d27g2000yqf.googlegroups.com> (raw)
In-Reply-To: d2f2e726-5e73-4b8d-ab6e-76d4c0b39f44@15g2000yqa.googlegroups.com
On Feb 15, 3:04 pm, jonathan <johns...@googlemail.com> wrote:
> On Feb 15, 2:54 pm, jonathan <johns...@googlemail.com> wrote:
> > On Feb 15, 10:58 am, Colin Paul Gloster <Colin_Paul_Glos...@ACM.org>
> > wrote:
>
> > > Hello,
>
> > > I have been improving a program by suppressing C++ in it. After
> > > speeding it up a lot by making changes, I have found one considerable
> > > part which calls a library routine, which is unfortunately very slow
> > > as provided as standard with a number of Ada compilers but which is
> > > very fast with implementations of other languages....
>
> > Here is my own bench ... easier to play with:
>
> > with ada.numerics.generic_elementary_functions;
> > with text_io; use text_io;
>
> > procedure log_bench is
>
> > type Real is digits 15;
> > package Math is new ada.numerics.generic_elementary_functions
> > (Real);
> > use Math;
> > x, y : Real := 0.1;
>
> > -- might (or might not!) be faster to use:
> > -- log_base_10 (x) = log_base_10_of_e * log_base_e (x)
>
> > Log_base_10_of_e : constant := 0.434_294_481_903_251_827_651_129;
>
> > begin
>
> > for i in 1 .. 1_000_000 loop
> > x := x + Log_base_10_of_e * Log (y);
> > y := y + 0.0000000000001;
> > end loop;
> > put (Real'Image(x));
>
> > gnatmake gnatnp -O2 -march=native log_bench.adb
>
> Opps, accidently hit the send button! Continuing discussion:
>
> gnatmake gnatnp -O2 -march=native log_bench.adb
>
> -9.99999682845800E+05
> real 0m0.024s
> user 0m0.024s
> sys 0m0.000s
>
> NOW, comment out the y := y + ... so that the inner loop
> is constant, and use the same compilation switches:
>
> gnatmake gnatnp -O2 -march=native log_bench.adb
>
> -9.99999900000000E+05
> real 0m0.002s
> user 0m0.000s
> sys 0m0.000s
>
> You see now what's happening. With the gnatn switch the
> compiler is smart enough to call the Log just once, rather
> than 10**6 times.
>
> If you remove the -gnatn or -gnatN switches, then it runs in
> 0m0.024s again.
The trouble is that that benchmark does something other than Colin's!
This might be a more accurate translation:
with Ada.Numerics.Generic_Elementary_Functions;
with Text_IO; use Text_IO;
procedure Log_Bench_0 is
type Real is digits 15;
package Math is new Ada.Numerics.Generic_Elementary_Functions
(Real);
use Math;
Answer : Real := 0.0;
Log_Base_10_Of_E : constant := 0.434_294_481_903_251_827_651_129;
begin
for I in 1 .. 1_000_000 loop
declare
X : Real := 0.1;
begin
for J in 1 .. 500 loop
Answer := Answer + Log_Base_10_Of_E * Log (X);
X := X + 0.1;
end loop;
end;
end loop;
Put (Real'Image(Answer));
end Log_Bench_0;
I've tried inlining GNAT's implementation (GCC 4.5.0, x86_64-aqpple-
darwin10.2.0) and even just calling up the C log10 routine using an
inline. None was very impressive compared to the g++ result.
Colin's time: 37s
Jonathan's time (-O3 -ffast-math -gnatp): 16s
Jonathan;s time (-O3 -ffast-math -gnatp -gnatN -funroll-loops): 14s
Jonathan's time (same opts, but using C log10()): 11s
so we still have 3 orders of magnitude to go to get to the g++ result:
0.02s
This is my final version, with the inlined GNAT implementation too:
with Ada.Numerics.Generic_Elementary_Functions;
with System.Machine_Code; use System.Machine_Code;
with Text_IO; use Text_IO;
procedure Log_Bench is
type Real is digits 15;
package Math is new Ada.Numerics.Generic_Elementary_Functions
(Real);
use Math;
Answer : Real := 0.0;
Log_Base_10_Of_E : constant := 0.434_294_481_903_251_827_651_129;
function LogM (X : Real) return Real;
pragma Inline_Always (LogM);
function LogM (X : Real) return Real is
Result : Real;
NL : constant String := ASCII.LF & ASCII.HT;
begin
Asm (Template =>
"fldln2 " & NL
& "fxch " & NL
& "fyl2x " & NL,
Outputs => Real'Asm_Output ("=t", Result),
Inputs => Real'Asm_Input ("0", X));
return Result;
end LogM;
function LogL (X : Real) return Real;
pragma Import (C, LogL, "log10");
begin
for I in 1 .. 1_000_000 loop
declare
X : Real := 0.1;
begin
for J in 1 .. 500 loop
-- Answer := Answer + Log_Base_10_Of_E * LogM (X);
Answer := Answer + LogL (X);
X := X + 0.1;
end loop;
end;
end loop;
Put (Real'Image(Answer));
end Log_Bench;
This is a fragment of the assembler output from the g++ code:
movsd LC0(%rip), %xmm0
call _log10
movsd %xmm0, -3976(%rbp)
movsd LC1(%rip), %xmm0
call _log10
movsd %xmm0, -3968(%rbp)
movsd LC2(%rip), %xmm0
call _log10
and this is a fragment of the assembler from the last Ada:
movapd %xmm1, %xmm0
movsd %xmm1, -144(%rbp)
addl $10, %ebx
call _log10
movsd -144(%rbp), %xmm9
addsd -120(%rbp), %xmm0
addsd LC1(%rip), %xmm9
movsd %xmm0, -120(%rbp)
movapd %xmm9, %xmm0
movsd %xmm9, -144(%rbp)
call _log10
movsd -144(%rbp), %xmm8
addsd -120(%rbp), %xmm0
addsd LC1(%rip), %xmm8
movsd %xmm0, -120(%rbp)
movapd %xmm8, %xmm0
movsd %xmm8, -144(%rbp)
call _log10
at which point I have to leave it to the experts; why do the 7 Ada
instructions take so much time compared to the 2 g++ instructions???
next prev parent reply other threads:[~2010-02-15 19:50 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-15 10:58 Is there an Ada compiler whose Ada.Numerics.Generic_Elementary_Functions.Log(Base=>10, X=>variable) is efficient? Colin Paul Gloster
2010-02-15 13:02 ` John B. Matthews
2010-02-15 14:17 ` Colin Paul Gloster
2010-02-15 17:19 ` John B. Matthews
2010-02-15 14:54 ` jonathan
2010-02-15 15:04 ` jonathan
2010-02-15 19:50 ` sjw [this message]
2010-02-16 16:50 ` Colin Paul Gloster
2010-02-15 18:26 ` (see below)
2010-02-15 18:51 ` jonathan
2010-02-15 20:00 ` sjw
2010-02-15 21:17 ` jonathan
2010-02-16 0:09 ` jonathan
2010-02-16 17:33 ` Colin Paul Gloster
2010-02-24 10:07 ` Colin Paul Gloster
2010-02-15 23:04 ` Jeffrey R. Carter
2010-02-16 14:54 ` Colin Paul Gloster
2010-02-16 15:24 ` Colin Paul Gloster
2010-02-16 19:01 ` Jeffrey R. Carter
2010-02-17 10:25 ` Colin Paul Gloster
2010-02-15 23:20 ` Randy Brukardt
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox