comp.lang.ada
 help / color / mirror / Atom feed
From: sjw <simon.j.wright@mac.com>
Subject: Re: Is there an Ada compiler whose Ada.Numerics.Generic_Elementary_Functions.Log(Base=>10, X=>variable) is efficient?
Date: Mon, 15 Feb 2010 11:50:06 -0800 (PST)
Date: 2010-02-15T11:50:06-08:00	[thread overview]
Message-ID: <d9ca32ef-5582-48a2-8c1e-423d2a35a5c1@d27g2000yqf.googlegroups.com> (raw)
In-Reply-To: d2f2e726-5e73-4b8d-ab6e-76d4c0b39f44@15g2000yqa.googlegroups.com

On Feb 15, 3:04 pm, jonathan <johns...@googlemail.com> wrote:
> On Feb 15, 2:54 pm, jonathan <johns...@googlemail.com> wrote:
> > On Feb 15, 10:58 am, Colin Paul Gloster <Colin_Paul_Glos...@ACM.org>
> > wrote:
>
> > > Hello,
>
> > > I have been improving a program by suppressing C++ in it. After
> > > speeding it up a lot by making changes, I have found one considerable
> > > part which calls a library routine, which is unfortunately very slow
> > > as provided as standard with a number of Ada compilers but which is
> > > very fast with implementations of other languages....
>
> > Here is my own bench ... easier to play with:
>
> > with ada.numerics.generic_elementary_functions;
> > with text_io; use text_io;
>
> > procedure log_bench is
>
> >   type Real is digits 15;
> >   package Math is new ada.numerics.generic_elementary_functions
> > (Real);
> >   use Math;
> >   x, y : Real := 0.1;
>
> >   -- might (or might not!) be faster to use:
> >   --    log_base_10 (x) = log_base_10_of_e * log_base_e (x)
>
> >   Log_base_10_of_e : constant := 0.434_294_481_903_251_827_651_129;
>
> > begin
>
> >   for i in 1 .. 1_000_000 loop
> >     x := x + Log_base_10_of_e * Log (y);
> >     y := y + 0.0000000000001;
> >   end loop;
> >   put (Real'Image(x));
>
> > gnatmake  gnatnp -O2 -march=native log_bench.adb
>
> Opps, accidently hit the send button!  Continuing discussion:
>
> gnatmake  gnatnp -O2 -march=native log_bench.adb
>
> -9.99999682845800E+05
> real    0m0.024s
> user    0m0.024s
> sys     0m0.000s
>
> NOW, comment out the y := y + ... so that the inner loop
> is constant, and use the same compilation switches:
>
> gnatmake  gnatnp -O2 -march=native log_bench.adb
>
> -9.99999900000000E+05
> real    0m0.002s
> user    0m0.000s
> sys     0m0.000s
>
> You see now what's happening.  With the gnatn switch the
> compiler is smart enough to call the Log just once, rather
> than 10**6 times.
>
> If you remove the -gnatn or -gnatN switches, then it runs in
>  0m0.024s again.

The trouble is that that benchmark does something other than Colin's!
This might be a more accurate translation:

with Ada.Numerics.Generic_Elementary_Functions;
with Text_IO; use Text_IO;
procedure Log_Bench_0 is
   type Real is digits 15;
   package Math is new Ada.Numerics.Generic_Elementary_Functions
(Real);
   use Math;
   Answer : Real := 0.0;
   Log_Base_10_Of_E : constant := 0.434_294_481_903_251_827_651_129;
begin
   for I in 1 .. 1_000_000 loop
      declare
         X : Real := 0.1;
      begin
         for J in 1 .. 500 loop
            Answer := Answer + Log_Base_10_Of_E * Log (X);
            X := X + 0.1;
         end loop;
      end;
   end loop;
   Put (Real'Image(Answer));
end Log_Bench_0;

I've tried inlining GNAT's implementation (GCC 4.5.0, x86_64-aqpple-
darwin10.2.0) and even just calling up the C log10 routine using an
inline. None was very impressive compared to the g++ result.

Colin's time: 37s
Jonathan's time (-O3 -ffast-math -gnatp): 16s
Jonathan;s time (-O3 -ffast-math -gnatp -gnatN -funroll-loops): 14s
Jonathan's time (same opts, but using C log10()): 11s

so we still have 3 orders of magnitude to go to get to the g++ result:
0.02s

This is my final version, with the inlined GNAT implementation too:

with Ada.Numerics.Generic_Elementary_Functions;
with System.Machine_Code; use System.Machine_Code;
with Text_IO; use Text_IO;
procedure Log_Bench is
   type Real is digits 15;
   package Math is new Ada.Numerics.Generic_Elementary_Functions
(Real);
   use Math;
   Answer : Real := 0.0;
   Log_Base_10_Of_E : constant := 0.434_294_481_903_251_827_651_129;
   function LogM (X : Real) return Real;
   pragma Inline_Always (LogM);
   function LogM (X : Real) return Real is
      Result : Real;
      NL : constant String := ASCII.LF & ASCII.HT;
   begin
      Asm (Template =>
         "fldln2               " & NL
       & "fxch                 " & NL
       & "fyl2x                " & NL,
         Outputs  => Real'Asm_Output ("=t", Result),
         Inputs   => Real'Asm_Input  ("0", X));
      return Result;
   end LogM;
   function LogL (X : Real) return Real;
   pragma Import (C, LogL, "log10");
begin
   for I in 1 .. 1_000_000 loop
      declare
         X : Real := 0.1;
      begin
         for J in 1 .. 500 loop
--              Answer := Answer + Log_Base_10_Of_E * LogM (X);
            Answer := Answer + LogL (X);
            X := X + 0.1;
         end loop;
      end;
   end loop;
   Put (Real'Image(Answer));
end Log_Bench;


This is a fragment of the assembler output from the g++ code:

	movsd	LC0(%rip), %xmm0
	call	_log10
	movsd	%xmm0, -3976(%rbp)
	movsd	LC1(%rip), %xmm0
	call	_log10
	movsd	%xmm0, -3968(%rbp)
	movsd	LC2(%rip), %xmm0
	call	_log10

and this is a fragment of the assembler from the last Ada:

	movapd	%xmm1, %xmm0
	movsd	%xmm1, -144(%rbp)
	addl	$10, %ebx
	call	_log10
	movsd	-144(%rbp), %xmm9
	addsd	-120(%rbp), %xmm0
	addsd	LC1(%rip), %xmm9
	movsd	%xmm0, -120(%rbp)
	movapd	%xmm9, %xmm0
	movsd	%xmm9, -144(%rbp)
	call	_log10
	movsd	-144(%rbp), %xmm8
	addsd	-120(%rbp), %xmm0
	addsd	LC1(%rip), %xmm8
	movsd	%xmm0, -120(%rbp)
	movapd	%xmm8, %xmm0
	movsd	%xmm8, -144(%rbp)
	call	_log10

at which point I have to leave it to the experts; why do the 7 Ada
instructions take so much time compared to the 2 g++ instructions???



  reply	other threads:[~2010-02-15 19:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-15 10:58 Is there an Ada compiler whose Ada.Numerics.Generic_Elementary_Functions.Log(Base=>10, X=>variable) is efficient? Colin Paul Gloster
2010-02-15 13:02 ` John B. Matthews
2010-02-15 14:17   ` Colin Paul Gloster
2010-02-15 17:19     ` John B. Matthews
2010-02-15 14:54 ` jonathan
2010-02-15 15:04   ` jonathan
2010-02-15 19:50     ` sjw [this message]
2010-02-16 16:50       ` Colin Paul Gloster
2010-02-15 18:26 ` (see below)
2010-02-15 18:51   ` jonathan
2010-02-15 20:00   ` sjw
2010-02-15 21:17     ` jonathan
2010-02-16  0:09       ` jonathan
2010-02-16 17:33   ` Colin Paul Gloster
2010-02-24 10:07     ` Colin Paul Gloster
2010-02-15 23:04 ` Jeffrey R. Carter
2010-02-16 14:54   ` Colin Paul Gloster
2010-02-16 15:24     ` Colin Paul Gloster
2010-02-16 19:01     ` Jeffrey R. Carter
2010-02-17 10:25       ` Colin Paul Gloster
2010-02-15 23:20 ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox