comp.lang.ada
 help / color / mirror / Atom feed
From: "Jeffrey R. Carter" <spam.not.jrcarter@acm.not.spam.org>
Subject: Re: GNAT compiler switches and optimization
Date: Sun, 22 Oct 2006 07:39:27 GMT
Date: 2006-10-22T07:39:27+00:00	[thread overview]
Message-ID: <OeF_g.1031657$084.91539@attbi_s22> (raw)
In-Reply-To: <sj3r04-rlv.ln1@newserver.thecreems.com>

Jeffrey Creem wrote:
> 
> Actually, as a result of this, I submitted a bug report to the GCC 
> bugzilla list. You can follow progress on it here:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29543
> 
> Interesting initial feedback is that
> 1) Not an Ada bug.
> 2) Is a FORTRAN bug
> 3) Is a backend limitation of the optimizer.
> 
> Of course, the FORTRAN one still runs correctly so I don't think most 
> users will care that it is because of a bug :)

Interesting. I've been experimenting with some variations simply out of 
curiosity and found some things that seem a bit strange. (All results 
for an argument of 800.)

Adding the Sum variable makes an important difference, as others have 
reported, in my case from 5.82 to 4.38 s. Hoisting the indexing 
calculation for the result (C) matrix location is a basic optimization, 
and I would be surprised if it isn't done. The only thing I can think of 
is that it's a cache issue: that all 3 matrices can't be kept in cache 
at once. Perhaps compiler writers would be able to make sense of this.

Previously, I found no difference between -O2 and -O3. With this change, 
-O2 is faster.

The issue of using 'range compared to using "1 .. N" makes no difference 
in my version of the program.

Something I found really surprising is that putting the multiplication 
in a procedure makes the program faster, down to 4.03 s. I have no idea 
why this would be so.

Compiled with MinGW GNAT 3.4.2, -O2, -gnatnp -fomit-frame-pointer. Run 
under Windows XP SP2 on a 3.2 GHz Pentium 4 HT with 1 GB RAM.

Here's the code:

with Ada.Numerics.Float_Random;
with Ada.Command_Line;          use Ada.Command_Line;
with Ada.Text_IO;               use Ada.Text_IO;
with Ada.Calendar;              use Ada.Calendar;

procedure Tst_Array is
    package F_IO is new Ada.Text_IO.Float_IO (Float);
    package D_IO is new Ada.Text_Io.Fixed_Io (Duration);

    N : constant Positive := Integer'Value (Argument (1) );

    type Real_Matrix is array (1 .. N, 1 .. N) of Float;
    pragma Convention (FORTRAN, Real_Matrix);

    G : Ada.Numerics.Float_Random.Generator;

    A,B : Real_Matrix :=
       (others => (others => Ada.Numerics.Float_Random.Random (G) ) );
    C : Real_Matrix := (others => (others => 0.0) );
    Start, Finish : Ada.Calendar.Time;

    procedure Multiply is
       Sum : Float;
    begin -- Multiply
       All_Rows : for Row in A'range (1) loop
          All_Columns : for Column in B'range (2) loop
             Sum := 0.0;

             All_Common : for R in A'range (2) loop
                Sum := Sum + A (Row, R) * B (R, Column);
             end loop All_Common;

             C (Row, Column) := Sum;
          end loop All_Columns;
       end loop All_Rows;
    end Multiply;
begin
    Start := Ada.Calendar.Clock;
    Multiply;
    Finish := Ada.Calendar.Clock;

    F_IO.Put (C (1, 1) );
    F_IO.Put (C (1, 2) );
    New_Line;
    F_IO.Put (C (2, 1) );
    F_IO.Put (C (2, 2) );
    New_Line;

    Put ("Time: ");
    D_IO.Put (Finish - Start);
    New_Line;
end Tst_Array;

Next, since there have been reported some meaningful speed-up of quick 
sort on a Pentium 4 HT processor by using 2 tasks, I thought I'd see 
what effect that had. With 2 tasks, I got a time of 3.70 s. That's not a 
significant speed up, about 9.1%.

Same compilation options and platform.

Here's that code:

with Ada.Numerics.Float_Random;
with Ada.Command_Line;          use Ada.Command_Line;
with Ada.Text_IO;               use Ada.Text_IO;
with Ada.Calendar;              use Ada.Calendar;

procedure Tst_Array is
    package F_IO is new Ada.Text_IO.Float_IO (Float);
    package D_IO is new Ada.Text_Io.Fixed_Io (Duration);

    N : constant Positive := Integer'Value (Argument (1) );

    type Real_Matrix is array (1 .. N, 1 .. N) of Float;
    pragma Convention (FORTRAN, Real_Matrix);

    G : Ada.Numerics.Float_Random.Generator;

    A, B : Real_Matrix :=
       (others => (others => Ada.Numerics.Float_Random.Random (G) ) );
    C : Real_Matrix := (others => (others => 0.0) );
    Start, Finish : Ada.Calendar.Time;

    procedure Multiply is
       procedure Multiply
          (Start_Row : in Positive; Stop_Row : in Positive)
       is
          Sum : Float;
       begin -- Multiply
          All_Rows : for Row in Start_Row .. Stop_Row loop
             All_Columns : for Column in B'range (2) loop
                Sum := 0.0;

                All_Common : for R in A'range (2) loop
                   Sum := Sum + A (Row, R) * B (R, Column);
                end loop All_Common;

                C (Row, Column) := Sum;
             end loop All_Columns;
          end loop All_Rows;
       end Multiply;

       task type Multiplier (Start_Row : Positive; Stop_Row : Positive);

       task body Multiplier is
          -- null;
       begin -- Multiplier
          Multiply (Start_Row => Start_Row, Stop_Row => Stop_Row);
       end Multiplier;

       Stop  : constant Positive := N / 2;
       Start : constant Positive := Stop + 1;

       Mult : Multiplier (Start_Row => 1, Stop_Row => Stop);
    begin -- Multiply
       Multiply (Start_Row => Start, Stop_Row => N);
    end Multiply;
begin
    Start := Ada.Calendar.Clock;
    Multiply;
    Finish := Ada.Calendar.Clock;

    F_IO.Put (C (1, 1) );
    F_IO.Put (C (1, 2) );
    New_Line;
    F_IO.Put (C (2, 1) );
    F_IO.Put (C (2, 2) );
    New_Line;

    Put ("Time: ");
    D_IO.Put (Finish - Start);
    New_Line;
end Tst_Array;

If I inline the inner Multiply, or put equivalent code in the task and 
the outer Mutliply, the time is much more than for the sequential 
version, presumably due to cache effects.

Since it appears you have 2 physical processors ("Dual Xeon 2.8 Ghz"), I 
would be interested in seeing what effect this concurrent version has on 
that platform. I also wonder how easy such a version would be to create 
in FORTRAN.

-- 
Jeff Carter
"Ada has made you lazy and careless. You can write programs in C that
are just as safe by the simple application of super-human diligence."
E. Robert Tisdale
72



  reply	other threads:[~2006-10-22  7:39 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-20 10:47 GNAT compiler switches and optimization tkrauss
2006-10-20 11:04 ` Duncan Sands
2006-10-21 10:45   ` Stephen Leake
2006-10-20 11:42 ` Duncan Sands
2006-10-20 15:41   ` Martin Krischik
2006-10-20 12:09 ` Samuel Tardieu
2006-10-20 12:18   ` Samuel Tardieu
2006-10-20 12:12 ` Gautier
2006-10-20 12:35 ` Dmitry A. Kazakov
2006-10-20 15:53   ` Martin Krischik
2006-10-20 12:52 ` Gautier
2006-10-20 13:27 ` claude.simon
2006-10-20 15:38 ` Robert A Duff
2006-10-20 19:32   ` Gautier
2006-10-20 15:56 ` Jeffrey Creem
2006-10-20 16:30 ` Martin Krischik
2006-10-20 19:51 ` Gautier
2006-10-20 22:11 ` Jeffrey R. Carter
2006-10-20 23:52   ` Jeffrey Creem
2006-10-21  7:37     ` Gautier
2006-10-21 16:35       ` Jeffrey Creem
2006-10-21 17:04         ` Pascal Obry
2006-10-21 21:22           ` Jeffrey Creem
2006-10-22  3:03             ` Jeffrey Creem
2006-10-22  7:39               ` Jeffrey R. Carter [this message]
2006-10-22 11:48                 ` tkrauss
2006-10-22 18:02                   ` Georg Bauhaus
2006-10-22 18:24                     ` Jeffrey Creem
2006-10-23  0:10                       ` Georg Bauhaus
2006-10-22 20:20                   ` Jeffrey R. Carter
2006-10-22 12:31                 ` Gautier
2006-10-22 20:26                   ` Jeffrey R. Carter
2006-10-22 21:22                     ` Gautier
2006-10-22 18:01                 ` tmoran
2006-10-22 20:54                   ` Jeffrey R. Carter
2006-10-22 13:50               ` Alinabi
2006-10-22 15:41                 ` Jeffrey Creem
2006-10-23  0:02                   ` Alinabi
2006-10-23  5:28                     ` Gautier
2006-10-23 16:32                       ` Alinabi
2006-10-22 15:57               ` Jeffrey Creem
2006-10-22 19:32                 ` Damien Carbonne
2006-10-22 20:00                   ` Gautier
2006-10-22 20:51                     ` Damien Carbonne
2006-10-23  2:15                       ` Jeffrey Creem
2006-10-23  2:29                         ` Jeffrey R. Carter
2006-10-23  1:31                   ` Jeffrey Creem
2006-10-23  3:10                     ` Jeffrey Creem
2006-10-23  7:31                       ` Jeffrey R. Carter
2006-10-23 11:55                         ` Jeffrey Creem
2006-10-23 19:52                           ` Wiljan Derks
2006-10-23 20:25                             ` Jeffrey R. Carter
2006-10-24  9:52                             ` Dr. Adrian Wrigley
2006-10-24 11:50                               ` Jeffrey Creem
2006-10-24 16:24                                 ` Jeffrey R. Carter
2006-10-25  3:50                                   ` Jeffrey Creem
2006-10-25 15:32                                     ` claude.simon
2006-10-24 19:21                               ` Wiljan Derks
2006-10-23 12:33                   ` Warner BRUNS
2006-10-23 12:40                   ` Warner BRUNS
2006-10-23 13:52                     ` Georg Bauhaus
2006-10-23 17:11                       ` Warner BRUNS
2006-10-23 17:57                         ` Dr. Adrian Wrigley
2006-10-23 15:02                     ` Robert A Duff
2006-10-23 20:22                       ` Jeffrey R. Carter
2006-10-21 18:28         ` tmoran
2006-10-23  6:28       ` Martin Krischik
2006-10-21 12:39 ` Dr. Adrian Wrigley
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox