comp.lang.ada
 help / color / mirror / Atom feed
From: Keean Schupke <keean.schupke@googlemail.com>
Subject: Re: GNAT (GCC) Profile Guided Compilation
Date: Mon, 2 Jul 2012 16:48:33 -0700 (PDT)
Date: 2012-07-02T16:48:33-07:00	[thread overview]
Message-ID: <fed934c8-9cff-4905-811d-9f9d3050d0b1@googlegroups.com> (raw)
In-Reply-To: <cdbe38d2-c8b0-41b2-9830-d913aefa200c@googlegroups.com>

On Monday, 2 July 2012 18:26:58 UTC+1, Keean Schupke  wrote:
> On Monday, 2 July 2012 18:15:28 UTC+1, Georg Bauhaus  wrote:
> > On 02.07.12 00:57, Keean Schupke wrote:
> > > The real benefit (and performance gains) from profile guided compilation come from correcting branch prediction. As such the gains will be most apparent when there is an 'if' statement in the inner loop of the code. Try something where you are taking the sign of an int in the formula and have three cases <0 =0 >0.
> > 
> > 
> > Thanks for your lucid words, I was mostly guessing at what profile
> > guided compilation might actually do. Indeed, now that I have started
> > playing with conditionals, the translations show very different effects
> > already, for variations of the procedure below,
> > 
> >    procedure Compute_1D (A : in out Matrix_1D) is
> >    begin
> >       for K in A'First + Len + 1 .. A'Last - Len - 1 loop
> >          case K mod Len is
> >          when 0 | Len - 1 => null;
> >          when others =>
> >             A (K) := (A(K + 1)
> >                         + A(K - Len)
> >                         + A(K - 1)
> >                         + A(K + Len)) mod Num'Last;
> >          end case;
> >          if A (K) mod 6 = 0 then
> >             A (K) := (A (K) - 1) mod Num'Last;
> >          else
> >             A (K) := K mod Num'Last;
> >          end if;
> >       end loop;
> >    end Compute_1D;
> > 
> > Ada and C++ are mostly on a par without help from a profile
> > (the 2D approach is still better in the Ada case; perhaps mod 6
> > isn't true for that many K). C++ gains 8%, Ada only 4%, though.
> > 
> > 
> > Cheers,
> > Georg
> 
> 
> As it happens, the branch predictor is quite good at predicting regular 'mod' patterns. See:
> 
> http://en.wikipedia.org/wiki/Branch_predictor
> 
> And look for the section on the two level adaptive predictor.
> 
> I think Monte-Carlo techniques must be particularly sensitive to branch predictor error, as each iteration the branching is controlled by a pseudo random number (and we hope the branch predictor cannot predict that).
> 
> So if for each iteration you pick a random number, and that controls your branch pattern in the inner loop, you should see a stronger effect from the profile-guided optimisation.
> 
> 
> Cheers,
> Keean.


I have done some testing with the linux "perf" tool. These are some figures for the Ada version:

         1,014,900 l1-dcache-load-misses     #    0.01% of all L1-dcache hits
    12,462,973,199 l1-dcache-loads
         7,311,495 cache-references
            38,804 cache-misses              #    0.531 % of all cache refs
     2,588,686,069 branch-instructions
       388,460,030 branch-misses             #   15.01% of all branches
      21.885512117 seconds time elapsed

And here are the results for the C++ version:

           840,245 l1-dcache-load-misses     #    0.01% of all L1-dcache hits
    11,140,761,995 l1-dcache-loads
         6,019,321 cache-references
            27,584 cache-misses              #    0.458 % of all cache refs
     3,049,597,029 branch-instructions
       560,173,316 branch-misses             #   18.37% of all branches
      17.823476294 seconds time elapsed


So the interesting thing is that the Ada version has less overall branches and less branch misses than the C++ version, so it seems the profile-guided compilation has achieved as much. There is another factor limiting performance. The interesting figure would appear to be the cache-misses.

So it would appear I need to focus on the cache utilisation of the Ada code.


Cheers,
Keean.



  reply	other threads:[~2012-07-02 23:48 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-29  9:17 GNAT (GCC) Profile Guided Compilation Keean Schupke
2012-06-29  9:34 ` Dmitry A. Kazakov
2012-06-29 10:01   ` Keean Schupke
2012-06-29 10:24     ` Keean Schupke
2012-06-29 12:26       ` stefan-lucks
2012-06-29 12:51         ` Keean Schupke
2012-06-29 12:05     ` Dmitry A. Kazakov
2012-06-29 10:48 ` Simon Wright
2012-06-29 11:14   ` Keean Schupke
2012-06-29 12:39 ` gautier_niouzes
2012-06-29 12:52   ` Keean Schupke
2012-06-29 14:14     ` gautier_niouzes
2012-06-29 15:05       ` gautier_niouzes
2012-06-29 17:03         ` Keean Schupke
2012-07-01  9:29           ` Georg Bauhaus
2012-07-01 17:45           ` Georg Bauhaus
2012-07-01 22:57             ` Keean Schupke
2012-07-02 17:15               ` Georg Bauhaus
2012-07-02 17:26                 ` Keean Schupke
2012-07-02 23:48                   ` Keean Schupke [this message]
2012-07-04 10:38                     ` Georg Bauhaus
2012-07-04 10:57                       ` Keean Schupke
2012-07-04 12:36                         ` Mark Lorenzen
2012-07-04 12:38                         ` Georg Bauhaus
2012-07-14 20:17                           ` Keean Schupke
2012-07-14 20:33                             ` Keean Schupke
2012-07-14 20:43                             ` Niklas Holsti
2012-07-14 22:32                               ` Keean Schupke
2012-07-14 23:40                                 ` Keean Schupke
2012-07-15  7:15                                   ` Niklas Holsti
2012-07-15  8:27                                     ` Keean Schupke
2012-07-18 10:01                                       ` Georg Bauhaus
2012-07-18 17:36                                         ` Keean Schupke
2012-07-19  5:42                                           ` Georg Bauhaus
2012-07-19 10:18                                             ` Keean Schupke
2012-07-15 11:02                                     ` Niklas Holsti
2012-07-15 12:48                                       ` Keean Schupke
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox