Re: GNAT (GCC) Profile Guided Compilation

comp.lang.ada
 help / color / mirror / Atom feed

From: Keean Schupke <keean.schupke@googlemail.com>
Subject: Re: GNAT (GCC) Profile Guided Compilation
Date: Sun, 1 Jul 2012 15:57:37 -0700 (PDT)
Date: 2012-07-01T15:57:37-07:00	[thread overview]
Message-ID: <a4f2a43e-5593-48f6-9e0f-7d0057874f94@googlegroups.com> (raw)
In-Reply-To: <4ff08cb2$0$6575$9b4e6d93@newsspool3.arcor-online.net>

On Sunday, 1 July 2012 18:45:22 UTC+1, Georg Bauhaus  wrote:
> On 29.06.12 19:03, Keean Schupke wrote:
> > Note: this is not really answering my question about the lack of improvement due to profile-guided compilation - but perhaps that side of things is more a compiler issue. Does anyone have any ideas about that side of things?
> 
> Some more observations, collected with the help of the simple
> two example programs appended below.
> 
> For Ada and GNAT at least, any advantages to be had from profile-guided
> compilation seem to vary with optimization options and sizes of data.
> The same is true about a 1D approach vs a 2D approach.
> 
> Among the various arrangements, one of the best I have got
> from both GNAT GPL 2012, and from an FSF GNAT as of June 2012
> uses
> *) -O2 -funroll-loops -gnatp
> *)  the 2D approach
> 
> (If it means a anything, adding -ftree-vectorize to the -O2 set
> produces only the second best Ada result, the same as the -O3 times
> listed below. I had chosen -ftree-vectorize as this switch is in the
> O3 set.) Trying profile guided compilation at higher optimization
> levels consistently slowed the Ada program down (other Ada programs
> were faster after profile guided compilation, as mentioned elsewhere).
> 
> The result at -O2 seems nice, though, in that the 2D approach is
> natural, the compiler switches are the ones that the manual recommends.
> and this combination produces the fastest Ada program.
> 
> In short, from worst to best 1D / 2D, Runs = 100:
> 
> Ada profile guided -> .88 / .80   (this program only!)
> Ada at -O3         -> .68 / .66
> C++ at -O3         -> .66
> Ada at -O2 -funr.. -> .68 / .47
> C++ profile guided -> .31
> 
> 
> Some differences seem to vary with hardware, too.
> On one computer I have seen a minor speedup, on another
> a very small slowdown, and a different ratio of 1D / 2D.
> 
> For the C++ case I have tested a 1D approach only, not
> knowing how to write 2D arrays in C++ using C style arrays.
> I'm not a C++ programmer, apologies for my C++.
> 
> 
> Ada at -O2 -funroll-loops -gnatp -mtune=native
> 1D:  0.680750000 27303
> 2D:  0.469094000 27303  <-- Best for Ada!
> 
> Ada at -O3 -gnatNp -fomit-frame-pointer -mtune=native
> 1D:  0.676616000 27303
> 2D:  0.664194000 27303
> 
> previous with profile-guided compilation
> 1D:  0.888206000 27303
> 2D:  0.806196000 27303
> 
> C++ at -O3 -mtune=native
> 1D: 0.681721 0
> 
> previous with profile-guided compilation
> 1D: 0.31165 0
> 
> Clang++
> 1D: 0.611522 0
> 
> The GNU C++ compiler is from the GNAT GPL 2012 distribution for Mac OS X,
> Clang++ is Apple's v 3.1.
> 
> 
> === 8< === 8< === 8< === 8< ===
> 
> package Fringes is
> 
>     pragma Pure (Fringes);
> 
>     type Full_Index is mod 2**32;
>     subtype Num is Full_Index Range 0 .. 100_000;
>     subtype Index is Full_Index range 0 .. 1000;
>     Len : constant Full_Index := Index'Last - Index'First + 1;
> 
>     type Matrix_1D is
>       array (Index'First .. Index'First + Len * Len - 1) of Num;
>     type Matrix_2 is array (Index, Index) of Num;
> 
>     procedure Compute_1D (A : in out Matrix_1D);
>     procedure Compute_2  (A : in out Matrix_2);
> 
> end Fringes;
> 
> package body Fringes is
> 
>     --
>     --  Each Compute procedure has A pointless loop that assigns each
>     --  inner field the sum of non-diagonal neighbors modulo Num'Last
>     --
> 
>     procedure NOT_USED_Compute_1D_Slow (A : in out Matrix_1D) is
>        J : Full_Index;
>     begin
>        J := A'First + Len;
>        loop
>          exit when J > A'Last - Len;
>          for K in J + 1 .. J + Len - 1 - 1 loop
>             A (K) := (A(K + 1)
>                       + A(K - Len)
>                       + A(K - 1)
>                       + A(K + Len)) mod Num'Last;
>          end loop;
>          J := J + Len;
>        end loop;
>     end NOT_USED_Compute_1D_Slow;
> 
>     procedure Compute_1D (A : in out Matrix_1D) is
>     begin
>        for K in A'First + Len + 1 .. A'Last - Len - 1 loop
>           case K mod Len is
>           when 0 | Len - 1 => null;
>           when others =>
>              A (K) := (A(K + 1)
>                        + A(K - Len)
>                        + A(K - 1)
>                        + A(K + Len)) mod Num'Last;
>           end case;
>        end loop;
>     end Compute_1D;
> 
>     procedure Compute_2 (A : in out Matrix_2) is
>     begin
>        for J in A'First(1) + 1 .. A'Last(1) - 1 loop
>           for K in A'First(2) + 1 .. A'Last(2) - 1 loop
>              A (J, K) := (A(J, K + 1)
>                           + A(j - 1, K)
>                           + A(J, K - 1)
>                           + A(j + 1, K)) mod Num'Last;
>           end loop;
>        end loop;
>     end Compute_2;
> 
> end Fringes;
> 
> 
> with Fringes;
> with Ada.Command_Line, Ada.Real_Time;
> procedure Test_Fringes is
> 
>     use Ada.Real_Time;
> 
>     Runs : constant Natural :=
>       Natural'Value (Ada.Command_Line.Argument(1));
> 
>     Start, Stop: Time;
> 
>     package Os_Or_Gnat_Stacksize_Nuisance is
>        use Fringes;
>        type P1 is access Matrix_1D;
>        type P2 is access Matrix_2;
>        M1P : constant P1 := new Matrix_1D'(others => 123);
>        M2p : constant P2 := new Matrix_2'(Index => (Index => 123));
>        M1D : Matrix_1D renames M1P.all;
>        M2 : Matrix_2 renames M2P.all;
>     end Os_Or_Gnat_Stacksize_Nuisance;
>     use Os_Or_Gnat_Stacksize_Nuisance;
> 
>     procedure Print_Timing (Part : String; N : Fringes.Num) is separate;
>     use type Fringes.Full_Index;
> begin
>     Start := Clock;
>     for Run in 1 .. Runs loop
>        Fringes.Compute_1D (M1D);
>     end loop;
>     Stop := Clock;
>     Print_Timing ("1D", M1D ((M1D'First + 1 * Fringes.Len) + (2)));
> 
>     Start := Clock;
>     for Run in 1 .. Runs loop
>        Fringes.Compute_2 (M2);
>     end loop;
>     Stop := Clock;
>     Print_Timing ("2D", M2 (1, 2));
> end Test_Fringes;
> 
> with Ada.Text_IO;
> separate (Test_Fringes)
> procedure Print_Timing (Part : String; N : Fringes.Num) is
>     use Ada.Text_IO;
> begin
>     Put (Part);
>     Put (": ");
>     Put (Duration'Image (To_Duration(Stop - Start)));
>     Put (Fringes.Num'Image (N));
>     New_Line;
> end print_timing;
> 
> 
> === 8< === 8< === 8< === 8< ===
> 
> #include <stdint.h>
> 
> namespace Fringes {
> 
>      typedef  uint32_t Full_Index;
> #define Num_Last 100000
>      typedef Full_Index Num;
> #define Index_Last 1000
>      typedef Full_Index Index;
>      const Full_Index Len = Index_Last + 1;
> 
> #define A_Last ((Len * Len) - 1)
>      typedef Num Matrix_1D[A_Last + 1];
> 
>      void Compute_1D (Matrix_1D&  A);
>      //  Each Compute procedure has a pointless loop that assigns each
>      //  inner field the sum of non-diagonal neighbors modulo Num_Last
>      void Compute_1D (Matrix_1D&  A) {
> 
>          for (Full_Index K = Len + 1; K < A_Last - Len - 1; ++K) {
>              switch (K % Len) {
>              case 0: case Len - 1: break;
>              default:
>                  A [K] = (A[K + 1]
>                            + A[K - Len]
>                            + A[K - 1]
>                            + A[K + Len]) % Num_Last;
>              }
>          }
>      }
> }
> 
> #include <sys/time.h>
> #include <string>
> 
> class Reporter {
>      struct timeval Start, Stop;
> public:
>      void logStart();
>      void logStop();
>      void Print_Timing (const std::string Part, const Fringes::Num N);
> };
> 
> 
> #include <sstream>
> 
> int main(int argc, char* argv[]) {
>      using namespace Fringes;
> 
>      int Runs;
>      Matrix_1D* M1D = new Matrix_1D[Len];
>      Reporter history;
> 
>      if (argc == 0) {
>          throw "usage";
>      } else {
>          std::istringstream converter(argv[1]);
> 
>          if (! ((converter >> Runs) && (Runs >= 0))) {
>              throw "argument error?";
>          }
>      }
>      
>      history.logStart();
>      for (int Run = 1; Run <= Runs; ++Run) {
>          Compute_1D (*M1D);
>      }
>      history.logStop();
>      history.Print_Timing ("1D", (*M1D) [(1 * Len) + (2)]);
> }
> 
> 
> #include <iostream>
> #include <sys/time.h>
> 
> void Reporter::Print_Timing (const std::string Part, const Fringes::Num N) {
>      double difference = (Stop.tv_sec - Start.tv_sec) * 1000000
>              + (Stop.tv_usec - Start.tv_usec);
>      std::cout << Part << ": ";
>      std::cout << (difference / 1000000.0) << " " << N;
>      std::cout << '\n';
> }
> 
> void Reporter::logStart() {
>      gettimeofday(&this->Start, 0);
> }
> 
> void Reporter::logStop() {
>      gettimeofday(&this->Stop, 0);
> }

On Sunday, 1 July 2012 18:45:22 UTC+1, Georg Bauhaus  wrote:
> On 29.06.12 19:03, Keean Schupke wrote:
> > Note: this is not really answering my question about the lack of improvement due to profile-guided compilation - but perhaps that side of things is more a compiler issue. Does anyone have any ideas about that side of things?
> 
> Some more observations, collected with the help of the simple
> two example programs appended below.
> 
> For Ada and GNAT at least, any advantages to be had from profile-guided
> compilation seem to vary with optimization options and sizes of data.
> The same is true about a 1D approach vs a 2D approach.
> 
> Among the various arrangements, one of the best I have got
> from both GNAT GPL 2012, and from an FSF GNAT as of June 2012
> uses
> *) -O2 -funroll-loops -gnatp
> *)  the 2D approach
> 
> (If it means a anything, adding -ftree-vectorize to the -O2 set
> produces only the second best Ada result, the same as the -O3 times
> listed below. I had chosen -ftree-vectorize as this switch is in the
> O3 set.) Trying profile guided compilation at higher optimization
> levels consistently slowed the Ada program down (other Ada programs
> were faster after profile guided compilation, as mentioned elsewhere).
> 
> The result at -O2 seems nice, though, in that the 2D approach is
> natural, the compiler switches are the ones that the manual recommends.
> and this combination produces the fastest Ada program.
> 
> In short, from worst to best 1D / 2D, Runs = 100:
> 
> Ada profile guided -> .88 / .80   (this program only!)
> Ada at -O3         -> .68 / .66
> C++ at -O3         -> .66
> Ada at -O2 -funr.. -> .68 / .47
> C++ profile guided -> .31
> 
> 
> Some differences seem to vary with hardware, too.
> On one computer I have seen a minor speedup, on another
> a very small slowdown, and a different ratio of 1D / 2D.
> 
> For the C++ case I have tested a 1D approach only, not
> knowing how to write 2D arrays in C++ using C style arrays.
> I'm not a C++ programmer, apologies for my C++.
> 
> 
> Ada at -O2 -funroll-loops -gnatp -mtune=native
> 1D:  0.680750000 27303
> 2D:  0.469094000 27303  <-- Best for Ada!
> 
> Ada at -O3 -gnatNp -fomit-frame-pointer -mtune=native
> 1D:  0.676616000 27303
> 2D:  0.664194000 27303
> 
> previous with profile-guided compilation
> 1D:  0.888206000 27303
> 2D:  0.806196000 27303
> 
> C++ at -O3 -mtune=native
> 1D: 0.681721 0
> 
> previous with profile-guided compilation
> 1D: 0.31165 0
> 
> Clang++
> 1D: 0.611522 0
> 
> The GNU C++ compiler is from the GNAT GPL 2012 distribution for Mac OS X,
> Clang++ is Apple's v 3.1.
> 
> 
> === 8< === 8< === 8< === 8< ===
> 
> package Fringes is
> 
>     pragma Pure (Fringes);
> 
>     type Full_Index is mod 2**32;
>     subtype Num is Full_Index Range 0 .. 100_000;
>     subtype Index is Full_Index range 0 .. 1000;
>     Len : constant Full_Index := Index'Last - Index'First + 1;
> 
>     type Matrix_1D is
>       array (Index'First .. Index'First + Len * Len - 1) of Num;
>     type Matrix_2 is array (Index, Index) of Num;
> 
>     procedure Compute_1D (A : in out Matrix_1D);
>     procedure Compute_2  (A : in out Matrix_2);
> 
> end Fringes;
> 
> package body Fringes is
> 
>     --
>     --  Each Compute procedure has A pointless loop that assigns each
>     --  inner field the sum of non-diagonal neighbors modulo Num'Last
>     --
> 
>     procedure NOT_USED_Compute_1D_Slow (A : in out Matrix_1D) is
>        J : Full_Index;
>     begin
>        J := A'First + Len;
>        loop
>          exit when J > A'Last - Len;
>          for K in J + 1 .. J + Len - 1 - 1 loop
>             A (K) := (A(K + 1)
>                       + A(K - Len)
>                       + A(K - 1)
>                       + A(K + Len)) mod Num'Last;
>          end loop;
>          J := J + Len;
>        end loop;
>     end NOT_USED_Compute_1D_Slow;
> 
>     procedure Compute_1D (A : in out Matrix_1D) is
>     begin
>        for K in A'First + Len + 1 .. A'Last - Len - 1 loop
>           case K mod Len is
>           when 0 | Len - 1 => null;
>           when others =>
>              A (K) := (A(K + 1)
>                        + A(K - Len)
>                        + A(K - 1)
>                        + A(K + Len)) mod Num'Last;
>           end case;
>        end loop;
>     end Compute_1D;
> 
>     procedure Compute_2 (A : in out Matrix_2) is
>     begin
>        for J in A'First(1) + 1 .. A'Last(1) - 1 loop
>           for K in A'First(2) + 1 .. A'Last(2) - 1 loop
>              A (J, K) := (A(J, K + 1)
>                           + A(j - 1, K)
>                           + A(J, K - 1)
>                           + A(j + 1, K)) mod Num'Last;
>           end loop;
>        end loop;
>     end Compute_2;
> 
> end Fringes;
> 
> 
> with Fringes;
> with Ada.Command_Line, Ada.Real_Time;
> procedure Test_Fringes is
> 
>     use Ada.Real_Time;
> 
>     Runs : constant Natural :=
>       Natural'Value (Ada.Command_Line.Argument(1));
> 
>     Start, Stop: Time;
> 
>     package Os_Or_Gnat_Stacksize_Nuisance is
>        use Fringes;
>        type P1 is access Matrix_1D;
>        type P2 is access Matrix_2;
>        M1P : constant P1 := new Matrix_1D'(others => 123);
>        M2p : constant P2 := new Matrix_2'(Index => (Index => 123));
>        M1D : Matrix_1D renames M1P.all;
>        M2 : Matrix_2 renames M2P.all;
>     end Os_Or_Gnat_Stacksize_Nuisance;
>     use Os_Or_Gnat_Stacksize_Nuisance;
> 
>     procedure Print_Timing (Part : String; N : Fringes.Num) is separate;
>     use type Fringes.Full_Index;
> begin
>     Start := Clock;
>     for Run in 1 .. Runs loop
>        Fringes.Compute_1D (M1D);
>     end loop;
>     Stop := Clock;
>     Print_Timing ("1D", M1D ((M1D'First + 1 * Fringes.Len) + (2)));
> 
>     Start := Clock;
>     for Run in 1 .. Runs loop
>        Fringes.Compute_2 (M2);
>     end loop;
>     Stop := Clock;
>     Print_Timing ("2D", M2 (1, 2));
> end Test_Fringes;
> 
> with Ada.Text_IO;
> separate (Test_Fringes)
> procedure Print_Timing (Part : String; N : Fringes.Num) is
>     use Ada.Text_IO;
> begin
>     Put (Part);
>     Put (": ");
>     Put (Duration'Image (To_Duration(Stop - Start)));
>     Put (Fringes.Num'Image (N));
>     New_Line;
> end print_timing;
> 
> 
> === 8< === 8< === 8< === 8< ===
> 
> #include <stdint.h>
> 
> namespace Fringes {
> 
>      typedef  uint32_t Full_Index;
> #define Num_Last 100000
>      typedef Full_Index Num;
> #define Index_Last 1000
>      typedef Full_Index Index;
>      const Full_Index Len = Index_Last + 1;
> 
> #define A_Last ((Len * Len) - 1)
>      typedef Num Matrix_1D[A_Last + 1];
> 
>      void Compute_1D (Matrix_1D&  A);
>      //  Each Compute procedure has a pointless loop that assigns each
>      //  inner field the sum of non-diagonal neighbors modulo Num_Last
>      void Compute_1D (Matrix_1D&  A) {
> 
>          for (Full_Index K = Len + 1; K < A_Last - Len - 1; ++K) {
>              switch (K % Len) {
>              case 0: case Len - 1: break;
>              default:
>                  A [K] = (A[K + 1]
>                            + A[K - Len]
>                            + A[K - 1]
>                            + A[K + Len]) % Num_Last;
>              }
>          }
>      }
> }
> 
> #include <sys/time.h>
> #include <string>
> 
> class Reporter {
>      struct timeval Start, Stop;
> public:
>      void logStart();
>      void logStop();
>      void Print_Timing (const std::string Part, const Fringes::Num N);
> };
> 
> 
> #include <sstream>
> 
> int main(int argc, char* argv[]) {
>      using namespace Fringes;
> 
>      int Runs;
>      Matrix_1D* M1D = new Matrix_1D[Len];
>      Reporter history;
> 
>      if (argc == 0) {
>          throw "usage";
>      } else {
>          std::istringstream converter(argv[1]);
> 
>          if (! ((converter >> Runs) && (Runs >= 0))) {
>              throw "argument error?";
>          }
>      }
>      
>      history.logStart();
>      for (int Run = 1; Run <= Runs; ++Run) {
>          Compute_1D (*M1D);
>      }
>      history.logStop();
>      history.Print_Timing ("1D", (*M1D) [(1 * Len) + (2)]);
> }
> 
> 
> #include <iostream>
> #include <sys/time.h>
> 
> void Reporter::Print_Timing (const std::string Part, const Fringes::Num N) {
>      double difference = (Stop.tv_sec - Start.tv_sec) * 1000000
>              + (Stop.tv_usec - Start.tv_usec);
>      std::cout << Part << ": ";
>      std::cout << (difference / 1000000.0) << " " << N;
>      std::cout << '\n';
> }
> 
> void Reporter::logStart() {
>      gettimeofday(&this->Start, 0);
> }
> 
> void Reporter::logStop() {
>      gettimeofday(&this->Stop, 0);
> }


Interesting stuff, I need a bit more time to read and digest, but I did notice one thing that I think is worth pointing out. 

The real benefit (and performance gains) from profile guided compilation come from correcting branch prediction. As such the gains will be most apparent when there is an 'if' statement in the inner loop of the code. Try something where you are taking the sign of an int in the formula and have three cases <0 =0 >0.

The other point is that for any single branch there is at least a 50% chance it is already optimal, for example:

if x > 0 then
    y := 1 + y;
else
    y := 1 - y;
end if;

will get implemented in assembly as something like:

 tst rx
 ble xyz
 ld #1, r0
 sub r0, ry
 bra zyx
.xyz
 add #1, ry
.zyx

OR

 tst rx
 bgt xyz
 add #1, ry
 bra zyx
.xyz
 ld #1, r0
 sub r0, ry
.zyx

whether this is optimal depends on the number of times x <= 0 and on what the CPU's branch predictor guesses for 'ble xyz'. Most branch predictors start with the assumption that a backwards branch will be taken (looping) and a forward branch will not be taken, but then they also build statistical tables based on the address of the branch instruction to improve that guess on future iterations.

The speed difference can be large because a branch predict failure can have a high penalty - a CPU pipe stall. Modern CPUs use speculative execution (they keep the pipe filled by using branch-predict to assume whether the branch will or will not be taken) so if the branch predict guesses correctly there is no pipe stall and the CPU continues through the code at full speed. A pipe stall can cost 10s or 100s of clock cycles each time it happens. 

I initially expected the dynamic branch predictor would be doing a good job and changing the order of the 'if' statements would have no effect, but benchmarking on 64bit x86 hardware showed a strong effect for changing the order of branches.

Anyway the point is, even if you have an 'if' statement in the inner loop, you may be lucky and already have it the 'faster' way round.

So you need to benchmark:

if x > 0 then
    y := 1 + y;
else
    y := 1 - y;
end if;

AND

if x <= 0 then
    y := 1 - y;
else
    y := 1 + y;
end if;

Then you can confirm that profile guided compilation gives you the faster of the two benchmark times, no matter which way round the if statement is written in the code.

Obviously for one if statement in the inner loop you can practically benchmark both alternatives and chose the fastest.

With my Monte-Carlo simulation, the inner loop logic is a complex state-machine, and it would be impractical to try all possibly combinations (due to the explosion of combinations at the speed of 2^N, just 8 branches would require 256 code change and rebuild-cycles, 16 = 65536 etc...

This is where profile-guided compilation is a valuable tool. Both the Ada code and C++ code share the same branching in the state machine, and both have the same initial performance now (approx 40k simulations per second). However the C++ compiler gains 25% from the profile guided compilation (by changing branch orders and re-ordering functions). This gain should be available to the Ada compiler too, as the same branches are getting generated in the output assembly - but its just not working as well.

My latest results are better, I now have both C++ and Ada running from the same GCC-4.7.1 compiler and my latest performance figures are:

Just so its clear, you first build with '-fprofile-generate' then you run the code to build the .gcda profile files, then you build again with '-fprofile-use' to build the 'profile-guided' version. My code runs 1,000,000 iterations between the 'generate' and the 'use' steps.


C++ first compile: 39k
C++ profile guided: 55k

Ada first compile: 41k
Ada profile guided: 46k


So gcc-4.7.1 seems to improve the results slightly for Ada, but it seems to be still falling short of the potential improvement. I am becoming more convinced that there is nothing in the Ada code itself that is causing this, but more likely something about the Ada GCC front-end itself not using the profiling information as effectively as C++. 


Cheers,
Keean.

next prev parent reply	other threads:[~2012-07-01 22:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-29  9:17 GNAT (GCC) Profile Guided Compilation Keean Schupke
2012-06-29  9:34 ` Dmitry A. Kazakov
2012-06-29 10:01   ` Keean Schupke
2012-06-29 10:24     ` Keean Schupke
2012-06-29 12:26       ` stefan-lucks
2012-06-29 12:51         ` Keean Schupke
2012-06-29 12:05     ` Dmitry A. Kazakov
2012-06-29 10:48 ` Simon Wright
2012-06-29 11:14   ` Keean Schupke
2012-06-29 12:39 ` gautier_niouzes
2012-06-29 12:52   ` Keean Schupke
2012-06-29 14:14     ` gautier_niouzes
2012-06-29 15:05       ` gautier_niouzes
2012-06-29 17:03         ` Keean Schupke
2012-07-01  9:29           ` Georg Bauhaus
2012-07-01 17:45           ` Georg Bauhaus
2012-07-01 22:57             ` Keean Schupke [this message]
2012-07-02 17:15               ` Georg Bauhaus
2012-07-02 17:26                 ` Keean Schupke
2012-07-02 23:48                   ` Keean Schupke
2012-07-04 10:38                     ` Georg Bauhaus
2012-07-04 10:57                       ` Keean Schupke
2012-07-04 12:36                         ` Mark Lorenzen
2012-07-04 12:38                         ` Georg Bauhaus
2012-07-14 20:17                           ` Keean Schupke
2012-07-14 20:33                             ` Keean Schupke
2012-07-14 20:43                             ` Niklas Holsti
2012-07-14 22:32                               ` Keean Schupke
2012-07-14 23:40                                 ` Keean Schupke
2012-07-15  7:15                                   ` Niklas Holsti
2012-07-15  8:27                                     ` Keean Schupke
2012-07-18 10:01                                       ` Georg Bauhaus
2012-07-18 17:36                                         ` Keean Schupke
2012-07-19  5:42                                           ` Georg Bauhaus
2012-07-19 10:18                                             ` Keean Schupke
2012-07-15 11:02                                     ` Niklas Holsti
2012-07-15 12:48                                       ` Keean Schupke

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox