comp.lang.ada
 help / color / mirror / Atom feed
* sub-optimal code for packed boolean arrays -- bug or inherent limitation
@ 2007-07-02 19:34 Alinabi
  2007-07-02 20:08 ` Ludovic Brenta
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Alinabi @ 2007-07-02 19:34 UTC (permalink / raw)


Hello, everyone.

I am writing a chess program in Ada, using bitboards for position
representation. Bitboards are just 64 bit integers in which each bit
represents a predicate about a square of the board. For example, you
would have a bitboard called WhitePawns, which has a one for every
square that contains a white pawn, and zero for every other square. In
Ada, there are two possible implementations of a bitboard: using
Unsigned_64, which would closely mirror the way things are done in C,
and using packed arrays of booleans. The C-like version, using
Unsigned_64, leads to some rather obfuscated code, so I decided to
investigate the efficiency of the code that gnat generates for some
basic operations on packed arrays, which I implemented in the
following package:

package Test is

    type Index_T is new Integer range 0..63;

    type Bitboard_T is private;

    procedure Set   (B : in out Bitboard_T; I : in Index_T);
    procedure Clear (B : in out Bitboard_T; I : in Index_T);
    procedure Clear (B : in out Bitboard_T);
    procedure Flip  (B : in out Bitboard_T; I : in Index_T);
    function  Is_Set(B : in     Bitboard_T; I : in Index_T)
	return Boolean ;

private

    type Bitboard_T is array(Index_T) of Boolean;
    pragma Pack(Bitboard_T);

    pragma Inline_Always(Set, Clear, Flip, Is_Set);

end Test;




package body Test is

    pragma Optimize(Time);

    procedure Set(B : in out Bitboard_T; I : in Index_T) is
    begin
	B(I) := True;
    end;

    procedure Clear(B : in out Bitboard_T; I : in Index_T) is
    begin
	B(I) := False;
    end;

    procedure Clear(B : in out Bitboard_T) is
    begin
	B    := B xor B;
    end;

    procedure Flip(B : in out Bitboard_T; I : in Index_T) is
    begin
	B(i) := not B(i);
    end;

    function Is_Set(B : in Bitboard_T; I : in Index_T)
	return Boolean is
    begin
	return B(I);
    end;


end Test;


When compiled with
gnatmake -g -gnatp -gnatn -march=opteron -O3 -fdump-tree-optimized
test.adb
using the ada compiler in gcc 4.1.2, the code generated is generally
optimal, or close to optimal. For example, the code generated for Set
is

procedure Set(B : in out Bitboard_T; I : in Index_T) is
   0:   89 f1                         mov   %esi,%ecx
    begin
        B(I) := True;
   2:   b8 01 00 00 00         mov    $0x1,%eax
   7:   48 d3 e0                   shl      %cl,%rax
   a:   48 09 f8                    or       %rdi,%rax
    end;
   d:   c3                              retq


The only notable exception is procedure Flip, which becomes

procedure Flip(B : in out Bitboard_T; I : in Index_T) is
  30:   89 f1                   mov    %esi,%ecx
    begin
        B(i) := not B(i);
  32:   48 c7 c0 fe ff ff ff    mov    $0xfffffffffffffffe,%rax
  39:   48 d3 c0                   rol    %cl,%rax
  3c:   48 21 f8                   and   %rdi,%rax
  3f:   48 d3 ef                    shr    %cl,%rdi
  42:   83 f7 01                   xor    $0x1,%edi
  45:   83 e7 01                  and    $0x1,%edi
  48:   48 d3 e7                  shl     %cl,%rdi
  4b:   48 09 f8                   or      %rdi,%rax
    end;
  4e:   c3                             retq

instead of the shorter

mov  %esi,  %ecx
mov  0x1,   %rax
shl    %cl,    %rax
xor    % rax, %rdx
retq

I don't know much (if anything) about compiler internals, so I was
wondering if I should file a bug report. Is there some good
theoretical justification for all that extraneous shifts and
rotations? I would think that if the compiler can figure out the best
way to set or clear a bit in a register using shift and logical ops,
than it should also be able to negate it efficiently.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
@ 2007-07-02 20:08 ` Ludovic Brenta
  2007-07-03  1:01 ` Jeffrey R. Carter
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Ludovic Brenta @ 2007-07-02 20:08 UTC (permalink / raw)


Alinabi <alexander.the.average@gmail.com> writes:
> Hello, everyone.

Unfortunately I don't know enough of GCC's optimisers to answer.  I
think you should ask your (interesting) question on the GCC mailing
list, where not only the GNAT developers but the optimiser developers
dwell.  The list is gcc@gcc.gnu.org.

-- 
Ludovic Brenta.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
  2007-07-02 20:08 ` Ludovic Brenta
@ 2007-07-03  1:01 ` Jeffrey R. Carter
  2007-07-03  7:22   ` Harald Korneliussen
  2007-07-03  7:59 ` gautier_niouzes
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Jeffrey R. Carter @ 2007-07-03  1:01 UTC (permalink / raw)


Alinabi wrote:
> 
> The only notable exception is procedure Flip, which becomes
> 
> procedure Flip(B : in out Bitboard_T; I : in Index_T) is
>   30:   89 f1                   mov    %esi,%ecx
>     begin
>         B(i) := not B(i);
>   32:   48 c7 c0 fe ff ff ff    mov    $0xfffffffffffffffe,%rax
>   39:   48 d3 c0                   rol    %cl,%rax
>   3c:   48 21 f8                   and   %rdi,%rax
>   3f:   48 d3 ef                    shr    %cl,%rdi
>   42:   83 f7 01                   xor    $0x1,%edi
>   45:   83 e7 01                  and    $0x1,%edi
>   48:   48 d3 e7                  shl     %cl,%rdi
>   4b:   48 09 f8                   or      %rdi,%rax
>     end;
>   4e:   c3                             retq
> 
> instead of the shorter
> 
> mov  %esi,  %ecx
> mov  0x1,   %rax
> shl    %cl,    %rax
> xor    % rax, %rdx
> retq
> 
> I don't know much (if anything) about compiler internals, so I was
> wondering if I should file a bug report. Is there some good
> theoretical justification for all that extraneous shifts and
> rotations? I would think that if the compiler can figure out the best
> way to set or clear a bit in a register using shift and logical ops,
> than it should also be able to negate it efficiently.

The code gives the correct results, so it can't be a compiler bug. 
Sometimes a compiler will do better than hand-coded assembler; sometimes 
it will do worse. I doubt the compiler's code will make your chess 
program fail to meet its timing requirements; I doubt if you'll notice 
the difference at all, so I don't see why you care. If you must have a 
specific sequence of machine code operations, you should use a 
machine-code insertion.

-- 
Jeff Carter
"Strange women lying in ponds distributing swords
is no basis for a system of government."
Monty Python & the Holy Grail
66



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03  1:01 ` Jeffrey R. Carter
@ 2007-07-03  7:22   ` Harald Korneliussen
  2007-07-03  8:37     ` Georg Bauhaus
  0 siblings, 1 reply; 29+ messages in thread
From: Harald Korneliussen @ 2007-07-03  7:22 UTC (permalink / raw)


On Jul 3, 3:01 am, "Jeffrey R. Carter"
<spam.jrcarter....@acm.nospam.org> wrote:
> I doubt the compiler's code will make your chess
> program fail to meet its timing requirements; I doubt if you'll notice
> the difference at all, so I don't see why you care. If you must have a
> specific sequence of machine code operations, you should use a
> machine-code insertion.
>
Now now, chess programs written to compete with other chess programs
need all the performance they can get. You could argue that he should
be using C with inline assembly (and probably another compiler than
GCC), but wouldn't it be kind of neat if a competitive and readable
chess program was written in Ada? It would be a feather in the hat for
Ada's low-level features, which is what this is all about.

I would ask the GCC team, after looking at the intermediate code. I'm
sure they're not content with merely having the compiler generate
correct code.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
  2007-07-02 20:08 ` Ludovic Brenta
  2007-07-03  1:01 ` Jeffrey R. Carter
@ 2007-07-03  7:59 ` gautier_niouzes
  2007-07-03  9:25 ` Stefan Lucks
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: gautier_niouzes @ 2007-07-03  7:59 UTC (permalink / raw)


...
> The only notable exception is procedure Flip, which becomes
>
> procedure Flip(B : in out Bitboard_T; I : in Index_T) is
>   30:   89 f1                   mov    %esi,%ecx
>     begin
>         B(i) := not B(i);
>   32:   48 c7 c0 fe ff ff ff    mov    $0xfffffffffffffffe,%rax
>   39:   48 d3 c0                   rol    %cl,%rax
>   3c:   48 21 f8                   and   %rdi,%rax
>   3f:   48 d3 ef                    shr    %cl,%rdi
>   42:   83 f7 01                   xor    $0x1,%edi
>   45:   83 e7 01                  and    $0x1,%edi
>   48:   48 d3 e7                  shl     %cl,%rdi
>   4b:   48 09 f8                   or      %rdi,%rax
>     end;
>   4e:   c3                             retq
>
> instead of the shorter
>
> mov  %esi,  %ecx
> mov  0x1,   %rax
> shl    %cl,    %rax
> xor    % rax, %rdx
> retq
>
> I don't know much (if anything) about compiler internals, so I was
> wondering if I should file a bug report. Is there some good
> theoretical justification for all that extraneous shifts and
> rotations? I would think that if the compiler can figure out the best
> way to set or clear a bit in a register using shift and logical ops,
> than it should also be able to negate it efficiently.

Did you time both codes ? Often a much shorter code is not faster at
all.
BTW, you can also do assembler insertions with Ada (although it would
be a pity if you had to).

G.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03  7:22   ` Harald Korneliussen
@ 2007-07-03  8:37     ` Georg Bauhaus
  0 siblings, 0 replies; 29+ messages in thread
From: Georg Bauhaus @ 2007-07-03  8:37 UTC (permalink / raw)


On Tue, 2007-07-03 at 00:22 -0700, Harald Korneliussen wrote:
> On Jul 3, 3:01 am, "Jeffrey R. Carter"
> <spam.jrcarter....@acm.nospam.org> wrote:
> > I doubt the compiler's code will make your chess
> > program fail to meet its timing requirements; I doubt if you'll notice
> > the difference at all, so I don't see why you care. If you must have a
> > specific sequence of machine code operations, you should use a
> > machine-code insertion.
> >
> Now now, chess programs written to compete with other chess programs
> need all the performance they can get. You could argue that he should
> be using C with inline assembly (and probably another compiler than
> GCC), but wouldn't it be kind of neat if a competitive and readable
> chess program was written in Ada?

Clear Ada is the point of this Ada program as the OP has explained
on the GCC mailing list.


> I would ask the GCC team, after looking at the intermediate code. I'm
> sure they're not content with merely having the compiler generate
> correct code.

The suggestion was different. Still, the problem is identified
as non-optimal code.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
                   ` (2 preceding siblings ...)
  2007-07-03  7:59 ` gautier_niouzes
@ 2007-07-03  9:25 ` Stefan Lucks
  2007-07-03 12:40   ` Stefan Lucks
  2007-07-03 15:42 ` Adam Beneschan
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Stefan Lucks @ 2007-07-03  9:25 UTC (permalink / raw)


As Clear, internally using XOR, is optimised well, have you tried to 
rewrite Flip using XOR for bit-flipping, e.g., as follows:

     procedure Flip(B : in out Bitboard_T; I : in Index_T) is
        One: constant Bitborard := (others => True);
     begin
         B(i) := B(i) xor One;
     end;

(Even though the optimiser appararently could be improved here -- I would 
not consider this a bug, it is more like a missing feature -- this might 
perhaps solve your current problem without having to use inline-assembly.)


-- 
Stefan Lucks      (moved to Bauhaus-University Weimar, Germany)
 		       <Stefan.Lucks at medien.uni-weimar.de>
------  I  love  the  taste  of  Cryptanalysis  in  the  morning!  ------





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03  9:25 ` Stefan Lucks
@ 2007-07-03 12:40   ` Stefan Lucks
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Lucks @ 2007-07-03 12:40 UTC (permalink / raw)


On Tue, 3 Jul 2007, Stefan Lucks wrote:

> As Clear, internally using XOR, is optimised well, have you tried to rewrite 
> Flip using XOR for bit-flipping, e.g., as follows:

sorry, I meant

     procedure Flip(B : in out Bitboard_T; I : in Index_T) is
     begin
         B(I) := B(I) xor True;
     end;



-- 
Stefan Lucks      (moved to Bauhaus-University Weimar, Germany)
 		       <Stefan.Lucks at medien.uni-weimar.de>
------  I  love  the  taste  of  Cryptanalysis  in  the  morning!  ------





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
                   ` (3 preceding siblings ...)
  2007-07-03  9:25 ` Stefan Lucks
@ 2007-07-03 15:42 ` Adam Beneschan
  2007-07-03 18:04 ` Alinabi
  2007-07-08 22:58 ` Robert A Duff
  6 siblings, 0 replies; 29+ messages in thread
From: Adam Beneschan @ 2007-07-03 15:42 UTC (permalink / raw)


On Jul 2, 12:34 pm, Alinabi <alexander.the.aver...@gmail.com> wrote:


>     procedure Flip(B : in out Bitboard_T; I : in Index_T) is
>     begin
>         B(i) := not B(i);
>     end;

Just as an experiment, I wonder what the following does.  I realize
that this code isn't useful.

   procedure Sideways_Flip(B : in out Bitboard_T; I, J : in Index_T)
is
   begin
       B(i) := not B(j);
   end;

If the assembly code looks substantially the same as your example,
that's an indication that your compiler isn't recognizing that the
B(i) on the left and right sides of the assignment are the same.  (It
makes perfect sense to me that the compiler might be able to recognize
a simple variable that's the same on both sides, but not an indexed
variable.)  I don't know that this helps solve the problem, except to
indicate that most likely any other "solution" that has B(i) on both
sides of := isn't going to be any better.

Hmmm... it just occurred to me that if the compiler can figure out to
optimize simple names on both sides of := but not indexed variables,
maybe this will work:

   procedure Flip(B : in out Bitboard_T; I : Index_T) is
      The_Bit : Boolean renames B(I);
   begin
      The_Bit := not The_Bit;
   end;

It's worth a try.  Sometimes doing stuff like this is enough to
confuse a compiler into doing the right thing.

If nothing else works, here's a possible workaround, but be warned
that I haven't tried this:

   procedure Flip(B : in out Bitboard_T; I : in Index_T) is
      type Int_64 is mod 2**64;
      B_Overlay : Int_64;
      for B_Overlay'Address use B'Address;
   begin
      B_Overlay := B_Overlay xor 2**Integer(I);  --or
      B_Overlay := B_Overlay xor 2**Integer(63-I);
   end;

I don't know which of the xor's will work, or if either of them work;
it depends on how the elements of Bitboard_T are ordered.  The above
code is evil, and it's the kind of thing I'd do only out of
desperation---but at least the obfuscation in your program will be
limited to one place.

                   -- Adam





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
                   ` (4 preceding siblings ...)
  2007-07-03 15:42 ` Adam Beneschan
@ 2007-07-03 18:04 ` Alinabi
  2007-07-03 18:09   ` Alinabi
                     ` (2 more replies)
  2007-07-08 22:58 ` Robert A Duff
  6 siblings, 3 replies; 29+ messages in thread
From: Alinabi @ 2007-07-03 18:04 UTC (permalink / raw)


First, let me thank all of you for your replies. I feel that I
have to clarify a few things about this pet project of mine.

I used to play chess at a competition level and I always
wanted to write a chess program, but part of the motivation for
starting the project is to learn the "dark corners" of Ada. So,
I am not interested in switching to C.

As I said, the bitboard operations can be implemented in C-like
fashion, using Unsigned_64, masks and shifts, and that's how I
have implemented them so far. This makes the static evaluation
function hard to understand for someone who is not necessarily
an experienced programmer. While I used to be a candidate master
once upon a time, I do not play at that strength anymore and I
could benefit from the advice of a strong player (who might not
be a programmer). For such a person B(i) := not B(i) is much
easier to understand than B := B xor Bitborad_T(Shift_Left(1,
i)).

I asked my question on the gcc mailing list and the answer was
that I should not bother with a bug report. Their point of view,
which I can understand, is that since the code generated is
correct and this is such a niche feature, nobody will be in any
hurry to work on it. They did suggest that I try to fix it
myself, and they even pointed me at the file of interest in the
gcc source. I barely have free time for my chess program, so I
don't think I'll take them up on that any time soon, but if
anybody here wants to give it a try, the relevant file is
exp_pakd.adb

Now, concerning efficiency: these basic operations on bitboards
are used 1e8 to 1e9 times every time the program tries to decide
what to move next. Even the smallest improvement in speed can
mean the difference between searching 10 or 11 ply deep, which
can mean an improvement of over 50 Elo points in strength.

For Stefan Lucks: I tried both

    procedure Flip(B : in out Bitboard_T; I : in Index_T) is
    begin
	B(I) := B(I) xor True;
    end;

and

    procedure Flip(B : in out Bitboard_T; I : in Index_T) is
	One: constant Bitborard := (others => False);
    begin
	One(i) := True;
	B(i)   := B(i) xor One;
    end;

The first one results in the same code as the original (B(i) :=
not B(i)) version, while the second one generates the optimal
code, but if I am to explicitly use masks, I might as well go
with the Unsigned_64 version. The whole point of using packed
arrays is to write easy to read code and let the compiler take
care of the masks and shifts behind the scene.

For Adam Beneschan:

   procedure Flip(B : in out Bitboard_T; I : Index_T) is
      The_Bit : Boolean renames B(I);
   begin
      The_Bit := not The_Bit;
   end;

results in the same code as my initial version.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:04 ` Alinabi
@ 2007-07-03 18:09   ` Alinabi
  2007-07-03 18:17     ` Alinabi
  2007-07-03 18:36   ` Jeffrey R. Carter
  2007-07-04  9:00   ` Jean-Pierre Rosen
  2 siblings, 1 reply; 29+ messages in thread
From: Alinabi @ 2007-07-03 18:09 UTC (permalink / raw)


Sorry, I meant to say

procedure Flip(B : in out Bitboard_T; I : in Index_T) is
   One: constant Bitborard := (others => False);
begin
   One(i) := True;
   B   := B xor One;
end;

On Jul 3, 2:04 pm, Alinabi <alexander.the.aver...@gmail.com> wrote:
>
> For Stefan Lucks: I tried both
>
>     procedure Flip(B : in out Bitboard_T; I : in Index_T) is
>     begin
>         B(I) := B(I) xor True;
>     end;
>
> and
>
>     procedure Flip(B : in out Bitboard_T; I : in Index_T) is
>         One: constant Bitborard := (others => False);
>     begin
>         One(i) := True;
>         B(i)   := B(i) xor One;
>     end;
>




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:09   ` Alinabi
@ 2007-07-03 18:17     ` Alinabi
  2007-07-10  2:06       ` Randy Brukardt
  0 siblings, 1 reply; 29+ messages in thread
From: Alinabi @ 2007-07-03 18:17 UTC (permalink / raw)


One more thing: some inline assembly is unavoidable. There are two
other bitboard operations that are required: First_One(Bitboard_T)
returns Natural and Last_One(Bitboard_T) returns Natural; They return
the position of the least significant and most significant set bit
respectively. On x86 processors you can do this with one instruction
BSF (bit scan forward) and BSR (bit scan reverse), and you cannot
expect a compiler to generate them.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:04 ` Alinabi
  2007-07-03 18:09   ` Alinabi
@ 2007-07-03 18:36   ` Jeffrey R. Carter
  2007-07-03 19:42     ` Alinabi
  2007-07-08 22:53     ` Robert A Duff
  2007-07-04  9:00   ` Jean-Pierre Rosen
  2 siblings, 2 replies; 29+ messages in thread
From: Jeffrey R. Carter @ 2007-07-03 18:36 UTC (permalink / raw)


Alinabi wrote:
> 
> Now, concerning efficiency: these basic operations on bitboards
> are used 1e8 to 1e9 times every time the program tries to decide
> what to move next. Even the smallest improvement in speed can
> mean the difference between searching 10 or 11 ply deep, which
> can mean an improvement of over 50 Elo points in strength.

I'm sure that's true. However, you have not demonstrated that there's a 
speed difference between the 2 versions. Even if there is, it doesn't 
necessarily mean that there will be a difference in the # of plys that 
can be searched. The old rule, "1st get it right, then make it fast," 
still applies. Once you have it finished you can easily see if modifying 
this single procedure actually makes a difference. Until then, you're 
wasting a lot of effort on something you don't know is important.

You should also be aware that you're sending the compiler conflicting 
messages. Packing the array indicates that you want the compiler to 
minimize storage, even at the expense of speed, while "pragma Optimize 
(Time);" indicates that you want to minimize time, even at the expense 
of storage. Obtaining speed often requires wasting storage; the fastest 
implementation might be to not pack the array. I don't know how many of 
these you have, but with the large memories on modern machines, you 
might be better off with the additional 56 bytes per array. Again, you 
won't know for sure until you have a working version you can measure.

-- 
Jeff Carter
"Gentlemen, you can't fight in here. This is the War Room!"
Dr. Strangelove
30



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:36   ` Jeffrey R. Carter
@ 2007-07-03 19:42     ` Alinabi
  2007-07-04  1:12       ` Jeffrey R. Carter
  2007-07-04  3:22       ` Steve
  2007-07-08 22:53     ` Robert A Duff
  1 sibling, 2 replies; 29+ messages in thread
From: Alinabi @ 2007-07-03 19:42 UTC (permalink / raw)


The bitboard representation does not to save space. In fact it is
quite wasteful, which is why it is not used in embedded devices, like
cell phones. If you wonder how can this save time, here is an example:
suppose that you have a white pawn on c4 and you want to determine if
it is a passed pawn. That means that the pawn can no longer be blocked
or attacked by a black pawn, and therefore it threatens to advance and
get promoted. You would have to check if there are any black pawns at
b5, b6, b7, c5, c6, c7, d5, d6, d7. In the bitboard representation,
you would have a bitboard called BlackPawns which has a one for every
square occupied by a black pawn and than you would keep an array of
bitboard masks BlackBlockers : array(a1..h8) of Bitboard, such that,
for example, BlackBlockers(c4) has 1's at b5, b6, b7, c5, c6, c7, d5,
d6, d7, and 0's everywhere else. Then the white pawn at c4 is passed
if BlackPawns and BlackBlockers(c4) = 0, which is two machine
instruction, one for indexing and one for the logical operator.

As for the proof that the hand coded version is faster, here it is:

Hand coded: 2 MOV, 1 SHL, 1 XOR
Compiler:   2 MOV, 1 SHL, 1 XOR, 1 SHR, 1 ROL, 2 AND, 1 OR
---------------------
5 extraneous instructions, Q.E.D.

As I mentioned in a previous post, I already implemented all these
bitboard operations the usual way, using Unsigned_64, shifts, masks
and even some inline assembly. I am not opposed to that. I was just
wondering if I could use packed arrays of booleans to make the code
more legible and delegate the low level work to the compiler. I did
not expect the compiler to perform as well as it did. Since it only
failed far off the mark in one case, I was wondering if that would be
considered a bug. That question was answered in the negative. You are
getting way to defensive about this. I don't have a gnat bashing
agenda.

On Jul 3, 2:36 pm, "Jeffrey R. Carter"
<spam.jrcarter....@acm.nospam.org> wrote:
> Alinabi wrote:
>
> > Now, concerning efficiency: these basic operations on bitboards
> > are used 1e8 to 1e9 times every time the program tries to decide
> > what to move next. Even the smallest improvement in speed can
> > mean the difference between searching 10 or 11 ply deep, which
> > can mean an improvement of over 50 Elo points in strength.
>
> I'm sure that's true. However, you have not demonstrated that there's a
> speed difference between the 2 versions. Even if there is, it doesn't
> necessarily mean that there will be a difference in the # of plys that
> can be searched. The old rule, "1st get it right, then make it fast,"
> still applies. Once you have it finished you can easily see if modifying
> this single procedure actually makes a difference. Until then, you're
> wasting a lot of effort on something you don't know is important.
>
> You should also be aware that you're sending the compiler conflicting
> messages. Packing the array indicates that you want the compiler to
> minimize storage, even at the expense of speed, while "pragma Optimize
> (Time);" indicates that you want to minimize time, even at the expense
> of storage. Obtaining speed often requires wasting storage; the fastest
> implementation might be to not pack the array. I don't know how many of
> these you have, but with the large memories on modern machines, you
> might be better off with the additional 56 bytes per array. Again, you
> won't know for sure until you have a working version you can measure.
>
> --
> Jeff Carter
> "Gentlemen, you can't fight in here. This is the War Room!"
> Dr. Strangelove
> 30





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 19:42     ` Alinabi
@ 2007-07-04  1:12       ` Jeffrey R. Carter
  2007-07-04 10:15         ` Jeffrey Creem
  2007-07-04  3:22       ` Steve
  1 sibling, 1 reply; 29+ messages in thread
From: Jeffrey R. Carter @ 2007-07-04  1:12 UTC (permalink / raw)


Alinabi wrote:
> The bitboard representation does not to save space. In fact it is
> quite wasteful, which is why it is not used in embedded devices, like
> cell phones. If you wonder how can this save time, here is an example:
> suppose that you have a white pawn on c4 and you want to determine if
> it is a passed pawn. That means that the pawn can no longer be blocked
> or attacked by a black pawn, and therefore it threatens to advance and
> get promoted. You would have to check if there are any black pawns at
> b5, b6, b7, c5, c6, c7, d5, d6, d7. In the bitboard representation,
> you would have a bitboard called BlackPawns which has a one for every
> square occupied by a black pawn and than you would keep an array of
> bitboard masks BlackBlockers : array(a1..h8) of Bitboard, such that,
> for example, BlackBlockers(c4) has 1's at b5, b6, b7, c5, c6, c7, d5,
> d6, d7, and 0's everywhere else. Then the white pawn at c4 is passed
> if BlackPawns and BlackBlockers(c4) = 0, which is two machine
> instruction, one for indexing and one for the logical operator.

Zero is not an array of Boolean. You'd have to compare against (others 
=> False). That the compiler recognizes this as 0 might be something 
else you'll want to check.

> As for the proof that the hand coded version is faster, here it is:
> 
> Hand coded: 2 MOV, 1 SHL, 1 XOR
> Compiler:   2 MOV, 1 SHL, 1 XOR, 1 SHR, 1 ROL, 2 AND, 1 OR
> ---------------------
> 5 extraneous instructions, Q.E.D.

This demonstrates only that the hand-coded version has fewer 
instructions. To prove that it's faster, you have to time them both.

However, granting that the hand-coded version is faster, you haven't 
demonstrated that it makes an actual difference.

> As I mentioned in a previous post, I already implemented all these
> bitboard operations the usual way, using Unsigned_64, shifts, masks
> and even some inline assembly. I am not opposed to that. I was just
> wondering if I could use packed arrays of booleans to make the code
> more legible and delegate the low level work to the compiler. I did
> not expect the compiler to perform as well as it did. Since it only
> failed far off the mark in one case, I was wondering if that would be
> considered a bug. That question was answered in the negative. You are
> getting way to defensive about this. I don't have a gnat bashing
> agenda.

My reaction is due to an allergy to premature optimization, the root of 
all evil.

-- 
Jeff Carter
"Gentlemen, you can't fight in here. This is the War Room!"
Dr. Strangelove
30



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 19:42     ` Alinabi
  2007-07-04  1:12       ` Jeffrey R. Carter
@ 2007-07-04  3:22       ` Steve
  2007-07-04  6:31         ` Harald Korneliussen
  1 sibling, 1 reply; 29+ messages in thread
From: Steve @ 2007-07-04  3:22 UTC (permalink / raw)


"Alinabi" <alexander.the.average@gmail.com> wrote in message 
news:1183491750.177186.154490@k29g2000hsd.googlegroups.com...
[snip]
>
> As for the proof that the hand coded version is faster, here it is:
>
> Hand coded: 2 MOV, 1 SHL, 1 XOR
> Compiler:   2 MOV, 1 SHL, 1 XOR, 1 SHR, 1 ROL, 2 AND, 1 OR
> ---------------------
> 5 extraneous instructions, Q.E.D.
>

Twenty years ago I might ave agreed with this logic, but not today.

In the good ole' days you could associate an amount of time with each 
instructions, add them up and get a total amount of execution time.  This 
hasn't been possible for a long time.  Ever heard of "instruction 
scheduling" and "concurrent execution"?

Todays CPU's contain "pipelines" that sometimes merge several operations 
into a single clock.  In some cases you will find that NOP's are added to 
increase the speed of execution based on a detailed knowledge of the 
underlying processor.

I agree with Jeff's assesment.  You must benchmark to measure and compare 
performance.

I work frequently with time intensive code where simulating more 
permutations translate to more value recovered.  These small differences in 
code generation seldom have a significant impact on the overall performance. 
Greater benefits are found by changes to algorithms or approaches to a 
problem.

Regards,
Steve
(The Duck)





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04  3:22       ` Steve
@ 2007-07-04  6:31         ` Harald Korneliussen
  0 siblings, 0 replies; 29+ messages in thread
From: Harald Korneliussen @ 2007-07-04  6:31 UTC (permalink / raw)


On Jul 4, 5:22 am, "Steve" <nospam_steve...@comcast.net> wrote:

> I work frequently with time intensive code where simulating more
> permutations translate to more value recovered.  These small differences in
> code generation seldom have a significant impact on the overall performance.
> Greater benefits are found by changes to algorithms or approaches to a
> problem.
>

If Alinabi went looking for a better chess playing algorithm, he'd
have to be either a seriously good SC researcher, or wasting his time.
It's well-trodden ground. Arguably, the opitimization isn't premature
either, since it seems this is very much what writing a competitive
chess program is about.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:04 ` Alinabi
  2007-07-03 18:09   ` Alinabi
  2007-07-03 18:36   ` Jeffrey R. Carter
@ 2007-07-04  9:00   ` Jean-Pierre Rosen
  2007-07-04 18:27     ` tmoran
  2007-07-04 18:51     ` tmoran
  2 siblings, 2 replies; 29+ messages in thread
From: Jean-Pierre Rosen @ 2007-07-04  9:00 UTC (permalink / raw)


Alinabi a �crit :
> While I used to be a candidate master
> once upon a time, I do not play at that strength anymore and I
> could benefit from the advice of a strong player (who might not
> be a programmer). For such a person B(i) := not B(i) is much
> easier to understand than B := B xor Bitborad_T(Shift_Left(1,
> i)).

Hmmm... that person will look at "Flip (B(I))", not at the details of 
the Flip function. Therefore, you should use whatever implementation is 
most efficient. But do not forget to make measurements, efficiency 
issues are often... surprising to say the least.
-- 
---------------------------------------------------------
            J-P. Rosen (rosen@adalog.fr)
Visit Adalog's web site at http://www.adalog.fr



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04  1:12       ` Jeffrey R. Carter
@ 2007-07-04 10:15         ` Jeffrey Creem
  2007-07-04 18:28           ` Jeffrey R. Carter
  0 siblings, 1 reply; 29+ messages in thread
From: Jeffrey Creem @ 2007-07-04 10:15 UTC (permalink / raw)


Jeffrey R. Carter wrote:

> My reaction is due to an allergy to premature optimization, the root of 
> all evil.
> 

"Premature Optimization" is one of those pieces of conventional wisdom 
that gets repeated often but which is in fact not the problem at 
all..Not really.

In any real-world problem, you rarely get to spend 6 months making it 
faster because you did such a poor job up front in designing the problem.

Often, by the time one has created a design mess that is slow 
everywhere, spending the short time optimizing the scary loops is not 
enough because the design itself is broken.

The real problem is not so much premature optimization but crazy 
micro-optimizations done early that also hurt maintainabilty/clarity of 
the code and often do little if anything to actually make the code faster.

Spending time up front thinking about the overall design and even 
worrying a little about specific performance details when not taken to 
the extreme and when done with the benefit of metrics backed experience 
(v.s. I heard one time that compiler X was slow on this) is actually not 
all that bad.

This is of course especially true when one considers that by the end of 
the project when you go to start profiling the code to find the hotspots 
and find out that your lousy tool set does not support profiling with 
programs with tasks, or does not support profiling of non-trivial 
programs at all.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04  9:00   ` Jean-Pierre Rosen
@ 2007-07-04 18:27     ` tmoran
  2007-07-04 19:16       ` Pascal Obry
  2007-07-05  4:53       ` Jeffrey R. Carter
  2007-07-04 18:51     ` tmoran
  1 sibling, 2 replies; 29+ messages in thread
From: tmoran @ 2007-07-04 18:27 UTC (permalink / raw)


Using -O2 with the old gnat 3.15p on a 3GHz dual core Pentium (but only
programming to use one core) under W2k, I find that the versions with
        B(I) := not B(I);
and
        B(I) := B(I) xor True;
take about 14 nanoseconds, while the version with
        One(i) := True;
        B   := B xor One;
takes about 6.  With 1e9 calls, that's an 8 second difference.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04 10:15         ` Jeffrey Creem
@ 2007-07-04 18:28           ` Jeffrey R. Carter
  0 siblings, 0 replies; 29+ messages in thread
From: Jeffrey R. Carter @ 2007-07-04 18:28 UTC (permalink / raw)


Jeffrey Creem wrote:
> 
> In any real-world problem, you rarely get to spend 6 months making it 
> faster because you did such a poor job up front in designing the problem.

The rule is, "1st get it right, then make it fast." If you did a poor 
job up front, you didn't get it right 1st. And usually no amount of 
small, local optimization such as the OP is discussing will help you.

> The real problem is not so much premature optimization but crazy 
> micro-optimizations done early that also hurt maintainabilty/clarity of 
> the code and often do little if anything to actually make the code faster.

That's pretty much the definition of premature optimization: extra 
(sometimes substantial) effort spent on obfuscation "for speed", done 
early, without any measurements to indicate that it will help.

-- 
Jeff Carter
"Hello! Smelly English K...niggets."
Monty Python & the Holy Grail
08



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04  9:00   ` Jean-Pierre Rosen
  2007-07-04 18:27     ` tmoran
@ 2007-07-04 18:51     ` tmoran
  1 sibling, 0 replies; 29+ messages in thread
From: tmoran @ 2007-07-04 18:51 UTC (permalink / raw)


> Using -O2 with the old gnat 3.15p on a 3GHz dual core Pentium (but only
Sorry, that was -O3



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04 18:27     ` tmoran
@ 2007-07-04 19:16       ` Pascal Obry
  2007-07-05  1:45         ` tmoran
  2007-07-05  4:53       ` Jeffrey R. Carter
  1 sibling, 1 reply; 29+ messages in thread
From: Pascal Obry @ 2007-07-04 19:16 UTC (permalink / raw)
  To: tmoran

tmoran@acm.org a �crit :
> Using -O2 with the old gnat 3.15p on a 3GHz dual core Pentium (but only
> programming to use one core) under W2k, I find that the versions with
>         B(I) := not B(I);
> and
>         B(I) := B(I) xor True;
> take about 14 nanoseconds, while the version with
>         One(i) := True;
>         B   := B xor One;
> takes about 6.  With 1e9 calls, that's an 8 second difference.

That's not a proper timing. 3.15p is using a very old backend (2.x if my
memory is good). Since then there have been two major versions of the
GCC backend, both bringing better optimizations. It would be better to
do the measurement with GCC/FSF 4.2 or GNAT GPL 2007.

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|              http://www.obry.net
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04 19:16       ` Pascal Obry
@ 2007-07-05  1:45         ` tmoran
  0 siblings, 0 replies; 29+ messages in thread
From: tmoran @ 2007-07-05  1:45 UTC (permalink / raw)



> That's not a proper timing. 3.15p is using a very old backend (2.x if my
> memory is good). Since then there have been two major versions of the
> GCC backend, both bringing better optimizations. It would be better to
> do the measurement with GCC/FSF 4.2 or GNAT GPL 2007.
  Gnat 3.15p was handy.  Please post your results on timing the different
codes with a different Ada compiler.
  Since my machine does have two cores, I tried it with a second task
running the benchmark subroutine, ie two copies of the benchmark were
running simultaneously.  Elapsed time till they were both done was only
20% longer than running a single copy.  So it might be worth looking into
a multi-tasking design of the chess program.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-04 18:27     ` tmoran
  2007-07-04 19:16       ` Pascal Obry
@ 2007-07-05  4:53       ` Jeffrey R. Carter
  1 sibling, 0 replies; 29+ messages in thread
From: Jeffrey R. Carter @ 2007-07-05  4:53 UTC (permalink / raw)


tmoran@acm.org wrote:
> Using -O2 with the old gnat 3.15p on a 3GHz dual core Pentium (but only
> programming to use one core) under W2k, I find that the versions with
>         B(I) := not B(I);
> and
>         B(I) := B(I) xor True;
> take about 14 nanoseconds, while the version with
>         One(i) := True;
>         B   := B xor One;
> takes about 6.  With 1e9 calls, that's an 8 second difference.

Yes. But at 6 mins/move, is 8 secs enough to search another ply? That's 
the important time. Since the OP said it might be the difference between 
searching 10 and 11 plys, that implies 36 secs/ply, so the answer would 
seem to be no.

Actually, Flip is only 1 of the 5 basic operations, and it's all the 
basic operations that are called 1e8 to 1e9 times per move, so the 
savings are more like 1.6 secs.

A real improvement would be to parallelize the algorithm and run it on a 
multiprocessor.

-- 
Jeff Carter
"Hello! Smelly English K...niggets."
Monty Python & the Holy Grail
08



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:36   ` Jeffrey R. Carter
  2007-07-03 19:42     ` Alinabi
@ 2007-07-08 22:53     ` Robert A Duff
  2007-07-09  6:09       ` tmoran
  1 sibling, 1 reply; 29+ messages in thread
From: Robert A Duff @ 2007-07-08 22:53 UTC (permalink / raw)


"Jeffrey R. Carter" <spam.jrcarter.not@acm.nospam.org> writes:

> You should also be aware that you're sending the compiler conflicting
> messages. Packing the array indicates that you want the compiler to
> minimize storage, even at the expense of speed,...

Well, sort of.  Pack means to minimize storage even at the (possible)
expense of speed of accessing the components.  Speed overall can be
improved by packing.  For example, speed of copying and comparing the
whole packed object, and passing it around as parameters, will typically
be improved by packing.

And of course using less memory will typically improve cache and paging
performance.

So on a desktop computer with "plenty" of memory, packing is mainly used
to improve speed -- there's no "time/space" tradeoff.  A small embedded
system might have such tradeoffs.

>... while "pragma Optimize
> (Time);" indicates that you want to minimize time, even at the expense
> of storage. Obtaining speed often requires wasting storage; the fastest
                              ^^^^^

I'd say, "sometimes", nor "often".  Usually, wasting storage means
wasting time.

> implementation might be to not pack the array. I don't know how many of
> these you have, but with the large memories on modern machines, you
> might be better off with the additional 56 bytes per array. Again, you
> won't know for sure until you have a working version you can measure.

Measurement is a Good Thing.

- Bob



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
                   ` (5 preceding siblings ...)
  2007-07-03 18:04 ` Alinabi
@ 2007-07-08 22:58 ` Robert A Duff
  6 siblings, 0 replies; 29+ messages in thread
From: Robert A Duff @ 2007-07-08 22:58 UTC (permalink / raw)


Alinabi <alexander.the.average@gmail.com> writes:

> I don't know much (if anything) about compiler internals, so I was
> wondering if I should file a bug report.

Well, it's not quite a bug, but there's no harm in sending an
"enhancement request" to AdaCore.  We might do something about it.

I would like to see operations on packed arrays of Booleans be as
efficient as similar operations on modular/unsigned integers.

>... Is there some good
> theoretical justification for all that extraneous shifts and
> rotations? I would think that if the compiler can figure out the best
> way to set or clear a bit in a register using shift and logical ops,
> than it should also be able to negate it efficiently.

Yes, it could.

- Bob



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-08 22:53     ` Robert A Duff
@ 2007-07-09  6:09       ` tmoran
  0 siblings, 0 replies; 29+ messages in thread
From: tmoran @ 2007-07-09  6:09 UTC (permalink / raw)


> Well, sort of.  Pack means to minimize storage even at the (possible)
> expense of speed of accessing the components.  Speed overall can be
> improved by packing.  For example, speed of copying and comparing the
> ...
> Measurement is a Good Thing.
  In my benchmark runs, the
        B(I) := not B(I);
and
        B(I) := B(I) xor True;
versions went from 10 down to 7 nanoseconds when I removed the pragma
pack, but the version with
        One(i) := True;
        B   := B xor One;
went from 5 to 700 nanoseconds.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: sub-optimal code for packed boolean arrays -- bug or inherent limitation
  2007-07-03 18:17     ` Alinabi
@ 2007-07-10  2:06       ` Randy Brukardt
  0 siblings, 0 replies; 29+ messages in thread
From: Randy Brukardt @ 2007-07-10  2:06 UTC (permalink / raw)


"Alinabi" <alexander.the.average@gmail.com> wrote in message
news:1183486662.806585.71680@q69g2000hsb.googlegroups.com...
> One more thing: some inline assembly is unavoidable. There are two
> other bitboard operations that are required: First_One(Bitboard_T)
> returns Natural and Last_One(Bitboard_T) returns Natural; They return
> the position of the least significant and most significant set bit
> respectively. On x86 processors you can do this with one instruction
> BSF (bit scan forward) and BSR (bit scan reverse), and you cannot
> expect a compiler to generate them.

There's a reason for that: Intel recommended to compiler writers to not use
"complex" instructions, as sets of simple instructions are often supposed to
be faster. The "complex" instructions were implemented essentially as
"macros" of simple instructions. (This is information is somewhat old, so it
may vary on the most recent processors.)

My point is that it might not actually be faster to use those instructions
than to use a loop of your own design (and there is a small chance that
they'd be slower). As suggested elsewhere, you need to test that in a
suitable benchmark. The number of instructions has had no real bearing on
the execution time since the 486 came out (and indeed, given that some
instructions are much slower than others, it never had much bearing on
Intel's x86 processors).

                                    Randy.





^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2007-07-10  2:06 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-02 19:34 sub-optimal code for packed boolean arrays -- bug or inherent limitation Alinabi
2007-07-02 20:08 ` Ludovic Brenta
2007-07-03  1:01 ` Jeffrey R. Carter
2007-07-03  7:22   ` Harald Korneliussen
2007-07-03  8:37     ` Georg Bauhaus
2007-07-03  7:59 ` gautier_niouzes
2007-07-03  9:25 ` Stefan Lucks
2007-07-03 12:40   ` Stefan Lucks
2007-07-03 15:42 ` Adam Beneschan
2007-07-03 18:04 ` Alinabi
2007-07-03 18:09   ` Alinabi
2007-07-03 18:17     ` Alinabi
2007-07-10  2:06       ` Randy Brukardt
2007-07-03 18:36   ` Jeffrey R. Carter
2007-07-03 19:42     ` Alinabi
2007-07-04  1:12       ` Jeffrey R. Carter
2007-07-04 10:15         ` Jeffrey Creem
2007-07-04 18:28           ` Jeffrey R. Carter
2007-07-04  3:22       ` Steve
2007-07-04  6:31         ` Harald Korneliussen
2007-07-08 22:53     ` Robert A Duff
2007-07-09  6:09       ` tmoran
2007-07-04  9:00   ` Jean-Pierre Rosen
2007-07-04 18:27     ` tmoran
2007-07-04 19:16       ` Pascal Obry
2007-07-05  1:45         ` tmoran
2007-07-05  4:53       ` Jeffrey R. Carter
2007-07-04 18:51     ` tmoran
2007-07-08 22:58 ` Robert A Duff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox