From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5ff6e0c3de8331c0,start X-Google-Attributes: gid103376,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!postnews.google.com!q69g2000hsb.googlegroups.com!not-for-mail From: Alinabi Newsgroups: comp.lang.ada Subject: sub-optimal code for packed boolean arrays -- bug or inherent limitation Date: Mon, 02 Jul 2007 19:34:16 -0000 Organization: http://groups.google.com Message-ID: <1183404856.375083.160890@q69g2000hsb.googlegroups.com> NNTP-Posting-Host: 75.75.8.72 Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Trace: posting.google.com 1183404856 31874 127.0.0.1 (2 Jul 2007 19:34:16 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Mon, 2 Jul 2007 19:34:16 +0000 (UTC) User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.4) Gecko/20061201 Firefox/2.0.0.4 (Ubuntu-feisty),gzip(gfe),gzip(gfe) Complaints-To: groups-abuse@google.com Injection-Info: q69g2000hsb.googlegroups.com; posting-host=75.75.8.72; posting-account=gD74RA0AAABm9rsBG7oeOmJ-iO5c3KUQ Xref: g2news1.google.com comp.lang.ada:16387 Date: 2007-07-02T19:34:16+00:00 List-Id: Hello, everyone. I am writing a chess program in Ada, using bitboards for position representation. Bitboards are just 64 bit integers in which each bit represents a predicate about a square of the board. For example, you would have a bitboard called WhitePawns, which has a one for every square that contains a white pawn, and zero for every other square. In Ada, there are two possible implementations of a bitboard: using Unsigned_64, which would closely mirror the way things are done in C, and using packed arrays of booleans. The C-like version, using Unsigned_64, leads to some rather obfuscated code, so I decided to investigate the efficiency of the code that gnat generates for some basic operations on packed arrays, which I implemented in the following package: package Test is type Index_T is new Integer range 0..63; type Bitboard_T is private; procedure Set (B : in out Bitboard_T; I : in Index_T); procedure Clear (B : in out Bitboard_T; I : in Index_T); procedure Clear (B : in out Bitboard_T); procedure Flip (B : in out Bitboard_T; I : in Index_T); function Is_Set(B : in Bitboard_T; I : in Index_T) return Boolean ; private type Bitboard_T is array(Index_T) of Boolean; pragma Pack(Bitboard_T); pragma Inline_Always(Set, Clear, Flip, Is_Set); end Test; package body Test is pragma Optimize(Time); procedure Set(B : in out Bitboard_T; I : in Index_T) is begin B(I) := True; end; procedure Clear(B : in out Bitboard_T; I : in Index_T) is begin B(I) := False; end; procedure Clear(B : in out Bitboard_T) is begin B := B xor B; end; procedure Flip(B : in out Bitboard_T; I : in Index_T) is begin B(i) := not B(i); end; function Is_Set(B : in Bitboard_T; I : in Index_T) return Boolean is begin return B(I); end; end Test; When compiled with gnatmake -g -gnatp -gnatn -march=opteron -O3 -fdump-tree-optimized test.adb using the ada compiler in gcc 4.1.2, the code generated is generally optimal, or close to optimal. For example, the code generated for Set is procedure Set(B : in out Bitboard_T; I : in Index_T) is 0: 89 f1 mov %esi,%ecx begin B(I) := True; 2: b8 01 00 00 00 mov $0x1,%eax 7: 48 d3 e0 shl %cl,%rax a: 48 09 f8 or %rdi,%rax end; d: c3 retq The only notable exception is procedure Flip, which becomes procedure Flip(B : in out Bitboard_T; I : in Index_T) is 30: 89 f1 mov %esi,%ecx begin B(i) := not B(i); 32: 48 c7 c0 fe ff ff ff mov $0xfffffffffffffffe,%rax 39: 48 d3 c0 rol %cl,%rax 3c: 48 21 f8 and %rdi,%rax 3f: 48 d3 ef shr %cl,%rdi 42: 83 f7 01 xor $0x1,%edi 45: 83 e7 01 and $0x1,%edi 48: 48 d3 e7 shl %cl,%rdi 4b: 48 09 f8 or %rdi,%rax end; 4e: c3 retq instead of the shorter mov %esi, %ecx mov 0x1, %rax shl %cl, %rax xor % rax, %rdx retq I don't know much (if anything) about compiler internals, so I was wondering if I should file a bug report. Is there some good theoretical justification for all that extraneous shifts and rotations? I would think that if the compiler can figure out the best way to set or clear a bit in a register using shift and logical ops, than it should also be able to negate it efficiently.