From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: GNAT 4.8 atomic access to 64-bit objects
Date: Sat, 16 Nov 2013 13:02:11 +0100
Date: 2013-11-16T13:02:11+01:00 [thread overview]
Message-ID: <1k2fx07tfbs4o$.1xxik0ffw6ud6.dlg@40tude.net> (raw)
In-Reply-To: 52874401$0$9514$9b4e6d93@newsspool1.arcor-online.net
On Sat, 16 Nov 2013 11:08:00 +0100, Georg Bauhaus wrote:
> On 15.11.13 22:33, Dmitry A. Kazakov wrote:
>
>> Try this:
>>
>> with Interfaces;
>> with Ada.Unchecked_Conversion;
>> with Ada.Text_IO;
>>
>> procedure Test is
>> type T is mod 2**64;
>> type Atomic_T is new Interfaces.IEEE_Float_64;
>> ...
>> end Test;
>>
>> The code generated looks horrific.
>
> Maybe according to
> http://stackoverflow.com/questions/15843159/are-32-bit-software-builds-typically-64-bit-optimized
> simply wanting movq is not "mode compatible"; however,
> if there are MMX registers in the CPU you are targetting,
> the following may be a valid way to get movq nevertheless,
> albeit using a 64 bit signedinteger.
> The program was translated in 32 bit GNU/Linux, using GNAT GPL 2012.
> It uses compiler intrinsics in ways adapted from GNAT.SSE.
>
> with Ada.Text_IO;
> with GNAT.SSE;
>
> procedure Atoms is
> use GNAT.SSE;
>
> type m64 is array (0 .. 0) of Integer64;
> for m64'Alignment use 8;
> pragma Machine_Attribute (m64, "vector_type");
> pragma Machine_Attribute (m64, "may_alias");
>
> function ia32_psllq (Left : m64; Right : m64) return m64;
> pragma Import (Intrinsic, ia32_psllq, "__builtin_ia32_psllq");
>
> X : Integer64;
> F : m64;
> for X'Address use F'Address;
> begin
> X := 123;
> F := ia32_psllq (F, m64'(0 => 1));
> Ada.Text_IO.Put_Line (Integer64'Image (X)); -- 246
> end Atoms;
With the -mmmx switch, it indeed uses movq in order to load the register.
In the test example I wrote, atomic load becomes;
movq
psllq
movq to another location (through Unchecked_Conversion)
Atomic store is the reverse.
Surprisingly (at least to me), this is about ten times faster than using
the floating point trick. I.e.
Load + Increment + Store
using psllq needs 16ns, using IEEE 64 it does 168ns, on i7-2700K 3.5GHz
It would be nice to get rid of psllq, which is a waste.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
next prev parent reply other threads:[~2013-11-16 12:02 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-14 15:57 GNAT 4.8 atomic access to 64-bit objects Dmitry A. Kazakov
2013-11-14 20:34 ` Ludovic Brenta
2013-11-15 8:44 ` Dmitry A. Kazakov
2013-11-15 19:25 ` Georg Bauhaus
2013-11-15 21:33 ` Dmitry A. Kazakov
2013-11-16 10:08 ` Georg Bauhaus
2013-11-16 12:02 ` Dmitry A. Kazakov [this message]
2013-11-15 19:08 ` Stefan.Lucks
2013-11-15 21:19 ` Dmitry A. Kazakov
2013-11-22 0:30 ` Randy Brukardt
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox