From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,a32653cf595422e6 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.135.106 with SMTP id pr10mr5728139pbb.3.1334929687656; Fri, 20 Apr 2012 06:48:07 -0700 (PDT) Path: r9ni78193pbh.0!nntp.google.com!news1.google.com!goblin3!goblin.stu.neva.ru!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail From: =?UTF-8?B?TWFya3VzIFNjaMO2cGZsaW4=?= Newsgroups: comp.lang.ada Subject: Re: GNAT and register allocation Date: Fri, 20 Apr 2012 15:48:06 +0200 Organization: Aioe.org NNTP Server Message-ID: References: <4f9138c2$0$6628$9b4e6d93@newsspool2.arcor-online.net> <4f9145b5$0$6557$9b4e6d93@newsspool4.arcor-online.net> NNTP-Posting-Host: MdpKeRr+sx3LK7JQiK5aNw.user.speranza.aioe.org Mime-Version: 1.0 X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 X-Notice: Filtered by postfilter v. 0.8.2 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Date: 2012-04-20T15:48:06+02:00 List-Id: Am 20.04.2012 13:17, schrieb Georg Bauhaus: > On 20.04.12 12:21, Georg Bauhaus wrote: >> Hi, >> >> in short, is there anything that one can do to make >> GNAT use many registers? MMX registers in particular, >> but not only. > > SSE, that is, darn, the registers being named xmmN. > >> It appears that the number of registers allocated increases >> with inline expansion enabled. But only so much. > > In the (meaningless) example below, the translations of both > Comp.A and Comp.B will use the same set of registers, xmm0 .. xmm1. > If they were instead using xmm0 .. xmm1 and xmm2 .. xmm3, > respectively, speed might well double. [snip code] As long as the functions are not inlined, I'd expect them to obey the calling convention of the ABI of the platform on which they are compiled. Therefore you won't see A using xmm0/1 and B using xmm2/3. If they are inlined however, things look quite different. Here is what I get when I compile your code: ---%<--- > gnatmake -g -O3 comp gcc-4.4 -c -g -O3 comp.adb > objdump -d comp.o comp.o: file format elf64-x86-64 Disassembly of section .text: ... 0000000000000050 : ... 5b: e8 00 00 00 00 callq 60 60: 84 c0 test %al,%al 62: 0f 84 8f 00 00 00 je f7 68: 66 0f 57 c9 xorpd %xmm1,%xmm1 6c: 31 c0 xor %eax,%eax 6e: f2 0f 10 05 00 00 00 movsd 0x0(%rip),%xmm0 # 76 75: 00 76: f2 0f 10 15 00 00 00 movsd 0x0(%rip),%xmm2 # 7e 7d: 00 7e: f2 0f 5c c8 subsd %xmm0,%xmm1 82: f2 0f 10 35 00 00 00 movsd 0x0(%rip),%xmm6 # 8a 89: 00 8a: f2 0f 10 2d 00 00 00 movsd 0x0(%rip),%xmm5 # 92 91: 00 92: f2 0f 10 25 00 00 00 movsd 0x0(%rip),%xmm4 # 9a 99: 00 9a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) a0: 66 0f 28 d9 movapd %xmm1,%xmm3 a4: f2 0f 59 cd mulsd %xmm5,%xmm1 a8: f2 0f 5c c2 subsd %xmm2,%xmm0 ac: 83 c0 01 add $0x1,%eax af: f2 0f 5e de divsd %xmm6,%xmm3 b3: 3d 40 42 0f 00 cmp $0xf4240,%eax b8: f2 0f 58 c9 addsd %xmm1,%xmm1 bc: f2 0f 58 c0 addsd %xmm0,%xmm0 c0: f2 0f 58 c3 addsd %xmm3,%xmm0 c4: f2 0f 59 c0 mulsd %xmm0,%xmm0 c8: 66 0f 28 d8 movapd %xmm0,%xmm3 cc: f2 0f 58 da addsd %xmm2,%xmm3 d0: f2 0f 58 cb addsd %xmm3,%xmm1 d4: f2 0f 59 c9 mulsd %xmm1,%xmm1 d8: f2 0f 5e cc divsd %xmm4,%xmm1 dc: 75 c2 jne a0 de: f2 0f 58 c1 addsd %xmm1,%xmm0 e2: f2 0f 59 05 00 00 00 mulsd 0x0(%rip),%xmm0 # ea e9: 00 ea: f2 0f 11 05 00 00 00 movsd %xmm0,0x0(%rip) # f2 --->%--- As you can see, the xmmN registers up to xmm6 are used. Do you get different results? Markus