From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,56525db28240414a
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.180.96.42 with SMTP id dp10mr217740wib.2.1343200149293;
        Wed, 25 Jul 2012 00:09:09 -0700 (PDT)
Path: 
 ge7ni59194211wib.0!nntp.google.com!volia.net!news2.volia.net!feed-A.news.volia.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: Niklas Holsti <niklas.holsti@tidorum.invalid>
Newsgroups: comp.lang.ada
Subject: Re: Efficient Sequential Access to Arrays
Date: Wed, 25 Jul 2012 10:09:09 +0300
Organization: Tidorum Ltd
Message-ID: <a79kcjFufvU1@mid.individual.net>
References: <01983f1c-f842-4b1f-a180-bcef531dad4c@googlegroups.com>
 <87ipdf4vh6.fsf@mid.deneb.enyo.de> <a72rt6Fq5sU1@mid.individual.net>
 <4ce44d2d-d789-42a0-a6ed-035f7f8d58be@googlegroups.com>
Mime-Version: 1.0
X-Trace: individual.net
 TJW5z9ek+RxymTzP1NqV/gXcysrZUSGCob4xsynRhwefLmRMY8TwsotL1OwQCKuy7E
Cancel-Lock: sha1:0rYP2xrhTr01rlYNUs5m3aeJoc0=
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:13.0) Gecko/20120614 Thunderbird/13.0.1
In-Reply-To: <4ce44d2d-d789-42a0-a6ed-035f7f8d58be@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Date: 2012-07-25T10:09:09+03:00
List-Id: <comp.lang.ada>

On 12-07-24 19:00 , robin.vowels@gmail.com wrote:
> On Monday, 23 July 2012 03:34:26 UTC+10, Niklas Holsti  wrote:
>> On 12-07-22 19:21 , Florian Weimer wrote:
>> &gt; * Keean Schupke:
>> &gt;
>> &gt;&gt; So assuming we need this level of performance, what would be the
>> &gt;&gt; best (and most idiomatic Ada) way to package this type of usage
>> &gt;&gt; pattern as an abstract datatype?
>> &gt;
>> &gt; You should look at compiler dumps to figure out why the strength
>> &gt; reduction optimization does not kick in.
>> &gt;
>>
>> As I understand it, the program is accessing A(I,J), where A is a 2D
>> array, and also accessing A(I+1,J), A(I,J+1), and the other neighbours
>> of A(I,J), eight in all.
>>
>> I,J are indices that are *not* traversed in sequential order, but in
>> some scattered order.
>>
>> Computing the address of A(I,J) involves multiplications of J and I with
>> the size of an array element and the size of an array row, respectively.
>> The problem seems to be that the results of these multiplications are
>> not reused in computing the address of A(I+1,J) etc.
>>
>> If the index constraints (or at least the row length) of A are static
>> (compile-time) constants, and the address of A(I,J) has been computed,
>> and c1 and c2 are some other compile-time constants (-1, 0, +1 in this
>> program), then the address of A(I+c1,J+c2) can be computed by adding a
>> static offset to the address of A(I,J). This is what happens when A(I,J)
>> and the neighbours are accessed through a C pointer, or when Keean uses
>> address arithmetic explicitly in the Ada program. The question is why
>> the Ada compiler is not doing this optimization.
>>
>> As I understand it, &quot;strength reduction&quot; is a loop optimization where an
>> expression of the form C*i, with &quot;i&quot; the loop index, is replaced by a
>> new variable that is incremented by C on each loop iteration. But in
>> Keean&#39;s program the indices I,J are not loop indices, nor linearly
>> related to loop indices.
>>
>> The optimization that would be required here is to replace (X+C)*F by
>> X*F+C*F, when C and F are static constants and X*F has already been
>> computed and is available. This would replace the multiplication by an
>> addition of a static constant. I've never seen this special kind of
>> optimization proposed or described.
>
> I think that it's a standard optimisation.

I'm willing to be convinced of that. But it seems that GNAT/GCC is not 
performing this optimization in Keean's program.

> For the PL/I optimising compiler for the CDC Cyber (c. 1980),
> multiplication was eliminated for matrix elements
> by using a table of offsets into each row.
> Thus, double subscripting reduced to simple addition.

That sounds like a specific choice of code generation for arrays, rather 
than a general optimization.

Keean could simulate this code by defining the array as a 
one-dimensional array of one-dimensional arrays (rows), which would 
replace the row-length multiplication by an additional indexing.

>> (We are of course (I believe) assuming that index range checks are
>> turned off.)
>
> Index range checks should be left on.

Sure, in general. But here the OP (Keean) is trying to beat the 
performance of C++ that uses pointers without index checks. Enabling 
index checks in the Ada code would probably (in this case, where the 
algorithm has scatterered indexing of the array) slow it by an amount 
that swamps the effect of the multiplication optimization.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .