From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: a07f3367d7,56525db28240414a
X-Google-Attributes: gida07f3367d7,public,usenet
X-Google-NewGroupId: yes
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.180.88.195 with SMTP id bi3mr1297853wib.3.1343313104683;
        Thu, 26 Jul 2012 07:31:44 -0700 (PDT)
Path: 
 ge7ni75591628wib.0!nntp.google.com!volia.net!news2.volia.net!feed-A.news.volia.net!border1.nntp.ams2.giganews.com!border3.nntp.ams.giganews.com!border1.nntp.ams.giganews.com!border4.nntp.ams.giganews.com!border2.nntp.ams.giganews.com!nntp.giganews.com!newsfeed.straub-nv.de!texta.sil.at!news.mind.de!news.cs.uni-magdeburg.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: Niklas Holsti <niklas.holsti@tidorum.invalid>
Newsgroups: comp.lang.ada
Subject: Re: Efficient Sequential Access to Arrays
Date: Sun, 22 Jul 2012 20:34:26 +0300
Organization: Tidorum Ltd
Message-ID: <a72rt6Fq5sU1@mid.individual.net>
References: <01983f1c-f842-4b1f-a180-bcef531dad4c@googlegroups.com>
 <87ipdf4vh6.fsf@mid.deneb.enyo.de>
Mime-Version: 1.0
X-Trace: individual.net
 aH93JoMe2ugGAl9omFEbFgdyVQ+F6tLPs/Wte1+rG3+fC/ZxrRsy8r06Ww40KlK1q6
Cancel-Lock: sha1:PLfsWo2t1GlVSfWb6klp1rvWZWA=
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:13.0) Gecko/20120614 Thunderbird/13.0.1
In-Reply-To: <87ipdf4vh6.fsf@mid.deneb.enyo.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Date: 2012-07-22T20:34:26+03:00
List-Id: <comp.lang.ada>

On 12-07-22 19:21 , Florian Weimer wrote:
> * Keean Schupke:
>
>> So assuming we need this level of performance, what would be the
>> best (and most idiomatic Ada) way to package this type of usage
>> pattern as an abstract datatype?
>
> You should look at compiler dumps to figure out why the strength
> reduction optimization does not kick in.
>

As I understand it, the program is accessing A(I,J), where A is a 2D 
array, and also accessing A(I+1,J), A(I,J+1), and the other neighbours 
of A(I,J), eight in all.

I,J are indices that are *not* traversed in sequential order, but in 
some scattered order.

Computing the address of A(I,J) involves multiplications of J and I with 
the size of an array element and the size of an array row, respectively. 
The problem seems to be that the results of these multiplications are 
not reused in computing the address of A(I+1,J) etc.

If the index constraints (or at least the row length) of A are static 
(compile-time) constants, and the address of A(I,J) has been computed, 
and c1 and c2 are some other compile-time constants (-1, 0, +1 in this 
program), then the address of A(I+c1,J+c2) can be computed by adding a 
static offset to the address of A(I,J). This is what happens when A(I,J) 
and the neighbours are accessed through a C pointer, or when Keean uses 
address arithmetic explicitly in the Ada program. The question is why 
the Ada compiler is not doing this optimization.

As I understand it, "strength reduction" is a loop optimization where an 
expression of the form C*i, with "i" the loop index, is replaced by a 
new variable that is incremented by C on each loop iteration. But in 
Keean's program the indices I,J are not loop indices, nor linearly 
related to loop indices.

The optimization that would be required here is to replace (X+C)*F by 
X*F+C*F, when C and F are static constants and X*F has already been 
computed and is available. This would replace the multiplication by an 
addition of a static constant. I've never seen this special kind of 
optimization proposed or described.

(We are of course (I believe) assuming that index range checks are 
turned off.)

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .