Re: Efficient Sequential Access to Arrays

comp.lang.ada
 help / color / mirror / Atom feed

From: Keean Schupke <keean.schupke@googlemail.com>
Cc: mailbox@dmitry-kazakov.de
Subject: Re: Efficient Sequential Access to Arrays
Date: Wed, 18 Jul 2012 01:10:03 -0700 (PDT)
Date: 2012-07-18T01:10:03-07:00	[thread overview]
Message-ID: <9d4d4463-4c7e-40f4-a167-933eb056c6a5@googlegroups.com> (raw)
In-Reply-To: <9s89w8awedu5.r6kl7s7vwvx9$.dlg@40tude.net>

On Tuesday, 17 July 2012 10:01:59 UTC+1, Dmitry A. Kazakov  wrote:
> On Mon, 16 Jul 2012 23:40:21 -0700 (PDT), Keean Schupke wrote:
> 
> &gt; The real problem is that each element needs to be addressed spatially (say
> &gt; as a 2D grid) so that all 4 direct, and 4 diagonal neighbours can be
> &gt; quickly found (using indexed-addressing), another is that we need to be
> &gt; able to tell if any two elements are in the same group (using disjoint-set
> &gt; union-find), and another is that we need to be able to iterate through all
> &gt; the members of a group given any element in that group.
> &gt; 
> &gt; If there is a faster data structure that meets all those requirements, then I am listening.
> 
> Depends on details.
> 
> k-d trees might be worth looking.
> 
> Less known are &quot;pyramid&quot; structures (used in image processing, filtering,
> compression).
> 
> For image segmenting, e.g. by area growing (8th-neighbourhood), one
> sometimes uses recursive elements permutations like (1,1) (1,2) (2,1) (2,2)
> (1,3) (1,4) (2,3) (2,4). That allows to perform segmenting in strictly
> consequent order visiting each pixel once. I remember it was popular in
> PDP-11 days when integer multiplication was indeed expensive.
> 
> -- 
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de


A tree structure is going to be slower than an array for the spatial lookup part. Indexing is faster than links (I think you said so yourself). So why suggesting a linked structure to replace the array?

One reason we use links to represent the groups is that by using a cirular list we can join two groups together very quickly using an exchange. If X is any element from the first circular list, and Y any element from the second circular list we can join the two into a single circular list by doing:

T := X.Next; X.Next = Y.Next; Y.Next := T; -- or Swap(X.Next, Y.Next)

And yes, joining groups is the most frequent operation on the groups, so copying one group into the other using arrays is slower.

So 2D spatial indexing (array structure) is optimal for spatial lookup. Circular lists between element is optimal for joining groups. The link to the canonical element from the eager-union algorithm is optimal for lookup dominated disjoint set.

So you can see the data structure is already optimal for the operations frequently required. It is an array of elements that contain a next element pointer to implement a circular linked list and a canonical element pointer to identify the sub-set.

So we have the algorithm, and the data structure, which is optimal from an analytical perspective. Some observations:

- The data structure and algorithm is optimal from a machine operation perspective (if we consider the relative speeds of the machine code operations)

- If the language cannot encode this structure, it cannot get the fastest performance as the language cannot be faster than the machine code operations.

- I may have misdirected people by looking at the loop behaviour of index optimisation. In my tests it is just as fast as pointer arithmetic. The real problem is that we need to combine the circular linked list into the array and allow addressing of neighbours from the resulting pointer dereference.

All this is fine, the problem in Ada is this:

- When we lookup an element spatially by index, and then iterate through the subset members (following the next links) we lose the ability to index spatially from the next element.

- If we instead store the index of the next element instead of the link, it is slower because we have to use a relative addressing mode instead of an absolute one.

- The compiler cannot use any optimisation of the array indexing because the index comes from main-memory. It is not stepping through in sequence so it cannot use strength reduction. There is no point in reading compiler documentation, this is beyond any current optimisation technology to do this, and no amount of shuffling the code about will change this.

- doing a division to convert the address of the next node back into an index to address the neighbours does not make any sense as division is slower than multiplication.

- The conclusion is the optimal way to do this is to access the neighbours by using the 'access' stored in the next pointer, and calculate the memory location of the neighbours by address arithmetic.

- We need to package this into an ADT to hide all the address stuff so it can be unit tested and validated in isolation from the rest of the application.


Does that make sense?


Cheers,
Keean.

next prev parent reply	other threads:[~2012-07-24  3:08 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-15 18:40 Efficient Sequential Access to Arrays Keean Schupke
2012-07-15 19:48 ` Dmitry A. Kazakov
2012-07-15 20:03   ` Keean Schupke
2012-07-15 20:56     ` Keean Schupke
2012-07-16 11:34       ` Jacob Sparre Andersen
2012-07-15 21:05     ` tmoran
2012-07-15 21:08       ` Keean Schupke
2012-07-16  1:19         ` tmoran
2012-07-15 21:44     ` Georg Bauhaus
2012-07-16  7:47     ` Dmitry A. Kazakov
2012-07-16 10:16       ` Keean Schupke
2012-07-16 14:57         ` Dmitry A. Kazakov
2012-07-16 16:39           ` Keean Schupke
2012-07-16 17:02             ` Georg Bauhaus
2012-07-16 18:48             ` Dmitry A. Kazakov
2012-07-16 19:12               ` Keean Schupke
2012-07-17  6:31                 ` Dmitry A. Kazakov
2012-07-17  6:50                   ` Georg Bauhaus
2012-07-17  8:42                     ` Dmitry A. Kazakov
2012-07-17  7:24                   ` Keean Schupke
2012-07-16 19:43             ` John B. Matthews
2012-07-16 20:44               ` Keean Schupke
2012-07-16 22:23             ` tmoran
2012-07-17  6:40               ` Keean Schupke
2012-07-17  9:01                 ` Dmitry A. Kazakov
2012-07-18  8:10                   ` Keean Schupke [this message]
2012-07-18 18:11                     ` tmoran
2012-07-19 10:41                       ` Keean Schupke
2012-07-19 11:18                         ` Georg Bauhaus
2012-07-19 19:58                           ` Keean Schupke
2012-07-21  8:24                             ` Keean Schupke
2012-07-21 11:52                               ` Georg Bauhaus
2012-07-21 16:14                                 ` Keean Schupke
2012-07-21 17:09                                   ` Keean Schupke
     [not found]                                   ` <i78m081tccmp078spmsei1l5vnj3k0kbkj@invalid.netcom.com>
2012-07-22 15:50                                     ` Keean Schupke
2012-07-22  5:13                               ` Shark8
2012-07-15 21:35   ` J-P. Rosen
2012-07-15 19:52 ` Shark8
2012-07-15 20:16   ` Keean Schupke
2012-07-15 21:33     ` Georg Bauhaus
2012-07-17 18:16 ` Florian Weimer
2012-07-22 10:01 ` robin.vowels
2012-07-30  6:31   ` Jacob Sparre Andersen
2012-07-30  7:16     ` Keean Schupke
2012-07-30  9:20     ` Georg Bauhaus
2012-07-30 14:04       ` Keean Schupke
2012-07-22 10:09 ` robin.vowels
2012-07-22 16:02   ` Keean Schupke
2012-07-22 16:21 ` Florian Weimer
2012-07-22 16:46   ` Keean Schupke
2012-07-22 18:41     ` Florian Weimer
2012-07-24  8:21       ` Keean Schupke
2012-07-22 17:34   ` Niklas Holsti
2012-07-22 18:21     ` Keean Schupke
2012-07-30 14:18       ` Jacob Sparre Andersen
2012-07-24 16:00     ` robin.vowels
2012-07-25  7:09       ` Niklas Holsti
2012-07-25  9:03         ` Keean Schupke
2012-07-26  0:15           ` Randy Brukardt
2012-07-26  8:38             ` Keean Schupke
2012-07-26 10:10               ` Brian Drummond
2012-07-26 12:32                 ` Keean Schupke
2012-07-26 13:46                   ` Dmitry A. Kazakov
2012-07-26 17:40                     ` Shark8
2012-07-26 18:56                       ` Dmitry A. Kazakov
2012-07-27  5:25                         ` Shark8
2012-07-27  6:28                           ` Dmitry A. Kazakov
2012-07-27  7:01                             ` Vasiliy Molostov
2012-07-27  8:48                               ` Dmitry A. Kazakov
2012-07-27 12:01                                 ` Georg Bauhaus
2012-07-27 16:49                                 ` Vasiliy Molostov
2012-07-27 19:38                                   ` Dmitry A. Kazakov
2012-07-28  5:32                             ` Shark8
2012-07-28  7:37                               ` Dmitry A. Kazakov
2012-07-28 18:32                                 ` Shark8
     [not found]                     ` <6ov218tnkbqu3vpkuo4t77rd7de0a3aesf@invalid.netcom.com>
2012-07-26 18:49                       ` Dmitry A. Kazakov
     [not found]                         ` <86d31898ne39maimbl24knds7rf38qg7vc@invalid.netcom.com>
2012-07-27  6:45                           ` Dmitry A. Kazakov
2012-07-27  8:21                   ` Keean Schupke
2012-07-27  8:50                     ` Dmitry A. Kazakov
2012-07-27  9:17                       ` Keean Schupke
2013-02-25 21:18                     ` Eryndlia Mavourneen
2012-07-26 13:08               ` Georg Bauhaus
2012-07-26 19:37                 ` stefan-lucks
2012-07-26 19:21               ` stefan-lucks
2012-07-31  3:12               ` Randy Brukardt
2012-07-26  7:11           ` Markus Schöpflin
2012-07-26 14:47         ` Robin Vowels
2012-07-26 15:18           ` Martin

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox