From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=ham autolearn_force=no version=3.4.4
X-Google-Thread: a07f3367d7,56525db28240414a
X-Google-Attributes: gida07f3367d7,public,usenet
X-Google-NewGroupId: yes
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.66.89.225 with SMTP id br1mr1637773pab.3.1343038781971;
        Mon, 23 Jul 2012 03:19:41 -0700 (PDT)
Path: 
 p10ni38413267pbh.1!nntp.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!ctu-peer!news.nctu.edu.tw!goblin1!goblin.stu.neva.ru!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
From: Keean Schupke <keean.schupke@googlemail.com>
Newsgroups: comp.lang.ada
Subject: Re: Efficient Sequential Access to Arrays
Date: Mon, 16 Jul 2012 13:44:31 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: <b62a7fa2-e04e-456f-86fa-892bf74899de@googlegroups.com>
References: <01983f1c-f842-4b1f-a180-bcef531dad4c@googlegroups.com>
 <kpltof5a8xvm.arygetd4esmc.dlg@40tude.net>
 <44f2bfd0-75ed-4b9c-982e-61d5308f9c01@googlegroups.com>
 <1s8rnmtyqnrr$.1o78f1qaxeulh$.dlg@40tude.net>
 <e43d83de-d992-475b-bcfc-fafa13c4ea40@googlegroups.com>
 <1lgvwexsx8ovd.xdbvohz8eowp.dlg@40tude.net>
 <4c86d54d-dfeb-44f1-9638-3c416ea51d74@googlegroups.com>
 <nospam-12F94A.15435316072012@news.aioe.org>
NNTP-Posting-Host: 82.44.19.199
Mime-Version: 1.0
X-Trace: posting.google.com 1342471471 11435 127.0.0.1 (16 Jul 2012 20:44:31
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Mon, 16 Jul 2012 20:44:31 +0000 (UTC)
In-Reply-To: <nospam-12F94A.15435316072012@news.aioe.org>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=82.44.19.199;
 posting-account=T5Z2vAoAAAB8ExE3yV3f56dVATtEMNcM
User-Agent: G2/1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Date: 2012-07-16T13:44:31-07:00
List-Id: <comp.lang.ada>

On Monday, 16 July 2012 20:43:54 UTC+1, John B. Matthews  wrote:
>  Keean Schupke wrote:
>=20
> &gt; &gt; On Mon, 16 Jul 2012 03:16:09 -0700 (PDT), Keean Schupke wrote:
> &gt; &gt;=20
> &gt; &gt; &amp;gt; On Monday, 16 July 2012 08:47:01 UTC+1, Dmitry A. Kaza=
kov  wrote:
> &gt; &gt; &gt; &gt; On Sun, 15 Jul 2012 13:03:12 -0700 (PDT), Keean Schup=
ke wrote:
> &gt; &gt; &gt; &gt; =20
> &gt; &gt; &gt; &gt; Absolute figures tell little. How big is the relative=
 difference?
> &gt; &gt; &gt;=20
> &gt; &gt; &gt; About 25% (30k with indexing, 40k with relative addressing=
)
> &gt; &gt;=20
> &gt; &gt; Which itself looks like a problem. It takes no less than 25% of=
=20
> &gt; &gt; overall performance for mere shuffling data in a task unrelated=
 to=20
> &gt; &gt; stuff like recoding etc. This looks wrong to me. I would look f=
or a=20
> &gt; &gt; better algorithm/structure.
> &gt;=20
> &gt; I find this comment puzzling and does not match my experience. What=
=20
> &gt; sort of algorithms do you work with. What algorithm for disjoint-set=
=20
> &gt; do you suggest I use? Have a read of this:
> &gt;=20
> &gt; http://www.cs.princeton.edu/~rs/AlgsDS07/01UnionFind.pdf
>=20
> Apropos to this, may I impose on you to elaborate on your previous=20
> decision not to use WQUPC? In particular, is your application lookup=20
> dominated due in some part to the choice of algorithm itself?
>=20
> Sorry if I&#39;ve overlooked an intervening clarification.
>=20
> &lt;https://groups.google.com/d/msg/comp.lang.ada/EDgDNVw9tgc/bUYu34D_8LE=
J&gt;
>=20
> &gt; Can you come up with a faster algorithm? Until you have read it I=20
> &gt; don&#39;t think you can comment on the quality of algorithm I am usi=
ng.
>=20
> --=20
> John B. Matthews
> trashgod at gmail dot com
> &lt;http://sites.google.com/site/drjohnbmatthews&gt;

That is correct, it is lookup dominated. By joining the smaller set to the =
lager set and updating the canonical pointer for each node in the smaller s=
et at join time we get better performance for this particular problem.

The second part of the question is harder, is it lookup dominated due to th=
e choice of algorithm? I don't think so. Certainly the choice of Eager-Unio=
n vs WUQPC has no influence on the number of lookups.

So the question then becomes what is the wider algorithm, what are we using=
 the disjoint-set to do, but this is getting off topic. The original point =
was that there exist a class of algorithms for which array indexing is a no=
n-optimal approach to accessing the container. There exists a class of algo=
rithms that require both indexes and linked access within the same data str=
ucture, a linked-hash table for example.

So lets try and focus here: Given that such a class of algorithms exist, an=
d for some problems will be optimal, how do we best implement them in Ada?


Cheers,
Keean.