From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=ham autolearn_force=no version=3.4.4
X-Google-Thread: a07f3367d7,56525db28240414a
X-Google-Attributes: gida07f3367d7,public,usenet
X-Google-NewGroupId: yes
X-Google-Language: ENGLISH,CP1252
Received: by 10.224.221.17 with SMTP id ia17mr6495442qab.2.1342919296187;
        Sat, 21 Jul 2012 18:08:16 -0700 (PDT)
Received: by 10.66.82.2 with SMTP id e2mr591491pay.40.1342919296119;
        Sat, 21 Jul 2012 18:08:16 -0700 (PDT)
Path: 
 a15ni43739939qag.0!nntp.google.com!x2no11887284qaj.0!news-out.google.com!p10ni21063256pbh.1!nntp.google.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!novia!news-peer1!btnet!zen.net.uk!hamilton.zen.co.uk!xlned.com!feeder7.xlned.com!multikabel.net!newsfeed10.multikabel.net!feed.xsnews.nl!border-1.ams.xsnews.nl!plix.pl!newsfeed2.plix.pl!news.mi.ras.ru!aspen.stu.neva.ru!goblin1!goblin.stu.neva.ru!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
From: Keean Schupke <keean.schupke@googlemail.com>
Newsgroups: comp.lang.ada
Subject: Re: Efficient Sequential Access to Arrays
Date: Tue, 17 Jul 2012 00:24:08 -0700 (PDT)
Organization: http://groups.google.com
Message-ID: <0bcbd157-91fa-4eaa-a797-cd1b9079ab75@googlegroups.com>
References: <01983f1c-f842-4b1f-a180-bcef531dad4c@googlegroups.com>
 <kpltof5a8xvm.arygetd4esmc.dlg@40tude.net>
 <44f2bfd0-75ed-4b9c-982e-61d5308f9c01@googlegroups.com>
 <1s8rnmtyqnrr$.1o78f1qaxeulh$.dlg@40tude.net>
 <e43d83de-d992-475b-bcfc-fafa13c4ea40@googlegroups.com>
 <1lgvwexsx8ovd.xdbvohz8eowp.dlg@40tude.net>
 <4c86d54d-dfeb-44f1-9638-3c416ea51d74@googlegroups.com>
 <iqxu0jiz121b.1i6qo6y7twnvp.dlg@40tude.net>
 <ca3e1b59-5bcc-4359-a6da-8ef3235502af@googlegroups.com>
 <n5f2c468x3m3.11so2phokj0ze$.dlg@40tude.net>
NNTP-Posting-Host: 82.44.19.199
Mime-Version: 1.0
X-Trace: posting.google.com 1342509970 13400 127.0.0.1 (17 Jul 2012 07:26:10
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 17 Jul 2012 07:26:10 +0000 (UTC)
Cc: mailbox@dmitry-kazakov.de
In-Reply-To: <n5f2c468x3m3.11so2phokj0ze$.dlg@40tude.net>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=82.44.19.199;
 posting-account=T5Z2vAoAAAB8ExE3yV3f56dVATtEMNcM
User-Agent: G2/1.0
X-Received-Bytes: 13737
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Date: 2012-07-17T00:24:08-07:00
List-Id: <comp.lang.ada>

On Tuesday, 17 July 2012 07:31:11 UTC+1, Dmitry A. Kazakov  wrote:
> On Mon, 16 Jul 2012 12:12:48 -0700 (PDT), Keean Schupke wrote:
>=20
> &gt; On Monday, 16 July 2012 19:48:22 UTC+1, Dmitry A. Kazakov  wrote:
> &gt;&gt; On Mon, 16 Jul 2012 09:39:45 -0700 (PDT), Keean Schupke wrote:
> &gt;&gt;=20
> &gt;&gt; &amp;gt; On Monday, 16 July 2012 15:57:04 UTC+1, Dmitry A. Kazak=
ov  wrote:
> &gt;&gt;=20
> &gt;&gt; &amp;gt;&amp;gt; Which itself looks like a problem. It takes no =
less than 25% of overall
> &gt;&gt; &amp;gt;&amp;gt; performance for mere shuffling data in a task u=
nrelated to stuff like
> &gt;&gt; &amp;gt;&amp;gt; recoding etc. This looks wrong to me. I would l=
ook for a better
> &gt;&gt; &amp;gt;&amp;gt; algorithm/structure.
> &gt;&gt; &amp;gt;=20
> &gt;&gt; &amp;gt; I find this comment puzzling and does not match my expe=
rience.
> &gt;&gt;=20
> &gt;&gt; It is just an observation that more than 25% of CPU load is spen=
t to heat
> &gt;&gt; the atmosphere rather than doing something useful.
> &gt;&gt;=20
> &gt;&gt; &amp;gt; What algorithm for disjoint-set do you suggest I use?
> &gt;&gt;=20
> &gt;&gt; I would try to avoid using such things if the problem is large.
> &gt;=20
> &gt; Perhaps you would like to outline your methodology for algorithm
> &gt; development?
>=20
> I don&#39;t develop algorithms I apply them.
>=20
> 25% overhead on indexing is too much.
>=20
> &gt; How would you go about designing an efficient algorithm to
> &gt; solve these kind of problems:
> &gt;=20
> &gt; =95 Network connectivity.
>=20
> 1. Open amazon site
> 2. Search &quot;netwok adapter&quot;=20
> 3. Choose 5 different with minimal price
> 4. Among them select one with maximal positive customer reports
> 5. Buy it
>=20
> (Network connectivity is not an algorithmic problem)
>=20
> &gt;&gt; No idea what you are trying to do. But indexing *must* be faster=
 than
> &gt;&gt; indirection. You should make your data structures straight.
> &gt;=20
> &gt; Straight data structures require copying data. If we were to use an =
array
> &gt; instead of a linked list then every state change would require copyi=
ng
> &gt; stuff.
>=20
> In Ada you rename array elements. This lets the compiler to choose betwee=
n
> copying and handling it in-place. BTW copying might turn more effective
> depending on the architecture.
>=20
> &gt;&gt; &amp;gt;&amp;gt; If you have performance problems with arrays, w=
hich are built-in and
> &gt;&gt; &amp;gt;&amp;gt; subject of tailored optimization, you cannot ex=
pect situation improved by
> &gt;&gt; &amp;gt;&amp;gt; puzzling the compiler with more abstract types.=
 That could only disable
> &gt;&gt; &amp;gt;&amp;gt; optimization.
> &gt;&gt; &amp;gt;=20
> &gt;&gt; &amp;gt; Maybe you are thinking of something more complex than I=
 am. An ADT is just
> &gt;&gt; &amp;gt; a private type with associated functions in a package.
> &gt;&gt;=20
> &gt;&gt; It is difficult for the compiler to optimize wrappers of array o=
perations
> &gt;&gt; like indexing. You said you failed to make GCC to optimize raw a=
rray
> &gt;&gt; access. Why wrapping that into functions should do any better?
> &gt;=20
> &gt; It doesn&#39;t do any worse... the functions are mostly single line =
and get
> &gt; inlined before the optimisation pass.
>=20
> Did you verify that?
>=20
> &gt; On the whole many positives, and no negatives (because its just as f=
ast in my tests).
>=20
> But you said you failed to optimize multiplication away.


I should probably expose the array from the package directly and operate on=
 it instead of using an ADT interface and test the performance. Its worth a=
 try anyway. However, to me this seems worse than the ADT with address arit=
hmetic.=20

At least with an ADT with internal address arithmetic, you can test and val=
idate the component in isolation through its interface, and package it as a=
 library for the 'user' code to safely use.

Exposing implementation details of a software component (like directly expo=
sing an array type) breaks encapsulation, and prevents separate testing and=
 validation.

When developing large pieces of software, I find good structure and abstrac=
tion vital for managing and understanding the code. I can live with address=
 arithmetic as long as it is contained within a component with a good inter=
face.

It is important when optimising complex problems to have good structure, so=
 that for example you can benchmark the whole program with different disjoi=
nt-set algorithms (eager-union vs WQUPC for example).

It should be possible to replace an Array based implementation with a hash =
table.

Each algorithm is based on a set of fundamental concepts, An iterator is pe=
rhaps the most commonly used example. You have forward iterator, bidirectio=
nal iterator, indexed iteratorm, random iterator. If your algorithm only re=
quires a forward iterator, you only use the interface for a forward iterato=
r, you don't require an indexed iterator. You can then supply a data-struct=
ure that meets or exceeds those requirements. An algorithm that uses an ind=
exed iterator could use a linked-list or an array as the data structure for=
 example. So we break software into containers that provide the interfaces,=
 and algorithms that use them.

I would rather use address arithmetic to achieve good performance and maint=
ain the good code structure, than break the structure to avoid address arit=
hmetic and have good performance. There is a third option, avoid address ar=
ithmetic, keep the good structure, but not have good performance, but this =
is not an option for this application.


Cheers,
Keean.


>=20
> &gt;&gt; &amp;gt; Element&amp;#39;s is different from the STL. It include=
s native &amp;#39;concepts&amp;#39; that
> &gt;&gt; &amp;gt; define axiomatic constraints on software modules.
> &gt;&gt;=20
> &gt;&gt; I have no idea what &amp;quot;axiomatic constraints on software =
modules&amp;quot; could mean.
> &gt;=20
> &gt; See:
> &gt;=20
> &gt; http://akrzemi1.wordpress.com/2012/01/11/concept-axioms-what-for/
>=20
> Ah, you meant invariants, propositions regarding program semantics
> maintained true during code transformations, e.g. for optimization.
>=20
> In more complex cases user could help the compiler giving some advices. B=
ut
> for low-level optimizations like ones of indexing this stuff is buried in
> the compiler guts.
>=20
> However, yes, Ada should have interfaces of integer types, interfaces of
> index types, interfaces of array types etc, annotated by
> pre-/post-conditions and invariants. Unfortunately, presently interfaces
> are tied with by-reference types. It is unclear when Ada will provide
> interfaces for all types. Otherwise, regarding program correctness stuff
> there is SPARK and some in Ada 2012.
>=20
> --=20
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de


On Tuesday, 17 July 2012 07:31:11 UTC+1, Dmitry A. Kazakov  wrote:
> On Mon, 16 Jul 2012 12:12:48 -0700 (PDT), Keean Schupke wrote:
>=20
> &gt; On Monday, 16 July 2012 19:48:22 UTC+1, Dmitry A. Kazakov  wrote:
> &gt;&gt; On Mon, 16 Jul 2012 09:39:45 -0700 (PDT), Keean Schupke wrote:
> &gt;&gt;=20
> &gt;&gt; &amp;gt; On Monday, 16 July 2012 15:57:04 UTC+1, Dmitry A. Kazak=
ov  wrote:
> &gt;&gt;=20
> &gt;&gt; &amp;gt;&amp;gt; Which itself looks like a problem. It takes no =
less than 25% of overall
> &gt;&gt; &amp;gt;&amp;gt; performance for mere shuffling data in a task u=
nrelated to stuff like
> &gt;&gt; &amp;gt;&amp;gt; recoding etc. This looks wrong to me. I would l=
ook for a better
> &gt;&gt; &amp;gt;&amp;gt; algorithm/structure.
> &gt;&gt; &amp;gt;=20
> &gt;&gt; &amp;gt; I find this comment puzzling and does not match my expe=
rience.
> &gt;&gt;=20
> &gt;&gt; It is just an observation that more than 25% of CPU load is spen=
t to heat
> &gt;&gt; the atmosphere rather than doing something useful.
> &gt;&gt;=20
> &gt;&gt; &amp;gt; What algorithm for disjoint-set do you suggest I use?
> &gt;&gt;=20
> &gt;&gt; I would try to avoid using such things if the problem is large.
> &gt;=20
> &gt; Perhaps you would like to outline your methodology for algorithm
> &gt; development?
>=20
> I don&#39;t develop algorithms I apply them.
>=20
> 25% overhead on indexing is too much.
>=20
> &gt; How would you go about designing an efficient algorithm to
> &gt; solve these kind of problems:
> &gt;=20
> &gt; =95 Network connectivity.
>=20
> 1. Open amazon site
> 2. Search &quot;netwok adapter&quot;=20
> 3. Choose 5 different with minimal price
> 4. Among them select one with maximal positive customer reports
> 5. Buy it
>=20
> (Network connectivity is not an algorithmic problem)
>=20
> &gt;&gt; No idea what you are trying to do. But indexing *must* be faster=
 than
> &gt;&gt; indirection. You should make your data structures straight.
> &gt;=20
> &gt; Straight data structures require copying data. If we were to use an =
array
> &gt; instead of a linked list then every state change would require copyi=
ng
> &gt; stuff.
>=20
> In Ada you rename array elements. This lets the compiler to choose betwee=
n
> copying and handling it in-place. BTW copying might turn more effective
> depending on the architecture.
>=20
> &gt;&gt; &amp;gt;&amp;gt; If you have performance problems with arrays, w=
hich are built-in and
> &gt;&gt; &amp;gt;&amp;gt; subject of tailored optimization, you cannot ex=
pect situation improved by
> &gt;&gt; &amp;gt;&amp;gt; puzzling the compiler with more abstract types.=
 That could only disable
> &gt;&gt; &amp;gt;&amp;gt; optimization.
> &gt;&gt; &amp;gt;=20
> &gt;&gt; &amp;gt; Maybe you are thinking of something more complex than I=
 am. An ADT is just
> &gt;&gt; &amp;gt; a private type with associated functions in a package.
> &gt;&gt;=20
> &gt;&gt; It is difficult for the compiler to optimize wrappers of array o=
perations
> &gt;&gt; like indexing. You said you failed to make GCC to optimize raw a=
rray
> &gt;&gt; access. Why wrapping that into functions should do any better?
> &gt;=20
> &gt; It doesn&#39;t do any worse... the functions are mostly single line =
and get
> &gt; inlined before the optimisation pass.
>=20
> Did you verify that?
>=20
> &gt; On the whole many positives, and no negatives (because its just as f=
ast in my tests).
>=20
> But you said you failed to optimize multiplication away.
>=20
> &gt;&gt; &amp;gt; Element&amp;#39;s is different from the STL. It include=
s native &amp;#39;concepts&amp;#39; that
> &gt;&gt; &amp;gt; define axiomatic constraints on software modules.
> &gt;&gt;=20
> &gt;&gt; I have no idea what &amp;quot;axiomatic constraints on software =
modules&amp;quot; could mean.
> &gt;=20
> &gt; See:
> &gt;=20
> &gt; http://akrzemi1.wordpress.com/2012/01/11/concept-axioms-what-for/
>=20
> Ah, you meant invariants, propositions regarding program semantics
> maintained true during code transformations, e.g. for optimization.
>=20
> In more complex cases user could help the compiler giving some advices. B=
ut
> for low-level optimizations like ones of indexing this stuff is buried in
> the compiler guts.
>=20
> However, yes, Ada should have interfaces of integer types, interfaces of
> index types, interfaces of array types etc, annotated by
> pre-/post-conditions and invariants. Unfortunately, presently interfaces
> are tied with by-reference types. It is unclear when Ada will provide
> interfaces for all types. Otherwise, regarding program correctness stuff
> there is SPARK and some in Ada 2012.
>=20
> --=20
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de