From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,56525db28240414a X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,CP1252 Received: by 10.224.221.17 with SMTP id ia17mr6495442qab.2.1342919296187; Sat, 21 Jul 2012 18:08:16 -0700 (PDT) Received: by 10.66.82.2 with SMTP id e2mr591491pay.40.1342919296119; Sat, 21 Jul 2012 18:08:16 -0700 (PDT) Path: a15ni43739939qag.0!nntp.google.com!x2no11887284qaj.0!news-out.google.com!p10ni21063256pbh.1!nntp.google.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!novia!news-peer1!btnet!zen.net.uk!hamilton.zen.co.uk!xlned.com!feeder7.xlned.com!multikabel.net!newsfeed10.multikabel.net!feed.xsnews.nl!border-1.ams.xsnews.nl!plix.pl!newsfeed2.plix.pl!news.mi.ras.ru!aspen.stu.neva.ru!goblin1!goblin.stu.neva.ru!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail From: Keean Schupke Newsgroups: comp.lang.ada Subject: Re: Efficient Sequential Access to Arrays Date: Tue, 17 Jul 2012 00:24:08 -0700 (PDT) Organization: http://groups.google.com Message-ID: <0bcbd157-91fa-4eaa-a797-cd1b9079ab75@googlegroups.com> References: <01983f1c-f842-4b1f-a180-bcef531dad4c@googlegroups.com> <44f2bfd0-75ed-4b9c-982e-61d5308f9c01@googlegroups.com> <1s8rnmtyqnrr$.1o78f1qaxeulh$.dlg@40tude.net> <1lgvwexsx8ovd.xdbvohz8eowp.dlg@40tude.net> <4c86d54d-dfeb-44f1-9638-3c416ea51d74@googlegroups.com> NNTP-Posting-Host: 82.44.19.199 Mime-Version: 1.0 X-Trace: posting.google.com 1342509970 13400 127.0.0.1 (17 Jul 2012 07:26:10 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Tue, 17 Jul 2012 07:26:10 +0000 (UTC) Cc: mailbox@dmitry-kazakov.de In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=82.44.19.199; posting-account=T5Z2vAoAAAB8ExE3yV3f56dVATtEMNcM User-Agent: G2/1.0 X-Received-Bytes: 13737 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Date: 2012-07-17T00:24:08-07:00 List-Id: On Tuesday, 17 July 2012 07:31:11 UTC+1, Dmitry A. Kazakov wrote: > On Mon, 16 Jul 2012 12:12:48 -0700 (PDT), Keean Schupke wrote: >=20 > > On Monday, 16 July 2012 19:48:22 UTC+1, Dmitry A. Kazakov wrote: > >> On Mon, 16 Jul 2012 09:39:45 -0700 (PDT), Keean Schupke wrote: > >>=20 > >> &gt; On Monday, 16 July 2012 15:57:04 UTC+1, Dmitry A. Kazak= ov wrote: > >>=20 > >> &gt;&gt; Which itself looks like a problem. It takes no = less than 25% of overall > >> &gt;&gt; performance for mere shuffling data in a task u= nrelated to stuff like > >> &gt;&gt; recoding etc. This looks wrong to me. I would l= ook for a better > >> &gt;&gt; algorithm/structure. > >> &gt;=20 > >> &gt; I find this comment puzzling and does not match my expe= rience. > >>=20 > >> It is just an observation that more than 25% of CPU load is spen= t to heat > >> the atmosphere rather than doing something useful. > >>=20 > >> &gt; What algorithm for disjoint-set do you suggest I use? > >>=20 > >> I would try to avoid using such things if the problem is large. > >=20 > > Perhaps you would like to outline your methodology for algorithm > > development? >=20 > I don't develop algorithms I apply them. >=20 > 25% overhead on indexing is too much. >=20 > > How would you go about designing an efficient algorithm to > > solve these kind of problems: > >=20 > > =95 Network connectivity. >=20 > 1. Open amazon site > 2. Search "netwok adapter"=20 > 3. Choose 5 different with minimal price > 4. Among them select one with maximal positive customer reports > 5. Buy it >=20 > (Network connectivity is not an algorithmic problem) >=20 > >> No idea what you are trying to do. But indexing *must* be faster= than > >> indirection. You should make your data structures straight. > >=20 > > Straight data structures require copying data. If we were to use an = array > > instead of a linked list then every state change would require copyi= ng > > stuff. >=20 > In Ada you rename array elements. This lets the compiler to choose betwee= n > copying and handling it in-place. BTW copying might turn more effective > depending on the architecture. >=20 > >> &gt;&gt; If you have performance problems with arrays, w= hich are built-in and > >> &gt;&gt; subject of tailored optimization, you cannot ex= pect situation improved by > >> &gt;&gt; puzzling the compiler with more abstract types.= That could only disable > >> &gt;&gt; optimization. > >> &gt;=20 > >> &gt; Maybe you are thinking of something more complex than I= am. An ADT is just > >> &gt; a private type with associated functions in a package. > >>=20 > >> It is difficult for the compiler to optimize wrappers of array o= perations > >> like indexing. You said you failed to make GCC to optimize raw a= rray > >> access. Why wrapping that into functions should do any better? > >=20 > > It doesn't do any worse... the functions are mostly single line = and get > > inlined before the optimisation pass. >=20 > Did you verify that? >=20 > > On the whole many positives, and no negatives (because its just as f= ast in my tests). >=20 > But you said you failed to optimize multiplication away. I should probably expose the array from the package directly and operate on= it instead of using an ADT interface and test the performance. Its worth a= try anyway. However, to me this seems worse than the ADT with address arit= hmetic.=20 At least with an ADT with internal address arithmetic, you can test and val= idate the component in isolation through its interface, and package it as a= library for the 'user' code to safely use. Exposing implementation details of a software component (like directly expo= sing an array type) breaks encapsulation, and prevents separate testing and= validation. When developing large pieces of software, I find good structure and abstrac= tion vital for managing and understanding the code. I can live with address= arithmetic as long as it is contained within a component with a good inter= face. It is important when optimising complex problems to have good structure, so= that for example you can benchmark the whole program with different disjoi= nt-set algorithms (eager-union vs WQUPC for example). It should be possible to replace an Array based implementation with a hash = table. Each algorithm is based on a set of fundamental concepts, An iterator is pe= rhaps the most commonly used example. You have forward iterator, bidirectio= nal iterator, indexed iteratorm, random iterator. If your algorithm only re= quires a forward iterator, you only use the interface for a forward iterato= r, you don't require an indexed iterator. You can then supply a data-struct= ure that meets or exceeds those requirements. An algorithm that uses an ind= exed iterator could use a linked-list or an array as the data structure for= example. So we break software into containers that provide the interfaces,= and algorithms that use them. I would rather use address arithmetic to achieve good performance and maint= ain the good code structure, than break the structure to avoid address arit= hmetic and have good performance. There is a third option, avoid address ar= ithmetic, keep the good structure, but not have good performance, but this = is not an option for this application. Cheers, Keean. >=20 > >> &gt; Element&#39;s is different from the STL. It include= s native &#39;concepts&#39; that > >> &gt; define axiomatic constraints on software modules. > >>=20 > >> I have no idea what &quot;axiomatic constraints on software = modules&quot; could mean. > >=20 > > See: > >=20 > > http://akrzemi1.wordpress.com/2012/01/11/concept-axioms-what-for/ >=20 > Ah, you meant invariants, propositions regarding program semantics > maintained true during code transformations, e.g. for optimization. >=20 > In more complex cases user could help the compiler giving some advices. B= ut > for low-level optimizations like ones of indexing this stuff is buried in > the compiler guts. >=20 > However, yes, Ada should have interfaces of integer types, interfaces of > index types, interfaces of array types etc, annotated by > pre-/post-conditions and invariants. Unfortunately, presently interfaces > are tied with by-reference types. It is unclear when Ada will provide > interfaces for all types. Otherwise, regarding program correctness stuff > there is SPARK and some in Ada 2012. >=20 > --=20 > Regards, > Dmitry A. Kazakov > http://www.dmitry-kazakov.de On Tuesday, 17 July 2012 07:31:11 UTC+1, Dmitry A. Kazakov wrote: > On Mon, 16 Jul 2012 12:12:48 -0700 (PDT), Keean Schupke wrote: >=20 > > On Monday, 16 July 2012 19:48:22 UTC+1, Dmitry A. Kazakov wrote: > >> On Mon, 16 Jul 2012 09:39:45 -0700 (PDT), Keean Schupke wrote: > >>=20 > >> &gt; On Monday, 16 July 2012 15:57:04 UTC+1, Dmitry A. Kazak= ov wrote: > >>=20 > >> &gt;&gt; Which itself looks like a problem. It takes no = less than 25% of overall > >> &gt;&gt; performance for mere shuffling data in a task u= nrelated to stuff like > >> &gt;&gt; recoding etc. This looks wrong to me. I would l= ook for a better > >> &gt;&gt; algorithm/structure. > >> &gt;=20 > >> &gt; I find this comment puzzling and does not match my expe= rience. > >>=20 > >> It is just an observation that more than 25% of CPU load is spen= t to heat > >> the atmosphere rather than doing something useful. > >>=20 > >> &gt; What algorithm for disjoint-set do you suggest I use? > >>=20 > >> I would try to avoid using such things if the problem is large. > >=20 > > Perhaps you would like to outline your methodology for algorithm > > development? >=20 > I don't develop algorithms I apply them. >=20 > 25% overhead on indexing is too much. >=20 > > How would you go about designing an efficient algorithm to > > solve these kind of problems: > >=20 > > =95 Network connectivity. >=20 > 1. Open amazon site > 2. Search "netwok adapter"=20 > 3. Choose 5 different with minimal price > 4. Among them select one with maximal positive customer reports > 5. Buy it >=20 > (Network connectivity is not an algorithmic problem) >=20 > >> No idea what you are trying to do. But indexing *must* be faster= than > >> indirection. You should make your data structures straight. > >=20 > > Straight data structures require copying data. If we were to use an = array > > instead of a linked list then every state change would require copyi= ng > > stuff. >=20 > In Ada you rename array elements. This lets the compiler to choose betwee= n > copying and handling it in-place. BTW copying might turn more effective > depending on the architecture. >=20 > >> &gt;&gt; If you have performance problems with arrays, w= hich are built-in and > >> &gt;&gt; subject of tailored optimization, you cannot ex= pect situation improved by > >> &gt;&gt; puzzling the compiler with more abstract types.= That could only disable > >> &gt;&gt; optimization. > >> &gt;=20 > >> &gt; Maybe you are thinking of something more complex than I= am. An ADT is just > >> &gt; a private type with associated functions in a package. > >>=20 > >> It is difficult for the compiler to optimize wrappers of array o= perations > >> like indexing. You said you failed to make GCC to optimize raw a= rray > >> access. Why wrapping that into functions should do any better? > >=20 > > It doesn't do any worse... the functions are mostly single line = and get > > inlined before the optimisation pass. >=20 > Did you verify that? >=20 > > On the whole many positives, and no negatives (because its just as f= ast in my tests). >=20 > But you said you failed to optimize multiplication away. >=20 > >> &gt; Element&#39;s is different from the STL. It include= s native &#39;concepts&#39; that > >> &gt; define axiomatic constraints on software modules. > >>=20 > >> I have no idea what &quot;axiomatic constraints on software = modules&quot; could mean. > >=20 > > See: > >=20 > > http://akrzemi1.wordpress.com/2012/01/11/concept-axioms-what-for/ >=20 > Ah, you meant invariants, propositions regarding program semantics > maintained true during code transformations, e.g. for optimization. >=20 > In more complex cases user could help the compiler giving some advices. B= ut > for low-level optimizations like ones of indexing this stuff is buried in > the compiler guts. >=20 > However, yes, Ada should have interfaces of integer types, interfaces of > index types, interfaces of array types etc, annotated by > pre-/post-conditions and invariants. Unfortunately, presently interfaces > are tied with by-reference types. It is unclear when Ada will provide > interfaces for all types. Otherwise, regarding program correctness stuff > there is SPARK and some in Ada 2012. >=20 > --=20 > Regards, > Dmitry A. Kazakov > http://www.dmitry-kazakov.de