From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,103803355c3db607 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.241.162 with SMTP id wj2mr1968558pbc.2.1340974299436; Fri, 29 Jun 2012 05:51:39 -0700 (PDT) Path: l9ni33310pbj.0!nntp.google.com!news2.google.com!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail From: Keean Schupke Newsgroups: comp.lang.ada Subject: Re: GNAT (GCC) Profile Guided Compilation Date: Fri, 29 Jun 2012 05:51:38 -0700 (PDT) Organization: http://groups.google.com Message-ID: <8ab283e5-6aaa-47f6-acee-fefdb60afb80@googlegroups.com> References: <62d099a8-d754-4c13-b8c8-d8eea2d6a764@googlegroups.com> <87ebfdda-559f-4661-a36d-7933f2c89d7e@googlegroups.com> NNTP-Posting-Host: 82.44.19.199 Mime-Version: 1.0 X-Trace: posting.google.com 1340974299 22125 127.0.0.1 (29 Jun 2012 12:51:39 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Fri, 29 Jun 2012 12:51:39 +0000 (UTC) Cc: stefan-lucks@see-the.signature In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=82.44.19.199; posting-account=T5Z2vAoAAAB8ExE3yV3f56dVATtEMNcM User-Agent: G2/1.0 Content-Type: text/plain; charset=ISO-8859-1 Date: 2012-06-29T05:51:38-07:00 List-Id: On Friday, 29 June 2012 13:26:23 UTC+1, (unknown) wrote: > On Fri, 29 Jun 2012, Keean Schupke wrote: > > > On Friday, 29 June 2012 11:01:30 UTC+1, Keean Schupke wrote: > > > > Maintaining the address of the current cell is faster than using > > > indexed addressing (mainly because of the extra parameter that needs > > > to be passed around increasing the pressure on register allocation). > > I guess you did some benchmarking, regarding that issue? > > > > I would be interested to understand how an algorithm can be > > > implemented in a more Ada way? > > Faster or slower, you are extensively using access types (Element_Access > and Set_Access). This is something Ada programmers usually try to avoid, > except when there is an urgent need for some dynamic memory management > (e.g., implementing one's own Container library). > > > generic > > type Element_Type is limited private; > > type Element_Access is not null access all Element_Type; > > type Set_Index is range <>; > > package Eager_Disjointsets is > > type Set_Type is limited private; > > type Set_Access is not null access all Set_Type; > > What, exactly, do you use Element_Access for? And why, exactly, do you > bother to define Set_Access? I use Element_Access because the element is going to be a record that needs to be updated in place. To copy the record is too slow. If you look at the get method we want to do: Get(Set, Index).Something := xxx The record stored in each Node of the set datatype will be limited, so you cannot return it from a function, you can only return an access to it. Why Set_Access, because in order to avoid an access-level problem you cannot define the element-access at a different block level from the set-access, so in order to safely return an element-access we must pass a set-access in. > > > function Find( > > Set : in Set_Access; > > Index : in Set_Index > > ) return Set_Index; > > pragma Inline_Always(Find); > > Why don't you just write the following? (Same for your other operations.) > > function find > (Set : in Set_Type; > Index : in Set_Index) return Set_Index; Because of the above requirements of the Get function, it is better to make all the methods take a Set_Access for consistency. The we only need to pass a Set_Access around, never a set, forcing pass-by-reference and not copying. > > [...] > > > private > > type Node_Type is limited record > > Canonical : Set_Index; > > Successor : Set_Index; > > Size : Natural; > > Value : aliased Element_Type; > > end record; > > type Set_Type is array (Set_Index) of Node_Type; > > end Eager_Disjointsets; > > Why, exactly, do you make Value aliased? Is this related to your "faster > than indexed addressing" point above? This is because we need to return an access to the value from 'Get' so that it can be updated in place. Remember 'Element_Type' will be limited. > > This may be hard for the compiler to properly optimize: An aliased > component Element_Type in a record ... and then there is an array of such > records, and all is inside a generic package. Actually the compiler does pretty well. 40k per second it matches the C++ version. It is only when trying to use profile-guided optimisation that Ada is not improving. > > If you really need Value to be aliased (do you?), then you can try out > rewriting Set_Type: > > type Set_Index_Array is array(Set_Index) of Set_Index; > type Size_Array is array(Set_Index) of Natural; > type Value_Array is array(Set_Index) of aliased Element_Type; > > type Set_Type is record > Canonical, Successor: Set_Index_Array; > Size: Size_Array; > Value: Value_Array; > end record; I dont think this will be faster as we now have to compute the array index offset for accessing each property. Happy to benchmark for you... > > I have no idea if the above modification helps to speed up the > computation, or not. It may backfire, I don't know. > > > package body Eager_Disjointsets is > > function Find( > > Set : in Set_Access; > > Index : in Set_Index > > ) return Set_Index is begin > > return Set(Index).Canonical; > > end Find; > > Now, here you would have to write "return Set.Canonical(Index);" > > > -- (C)2011 Keean Schupke, all rights reserved. > > generic > > Size : Int; > > type Element_Type is limited private; > > type Element_Access is not null access all Element_Type; > > package Eager_Disjointsets is > > type Set_Type is limited private; > > type Set_Access is not null access all Set_Type; > > type Node_Type is limited private; > > type Cursor is access all Node_Type; > > Is Int another name for Integer? Sorry, its a type name for the local best performing integer type, which is Integer_64 in my case. > > I understand that for performance reason you introduced Cursor as access > all Node_Type and later Set_Array as aliased array of Node_Type. > > But again, what do you need Element_Access and Set_Access for? See above. > > > package Node_Address is new > > System.Address_To_Access_Conversions(Node_Type); > > Probably, the compiler finds that out by itself -- but you may want to > make it explicit that Node_Size doesn't change: > > > Node_Size : constant Storage_Offset := Node_Type'Object_Size > > / System.Storage_Elements.Storage_Element'Size; Yes, thats a good spot - I missed that. > > Also, if Node_Size isn't a power of 2, it may be difficult for the > compiler to decide on the alignment of objects of type Node_Type -- with > possible effects on caching. On one hand, it may be good to pack the > objects closely together, to make as many fit into a single cache line as > possible. On the other hand, it is bad if a part of a certain node is in > one cache line, another part in another cache line. Packing is slower, the whole data set is small enough to fit in the cache (<5k) so alignment wins over packing. > > > function Find( > > Node : in Cursor > > ) return Cursor is begin > > return Cursor(Node.Canonical); > > end Find; > > Not related to performance, but why don't you just write > "return Node.Canonical;" as Node.Canonical is of type Cursor, anyway? Again, good spot, Canonical was not always a Cursor, so I missed it. > > > -- > ---- Stefan.Lucks (at) uni-weimar.de, University of Weimar, Germany ---- > > ------ I love the taste of Cryptanalysis in the morning! ------ Cheers, Keean.