Efficiency of returning big objects

comp.lang.ada
 help / color / mirror / Atom feed

* Efficiency of returning big objects
@ 2006-09-25 12:14 Alex R. Mosteo
  2006-09-25 13:09 ` Dmitry A. Kazakov
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Alex R. Mosteo @ 2006-09-25 12:14 UTC (permalink / raw)


Hello, 

I had a concern over the efficiency of returning big (tagged) types, since I
like to store them in indefinite collections (a la Object'Class) instead of
storing pointers to them. It is said that, as tagged types are passed by
reference, they are always efficiently passed around. However, when you
are /returning/ and not calling, I had my doubts, so I have made this
simple testcase (see at bottom).

Here is a typical execution:

$ ./ref
Direct access: 341312
Indirect access: 76992
Pass constant: 336507
Pass value: 72573

It has been compiled with just gnatmake -O3. It reveals that a copy is
created in the stack for returning every time. I was hoping that, as long
as the type were used as a constant, a reference would be returned and
passed around. This would be very interesting in collections, since we
could do, for example:

Some_Map.Element (Key).Method_Or_Member

without efficiency penalties.

Now, in C++ I would force this behavior explicitly returning references (I
guess this is the usual practice, but I'm not sure). In Ada, I have no idea
how to match this efficient behavior.

So, what do you think? Is there some error in my reasoning or testbed? Is
this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
overlooking something that makes this very non-trivial to implement?

Thanks in advance for your comments,

A. Mosteo.

-------8<---------

with Ada.Calendar; use Ada.Calendar;
with Ada.Text_Io;  use Ada.Text_Io;
procedure Ref is

   type Big_Thing is array (1 .. 64_000) of Character;

   type T is tagged record
      Dummy : Big_Thing := (others => 'c');
      X     : Integer   := 0;
   end record;

   procedure Check (Y : T) is
   begin
      if Y.X /= 0 then
         raise Constraint_Error;
      end if;
   end Check;

   Borg : T;

   function Create return T is
   begin
      return Borg;
   end Create;

   Start : Time;
   Iters : Natural;

begin
   --  Direct access, cheap.
   Iters := 0;
   Start := Clock;
   while Clock - Start < 1.0 loop
      if Borg.X /= 0 then 
         raise Constraint_Error;
      end if;
      Iters := Iters + 1;
   end loop;
   Put_line ("Direct access:" & Natural'Image (Iters));  

   --  Indirect access.
   Iters := 0;
   Start := Clock;
   while Clock - Start < 1.0 loop
      if Create.X /= 0 then 
         raise Constraint_Error;
      end if;
      Iters := Iters + 1;
   end loop;
   Put_line ("Indirect access:" & Natural'Image (Iters));  

   --  Pass by reference, cheap.
   Iters := 0;
   Start := Clock;
   while Clock - Start < 1.0 loop
      Check (Borg);
      Iters := Iters + 1;
   end loop;
   Put_line ("Pass constant:" & Natural'Image (Iters));  

   --  Pass - how?
   Iters := 0;
   Start := Clock;
   while Clock - Start < 1.0 loop
      Check (Create);
      Iters := Iters + 1;
   end loop;
   Put_line ("Pass value:" & Natural'Image (Iters));  
end Ref;

-------8<--------------------------------



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 12:14 Efficiency of returning big objects Alex R. Mosteo
@ 2006-09-25 13:09 ` Dmitry A. Kazakov
  2006-09-25 13:51   ` Alex R. Mosteo
  2006-09-25 15:24 ` Georg Bauhaus
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Dmitry A. Kazakov @ 2006-09-25 13:09 UTC (permalink / raw)


On Mon, 25 Sep 2006 14:14:44 +0200, Alex R. Mosteo wrote:

> I had a concern over the efficiency of returning big (tagged) types, since I
> like to store them in indefinite collections (a la Object'Class) instead of
> storing pointers to them. It is said that, as tagged types are passed by
> reference, they are always efficiently passed around. However, when you
> are /returning/ and not calling, I had my doubts, so I have made this
> simple testcase (see at bottom).

Same with arrays - when returning a slice of a non-local string, GNAT
copies it. It does this even when the callee is inlined.
 
> It has been compiled with just gnatmake -O3. It reveals that a copy is
> created in the stack for returning every time. I was hoping that, as long
> as the type were used as a constant, a reference would be returned and
> passed around. This would be very interesting in collections, since we
> could do, for example:
> 
> Some_Map.Element (Key).Method_Or_Member
> 
> without efficiency penalties.
> 
> Now, in C++ I would force this behavior explicitly returning references (I
> guess this is the usual practice, but I'm not sure). In Ada, I have no idea
> how to match this efficient behavior.

In Ada 2006 an equivalent would be an anonymous access type of the result.
As ugly as C++, but works.

> So, what do you think? Is there some error in my reasoning or testbed? Is
> this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
> overlooking something that makes this very non-trivial to implement?

You mean, checking if the scope of the name of the function result is in
the scope of the object returned? It is a global + temporal variables
optimizations, AFAIK. And the callee should be inlined. [ However, I
suppose, GNAT would make a copy even if your Create were inlined. ]

In the specific case:

   function Get (X : Container_Type; ...) return Element_Type;

the compiler could know that the result is from X. But it were still
possible something like:

   Get (Read_Container_From_File, First_Element); -- Shall copy

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 13:09 ` Dmitry A. Kazakov
@ 2006-09-25 13:51   ` Alex R. Mosteo
  0 siblings, 0 replies; 9+ messages in thread
From: Alex R. Mosteo @ 2006-09-25 13:51 UTC (permalink / raw)


Dmitry A. Kazakov wrote:

> On Mon, 25 Sep 2006 14:14:44 +0200, Alex R. Mosteo wrote:
> 
>> I had a concern over the efficiency of returning big (tagged) types,
>> since I like to store them in indefinite collections (a la Object'Class)
>> instead of storing pointers to them. It is said that, as tagged types are
>> passed by reference, they are always efficiently passed around. However,
>> when you are /returning/ and not calling, I had my doubts, so I have made
>> this simple testcase (see at bottom).
> 
> Same with arrays - when returning a slice of a non-local string, GNAT
> copies it. It does this even when the callee is inlined.
>  
>> It has been compiled with just gnatmake -O3. It reveals that a copy is
>> created in the stack for returning every time. I was hoping that, as long
>> as the type were used as a constant, a reference would be returned and
>> passed around. This would be very interesting in collections, since we
>> could do, for example:
>> 
>> Some_Map.Element (Key).Method_Or_Member
>> 
>> without efficiency penalties.
>> 
>> Now, in C++ I would force this behavior explicitly returning references
>> (I guess this is the usual practice, but I'm not sure). In Ada, I have no
>> idea how to match this efficient behavior.
> 
> In Ada 2006 an equivalent would be an anonymous access type of the result.
> As ugly as C++, but works.

Yep, it is ugly and sadly(?) missing from the standard library. Also makes
current gnat go boom with relative easiness.

I've experimented with it recently and having both functions (return the
proper type and an access to it) leads to ambiguities, so I finally end
doing

Some_Function.all.blah 

to disambiguate, which is even uglier.

>> So, what do you think? Is there some error in my reasoning or testbed? Is
>> this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
>> overlooking something that makes this very non-trivial to implement?
> 
> You mean, checking if the scope of the name of the function result is in
> the scope of the object returned? It is a global + temporal variables
> optimizations, AFAIK. And the callee should be inlined. [ However, I
> suppose, GNAT would make a copy even if your Create were inlined. ]

Just in case I've tried with -O3 -gnatn and -O3 -gnatN, and no differences.

Last time I did serious C++ (several years ago) I remember that returning
references was prone to give warnings about temporary copies being
necessary in some cases. I think this is precisely what we are talking:
when it is determined that a reference won't suffice because the returned
value will outlive the original reference, the warning is issued (and
copies made).



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 12:14 Efficiency of returning big objects Alex R. Mosteo
  2006-09-25 13:09 ` Dmitry A. Kazakov
@ 2006-09-25 15:24 ` Georg Bauhaus
  2006-09-25 16:41   ` Alex R. Mosteo
  2006-09-25 19:31 ` Jeffrey R. Carter
  2006-09-26  0:33 ` Adam Beneschan
  3 siblings, 1 reply; 9+ messages in thread
From: Georg Bauhaus @ 2006-09-25 15:24 UTC (permalink / raw)


On Mon, 2006-09-25 at 14:14 +0200, Alex R. Mosteo wrote:

> Some_Map.Element (Key).Method_Or_Member

Could you consider using Query_Element as an efficient
alternative? Or Update_Element if Method modifies.


> So, what do you think? Is there some error in my reasoning or testbed? Is
> this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
> overlooking something that makes this very non-trivial to implement?

Making T limited reveals a few things:

Compiling: ref.adb (source file time stamp: 2006-09-25 15:22:05)

    23.       return Borg;
                     |
        >>> (Ada 2005) cannot copy object of a limited type (RM-2005 6.5(5.5/2))
        >>> return by reference not permitted in Ada 2005
        >>> consider switching to return of access type

    51.       if Create.X /= 0 then
                 |
        >>> (Ada 2005) limited function call in this context is not yet implemented




-- Georg 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 15:24 ` Georg Bauhaus
@ 2006-09-25 16:41   ` Alex R. Mosteo
  0 siblings, 0 replies; 9+ messages in thread
From: Alex R. Mosteo @ 2006-09-25 16:41 UTC (permalink / raw)

Georg Bauhaus wrote:

> On Mon, 2006-09-25 at 14:14 +0200, Alex R. Mosteo wrote:
> 
>> Some_Map.Element (Key).Method_Or_Member
> 
> Could you consider using Query_Element as an efficient
> alternative? Or Update_Element if Method modifies.

I use these when common sense indicates that extracting a copy is too
heavyweight (and of course Update_Element is necessary for in-place
modifications).

I've once used Update_Element to make a function returning an access to the
actual copy in the container. This has required using 'Unrestricted_Access
inside Update_Element, so I'm not comfortable with it (though it worked).

It's maybe a problem of personal taste, but I find that
[Query/Update]_Element breaks algorithm flow, and being that Ada mainly
follows the imperative paradigm it is a bit awkward for me. Usually I
simply want to call a method of an stored object and instead I must insert
a "huge" declare block with the Query/Update procedure stub that simply
calls the object method, and after the fact call to Query/Update. For
example, for a querying, instead of

if Collection.Element (Cursor).Some_Function then ...

I must do

declare
   Ok : Bool;
   procedure Query_Stub ...
begin
   Some_Package.Query_Element (Cursor, Query_Stub'Access);
   if Ok then ...
end;

Even if all of this tends to be hidden in bodies of opaque types, it is
still a bore when writing these types. If there is some gain (in clarity or
reuse) in encapsulating all this in a proper function, it is still overhead
(in calls and in writing). As I say, maybe it is simply a matter of
personal likings. Still, I find myself musing in a compulsive way about
these things every time :)

>> So, what do you think? Is there some error in my reasoning or testbed? Is
>> this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
>> overlooking something that makes this very non-trivial to implement?
> 
> Making T limited reveals a few things:
> 
> Compiling: ref.adb (source file time stamp: 2006-09-25 15:22:05)
> 
>     23.       return Borg;
>                      |
>         >>> (Ada 2005) cannot copy object of a limited type (RM-2005
>         >>> 6.5(5.5/2)) return by reference not permitted in Ada 2005
>         >>> consider switching to return of access type
>     51.       if Create.X /= 0 then
>                  |
>         >>> (Ada 2005) limited function call in this context is not yet
>         >>> implemented

If I'm not mistaken, these are both sides of the coin: you can't return
limited types, because this implies a copy, unless (second case) this is a
new object being created in-place. 

Conceptually, it's clear that returning an object implies obtaining a copy.
I was hoping for the kind of optimization that also allows the omission of
some temporary copies (for example I think there are some rules about this
in controlled types).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 12:14 Efficiency of returning big objects Alex R. Mosteo
  2006-09-25 13:09 ` Dmitry A. Kazakov
  2006-09-25 15:24 ` Georg Bauhaus
@ 2006-09-25 19:31 ` Jeffrey R. Carter
  2006-09-26  7:45   ` Alex R. Mosteo
  2006-09-26  0:33 ` Adam Beneschan
  3 siblings, 1 reply; 9+ messages in thread
From: Jeffrey R. Carter @ 2006-09-25 19:31 UTC (permalink / raw)

Alex R. Mosteo wrote:
> 
> I had a concern over the efficiency of returning big (tagged) types, since I
> like to store them in indefinite collections (a la Object'Class) instead of
> storing pointers to them. It is said that, as tagged types are passed by
> reference, they are always efficiently passed around. However, when you
> are /returning/ and not calling, I had my doubts, so I have made this
> simple testcase (see at bottom).

This is a basic Ada concept. A function returns an object. A return 
statement gives the value of that object, so "return Borg;" means the 
returned object has the same value as Borg, but it is not the same 
object as Borg. Since the function returns a new object, it must create 
a copy.

If the object is immediately assigned to a variable, an intermediate 
copy may be optimized away in some cases. You may want to compare 
assigning Borg to assigning a function call.

Note that Borg is not a constant. That may affect the amount of 
optimization that is done.

You seem to expect the function to return an alias. In Ada, if that's 
what you want, you have to explicitly return an alias.

> So, what do you think? Is there some error in my reasoning or testbed?

I think so. The real questions should be, what are the timing (or 
storage; it's not clear if you're discussing time or space efficiency) 
requirements for this application? Does the clearest implementation 
(returning an object) meet those requirements? If not, is complicating 
the code to return aliases the only way to meet them? If not, does one 
of the other ways have less of an effect on clarity?

-- 
Jeff Carter
"Don't knock masturbation. It's sex with someone I love."
Annie Hall
45

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 12:14 Efficiency of returning big objects Alex R. Mosteo
                   ` (2 preceding siblings ...)
  2006-09-25 19:31 ` Jeffrey R. Carter
@ 2006-09-26  0:33 ` Adam Beneschan
  2006-09-26  7:43   ` Alex R. Mosteo
  3 siblings, 1 reply; 9+ messages in thread
From: Adam Beneschan @ 2006-09-26  0:33 UTC (permalink / raw)

Alex R. Mosteo wrote:
> Hello,
>
> I had a concern over the efficiency of returning big (tagged) types......

> Now, in C++ I would force this behavior explicitly returning references (I
> guess this is the usual practice, but I'm not sure). In Ada, I have no idea
> how to match this efficient behavior.
>
> So, what do you think? Is there some error in my reasoning or testbed? Is
> this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
> overlooking something that makes this very non-trivial to implement?

If I'm understanding you correctly, in your original program, you'd
like it if Create simply returned a pointer to Borg, and then the
parameter to Check used the pointer to Borg, and no copying would be
done.  But then the following program wouldn't work right.  Note that
according to the Ada rules, since the function Create is *not*
return-by-reference, the Ada semantics indicate that the result of
Create is an anonymous object (6.5(21)); therefore, Borg and Y are not
supposed to denote the same object, and the rule about distinct access
paths (6.2(12)) doesn't apply to the example below.  I think that
expecting a compiler to figure out what's going on, so that it wouldn't
optimize the copy away in my example but *would* optimize it away in
yours, may well be too much to ask.

                              -- Adam

with Ada.Calendar; use Ada.Calendar;
with Ada.Text_Io;  use Ada.Text_Io;
procedure Ref is

   type Big_Thing is array (1 .. 64_000) of Character;

   type T is tagged record
      Dummy : Big_Thing := (others => 'c');
      X     : Integer   := 0;
   end record;

   Borg : T;

   procedure Check (Y : T) is
   begin
      Borg.X := 1133;  --- We can't let this modify Y!!!
      if Y.X /= 0 then
         raise Constraint_Error;
      end if;
   end Check;

   function Create return T is
   begin
      return Borg;
   end Create;

   Start : Time;
   Iters : Natural;

begin
   --  Pass - how?
   Iters := 0;
   Start := Clock;
   while Clock - Start < 1.0 loop
      Check (Create);
      Iters := Iters + 1;
   end loop;
   Put_line ("Pass value:" & Natural'Image (Iters));  
end Ref;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-26  0:33 ` Adam Beneschan
@ 2006-09-26  7:43   ` Alex R. Mosteo
  0 siblings, 0 replies; 9+ messages in thread
From: Alex R. Mosteo @ 2006-09-26  7:43 UTC (permalink / raw)


Adam Beneschan wrote:

> Alex R. Mosteo wrote:
>> Hello,
>>
>> I had a concern over the efficiency of returning big (tagged) types......
> 
>> Now, in C++ I would force this behavior explicitly returning references
>> (I guess this is the usual practice, but I'm not sure). In Ada, I have no
>> idea how to match this efficient behavior.
>>
>> So, what do you think? Is there some error in my reasoning or testbed? Is
>> this a particular shortcoming of this compiler (gnat gpl 2006), or I'm
>> overlooking something that makes this very non-trivial to implement?
> 
> If I'm understanding you correctly, in your original program, you'd
> like it if Create simply returned a pointer to Borg, and then the
> parameter to Check used the pointer to Borg, and no copying would be
> done.  But then the following program wouldn't work right.  Note that
> according to the Ada rules, since the function Create is *not*
> return-by-reference, the Ada semantics indicate that the result of
> Create is an anonymous object (6.5(21)); therefore, Borg and Y are not
> supposed to denote the same object, and the rule about distinct access
> paths (6.2(12)) doesn't apply to the example below.  I think that
> expecting a compiler to figure out what's going on, so that it wouldn't
> optimize the copy away in my example but *would* optimize it away in
> yours, may well be too much to ask.

Yep, the more I think about it and coupled with the rules you and others
have mentioned are convincing me that this is no way so obvious as I had
thought in principle.

Now that you name it I was once bitten by the distinct access path
details... there was no error in the compiler but in my code, but it didn't
do the realization of it less painful :( Alas, learning the hard way. It
never ceases to amaze me how many things are explicitly considered and
determined in the ARM.

> 
>                               -- Adam
> 
> 
> with Ada.Calendar; use Ada.Calendar;
> with Ada.Text_Io;  use Ada.Text_Io;
> procedure Ref is
> 
>    type Big_Thing is array (1 .. 64_000) of Character;
> 
>    type T is tagged record
>       Dummy : Big_Thing := (others => 'c');
>       X     : Integer   := 0;
>    end record;
> 
>    Borg : T;
> 
>    procedure Check (Y : T) is
>    begin
>       Borg.X := 1133;  --- We can't let this modify Y!!!
>       if Y.X /= 0 then
>          raise Constraint_Error;
>       end if;
>    end Check;
> 
>    function Create return T is
>    begin
>       return Borg;
>    end Create;
> 
>    Start : Time;
>    Iters : Natural;
> 
> begin
>    --  Pass - how?
>    Iters := 0;
>    Start := Clock;
>    while Clock - Start < 1.0 loop
>       Check (Create);
>       Iters := Iters + 1;
>    end loop;
>    Put_line ("Pass value:" & Natural'Image (Iters));
> end Ref;




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Efficiency of returning big objects
  2006-09-25 19:31 ` Jeffrey R. Carter
@ 2006-09-26  7:45   ` Alex R. Mosteo
  0 siblings, 0 replies; 9+ messages in thread
From: Alex R. Mosteo @ 2006-09-26  7:45 UTC (permalink / raw)

Jeffrey R. Carter wrote:

> Alex R. Mosteo wrote:
>> So, what do you think? Is there some error in my reasoning or testbed?
> 
> I think so. The real questions should be, what are the timing (or
> storage; it's not clear if you're discussing time or space efficiency)

I'm more worried about time efficiency when copying large objects, as long
as the objects aren't that large to risk stack overflows. In any case, here
time efficiency is a side effect of less extra copies.

> requirements for this application? Does the clearest implementation
> (returning an object) meet those requirements? If not, is complicating
> the code to return aliases the only way to meet them? If not, does one
> of the other ways have less of an effect on clarity?

At least now I'm more informed to make these tradeoffs, instead of relying
in some compiler magic that isn't happening...

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-09-26  7:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-25 12:14 Efficiency of returning big objects Alex R. Mosteo
2006-09-25 13:09 ` Dmitry A. Kazakov
2006-09-25 13:51   ` Alex R. Mosteo
2006-09-25 15:24 ` Georg Bauhaus
2006-09-25 16:41   ` Alex R. Mosteo
2006-09-25 19:31 ` Jeffrey R. Carter
2006-09-26  7:45   ` Alex R. Mosteo
2006-09-26  0:33 ` Adam Beneschan
2006-09-26  7:43   ` Alex R. Mosteo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox