From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,cf34599caf2fa938
X-Google-Attributes: gid103376,public
From: griest-tom@cs.yale.edu (Tom Griest)
Subject: Re: GNAT function calling overhead
Date: 1995/04/07
Message-ID: <3m40vpINN3p8@RA.DEPT.CS.YALE.EDU>#1/1
X-Deja-AN: 100070397
references: <3m0nv1$pv2@nef.ens.fr> <3m0psq$fl2@stout.entertain.com>
organization: Yale University Computer Science Dept., New Haven, CT 06520-2158
newsgroups: comp.lang.ada
Date: 1995-04-07T00:00:00+00:00
List-Id: <comp.lang.ada>

In article <3m0nv1$pv2@nef.ens.fr>, Duncan Sands <sands@clipper.ens.fr> wrote:
[snipped stuff about 10_000 matrix-multplies in an application]
>>In summary: 55 seconds with function calling overhead.
>>            44 seconds without function calling overhead.
>>
>>Now, a 15 by 15 matrix means 225 entries.  225 entries at,
>>say, 8 bytes an entry makes a bit less than 2k.  So, depending on
>>whether GNAT takes function parameters by reference or by copy,
>>this makes anything between 2k and, say, 10k bytes to be copied
>>on each function call.
>>
>>Does this explain the time difference?  It shouldn't!  The amount
>>of time spent copying memory should be completely overwhelmed by
>>the amount of time taken to do the floating point operations!

First, I doubt very much that the matrix is passed by copy.  
Basically, any composite object larger than 8-bytes will be passed 
by reference.

Second, what are your assumptions about the time to perform a
floating point multiply on a 486DX?   The 486 ref manual indicates
that fpadds are typically 10 clocks and an fpmult is around 16 clocks.

Since the formal parameters for your function are unconstrated types,
there is probably a dynmaic allocation/initialization/deallocation 
of the dope vectors for each of the parameters.  This might account
for some of the overhead.

To really give you an answer, you should either get an assembly
listing (-S flag) or provide us the source of both versions.  It is
very hard to give a detailed answer (as opposed to the sort
Colin likes to supply :-)) without this information.

-Tom