* GNAT function calling overhead @ 1995-04-06 0:00 Duncan Sands 1995-04-06 0:00 ` Norman H. Cohen ` (4 more replies) 0 siblings, 5 replies; 20+ messages in thread From: Duncan Sands @ 1995-04-06 0:00 UTC (permalink / raw) system: DOS 486 (with math coprocessor), gcc 2.6.3, GNAT 2.03 Essentially the question is: why so much function calling overhead in GNAT? I'm writing a set of matrix factorization routines (Schur etc) so of course I need routines for multiplying matrices etc. For me a matrix is type Matrix is array(Positive range <>, Positive range <>) of Float; I overloaded "*" for matrix multiplication: function "*"(Left : in Matrix; Right : in Matrix) return Matrix; Multiplying two 15 by 15 matrices 10_000 times using this function takes about 55 seconds on my machine. The algorithm is the obvious one: loop over rows and columns, add up the appropriate products and assign them. I then "inlined" this function: rather than using "*", I put the code for "*" directly into my 10_000 times loop, of course renaming Left and Right to the names of my matrices, and assigning directly to the matrix which is to hold the answer. In this way I eliminated the function calling overhead. Using this method, multiplying two 15 by 15 matrices 10_000 times takes about 44 seconds. All this was done with optimisation (-O3) and -gnatp (i.e. no range checking etc). In summary: 55 seconds with function calling overhead. 44 seconds without function calling overhead. Now, a 15 by 15 matrix means 225 entries. 225 entries at, say, 8 bytes an entry makes a bit less than 2k. So, depending on whether GNAT takes function parameters by reference or by copy, this makes anything between 2k and, say, 10k bytes to be copied on each function call. Does this explain the time difference? It shouldn't! The amount of time spent copying memory should be completely overwhelmed by the amount of time taken to do the floating point operations! That is, for each of the 225 entries there are 15 floating point multiplications to be performed. The amount of time taken to copy the 225 entries, even if you need to copy them several times, should be MUCH smaller than the amount of time spent in the calculation. But the timings above indicate that function calling overhead makes up something like 25% of the time taken! So, the question is: why so much function calling overhead in GNAT? Can anyone please enlighten me? Thanks a lot, Duncan Sands. PS: The corresponding C code takes about 6 seconds. This surprises me too. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 GNAT function calling overhead Duncan Sands @ 1995-04-06 0:00 ` Norman H. Cohen 1995-04-06 0:00 ` Colin James III ` (3 subsequent siblings) 4 siblings, 0 replies; 20+ messages in thread From: Norman H. Cohen @ 1995-04-06 0:00 UTC (permalink / raw) In article <3m0nv1$pv2@nef.ens.fr>, sands@clipper.ens.fr (Duncan Sands) writes: |> system: DOS 486 (with math coprocessor), gcc 2.6.3, GNAT 2.03 |> |> Essentially the question is: why so much function calling overhead |> in GNAT? |> |> I'm writing a set of matrix factorization routines (Schur etc) so |> of course I need routines for multiplying matrices etc. |> For me a matrix is |> type Matrix is array(Positive range <>, Positive range <>) of Float; |> |> I overloaded "*" for matrix multiplication: |> function "*"(Left : in Matrix; Right : in Matrix) return Matrix; Functions returning results of an unconstrained array type are notoriously expensive. Because the compiler cannot determine the size of the result before hand, it cannot leave space for it in an ordinary stack frame, so the result must be put somewhere else and then copied to its final destination after the function returns. Unless your compiler is clever enough to realize that your local variable inside the function, say Result, is the only variable used in a return statement, it will probably copy twice: from Result to "somewhere else" and from "somewhere else" to the variable in the calling subprogram to which you assign the function result. Here are some other experiments you could try: 1. Use a procedure procedure Multiply (Left, Right: in Matrix; Product: out Matrix); instead of a function. (This makes the caller responsible for knowing the size of the result and declaring an object with those dimensions to be passed as the third actual parameter.) 2. (Poor SW engineering, but a worthwhile experiment:) Restrict your function to work with a CONSTRAINED array subtype: subtype Matrix_15 is Matrix (1 .. 15, 1 .. 15); function "*" (Left : in Matrix_15; Right : in Matrix_15) return Matrix_15; |> Multiplying two 15 by 15 matrices 10_000 times using this function |> takes about 55 seconds on my machine. The algorithm is the obvious |> one: loop over rows and columns, add up the appropriate products and |> assign them. |> |> I then "inlined" this function: rather than using "*", I put the code |> for "*" directly into my 10_000 times loop, of course renaming Left |> and Right to the names of my matrices, and assigning directly to the |> matrix which is to hold the answer. In this way I eliminated the |> function calling overhead. Using this method, multiplying two 15 by |> 15 matrices 10_000 times takes about 44 seconds. If you wrote for I in Left'Range(1) loop for J in Right'Range(2) loop for K in Left'Range(2) loop Ultimate_Target(I,J) := Ultimate_Target(I,J) + Left(I,K) * Right(K,J); end loop; end loop; end loop; rather than for I in Left'Range(1) loop for J in Right'Range(2) loop for K in Left'Range(2) loop Result(I,J) := Result(I,J) + Left(I,K) * Right(K,J); end loop; end loop; end loop; Ultimate_Target := Result; (as seems sensible) then you eliminated more than the function-call overhead: You also eliminated the overhead that was originally associated with the ":=" in Ultimate_Result := Left * Right; |> All this was done with optimisation (-O3) and -gnatp (i.e. no range |> checking etc). |> |> In summary: 55 seconds with function calling overhead. |> 44 seconds without function calling overhead. |> |> Now, a 15 by 15 matrix means 225 entries. 225 entries at, |> say, 8 bytes an entry makes a bit less than 2k. So, depending on |> whether GNAT takes function parameters by reference or by copy, |> this makes anything between 2k and, say, 10k bytes to be copied |> on each function call. |> |> Does this explain the time difference? It shouldn't! The amount |> of time spent copying memory should be completely overwhelmed by |> the amount of time taken to do the floating point operations! As modern processors have become faster and faster, loads and stores have become the bottleneck in many computations. I don't know the details of timings on the 486, but on the PowerPC architecture, once you fill your floating-point pipeline with multiply-adds the way that the inner loop above does, you get one righthand side expression completely evaluated on each cycle, PROVIDED that you can get your operands into floating-point registers fast enough. ("Floating-point registers?" ask the Intel users, "What are floating-point registers?") Accounting for leading-edge and trailing-edge effects, the fifteen iterations of the inner loop could take on the order of 20-25 cycles. In contrast, a single load of a value not in cache could cost you on the order of 10 cycles. (Once you pay that penalty, an entire cache line is read in, which, assuming row-major order, buys you something as you traverse a row of Left, but not as you traverse a column of Right.) If you get unlucky, you find that parts of different matrices you are touching at about the same time (say row I of Result and Row I of Left), or different parts of the same matrix that you are touching at about the same time (say Right(K,J) and Right(K+1,J)) are mapped to the same cache line. Then, unless you have a highly associative cache, you encounter an inordinate number of cache misses and slow your computation down dramatically. Memory latencies play such an important role in the running time of numerical algorithms that professional matrix multipliers almost never use the familiar "for i/for j/for k" loops that I wrote above. We are more likely to see something like for K in Left'Range(2) loop for I in Left'Range(1) loop for J in Right'Range(2) loop Result(I,J) := Result(I,J) + Left(I,K) * Right(K,J); end loop; end loop; end loop; or, better yet, if it is convenient to keep the matrices that will be used as left operands in transposed form, Result(I,J) := Result(I,J) + Transpose_Of_Left(K,I) * Right(K,J); which (again assuming row-major ordering of array components) does a much better job of reusing the contents of cache lines once they have been loaded from memory and the contents of registers once they have been loaded from cache. It is because these effects are so powerful that Fortran preprocessors performing these kinds of transformations are able to increase certain SPECfp benchmark scores by orders of magnitude. |> That is, for each of the 225 entries there are 15 floating point |> multiplications to be performed. The amount of time taken to |> copy the 225 entries, even if you need to copy them several times, |> should be MUCH smaller than the amount of time spent in the |> calculation. But the timings above indicate that function |> calling overhead makes up something like 25% of the time taken! Well, 20% (11/55), but in any event I'm not surprised. Adding and multiplying floating-point numbers is the easy part. It's copying them that can slow you down. -- Norman H. Cohen ncohen@watson.ibm.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 GNAT function calling overhead Duncan Sands 1995-04-06 0:00 ` Norman H. Cohen @ 1995-04-06 0:00 ` Colin James III 1995-04-06 0:00 ` Robb Nebbe ` (4 more replies) 1995-04-07 0:00 ` Robert Dewar ` (2 subsequent siblings) 4 siblings, 5 replies; 20+ messages in thread From: Colin James III @ 1995-04-06 0:00 UTC (permalink / raw) In article <3m0nv1$pv2@nef.ens.fr>, Duncan Sands <sands@clipper.ens.fr> wrote: >system: DOS 486 (with math coprocessor), gcc 2.6.3, GNAT 2.03 > >Essentially the question is: why so much function calling overhead >in GNAT? > >I'm writing a set of matrix factorization routines (Schur etc) so >of course I need routines for multiplying matrices etc. >For me a matrix is > type Matrix is array(Positive range <>, Positive range <>) of Float; > >I overloaded "*" for matrix multiplication: > function "*"(Left : in Matrix; Right : in Matrix) return Matrix; > >Multiplying two 15 by 15 matrices 10_000 times using this function >takes about 55 seconds on my machine. The algorithm is the obvious >one: loop over rows and columns, add up the appropriate products and >assign them. > >I then "inlined" this function: rather than using "*", I put the code >for "*" directly into my 10_000 times loop, of course renaming Left >and Right to the names of my matrices, and assigning directly to the >matrix which is to hold the answer. In this way I eliminated the >function calling overhead. Using this method, multiplying two 15 by >15 matrices 10_000 times takes about 44 seconds. > >All this was done with optimisation (-O3) and -gnatp (i.e. no range >checking etc). > >In summary: 55 seconds with function calling overhead. > 44 seconds without function calling overhead. > >Now, a 15 by 15 matrix means 225 entries. 225 entries at, >say, 8 bytes an entry makes a bit less than 2k. So, depending on >whether GNAT takes function parameters by reference or by copy, >this makes anything between 2k and, say, 10k bytes to be copied >on each function call. > >Does this explain the time difference? It shouldn't! The amount >of time spent copying memory should be completely overwhelmed by >the amount of time taken to do the floating point operations! >That is, for each of the 225 entries there are 15 floating point >multiplications to be performed. The amount of time taken to >copy the 225 entries, even if you need to copy them several times, >should be MUCH smaller than the amount of time spent in the >calculation. But the timings above indicate that function >calling overhead makes up something like 25% of the time taken! > >So, the question is: why so much function calling overhead in GNAT? > >Can anyone please enlighten me? > >Thanks a lot, > >Duncan Sands. > >PS: The corresponding C code takes about 6 seconds. This surprises >me too. At the most abstract level, it's because GNAT is a failed government project which was never finished and was mismanaged from the start by a bunch of flaky educators posing as "capable professionals". At the most detailed level, it's because GNAT emits poorly optimized, and hence very evil, C code. And what makes anyone think that ACT will change anything with regard to GNAT support, documentation or enhancements. The ACT principals have already demonstrated that they failed with GNAT, by even starting ACT. In other words, if GNAT were such a smashing success and quality product, then there would be no need for ACT. Good grief, what moral and intellectual dishonesty ! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Colin James III @ 1995-04-06 0:00 ` Robb Nebbe 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Duncan Sands 1995-04-06 0:00 ` Samuel Tardieu ` (3 subsequent siblings) 4 siblings, 2 replies; 20+ messages in thread From: Robb Nebbe @ 1995-04-06 0:00 UTC (permalink / raw) In article <3m0nv1$pv2@nef.ens.fr>, Duncan Sands <sands@clipper.ens.fr> wrote: >PS: The corresponding C code takes about 6 seconds. This surprises >me too. The main reason is most likely that the Ada code is not at all equivalent to the C code. If you declare type Matrix is array( 0 .. 14, 0 .. 14 ) of Float; and write the loops in a way that allow the compiler to optimize out the bounds checks (not sure if GNAT does this) then you should get the same result as with C; Robb Nebbe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Robb Nebbe @ 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Duncan Sands 1 sibling, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) Robb comments that the bounds checks can make a difference, yes indeed! and GNAT is not yet doing much on optimizing bounds checks. But if you look at the post carefully, you will see that the comparison was with checks turned off, at least that is the way I read it, in which case more subtle things are at work! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Robb Nebbe 1995-04-07 0:00 ` Robert Dewar @ 1995-04-07 0:00 ` Duncan Sands 1 sibling, 0 replies; 20+ messages in thread From: Duncan Sands @ 1995-04-07 0:00 UTC (permalink / raw) In article <1995Apr6.163740@di.epfl.ch>, Robb.Nebbe@di.epfl.ch (Robb Nebbe) writes: |> In article <3m0nv1$pv2@nef.ens.fr>, Duncan Sands <sands@clipper.ens.fr> wrote: |> >PS: The corresponding C code takes about 6 seconds. This surprises |> >me too. |> |> |> The main reason is most likely that the Ada code is not at all equivalent |> to the C code. |> |> If you declare |> |> type Matrix is array( 0 .. 14, 0 .. 14 ) of Float; |> |> and write the loops in a way that allow the compiler to optimize out |> the bounds checks (not sure if GNAT does this) then you should get |> the same result as with C; Thanks for your comments. Actually I compiled with all checking suppressed (range checking and others) for exactly this reason. Therefore range checking is not the culprit. I'm not too sure what the culprit could possibly be. In any case, the comparison I made with C was quick and nasty so should be taken with a pinch of salt. Duncan Sands. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Colin James III 1995-04-06 0:00 ` Robb Nebbe @ 1995-04-06 0:00 ` Samuel Tardieu 1995-04-07 0:00 ` Tom Griest ` (2 subsequent siblings) 4 siblings, 0 replies; 20+ messages in thread From: Samuel Tardieu @ 1995-04-06 0:00 UTC (permalink / raw) From: cjames@stout.entertain.com (Colin James III) Subject: Re: GNAT function calling overhead Newsgroups: comp.lang.ada Date: 6 Apr 1995 07:21:30 -0600 Organization: A poorly-installed InterNetNews site ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Colin, go and configure your site first, then you can write your crap. Sam -- "La cervelle des petits enfants, ca doit avoir comme un petit gout de noisette" Charles Baudelaire ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Colin James III 1995-04-06 0:00 ` Robb Nebbe 1995-04-06 0:00 ` Samuel Tardieu @ 1995-04-07 0:00 ` Tom Griest 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Philip Brashear 4 siblings, 1 reply; 20+ messages in thread From: Tom Griest @ 1995-04-07 0:00 UTC (permalink / raw) In article <3m0nv1$pv2@nef.ens.fr>, Duncan Sands <sands@clipper.ens.fr> wrote: [snipped stuff about 10_000 matrix-multplies in an application] >>In summary: 55 seconds with function calling overhead. >> 44 seconds without function calling overhead. >> >>Now, a 15 by 15 matrix means 225 entries. 225 entries at, >>say, 8 bytes an entry makes a bit less than 2k. So, depending on >>whether GNAT takes function parameters by reference or by copy, >>this makes anything between 2k and, say, 10k bytes to be copied >>on each function call. >> >>Does this explain the time difference? It shouldn't! The amount >>of time spent copying memory should be completely overwhelmed by >>the amount of time taken to do the floating point operations! First, I doubt very much that the matrix is passed by copy. Basically, any composite object larger than 8-bytes will be passed by reference. Second, what are your assumptions about the time to perform a floating point multiply on a 486DX? The 486 ref manual indicates that fpadds are typically 10 clocks and an fpmult is around 16 clocks. Since the formal parameters for your function are unconstrated types, there is probably a dynmaic allocation/initialization/deallocation of the dope vectors for each of the parameters. This might account for some of the overhead. To really give you an answer, you should either get an assembly listing (-S flag) or provide us the source of both versions. It is very hard to give a detailed answer (as opposed to the sort Colin likes to supply :-)) without this information. -Tom ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Tom Griest @ 1995-04-07 0:00 ` Robert Dewar 0 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) Tom Griest said: "Since the formal parameters for your function are unconstrated types, there is probably a dynmaic allocation/initialization/deallocation of the dope vectors for each of the parameters. This might account for some of the overhead." nope, no dynamic alloocation is ever involved for bounds templates for arrays in this situation. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Colin James III ` (2 preceding siblings ...) 1995-04-07 0:00 ` Tom Griest @ 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Philip Brashear 4 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) >At the most abstract level, it's because GNAT is a failed government >project which was never finished and was mismanaged from the start by a >bunch of flaky educators posing as "capable professionals". Maybe things are different in other government projects, and they all get finished before the expected termination date ?? Anyway, to get things absolutely clear on this. The GNAT project is indeed not finished. The project terminates on June 31st, 1995, and by that time, we will indeed be finished, in the sense of having completed the full implementation of Ada 95, including all the annexes. It's right to be suspicious of anything coming out of academic environments. I am myself one of the most sceptical people when it comes to software coming out of universities. So I understand this concern. The best advice is to pay no attention to what I or CJIII say on this, but instead take a close look at GNAT itself! >At the most detailed level, it's because GNAT emits poorly optimized, and >hence very evil, C code. This can't be based on looking at the alledged "very evil" C code. How do I know this? Because in no sense does GNAT emit C code AT ALL. It is a true compiler, not a translator to C. Both the C and GNAT front ends for GCC emit a common intermediate language (RTL) that is optimized by the backend of GCC. So this remark is nothing but fantasy. >And what makes anyone think that ACT will change anything with regard to >GNAT support, documentation or enhancements. The ACT principals have >already demonstrated that they failed with GNAT, by even starting ACT. >In other words, if GNAT were such a smashing success and quality product, >then there would be no need for ACT. Here I think that Colin James misunderstands what ACT is about. The idea that quality compilers need no support might make some sense in an ideal world, but in practice I know of no major project that would use a compiler for *any* language without having guaranteed support. After all we expect warranties on any products we buy, no matter how excellent. Actually if GNAT is such a dismal failure, *then* there is definitely no need for commercial support. No one is going to use a junk compiler, even if support is available (you do not buy products that are rated as terrible by Consumer Reports just because they have guarantees!) If people sign up for support for GNAT, then it is because they think it meets their needs. By the way, this is a good time to reemphasize that GNAT will continue to be freely available, and continue to be maintained after the official government contract is completed. All improvements and maintenance fixes will continue to be available free via anonymous FTP, on CD-ROM's etc. This is one of the advantages of the free software mode of operation. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 ` Colin James III ` (3 preceding siblings ...) 1995-04-07 0:00 ` Robert Dewar @ 1995-04-07 0:00 ` Philip Brashear 4 siblings, 0 replies; 20+ messages in thread From: Philip Brashear @ 1995-04-07 0:00 UTC (permalink / raw) In article <3m0psq$fl2@stout.entertain.com>, Colin James III <cjames@stout.entertain.com> wrote: > >At the most abstract level, it's because GNAT is a failed government >project which was never finished and was mismanaged from the start by a >bunch of flaky educators posing as "capable professionals". > >At the most detailed level, it's because GNAT emits poorly optimized, and >hence very evil, C code. > >And what makes anyone think that ACT will change anything with regard to >GNAT support, documentation or enhancements. The ACT principals have >already demonstrated that they failed with GNAT, by even starting ACT. >In other words, if GNAT were such a smashing success and quality product, >then there would be no need for ACT. > >Good grief, what moral and intellectual dishonesty ! I KNOW that one shouldn't waste time responding to either Mr. James or Mr. Aharonian, but this is ridiculous and near the point of libel. First, GNAT is not strictly a government project; other organizations have partially funded it (yes?). Second, it doesn't claim to be finished. Third, I don't believe that it emits C code at all. Fourth, ACT was founded to provide services related to GNAT, not to "finish" it. Colin, for Heaven's sake, learn the meaning of "homework" and "self-control"!!! Phil Brashear ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 GNAT function calling overhead Duncan Sands 1995-04-06 0:00 ` Norman H. Cohen 1995-04-06 0:00 ` Colin James III @ 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Theodore Dennison 1995-04-07 0:00 ` Kenneth Almquist 4 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) Two issues here: first, in your example, you return unconstrained arrays. This always involves a fair amount of overhead. Some compilers will use the heap for this (GNAT used to, and I think Alsys still does in some of their compilers), and do two copies. Some other compilers will do two copies, using a secondary stack (that's what GNAT does now). Some compilers will use specialized calling sequences, and manage to do only one copy in some cases, but still two copies in many cases. Anyway, there will be at least one extra copy, so that probably accounts for the overhead of the call that you see. If you are concerned with maximum efficiency, try to avoid returning uncosntrained arrays (note that this facility does not exist at all in Fortran, C or C++). Second, the comparisons between GNAT and C are odd. Normally when you write equivalent code in Ada and C and compile both with GCC you will get identical object code. In almost all cases that we have examined, it turns out that such discrepancies are caused by using high level features in Ada that have no analog in C, thus rendering it an apples-vs-oranges comparison. Anyway, I can't comment further without details. Send me the sources at dewar@cs.nyu.edu, and I will analyze what is going on,a nd post a followup when I figure it out. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 GNAT function calling overhead Duncan Sands ` (2 preceding siblings ...) 1995-04-07 0:00 ` Robert Dewar @ 1995-04-07 0:00 ` Theodore Dennison 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Kenneth Almquist 4 siblings, 1 reply; 20+ messages in thread From: Theodore Dennison @ 1995-04-07 0:00 UTC (permalink / raw) sands@clipper.ens.fr (Duncan Sands) wrote: >Essentially the question is: why so much function calling overhead >in GNAT? > >In summary: 55 seconds with function calling overhead. > 44 seconds without function calling overhead. >PS: The corresponding C code takes about 6 seconds. This surprises >me too. Did you try compiling gnat with the optimizations turned on? T.E.D. (structured programming bigot) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Theodore Dennison @ 1995-04-07 0:00 ` Robert Dewar 0 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) T.E.D. asks a good question, did you turn optimizations on? The Unix style in compilers is to default to no optimization. The code generated by GCC with no optimization is horrible! It is very important that any performance measurements are made with optimization turned on (-O3), otherwise they are completely meaningless. We have wondered whether on the PC ports, it would be better to have optimization on be the default, because this is more common with PC compilers, and the extra time for compiling in -O3 mode on the PC is very small (unlike some of the RISC machines). I would be interested in people's input on this issue (optimization default). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-06 0:00 GNAT function calling overhead Duncan Sands ` (3 preceding siblings ...) 1995-04-07 0:00 ` Theodore Dennison @ 1995-04-07 0:00 ` Kenneth Almquist 1995-04-07 0:00 ` Colin James III ` (2 more replies) 4 siblings, 3 replies; 20+ messages in thread From: Kenneth Almquist @ 1995-04-07 0:00 UTC (permalink / raw) (Duncan Sands wants to know why matrix multiplication is slow using: type Matrix is array(Positive range <>, Positive range <>) of Float; function "*"(Left : in Matrix; Right : in Matrix) return Matrix; He observes that he gets about a 20% speedup if he manually inlines the matrix multiplication and wants to know why the calling overhead is so high.) This has little to do with function call overhead per se. The problem is that GNAT produces ineffient code for subscripting operations on Matrix variables when the bounds of the matrix are not known at compile time. GNAT does somewhat better when the matrix is not passed as an argument; hence the performance improvement from inlining. I hope that GNAT funding will cover a bunch of performance tuning, but unimplemented features and bug fixing presumably have higher priority right now. Of course one of the advantages that GNAT gains from using the GCC back end is that even if nobody on the GNAT project gets around to looking at this particular performance problem, somebody from another project, such as GNU Fortran, might fix it. Kenneth Almquist ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Kenneth Almquist @ 1995-04-07 0:00 ` Colin James III 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Larry Kilgallen 2 siblings, 1 reply; 20+ messages in thread From: Colin James III @ 1995-04-07 0:00 UTC (permalink / raw) In article <D6nA9u.Hq7@nntpa.cb.att.com>, Kenneth Almquist <ka@socrates.hr.att.com> wrote: > >I hope that GNAT funding will cover a bunch of performance tunin ... > Kenneth Almquist In about April, 1994 (one year ago) at a FRAWG meeting, I believe it was General Little, before he retired, answered my question of whether GNAT would be funded said, "No, no more money for GNAT". If ACT goes public, maybe you could buy stock in that "hope". ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Colin James III @ 1995-04-07 0:00 ` Robert Dewar 0 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) Just to cut through some of the smoke here! There will be no more funding for the GNAT project per se from the government after the contract termination at the end of June. The government may wish to establish support contracts with SGI, Labtek, ACT or other organizations supporting GNAT, but that's a different matter entirely. GNAT stands on its own feet after June 30th, and that seems quite appropriate to me! ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Kenneth Almquist 1995-04-07 0:00 ` Colin James III @ 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Larry Kilgallen 2 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) Kenneth says: " This has little to do with function call overhead per se. The problem is that GNAT produces ineffient code for subscripting operations on Matrix variables when the bounds of the matrix are not known at compile time. GNAT does somewhat better when the matrix is not passed as an argument; hence the performance improvement from inlining." I don't see this, in the inner loop, with checks turned off, the lower bound calculation should be moved out of the loop. I can't yet duplicate the reported results. I asked for the code but did not get it yet. Certainly for the straightforward way of computing matrix multiplciation for example, there is no extra overhead in the inner loop in the GNAT code, at least on the i386 where I am looking at the assembly code. In my experience, it is futile to guess what might be behind such differences without the actual code at hand, there can be MANY variables. In future, if people want to discuss performance differences between Ada and C on particular code, it would be useful to post the alledgedly comparable source code. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Kenneth Almquist 1995-04-07 0:00 ` Colin James III 1995-04-07 0:00 ` Robert Dewar @ 1995-04-07 0:00 ` Larry Kilgallen 1995-04-07 0:00 ` Robert Dewar 2 siblings, 1 reply; 20+ messages in thread From: Larry Kilgallen @ 1995-04-07 0:00 UTC (permalink / raw) In article <D6nA9u.Hq7@nntpa.cb.att.com>, ka@socrates.hr.att.com (Kenneth Almquist) writes: > I hope that GNAT funding will cover a bunch of performance tuning, but > unimplemented features and bug fixing presumably have higher priority Actually, I would hope GNAT funds would be devoted toward correctness on the largest possible number of platforms. Then let commercial vendors sell high-performance compilers to those who need high performance. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: GNAT function calling overhead 1995-04-07 0:00 ` Larry Kilgallen @ 1995-04-07 0:00 ` Robert Dewar 0 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1995-04-07 0:00 UTC (permalink / raw) Larry says: "Actually, I would hope GNAT funds would be devoted toward correctness on the largest possible number of platforms. Then let commercial vendors sell high-performance compilers to those who need high performance." I am not quite sure what "GNAT funds" means here. If it means the money we have left for the remaining 84 days of the contract, then this will be devoted to finishing off the implementation of Ada 95, and fixing bugs. If you mean the funds that SGI, Labtek, ACT etc generate for maintenance of GNAT, those will be directed in whatever manner corresponds to customer needs, and high performance will definitely be one of these needs. At that point extension of GNAT to new platforms will happen only if volunteers do ports, or if people want ports to appear and can pay for them. But in any case, it has always been our intention to generate a high-performance compiler that will compete on its own terms. This will help push the quality barrier for all Ada 95 compilers, which can only help users of the language, no matter what compiler they are using. Remember that the ground on which we are building GNAT, namely GCC, is itself a high performance system. On many machines, GCC is the fastest C compiler available. On some systems, such as Nextstep, it is the ONLY C compiler available. Of course there are lots more optimizations that could be done to improve GNAT, but then that's a statement that can be made about most Ada compilers! Note that a relatively small amount of the NYU resources (which are after all fairly limited), has been spent on generating new ports. Yet there are lots of ports of GNAT. These have come from volunteers around the world. I am sure that this will continue to occur! ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~1995-04-07 0:00 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1995-04-06 0:00 GNAT function calling overhead Duncan Sands 1995-04-06 0:00 ` Norman H. Cohen 1995-04-06 0:00 ` Colin James III 1995-04-06 0:00 ` Robb Nebbe 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Duncan Sands 1995-04-06 0:00 ` Samuel Tardieu 1995-04-07 0:00 ` Tom Griest 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Philip Brashear 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Theodore Dennison 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Kenneth Almquist 1995-04-07 0:00 ` Colin James III 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Robert Dewar 1995-04-07 0:00 ` Larry Kilgallen 1995-04-07 0:00 ` Robert Dewar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox