From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,c615e41a65104004 X-Google-Attributes: gid103376,public From: Markus Kuhn Subject: Re: Performance Ada and C Date: 1998/07/03 Message-ID: <359D4C6F.5691A370@cl.cam.ac.uk>#1/1 X-Deja-AN: 368506621 Content-Transfer-Encoding: 7bit References: <35921271.E51E36DF@aonix.fr> <3598358A.73FF35CC@pipeline.com> <6nh762$66i@netline.jpl.nasa.gov> <359CB19D.EDAD6D1F@cl.cam.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Organization: Cambridge University, Computer Laboratory Newsgroups: comp.lang.ada Date: 1998-07-03T00:00:00+00:00 List-Id: Robert Dewar wrote: > < , > and while the best available C implementation > > did 27 Mbit/s, I wasn't able to get with Ada more than > 20 Mbit/s on the same processor (Pentium II, 300 MHz) using > the same compiler (gnat-3.10p). > > It is always possible to duplicate the object code of any C code writing > in 100% Ada using GNAT. Of course you may have to write at a lower semantic > level than you would wish. > > But if you didn't close the gap, it just means you didn't use the right > approach. Most probably you were using some Ada specific feature that you > thought was equivalent to the C code when it was not, that is the most common > reason for this kind of failure. Basically the only difference is that I replaced the macros in the C version by inline functions. Are as far as performance and optimization are concerned, I just think of inlined functions as a sort of macros with type checking, so this should not cause the difference. In this type of encryption algorithm, some functions are called 30 times to fiddle around with 4 registers. In the C version, the functions are replaced by macros and manual loop unrolling is performed. I did the same in Ada, just that I used inline functions instead of macros. The other difference is that while in the C function there are 8 different macros S0 to S7, I have in Ada only a single function S with an integer parameter 0..7 that determines with a few ifs which of S0..S7 is executed. I had hoped (and objdump -S output suggests this) that after inline functions are textually substituted that the optimizer removes the statically unaccessible other if branches. So there would not be any difference to the C version. The time consuming part looks like pragma Inline(S, Tr, Keying); Keying(W, 0, X0, X1, X2, X3); S(0, X0, X1, X2, X3); Tr(X0, X1, X2, X3); Keying(W, 1, X0, X1, X2, X3); S(1, X0, X1, X2, X3); Tr(X0, X1, X2, X3); Keying(W, 2, X0, X1, X2, X3); S(2, X0, X1, X2, X3); Tr(X0, X1, X2, X3); ... Keying(W, 30, X0, X1, X2, X3); S(6, X0, X1, X2, X3); Tr(X0, X1, X2, X3); > I am always surprised how often Ada programmers have no idea of the > consequences of what they write. By the way the -gnatdg switch in GNAT is > a useful tool in this regard. Thanks for the hint! I guess, -gnatdg will be very helpful to get around my uncomfortable feeling that I have with Ada being a language where the compiler silently inserts a lot of code between the lines. However for this purpose, -gnatdg seems to be a dump in a too early stange. Is there also a debugging dump available for the stage where at least some of the architecture independent optimizations (function inlining, unaccessed code removal, common subexpression elimination) have already been done, but where there is still a clear relationship with the source code (e.g., where the variable names are still used where possible). The objdump -S output alone is rather difficult to read. It also seems that -gnatdg does not indicate how and where the variable length arrays that functions can return are handled and deallocated and what compiler generated runtime checks remain, such that I could check myself where there are potential memory leaks lurking. The RM says in H.3.1 that pragma Reviewable code should cause such information to be produced, but I haven't yet found out how the information that is supposed to be produced by pragma Reviewable is made available by gnat. Or is objdump -S all there is at the moment to satisfy RM H.3.1? Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: