From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,dab7d920e4340f12
X-Google-Attributes: gid103376,public
X-Google-Thread: 1014db,dab7d920e4340f12
X-Google-Attributes: gid1014db,public
From: Mark  McKinney <mckmark@mail.concentric.net>
Subject: Re: C is 'better' than Ada because...
Date: 1996/08/02
Message-ID: <4tu2gd$rt8@herald.concentric.net>
X-Deja-AN: 171650311
references: <Pine.A32.3.91.960801090756.64586H-100000@wc.airshields.com>
content-type: text/plain; charset=us-ascii
organization: Concentric Internet Services
mime-version: 1.0
newsgroups: comp.lang.c,comp.lang.ada
x-mailer: Mozilla 1.22-INTUIT (Windows; U; 16bit)
Date: 1996-08-02T00:00:00+00:00
List-Id: <comp.lang.ada>


"Let the compiler do the optimization ?"
Someone said try it so I did.           
           
Background :  To say that only c does pointer arithmetic 
and Ada does not is not entirely accurate. I wrote a bubble sort using 3 
slightly different techniques. THe second closely resembles c pointer 
arithmetic. Except it should calculate the offset from the original address 
instead of accumulating it. Possible at the expense of a multiply opertion. 
Not used by the c method. 

The concept : Use a simple algorithym use arrays and try to optimize it in 
Ada. Compare the unoptimized results with the optimized results. See which is 
better for the compilers I have access to.

The Test : I chose a bubble sort because it has nested loops so that elements 
of the array would be used over and over. It uses Two different array offsets 
and the swap repeats them.

The Methods : Sort 1 is the standard way of using arrays. Sort 2 attempts 
to use an address clause to eliminat redundant use of array offsets. 
Sort 3 uses a renames clause to for the same purpose. 

The expectation : Sort 2 and Sort 3 should perform better than sort 1 when 
no optimization is used. The results are unpredictble with optimization 
but they should be faster when optimized as well.


The Loops: 
Sort 1 The standard way.

   -- I expect that for each array reference. 
   -- The offset must be recalculated.        
   -- There are Six potential Calculationsbut 
   -- only two are needed.The optimizer should  
   -- use extract common expressions if used. 

   for INDEX_1 in BIG_ARRAY'RANGE loop
      for INDEX_2 in INDEX_1 .. BIG_ARRAY'LAST loop
         if BIG_ARRAY(INDEX_2) < BIG_ARRAY(INDEX_1) then  
            -- Swap                                       
            TMP := BIG_ARRAY(INDEX_1);                    
            BIG_ARRAY(INDEX_1) := BIG_ARRAY(INDEX_2);      
            BIG_ARRAY(INDEX_2) := TMP;                    
         end if;
      end loop;
   end loop;

Sort 2 Declare an Addres clause.
   for INDEX_1 in BIG_ARRAY'RANGE loop
      for INDEX_2 in INDEX_1 .. BIG_ARRAY'LAST loop  
         -- This technique should force the compiler to use one calculation
         -- for each of the two references. 
         declare                                            
            LOW_ADR : constant SYSTEM.ADDRESS := BIG_ARRAY(INDEX_1)'ADDRESS;                         
                  
            LOW_INT : INTEGER;
            for LOW_INT use at LOW_ADR;     -- This is roughly equivelent to poiter arithmetic.

            HI_ADR : constant SYSTEM.ADDRESS := BIG_ARRAY(INDEX_2)'ADDRESS;                          
                 
            HI_INT  : INTEGER;
            for HI_INT use at HI_ADR;
         begin
            if HI_INT < LOW_INT then        -- This is easier to read.
               -- Swap
               TMP := LOW_INT;
               LOW_INT := HI_INT;
               HI_INT := TMP;
            end if;
         end;
      end loop;
   end loop;
           
Sort 3 Use a renames clause.
   for INDEX_1 in BIG_ARRAY'RANGE loop
      for INDEX_2 in INDEX_1 .. BIG_ARRAY'LAST loop
         -- This method only gives the compiler a hint.
         -- The compiler could treat this only as a macro though.
         declare
            LOW_INT : INTEGER renames BIG_ARRAY(INDEX_1);   
            HI_INT  : INTEGER renames BIG_ARRAY(INDEX_2);   
         begin
            if HI_INT < LOW_INT then        -- This is also easier to read.
               -- Swap
               TMP := LOW_INT;
               LOW_INT := HI_INT;
               HI_INT := TMP;
            end if;
         end;
      end loop;
   end loop;


The results :
   All compilations retained expception handling (error_checking).
   Execution times for various compilers.
                  
                            ------ Unoptimized --------   ------ Optimized  --------- 
Compiler                    Sort 1    Sort 2   Sort 3     Sort 1   Sort 2    Sort 3   
                            _________ ________ ________   ________ ________  ________ 
ALSYS DOS (90MHZPENTIUM)      27.8472  19.7731  19.2239    13.5117  19.7731   15.3242 
ALSYS HP 755 (PA-RISC)        37.0996  29.0010  28.9375    21.8467  28.9854   23.8672 
Verdix 3b2                  1417.0000 984.0000 782.0000   815.0000 606.0000  487.0000 
Gnat WIN95/NT(90MHzPentium)   28.9500  27.7400  26.9700     8.0200   6.9200    6.9200 
ALSYS(WIN32)(90MHzPentium)    29.0600  21.6900  21.1500    15.2200  24.6600   24.9300 

   
   Best optimizer show optimization on Sort 1
   Best method shows the most effective combination of sort and optimization for a compiler. 
   
                                  -------- Best -------                                        
                                  Optimizer      Method                                        
                                  _________    _________                                       
ALSYS DOS (90MHZPENTIUM)          206.0973%    206.0973%  Let the optomizer Do it              
ALSYS HP 755 (PA-RISC)            169.8181%    169.8181%  Let the optomizer Do it              
Verdix 3b2                        173.8650%    290.9651%  Combination (Optimizer + Sort 3)     
Gnat WIN95/NT(90MHzPentium)       360.9726%    418.3526%  Combination (Optimizer + Sort 2 or 3)
ALSYS(WIN32)(90MHzPentium)        190.9326%    190.9326%  Let the optomizer Do it              
                              

Conclusion :
   If highest performance is a requirement for a routine. Try and test differnt techniques. 
   I suspect this applies in any programming language. Idealy you should pick a compiler that 
   meets this requirement. Different implementations of a language can have different results. 
   Use a test that meets your requirements.(Event though the pentium machine outperforms the HP in
   this test text reports genrated off of flat files in both C and Ada are about 60 times as 
   fast on the HP as The Pentium Better Io?)
   The "I'll be sure and code it in assembly" argument has merit when justified by requirements. 
   I you care about performance find many compilers and test them on the target platform. 
   
Surprizes :
   In the case of the ALSYS compilers it is clearly better to let the optimizer do the work.
   GNAT Did exceptionally well. (I haven't used it much I'll use it more, if Someone can help 
   me get the Windows API to compile correctly)    

Comments :
   A better optimization would be to use a better sorting algorithym. Generally this should 
   be tried first.
   
   There is no substitute for hard work if you really want to know.

   These test results are not intended to endorse any particular product. The test 
   were done hastily and in less than Ideal conditions. Let your system requirements 
   determine which compiler you choose using appropriate test cases.

   Any opinion are mine and mine alone.
   All other disclaimers apply.
   
   If you want complete source I'll send it. 
   send requests to mckmark@mail.concentric.net

             Mark McKinney