From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on
	ip-172-31-74-118.ec2.internal
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.5-pre1
Date: 19 Nov 91 18:30:58 GMT
From: psinntp!vitro.com!v7.vitro.com!vaxs09@uunet.uu.net
Subject: Re: Numeric problem with Ada
Message-ID: <1991Nov19.133058.25@v7.vitro.com>
List-Id: <comp.lang.ada>

> I've a strange problem with numeric expressions in Ada.
...
> procedure TEST is
>       Mz : integer := 7;
>       Z2 : float;
>       package FLOAT_TEXT_IO is new TEXT_IO.FLOAT_IO (float);
> begin
>       Z2 := 2.6 * float (Mz) - 0.2;
>       FLOAT_TEXT_IO.PUT (Z2, Aft => 6, Exp => 0);
> end TEST;        
>
(Program displays 17.999998)
...
> main ()
> {
>       int mz = 7;
>       float z2;
>       z2 = 2.6 * mz - 0.2;
>       printf ("%2.6f\n", z2);
> }
(Program displays 18.000000)

I can't speak for your results on the Sun.  On the VAX, however, I was able
to reproduce your results and found an underlying reason.

Using VAX/VMS and DEC's Ada compiler and DEC's C compiler, the following
machine code was generated:

        ADA                                  C

        cvtlf   -80(fp),r0                   cvtld      #7,r0
        mulf2   #1717977382,r0               muld2      $CODE,r0
        subf2   #-858964148,r0               subd2      $CODE+8,r0
        movf    r0,-84(fp)                   cvtdf      r0,ap

Result: FFFF428F (single)                    FFFF428F FFFFFFFF (double)
        = 18.0 minus 2**-19                  00004290 (single)
        ~ 18.0 minus .0000019                = 18.0 exactly
        ~ 17.999998

As you can see, the C program performed the calculations in double
precision and converted the result back to single.  The Ada program
performed the calculations in single precision.

Examining the value actually stored reveals that the result in the
two cases differs by only the least significant bit (the byte order
of VAX F-floating format makes the longword integer representations
of the result appear much different).  The C program computed the result
of the calculation as 18.0 exactly.  Although the double precision
intermediate result was inexact, rounding to single precision
produced a result that happens to exactly match the mathematically
correct answer.

Can a C guru confirm or deny that common practice in C compilers is to
perform floating point calculations in double precision?

VAX F-floating format has

	(Numbering bits in longword with low order = bit 0 and high order 31)
        Sign bit as the bit #15
        Binary exponent in bits 7..14, encoded excess 128
        High order bit of mantissa not encoded.  Because of normalization,
         it is always set.
        Next seven bits of mantissa in 0..7
        Low order sixteen bits of mantissa in 16..31

00004290 = sign bit clear (positive)
           exponent = 5 (multiply mantissa by 2**5)
           binary mantissa = .100100000000000000000000

FFFF428F = sign bit clear (positive)
           exponent = 5 (multiply mantissa by 2**5)
           binary mantissa = .100011111111111111111111

The D-floating format is identical with additional 32 bits of mantissa.

	John Briggs		vaxs09@v7.vitro.com