Re: precise floats - Robert Dewar

comp.lang.ada
 help / color / mirror / Atom feed

From: dewar@merv.cs.nyu.edu (Robert Dewar)
Subject: Re: precise floats
Date: 1998/08/08
Date: 1998-08-08T00:00:00+00:00	[thread overview]
Message-ID: <dewar.902579704@merv> (raw)
In-Reply-To: uu33pxltv.fsf@sg.adisys.com.au

Paul said

<<The Ada.Interfaces package that came with ObjectAda 7.1.1
(ObjectAda/lib/src/INTERF$1.ADS) contains the following definitions:

    type Integer_32 is range -2**31 .. 2**31-1;  --2's complement
    type Integer_16 is range -2**15 .. 2**15-1;  --2's complement
    type Integer_8  is range -2**7  .. 2**7 -1;  --2's complement

    type Unsigned_32 is mod 2**32;
    type Unsigned_16 is mod 2**16;
    type Unsigned_8  is mod 2**8;

    type IEEE_Float_32 is new FLOAT;
    type IEEE_Float_64 is new LONG_FLOAT;
>>

That's reasonable, and the names here are expected (since they correspond
to the the implementation note B.2(10.a) in the AARM). Note that purists
might object to the appearence of IEEE rather than the proper ISO name
here, but in practice everyone talks about IEEE floating-point (it has
been one of the real name recognition successes for IEEE :-)

However, the above definitions are most certainly inadequate on the x86.
This machine fully supports the 80-bit floating-point format corresponding
the optional IEEE extended type. Indeed all arithmetic in the fpt chip is
in fact performed in this mode. The ia32 (x86) is essentially the only widely
used chip that has this extended precision, and it is an extremely important
capability, since it allows algorithms to be used that otherwise would not
be usable (e.g. you cannot compute x**y using log/exp without introducing
horrible errors, but if you use extended precision to compute x**y using
log/exp, you can get accurate 64-bit answers -- see chapter 5 of my book
Microprocessors: A Programmer's View, McGraw Hill 1990 for details).

This type then is most definitely covered by the requirement in B.2(10),
and for an x86 implementation, must be defined in Interfaces. The relevant
definitions from the GNAT version of Interfaces are

   type IEEE_Float_32       is new Short_Float;
   pragma Float_Representation (IEEE_Float, IEEE_Float_32);

   type IEEE_Float_64       is new Long_Float;
   pragma Float_Representation (IEEE_Float, IEEE_Float_64);

   type IEEE_Extended_Float is new Long_Long_Float;
   pragma Float_Representation (IEEE_Float, IEEE_Extended_Float);

and Long_Long_Float is indeed the 80-bit format on the x86. Note that it is
usually the case that Long_Long_Float is the same as long double in C. However
this is not always the case. On the SPARC, where the 128-bit format is NOT
supported by the hardware, we consider that the proper approach is not to 
provide this 128-bit format as a first class citizen in the world of
base types that are intended to reflect the types implemented in the hardware.

Note: some people sometimes get confused into thinking that somehow the
80-bit format is "reserved" and should not be used. This derives from the
intention in the IEEE-754 standard that the extended format be used for
the purposes of getting accurate 64-bit results (e.g. in the log/exp
case above).

However, this is not an excuse for disobeying B.2(10), because:

(a) this is only an intention reflected in the design of 754, which is in
no way reflected in the hardware, and is essentially just programming 
advice which a programmer is free to ignore. If your application will
successfully run using 64-bit mantissa precision and not 53-bit mantissa
precision, then you will be happy to find out that the x86 (and incidentally
the forthcoming ia64 [merced]) satisfy your needs, and you will not be happy
to find that your Ada compiler arbitrarily eliminates this possibility.

(b) code that reflects the intention of IEEE-754 and uses the extended format
for intermediate results in the intended manner may perfectly well be
written in Ada!

Note that I have no idea what other Ada compilers do here. All I know is
that the standard mandates the inclusion of the 80-bit format in the
Interfaces definition for a conforming Ada 95 compiler for the x86.
We have already seen that people can get confused when they tell us
what is and what is not in some particular compiler. So you should check
for yourself to make sure that your compiler meets this requirement. It
is indeed a requirement that should be checked by the validation suite,
but in practice this kind of checking is hard. You can parametrize the
validation suite by specifying the maximum fpt precision, but it is 
difficult to check that this parametrization is done correctly when
it is machine dependent.

Robert Dewar
Ada Core Technologies

next prev parent reply	other threads:[~1998-08-08  0:00 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1998-08-04  0:00 precise floats Bob Fletcher
1998-08-04  0:00 ` Christopher Green
1998-08-04  0:00 ` Corey Ashford
1998-08-04  0:00   ` Corey Ashford
1998-08-05  0:00     ` Frank Klemm
1998-08-06  0:00       ` Robert Dewar
1998-08-06  0:00         ` dennison
1998-08-07  0:00           ` paul.english
1998-08-08  0:00             ` Robert Dewar [this message]
1998-08-12  0:00               ` Kevin Radke
1998-08-25  0:00               ` Gene Ouye
1998-08-10  0:00             ` dennison
1998-08-07  0:00           ` Robert Dewar
1998-08-07  0:00         ` Tom Weis
1998-08-07  0:00           ` Robert Dewar
1998-08-04  0:00 ` David C. Hoos, Sr.
1998-08-05  0:00   ` Bob Fletcher
1998-08-05  0:00 ` Matthew Heaney
1998-08-09  0:00 ` Bob Fletcher
  -- strict thread matches above, loose matches on Subject: below --
1998-08-06  0:00 Robert Dewar
1998-08-06  0:00 ` Corey Ashford
1998-08-06  0:00 ` Samuel Mize
1998-08-07  0:00   ` Matthew Heaney
1998-08-07  0:00     ` Robert Dewar

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox