Re: Trigonometric operations on x86 and x64 CPUs

comp.lang.ada
 help / color / mirror / Atom feed

From: already5chosen@yahoo.com
Subject: Re: Trigonometric operations on x86 and x64 CPUs
Date: Wed, 21 Dec 2016 01:29:20 -0800 (PST)
Date: 2016-12-21T01:29:20-08:00	[thread overview]
Message-ID: <80f5343e-6aa2-4d95-9abe-0b2bf7ee6c0c@googlegroups.com> (raw)
In-Reply-To: <o3cldi$7md$1@franka.jacob-sparre.dk>

On Wednesday, December 21, 2016 at 3:20:52 AM UTC+2, Randy Brukardt wrote:
> <already5chosen@yahoo.com> wrote in message 
> news:c3725fa2-7eaa-4020-b0ae-6ddcfc2a3d1d@googlegroups.com...
> ...
> >At first glance it seems that Randy Brukardt is correct.
> 
> It seems likely, even though today I'd be hard pressed to explain why in any 
> detail. We're talking about work done in the mid-1990s.
> 
> >Assuming IEEE binary64, for any x in range [-2**26.5..2**26.5] requirements
> >want Sin(x) to return the number in range [exact_sin(x)*(1-d*2**53) ..
> >exact_sin(x)*(1+d*2**(-53))] where d=2.
> >x87 SIN instruction by itself achieves specified precision in smaller range
> > (~ up to abs(x) < 2^14).
> >
> >It means that conforming implementations of  Ada libraries forced to spends
> >a significant effort doing reduction of stupidly big arguments of 
> >sin/cos/tan.
> 
> Also on doing sanity checks of operands 

That's not the same.
Unlike argument reduction, range/Inf/NaN checks are computationally cheap.

> (but of course that's generally a 
> strength of Ada).

In specific case of numeric libraries, sanity checks are not unique to Ada. Right now I can't recollect the language that does *not* do sanity checks in its numeric lib. Except, of course, for functions like sin/cos where some values of input arguments do not make physical sense, but nevertheless all inputs are legal.

> A version of GEF (Generic_Elementary_Functions) that used 
> Ada 2012 preconditions could avoid much of that overhead, and would be an 
> advantage for really speed-critical operations.
> 
> >On the other hand The Requirements allow rather poor precision for small
> >arguments (d=2) where better precision (d=1.25 or d=1.125) is both
> >desirable and not especially hard to achieve.
> 
> Keep in mind that the requirements were written by a team of numerical 
> analysis experts in 1992-4 based on the state of the art at that time. 
> (Probably the Cody/Waite algorithms.) Those of us who maintain the Standard 
> today don't really have the expertise to make any informed changes to these 
> rules, so for the most part we keep our hands off (the worst thing we could 
> do would be to mess up carefully considered rules - but we have fixed a 
> number of obvious errors).
> 
> >There is a consolation, too - the range reduction in the minimal range 
> >required
> >by the standard can be done without big tables. And it can be implemented
> >relatively quickly if the hardware features fused multiply-add.
> 
> My recollection is that it isn't even that bad if one writes the entire 
> algorithm in Ada. (Since our compiler generally keeps intermediate results 
> in the 80-bit extended format, we tended to write as much as possible as a 
> single large expression. Probably could do that even better today with Ada 
> 2012 conditional expressions.)

80-bit format helps when underlying machine has 80-bit format. Many machines have not.
Also, I suppose that even on x386/AMD64 machines that do support 80-bit format,  modern code generators by default use SSE/AVX registers rather than x87 registers.
FMA makes argument reduction easier even when extended precision is not available.

Besides, it seems to me that at upper edge of our problematic range 80-bit precision of intermediate results is insufficient for really simple range reduction. To make things really simple, one needs intermediates with 53+26.5=80 bits of mantissa. 80-bit extended precision has only 64 bits, so you'll still need to do the reduction by several carefully measured steps. FMA, on the other hand, when used smartly, gives you equivalent of intermediate with 106-bit mantissa.

> 
> >On somewhat related note, it seems to me that forward trigonometric 
> >functions
> >with 'Cycle' argument are underspecified. I'd like the requirements for 
> >extremely
> >useful special cases of 'Cycle' == exact power of 2 to be more pronounced.
> 
> It's certainly possible. We'd welcome someone with substantial numeric 
> experience contributing to the ARG for Ada Standards maintenance. Most of us 
> know enough to understand the Ada numeric model and when something is 
> incorrect for it, but we don't have anyone very good at the subtle details.
> 
>                                            Randy.

I am certainly not enough of an expert.
All I can say is that when Cycle==1 then Sin(x, 1) becomes an equivalent of IEEE-754 sinPi(x), so, may be, you can copy IEEE requirements for sinPi() if not in general case than, at least for IEEE-754 based platforms.
On more general note, asking IEEE-754 committee for help sounds like a good idea.

next prev parent reply	other threads:[~2016-12-21  9:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-16  0:38 Trigonometric operations on x86 and x64 CPUs Robert Eachus
2016-12-16 14:00 ` Luke A. Guest
2016-12-16 20:16 ` Randy Brukardt
2016-12-16 23:20   ` Robert Eachus
2016-12-18 10:09     ` already5chosen
2016-12-18 14:19       ` Robert Eachus
2016-12-18 15:45         ` hreba
2016-12-18 15:47         ` already5chosen
2016-12-19 23:11       ` Randy Brukardt
2016-12-19 23:49         ` already5chosen
2016-12-20  5:27           ` Niklas Holsti
2016-12-20  8:37             ` Simon Wright
2016-12-20  9:12               ` G.B.
2016-12-20 18:01             ` already5chosen
2016-12-21  1:20               ` Randy Brukardt
2016-12-21  9:29                 ` already5chosen [this message]
2016-12-16 20:50 ` Vadim Godunko

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox