From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Received: by 10.13.252.7 with SMTP id m7mr914523ywf.83.1482312561251;
        Wed, 21 Dec 2016 01:29:21 -0800 (PST)
X-Received: by 10.157.11.142 with SMTP id 14mr60763oth.20.1482312561205; Wed,
 21 Dec 2016 01:29:21 -0800 (PST)
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!news.glorb.com!n6no1651605qtd.0!news-out.google.com!c1ni2485itd.0!nntp.google.com!b123no2312752itb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Wed, 21 Dec 2016 01:29:20 -0800 (PST)
In-Reply-To: <o3cldi$7md$1@franka.jacob-sparre.dk>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=199.203.251.52;
 posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <8d0f7f03-9324-4702-9100-d6b8a1f16fc5@googlegroups.com>
 <o31i2o$k1l$1@franka.jacob-sparre.dk>
 <ce9bddec-64ba-40e9-8fc9-c70adc0555c1@googlegroups.com>
 <a6734b29-5ffc-4938-bbc2-453f7ae92325@googlegroups.com>
 <o39pet$gfc$1@franka.jacob-sparre.dk>
 <18af86db-21fd-4fa0-90a3-c87b6486b439@googlegroups.com>
 <ebrtq2F2igqU1@mid.individual.net>
 <c3725fa2-7eaa-4020-b0ae-6ddcfc2a3d1d@googlegroups.com>
 <o3cldi$7md$1@franka.jacob-sparre.dk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <80f5343e-6aa2-4d95-9abe-0b2bf7ee6c0c@googlegroups.com>
Subject: Re: Trigonometric operations on x86 and x64 CPUs
From: already5chosen@yahoo.com
Injection-Date: Wed, 21 Dec 2016 09:29:21 +0000
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Xref: news.eternal-september.org comp.lang.ada:32933
Date: 2016-12-21T01:29:20-08:00
List-Id: <comp.lang.ada>

On Wednesday, December 21, 2016 at 3:20:52 AM UTC+2, Randy Brukardt wrote:
> <already5chosen@yahoo.com> wrote in message=20
> news:c3725fa2-7eaa-4020-b0ae-6ddcfc2a3d1d@googlegroups.com...
> ...
> >At first glance it seems that Randy Brukardt is correct.
>=20
> It seems likely, even though today I'd be hard pressed to explain why in =
any=20
> detail. We're talking about work done in the mid-1990s.
>=20
> >Assuming IEEE binary64, for any x in range [-2**26.5..2**26.5] requireme=
nts
> >want Sin(x) to return the number in range [exact_sin(x)*(1-d*2**53) ..
> >exact_sin(x)*(1+d*2**(-53))] where d=3D2.
> >x87 SIN instruction by itself achieves specified precision in smaller ra=
nge
> > (~ up to abs(x) < 2^14).
> >
> >It means that conforming implementations of  Ada libraries forced to spe=
nds
> >a significant effort doing reduction of stupidly big arguments of=20
> >sin/cos/tan.
>=20
> Also on doing sanity checks of operands=20

That's not the same.
Unlike argument reduction, range/Inf/NaN checks are computationally cheap.

> (but of course that's generally a=20
> strength of Ada).

In specific case of numeric libraries, sanity checks are not unique to Ada.=
 Right now I can't recollect the language that does *not* do sanity checks =
in its numeric lib. Except, of course, for functions like sin/cos where som=
e values of input arguments do not make physical sense, but nevertheless al=
l inputs are legal.

> A version of GEF (Generic_Elementary_Functions) that used=20
> Ada 2012 preconditions could avoid much of that overhead, and would be an=
=20
> advantage for really speed-critical operations.
>=20
> >On the other hand The Requirements allow rather poor precision for small
> >arguments (d=3D2) where better precision (d=3D1.25 or d=3D1.125) is both
> >desirable and not especially hard to achieve.
>=20
> Keep in mind that the requirements were written by a team of numerical=20
> analysis experts in 1992-4 based on the state of the art at that time.=20
> (Probably the Cody/Waite algorithms.) Those of us who maintain the Standa=
rd=20
> today don't really have the expertise to make any informed changes to the=
se=20
> rules, so for the most part we keep our hands off (the worst thing we cou=
ld=20
> do would be to mess up carefully considered rules - but we have fixed a=
=20
> number of obvious errors).
>=20
> >There is a consolation, too - the range reduction in the minimal range=
=20
> >required
> >by the standard can be done without big tables. And it can be implemente=
d
> >relatively quickly if the hardware features fused multiply-add.
>=20
> My recollection is that it isn't even that bad if one writes the entire=
=20
> algorithm in Ada. (Since our compiler generally keeps intermediate result=
s=20
> in the 80-bit extended format, we tended to write as much as possible as =
a=20
> single large expression. Probably could do that even better today with Ad=
a=20
> 2012 conditional expressions.)

80-bit format helps when underlying machine has 80-bit format. Many machine=
s have not.
Also, I suppose that even on x386/AMD64 machines that do support 80-bit for=
mat,  modern code generators by default use SSE/AVX registers rather than x=
87 registers.
FMA makes argument reduction easier even when extended precision is not ava=
ilable.

Besides, it seems to me that at upper edge of our problematic range 80-bit =
precision of intermediate results is insufficient for really simple range r=
eduction. To make things really simple, one needs intermediates with 53+26.=
5=3D80 bits of mantissa. 80-bit extended precision has only 64 bits, so you=
'll still need to do the reduction by several carefully measured steps. FMA=
, on the other hand, when used smartly, gives you equivalent of intermediat=
e with 106-bit mantissa.


>=20
> >On somewhat related note, it seems to me that forward trigonometric=20
> >functions
> >with 'Cycle' argument are underspecified. I'd like the requirements for=
=20
> >extremely
> >useful special cases of 'Cycle' =3D=3D exact power of 2 to be more prono=
unced.
>=20
> It's certainly possible. We'd welcome someone with substantial numeric=20
> experience contributing to the ARG for Ada Standards maintenance. Most of=
 us=20
> know enough to understand the Ada numeric model and when something is=20
> incorrect for it, but we don't have anyone very good at the subtle detail=
s.
>=20
>                                            Randy.

I am certainly not enough of an expert.
All I can say is that when Cycle=3D=3D1 then Sin(x, 1) becomes an equivalen=
t of IEEE-754 sinPi(x), so, may be, you can copy IEEE requirements for sinP=
i() if not in general case than, at least for IEEE-754 based platforms.
On more general note, asking IEEE-754 committee for help sounds like a good=
 idea.