From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,86d4e48d5a9b02a1 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!postnews.google.com!k11g2000vbf.googlegroups.com!not-for-mail From: Adam Beneschan Newsgroups: comp.lang.ada Subject: Re: Cannot summate small float values Date: Mon, 22 Nov 2010 08:30:48 -0800 (PST) Organization: http://groups.google.com Message-ID: References: <8kq1usFojgU1@mid.individual.net> NNTP-Posting-Host: 66.126.103.122 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1290443448 15156 127.0.0.1 (22 Nov 2010 16:30:48 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Mon, 22 Nov 2010 16:30:48 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: k11g2000vbf.googlegroups.com; posting-host=66.126.103.122; posting-account=duW0ogkAAABjRdnxgLGXDfna0Gc6XqmQ User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C),gzip(gfe) Xref: g2news1.google.com comp.lang.ada:15636 Date: 2010-11-22T08:30:48-08:00 List-Id: On Nov 21, 1:06=A0pm, tolkamp wrote: > On 20 nov, 14:49, Niklas Holsti wrote: > > > > > > > tolkamp wrote: > > > When I summate Float values smaller then 1.0E-6 then the summation is > > > not done. > > > > Code Example: > > > > X, Dx : Float; > > > X :=3D 0.0; > > > Dx :=3D 1.0E-7; > > > lwhile X < =A01.0 loop > > > =A0 =A0 X =3D X + Dx; > > > =A0 =A0 Float_Io.Put(X, 3,9,0); New_Line; > > > end loop; > > > Certainly the addition is done. Your program (after some small syntacti= c > > corrections) prints: > > > =A0 =A00.000000100 > > =A0 =A00.000000200 > > =A0 =A00.000000300 > > =A0 =A00.000000400 > > =A0 =A00.000000500 > > =A0 =A00.000000600 > > =A0 =A00.000000700 > > > and so on. If your program prints out something else, please show the > > source code of your whole program, exactly as you compile and run it. > > Don't re-type it into your message. > > > However, when X approaches 1.0, at some point the addition of 1.0E-7 ma= y > > be lost in round-off, since it is close to the precision limit of the > > Float type, relative to 1.0. On my system (Debian, Gnat) the X variable > > does reach 1.0 and the program stops. > > > What are you really trying to do? There are probably safer and more > > accurate ways of doing it. > > > Here is the program that I used: > > > with Ada.Text_IO; > > with Ada.Float_Text_IO; > > > procedure Sums > > is > > =A0 =A0 use Ada.Text_IO, Ada.Float_Text_IO; > > =A0 =A0 X, Dx : Float; > > begin > > =A0 =A0 X :=3D 0.0; > > =A0 =A0 Dx :=3D 1.0E-7; > > =A0 =A0 while X < =A01.0 loop > > =A0 =A0 =A0 =A0 X :=3D X + Dx; > > =A0 =A0 =A0 =A0 Put(X, 3,9,0); New_Line; > > =A0 =A0 end loop; > > end Sums; > > Thank you your reaction. > Using your procedure Sums I found out that when the start value of X < > 0.24 the summation works correct with Dx =3D 1.0E-8 > When start X > 0.25 the summation remains 0.250000000. You seem to lack a fundamental understanding of how floating-point works. In a 32-bit float, 23 of those bits are the "mantissa"; the rest are used for the sign and exponent. If the 23 bits are bbbb---bbbb, then the value represented by the 32-bit float is [possibly negative] 1.bbbb---bbbb * (2**exp) where exp is the exponent (represented by the other bits of the float). The 1.bbbb---bbbb is in binary notation, so that the first "b" represents 2**-1, the second is 2**-2, etc. and the last is 2**-23. Since 2**-23 is about 1.192E-7, this means that the ratio between 1.0000----0000 and 1.0000----0001 will be 1 + 1.192E-7. So if you start with the number 0.25, the smallest number greater than 0.25 that can be represented is 0.25 + (0.25 * 1.192E-7). This last part is 2.98E-8, which is a lot more than the 1E-8 that you're trying to add, which is why 1E-8 is too small to make a difference when added. -- Adam