* Cannot summate small float values
@ 2010-11-20 12:47 tolkamp
2010-11-20 13:49 ` Niklas Holsti
0 siblings, 1 reply; 7+ messages in thread
From: tolkamp @ 2010-11-20 12:47 UTC (permalink / raw)
When I summate Float values smaller then 1.0E-6 then the summation is
not done.
Code Example:
X, Dx : Float;
X := 0.0;
Dx := 1.0E-7;
lwhile X < 1.0 loop
X = X + Dx;
Float_Io.Put(X, 3,9,0); New_Line;
end loop;
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Cannot summate small float values
2010-11-20 12:47 Cannot summate small float values tolkamp
@ 2010-11-20 13:49 ` Niklas Holsti
2010-11-21 21:06 ` tolkamp
0 siblings, 1 reply; 7+ messages in thread
From: Niklas Holsti @ 2010-11-20 13:49 UTC (permalink / raw)
tolkamp wrote:
> When I summate Float values smaller then 1.0E-6 then the summation is
> not done.
>
> Code Example:
>
> X, Dx : Float;
> X := 0.0;
> Dx := 1.0E-7;
> lwhile X < 1.0 loop
> X = X + Dx;
> Float_Io.Put(X, 3,9,0); New_Line;
> end loop;
Certainly the addition is done. Your program (after some small syntactic
corrections) prints:
0.000000100
0.000000200
0.000000300
0.000000400
0.000000500
0.000000600
0.000000700
and so on. If your program prints out something else, please show the
source code of your whole program, exactly as you compile and run it.
Don't re-type it into your message.
However, when X approaches 1.0, at some point the addition of 1.0E-7 may
be lost in round-off, since it is close to the precision limit of the
Float type, relative to 1.0. On my system (Debian, Gnat) the X variable
does reach 1.0 and the program stops.
What are you really trying to do? There are probably safer and more
accurate ways of doing it.
Here is the program that I used:
with Ada.Text_IO;
with Ada.Float_Text_IO;
procedure Sums
is
use Ada.Text_IO, Ada.Float_Text_IO;
X, Dx : Float;
begin
X := 0.0;
Dx := 1.0E-7;
while X < 1.0 loop
X := X + Dx;
Put(X, 3,9,0); New_Line;
end loop;
end Sums;
--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Cannot summate small float values
2010-11-20 13:49 ` Niklas Holsti
@ 2010-11-21 21:06 ` tolkamp
2010-11-21 21:18 ` Niklas Holsti
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: tolkamp @ 2010-11-21 21:06 UTC (permalink / raw)
On 20 nov, 14:49, Niklas Holsti <niklas.hol...@tidorum.invalid> wrote:
> tolkamp wrote:
> > When I summate Float values smaller then 1.0E-6 then the summation is
> > not done.
>
> > Code Example:
>
> > X, Dx : Float;
> > X := 0.0;
> > Dx := 1.0E-7;
> > lwhile X < 1.0 loop
> > X = X + Dx;
> > Float_Io.Put(X, 3,9,0); New_Line;
> > end loop;
>
> Certainly the addition is done. Your program (after some small syntactic
> corrections) prints:
>
> 0.000000100
> 0.000000200
> 0.000000300
> 0.000000400
> 0.000000500
> 0.000000600
> 0.000000700
>
> and so on. If your program prints out something else, please show the
> source code of your whole program, exactly as you compile and run it.
> Don't re-type it into your message.
>
> However, when X approaches 1.0, at some point the addition of 1.0E-7 may
> be lost in round-off, since it is close to the precision limit of the
> Float type, relative to 1.0. On my system (Debian, Gnat) the X variable
> does reach 1.0 and the program stops.
>
> What are you really trying to do? There are probably safer and more
> accurate ways of doing it.
>
> Here is the program that I used:
>
> with Ada.Text_IO;
> with Ada.Float_Text_IO;
>
> procedure Sums
> is
> use Ada.Text_IO, Ada.Float_Text_IO;
> X, Dx : Float;
> begin
> X := 0.0;
> Dx := 1.0E-7;
> while X < 1.0 loop
> X := X + Dx;
> Put(X, 3,9,0); New_Line;
> end loop;
> end Sums;
>
> --
> Niklas Holsti
> Tidorum Ltd
> niklas holsti tidorum fi
> . @ .
Thank you your reaction.
Using your procedure Sums I found out that when the start value of X <
0.24 the summation works correct with Dx = 1.0E-8
When start X > 0.25 the summation remains 0.250000000.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Cannot summate small float values
2010-11-21 21:06 ` tolkamp
@ 2010-11-21 21:18 ` Niklas Holsti
2010-11-22 1:23 ` Gautier write-only
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Niklas Holsti @ 2010-11-21 21:18 UTC (permalink / raw)
tolkamp wrote:
> On 20 nov, 14:49, Niklas Holsti <niklas.hol...@tidorum.invalid> wrote:
>> tolkamp wrote:
>>> When I summate Float values smaller then 1.0E-6 then the summation is
>>> not done.
>>> Code Example:
>>> X, Dx : Float;
>>> X := 0.0;
>>> Dx := 1.0E-7;
>>> lwhile X < 1.0 loop
>>> X = X + Dx;
>>> Float_Io.Put(X, 3,9,0); New_Line;
>>> end loop;
>> Certainly the addition is done. Your program (after some small syntactic
>> corrections) prints:
>>
>> 0.000000100
>> 0.000000200
>> 0.000000300
>> 0.000000400
>> 0.000000500
>> 0.000000600
>> 0.000000700
>>
>> and so on. If your program prints out something else, please show the
>> source code of your whole program, exactly as you compile and run it.
>> Don't re-type it into your message.
>>
>> However, when X approaches 1.0, at some point the addition of 1.0E-7 may
>> be lost in round-off, since it is close to the precision limit of the
>> Float type, relative to 1.0. On my system (Debian, Gnat) the X variable
>> does reach 1.0 and the program stops.
>>
>> What are you really trying to do? There are probably safer and more
>> accurate ways of doing it.
>>
>> Here is the program that I used:
>>
>> with Ada.Text_IO;
>> with Ada.Float_Text_IO;
>>
>> procedure Sums
>> is
>> use Ada.Text_IO, Ada.Float_Text_IO;
>> X, Dx : Float;
>> begin
>> X := 0.0;
>> Dx := 1.0E-7;
>> while X < 1.0 loop
>> X := X + Dx;
>> Put(X, 3,9,0); New_Line;
>> end loop;
>> end Sums;
>>
>
> Thank you your reaction.
> Using your procedure Sums I found out that when the start value of X <
> 0.24 the summation works correct with Dx = 1.0E-8
> When start X > 0.25 the summation remains 0.250000000.
That is still correct behaviour, because floating point addition can
behave like that when the two addends have very different magnitudes. If
you need to compute the sum of a large set of floating-point numbers,
you should use a floating-point type with more digits or use a smart
summation algorithm like the one by Kahan,
http://en.wikipedia.org/wiki/Kahan_summation_algorithm
--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Cannot summate small float values
2010-11-21 21:06 ` tolkamp
2010-11-21 21:18 ` Niklas Holsti
@ 2010-11-22 1:23 ` Gautier write-only
2010-11-22 8:35 ` Julian Leyh
2010-11-22 16:30 ` Adam Beneschan
3 siblings, 0 replies; 7+ messages in thread
From: Gautier write-only @ 2010-11-22 1:23 UTC (permalink / raw)
On 21 nov, 22:06, tolkamp wrote:
> Thank you your reaction.
> Using your procedure Sums I found out that when the start value of X <
> 0.24 the summation works correct with Dx = 1.0E-8
> When start X > 0.25 the summation remains 0.250000000.
Anyway, since Dx is constant, you should never add it over and over:
it will accumulate numerical errors (with very rare exceptions) even
if it looks to work correctly on a few iterations.
The right way is
X:= N * Real(Dx);
N:= N + 1;
where Real is your floating-point type.
Cheers
______________________________________________________________
Gautier's Ada programming -- http://gautiersblog.blogspot.com/
NB: follow the above link for a working e-mail address
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Cannot summate small float values
2010-11-21 21:06 ` tolkamp
2010-11-21 21:18 ` Niklas Holsti
2010-11-22 1:23 ` Gautier write-only
@ 2010-11-22 8:35 ` Julian Leyh
2010-11-22 16:30 ` Adam Beneschan
3 siblings, 0 replies; 7+ messages in thread
From: Julian Leyh @ 2010-11-22 8:35 UTC (permalink / raw)
On 21 Nov., 22:06, tolkamp <f.tolk...@gmail.com> wrote:
> On 20 nov, 14:49, Niklas Holsti <niklas.hol...@tidorum.invalid> wrote:
>
>
>
> > tolkamp wrote:
> > > When I summate Float values smaller then 1.0E-6 then the summation is
> > > not done.
>
> > > Code Example:
>
> > > X, Dx : Float;
> > > X := 0.0;
> > > Dx := 1.0E-7;
> > > lwhile X < 1.0 loop
> > > X = X + Dx;
> > > Float_Io.Put(X, 3,9,0); New_Line;
> > > end loop;
>
> > Certainly the addition is done. Your program (after some small syntactic
> > corrections) prints:
>
> > 0.000000100
> > 0.000000200
> > 0.000000300
> > 0.000000400
> > 0.000000500
> > 0.000000600
> > 0.000000700
>
> > and so on. If your program prints out something else, please show the
> > source code of your whole program, exactly as you compile and run it.
> > Don't re-type it into your message.
>
> > However, when X approaches 1.0, at some point the addition of 1.0E-7 may
> > be lost in round-off, since it is close to the precision limit of the
> > Float type, relative to 1.0. On my system (Debian, Gnat) the X variable
> > does reach 1.0 and the program stops.
>
> > What are you really trying to do? There are probably safer and more
> > accurate ways of doing it.
>
> > Here is the program that I used:
>
> > with Ada.Text_IO;
> > with Ada.Float_Text_IO;
>
> > procedure Sums
> > is
> > use Ada.Text_IO, Ada.Float_Text_IO;
> > X, Dx : Float;
> > begin
> > X := 0.0;
> > Dx := 1.0E-7;
> > while X < 1.0 loop
> > X := X + Dx;
> > Put(X, 3,9,0); New_Line;
> > end loop;
> > end Sums;
>
> > --
> > Niklas Holsti
> > Tidorum Ltd
> > niklas holsti tidorum fi
> > . @ .
>
> Thank you your reaction.
> Using your procedure Sums I found out that when the start value of X <
> 0.24 the summation works correct with Dx = 1.0E-8
> When start X > 0.25 the summation remains 0.250000000.
This should be taught in basics of computer nummerics or processor
architecture. Have a look at IEEE 754 floating point arithmetic.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Cannot summate small float values
2010-11-21 21:06 ` tolkamp
` (2 preceding siblings ...)
2010-11-22 8:35 ` Julian Leyh
@ 2010-11-22 16:30 ` Adam Beneschan
3 siblings, 0 replies; 7+ messages in thread
From: Adam Beneschan @ 2010-11-22 16:30 UTC (permalink / raw)
On Nov 21, 1:06 pm, tolkamp <f.tolk...@gmail.com> wrote:
> On 20 nov, 14:49, Niklas Holsti <niklas.hol...@tidorum.invalid> wrote:
>
>
>
>
>
> > tolkamp wrote:
> > > When I summate Float values smaller then 1.0E-6 then the summation is
> > > not done.
>
> > > Code Example:
>
> > > X, Dx : Float;
> > > X := 0.0;
> > > Dx := 1.0E-7;
> > > lwhile X < 1.0 loop
> > > X = X + Dx;
> > > Float_Io.Put(X, 3,9,0); New_Line;
> > > end loop;
>
> > Certainly the addition is done. Your program (after some small syntactic
> > corrections) prints:
>
> > 0.000000100
> > 0.000000200
> > 0.000000300
> > 0.000000400
> > 0.000000500
> > 0.000000600
> > 0.000000700
>
> > and so on. If your program prints out something else, please show the
> > source code of your whole program, exactly as you compile and run it.
> > Don't re-type it into your message.
>
> > However, when X approaches 1.0, at some point the addition of 1.0E-7 may
> > be lost in round-off, since it is close to the precision limit of the
> > Float type, relative to 1.0. On my system (Debian, Gnat) the X variable
> > does reach 1.0 and the program stops.
>
> > What are you really trying to do? There are probably safer and more
> > accurate ways of doing it.
>
> > Here is the program that I used:
>
> > with Ada.Text_IO;
> > with Ada.Float_Text_IO;
>
> > procedure Sums
> > is
> > use Ada.Text_IO, Ada.Float_Text_IO;
> > X, Dx : Float;
> > begin
> > X := 0.0;
> > Dx := 1.0E-7;
> > while X < 1.0 loop
> > X := X + Dx;
> > Put(X, 3,9,0); New_Line;
> > end loop;
> > end Sums;
>
> Thank you your reaction.
> Using your procedure Sums I found out that when the start value of X <
> 0.24 the summation works correct with Dx = 1.0E-8
> When start X > 0.25 the summation remains 0.250000000.
You seem to lack a fundamental understanding of how floating-point
works.
In a 32-bit float, 23 of those bits are the "mantissa"; the rest are
used for the sign and exponent. If the 23 bits are bbbb---bbbb, then
the value represented by the 32-bit float is
[possibly negative] 1.bbbb---bbbb * (2**exp)
where exp is the exponent (represented by the other bits of the
float). The 1.bbbb---bbbb is in binary notation, so that the first
"b" represents 2**-1, the second is 2**-2, etc. and the last is
2**-23.
Since 2**-23 is about 1.192E-7, this means that the ratio between
1.0000----0000 and 1.0000----0001 will be 1 + 1.192E-7. So if you
start with the number 0.25, the smallest number greater than 0.25 that
can be represented is 0.25 + (0.25 * 1.192E-7). This last part is
2.98E-8, which is a lot more than the 1E-8 that you're trying to add,
which is why 1E-8 is too small to make a difference when added.
-- Adam
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-11-22 16:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-20 12:47 Cannot summate small float values tolkamp
2010-11-20 13:49 ` Niklas Holsti
2010-11-21 21:06 ` tolkamp
2010-11-21 21:18 ` Niklas Holsti
2010-11-22 1:23 ` Gautier write-only
2010-11-22 8:35 ` Julian Leyh
2010-11-22 16:30 ` Adam Beneschan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox