comp.lang.ada
 help / color / mirror / Atom feed
From: tmoran@acm.org
Subject: Re: OT: Incremental statistics functions
Date: Tue, 27 Jan 2004 04:56:52 GMT
Date: 2004-01-27T04:56:52+00:00	[thread overview]
Message-ID: <ocmRb.31182$U%5.206154@attbi_s03> (raw)
In-Reply-To: MMednVmf58F6QYjd4p2dnA@comcast.com

>Yes, be very, very careful.  The problem/issue is that for the general
>case computing the mean and then the standard deviation has very good
>mathematical properties.  Computing them incrementally does not.  If you
>have an 'expected' value for the mean (call it u) you can use it to
>accumulate (Xi-u)**2 and if u is 'close' to the mean, the numerical
  If your incoming data is random, you could use the average of the first
bunch of values as u.  Or if N is large and the incoming data is really
random, you might want to act like a pollster and just take a subsample.
If the data has a pattern, it might be worth using other Monte Carlo
techniques.
  Even the average can be a problem if N is large.  Say you use 6 decimal
digit floating point for the running sum and the first million values are
all 10.0, followed by a million 9.0s so the correct average is 9.5 After
adding the million 10.0s your floating sum will lose its units position
and all the additions of 9.0s will change nothing.  So after all the data
is in, the sum is ten million, divided by N of 2 million, gives you a
value of 5.0, not the correct 9.5
  If you are just calculating the average and variance (standard deviation
squared) of the heights of people in your office, all these considerations
are irrelevant and you probably don't need a statistician. :)



  reply	other threads:[~2004-01-27  4:56 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-26 22:59 OT: Incremental statistics functions Mats Karlssohn
2004-01-27  1:50 ` tmoran
2004-01-27  2:13 ` Stephen Leake
2004-01-27  3:37 ` Robert I. Eachus
2004-01-27  4:56   ` tmoran [this message]
2004-01-28  0:22   ` tmoran
2004-01-28 19:56     ` OT: large sums; was " tmoran
2004-01-27  3:39 ` Steve
2004-01-27 16:22   ` Robert I. Eachus
2004-01-27 15:48 ` Joachim Schr�er
2004-01-28  0:22   ` tmoran
2004-01-27 23:44 ` OT: " Mats Karlssohn
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox