From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,608f4b25931220fc X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2004-01-26 19:38:03 PST Path: archiver1.google.com!news2.google.com!newsfeed2.dallas1.level3.net!news.level3.com!news.mainstreet.net!feedwest.aleron.net!aleron.net!sjc70.webusenet.com!news.webusenet.com!pd7cy2so!shaw.ca!border1.nntp.ash.giganews.com!border2.nntp.sjc.giganews.com!border1.nntp.sjc.giganews.com!nntp.giganews.com!local1.nntp.sjc.giganews.com!nntp.comcast.com!news.comcast.com.POSTED!not-for-mail NNTP-Posting-Date: Mon, 26 Jan 2004 21:37:10 -0600 Date: Mon, 26 Jan 2004 22:37:08 -0500 From: "Robert I. Eachus" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) X-Accept-Language: en-us, en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: OT: Incremental statistics functions References: <86k73e9uzk.fsf@lucretia.kaos> In-Reply-To: <86k73e9uzk.fsf@lucretia.kaos> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Message-ID: NNTP-Posting-Host: 24.147.77.160 X-Trace: sv3-YmfehqppPZyP+S8Rptr8OA4qAotvv/Ilx0+rNB2724hYrlvL2L6DbK5oNgAtS8GcM+T+XG7ZTi+Afol!RkTAHG9YB8FXGQ62CT0PZgeMqPQw7nSZk2RLPrZtaiADwGPTsqNnv3y8Im88KA== X-Complaints-To: abuse@comcast.net X-DMCA-Complaints-To: dmca@comcast.net X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.1 Xref: archiver1.google.com comp.lang.ada:4873 Date: 2004-01-26T22:37:08-05:00 List-Id: Mats Karlssohn wrote: > Basically I don't want to keep a (limited) buffer of samples but would like > to add the values one at a time when they are calculated. > > Probably I googled bad, since I didn't find anything I could understand. > > Any suggestions please? Yes, be very, very careful. The problem/issue is that for the general case computing the mean and then the standard deviation has very good mathematical properties. Computing them incrementally does not. If you have an 'expected' value for the mean (call it u) you can use it to accumulate (Xi-u)**2 and if u is 'close' to the mean, the numerical characteristics will be fine. (To put that technically, if your estimate u, is less than half a standard deviation from the sample mean you should have nothing to worry about.) To put all this in perspective, say you are monitoring daily low temperatures in New Hampshire in January. There is no problem, the day to day differences are larger than difference between zero and the average. Try the same thing in Iraq in July, and you won't do as well. Even if you are monitoring low temperatures, eventually the difference between the sum of the squares and n times x-bar squared will be much smaller than those two numbers. And if you are using floating point, that ratio determines how much significance you have lost. Your estimate of the variance or standard deviation may have just a few significant bits (one significant digit) or worse, no significant bits or digits. You can guard against this to some extent by using IEEE double or extended for computing the sum of the squares, but that only postpones when you run out of significance, it doesn't prevent it. -- Robert I. Eachus "The war on terror is a different kind of war, waged capture by capture, cell by cell, and victory by victory. Our security is assured by our perseverance and by our sure belief in the success of liberty." -- George W. Bush