From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,608f4b25931220fc
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2004-01-28 11:56:34 PST
Path: 
 archiver1.google.com!news2.google.com!newsfeed2.dallas1.level3.net!news.level3.com!crtntx1-snh1.gtei.net!news.gtei.net!chcgil2-snh1.gtei.net!news.bbnplanet.com!wn11feed!worldnet.att.net!attbi_feed3!attbi.com!attbi_s02.POSTED!not-for-mail
From: tmoran@acm.org
Newsgroups: comp.lang.ada
Subject: OT: large sums; was Incremental statistics functions
References: <dhDRb.130506$sv6.691674@attbi_s52>
X-Newsreader: Tom's custom newsreader
Message-ID: <RtURb.174943$xy6.818300@attbi_s02>
NNTP-Posting-Host: 67.161.24.134
X-Complaints-To: abuse@comcast.net
X-Trace: attbi_s02 1075319793 67.161.24.134 (Wed, 28 Jan 2004 19:56:33 GMT)
NNTP-Posting-Date: Wed, 28 Jan 2004 19:56:33 GMT
Organization: Comcast Online
Date: Wed, 28 Jan 2004 19:56:33 GMT
Xref: archiver1.google.com comp.lang.ada:5006
Date: 2004-01-28T19:56:33+00:00
List-Id: <comp.lang.ada>

  If you add many similar numbers with a non-zero average, say in
calulating an average or the sum of squares for a variance, you soon find
that a value is being added to about N* that value, which loses log2(N)
bits in floating point.  For instance, using 4 byte IEEE floating point,
summing 8000 times the number  111111.0, then dividing by 8000.0,
yields an incorrect average of 111105.0

  If you take the sum of the first two values, then the sum of the
next two, and then add those two sums, you get a more accurate
value for the sum of all four.  Building on that, here's an algorithm
for calculating a large sum.  This algorithm is new to me; I would be
interested in seeing references to it.

  type Registers is record
    Full    : Boolean := False;
    Content : Float;
  end record;

  -- To add N values we need log2(N) registers.
  subtype Register_Indices is Integer range 1 .. 64; -- allow 2**64 values

  type Register_Sets is array (Register_Indices range <>) of Registers;

  type Accumulators(N : Register_Indices) is record
    Reg : Register_Sets(1 .. N);
  end record;

  procedure Reset(Acc : in out Accumulators) is
  begin
    Acc.Reg := (others => (Full => False, Content => 0.0));
  end Reset;

  procedure Add(Acc : in out Accumulators;
                X   : in     Float) is
    Partial_Sum: Float := X;
  begin
    for I in Acc.Reg'range loop
      if not Acc.Reg(I).Full then
        Acc.Reg(I).Content := Partial_Sum;
        Acc.Reg(I).Full := True;
        return;
      end if;
      Partial_Sum := Partial_Sum + Acc.Reg(I).Content;
      Acc.Reg(I).Full := False;
    end loop;
    raise Constraint_Error; -- overflow
  end Add;

  function Sum(Acc : Accumulators) return Float is
  -- Return current sum in Acc.
  -- Best performance when N is a power of two, worst when it's a bit larger.
    Result  : Float := 0.0;
  begin
    for I in Acc.Reg'range loop -- start by adding the small values
      if Acc.Reg(I).Full then
        Result := Result + Acc.Reg(I).Content;
      end if;
    end loop;
    return Result;
  end Sum;