From: Dr Adrian Wrigley <amtw@linuxchip.demon.co.uk>
Subject: Re: Problems with large records (GNAT) [continued]
Date: Thu, 01 Mar 2001 13:52:28 -0800
Date: 2001-03-01T13:52:28-08:00 [thread overview]
Message-ID: <3A9EC49C.9F5CD8D6@linuxchip.demon.co.uk> (raw)
In-Reply-To: esmn6.41158$5M5.2034803@news1.frmt1.sfba.home.com
tmoran@acm.org wrote:
> I realize you are interested in the generic large-memory-with-Gnat
> problem, but, as they say, sometimes it's better to improve the
> algorithm than the hardware.
I have the same frustration with this problem as with things like
segmented memory architectures, short index registers etc.
They all tend to result in less robust code, or a lot more work.
Hitting one of the various memory limits is one of the
common problems I encounter running GNAT/Linux.
I plan to go to (partial) intra-day data sometime, so that will
need a better representation.
...
> For historical data, stock prices can be 16 bit fixed point with delta of
> 1/8, rather than 32 bit floats (excluding Berkshire-Hathaway). Even
> with prices in pennys nowadays, 24 bits should be quite enough for a
> stock price. Similarly, a 24 bit Volume (16 million shares of one
> stock traded in one day) should be normally be adequate, perhaps with
> an exception list for anything that doesn't fit. A sixteen bit fixed
> point value, with suitable delta, should be fine for holding the
> split correction, or 24 bits if you really want to allow for even the
> most bizarre changes.
I decided that 16 bits was inadequate. Even with prices in the
range $0.05 to $500, you need 20 bits to accommodate a delta
representing 1% at the bottom end. Companies that have had
a lot of splits and dividends in their history have very
small prices back in the '70s. Perhaps a 16 bit logarithm of
the share price would be OK. (and even speed up volatility
calculations!)
With volume, I think that really needs to better than 32 bit range.
Once you start to calculate weekly or monthly volumes, quite a number
of companies exceed 2**32 shares. (and in some countries, they
even trade fractional shares routinely). Maybe you've seen the
WWW sites of historic data that show Intel's monthly share volume
as things like "-1518500200 shares". I mentioned this problem
to Yahoo nearly a year ago, but they haven't fixed it.
When it comes down to it, it is a matter of confidence and simplicity.
Fixed point for this wide ranging data doesn't give me the confidence
I want from a (mission critical) financial application.
I hadn't thought of using 24 bit values, and I think they would
not be worthwhile here given the issues involved.
> I don't know what kind of processing you are
> doing, but usually one processes a small number of complete time
> series, or the complete market for just a few days, so only a few
> rows or columns of the complete matrix need be in RAM at any one time.
That's why I want a very fast data access method... I want to
scan all the stocks over all the times. Sometimes I access the data
sparsely as well. With mmap, the data from one invocation to another
remain in RAM, and can be completely scanned in only a few seconds.
Maybe someday there will be a standard persistent object store
package in the Ada standard. Loading data from files into RAM
tends to be amazingly slow, when the file and the in-memory
representation are both as big as the physical memory - and
my machine has no free memory slots :(
--
Adrian Wrigley
next prev parent reply other threads:[~2001-03-01 21:52 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-02-28 10:44 Problems with large records (GNAT) [continued] Dr Adrian Wrigley
2001-02-28 3:13 ` Robert A Duff
2001-02-28 12:09 ` Dr Adrian Wrigley
2001-02-28 9:51 ` Florian Weimer
2001-02-28 18:35 ` Laurent Guerby
2001-03-01 8:17 ` Dr Adrian Wrigley
2001-03-01 1:58 ` Robert A Duff
2001-03-01 22:18 ` Dr Adrian Wrigley
2001-03-01 17:02 ` Robert A Duff
2001-03-01 7:00 ` tmoran
2001-03-01 21:52 ` Dr Adrian Wrigley [this message]
2001-03-01 19:32 ` tmoran
2001-03-01 19:38 ` Laurent Guerby
2001-03-02 20:32 ` Randy Brukardt
2001-03-07 2:15 ` Dr Adrian Wrigley
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox