From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Received: by 10.36.48.67 with SMTP id q64mr4718785itq.55.1516807103399;
        Wed, 24 Jan 2018 07:18:23 -0800 (PST)
X-Received: by 10.157.31.57 with SMTP id x54mr692842otd.1.1516807103318; Wed,
 24 Jan 2018 07:18:23 -0800 (PST)
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!paganini.bofh.team!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!g80no104426itg.0!news-out.google.com!s63ni4142itb.0!nntp.google.com!g80no104424itg.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Wed, 24 Jan 2018 07:18:22 -0800 (PST)
In-Reply-To: <f1cc3c80-9b31-4a74-a201-0e209a789fc4@googlegroups.com>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=2601:191:8303:2100:7466:f44c:da21:40b1;
 posting-account=fdRd8woAAADTIlxCu9FgvDrUK4wPzvy3
NNTP-Posting-Host: 2601:191:8303:2100:7466:f44c:da21:40b1
References: <f1cc3c80-9b31-4a74-a201-0e209a789fc4@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1892f04b-0223-4060-90a7-91983f775f18@googlegroups.com>
Subject: Re: How to optimize use of RAM/disk access ?
From: Robert Eachus <rieachus@comcast.net>
Injection-Date: Wed, 24 Jan 2018 15:18:23 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Xref: reader02.eternal-september.org comp.lang.ada:50098
Date: 2018-01-24T07:18:22-08:00
List-Id: <comp.lang.ada>

On Saturday, January 20, 2018 at 1:16:00 AM UTC-5, reinert wrote:
>=20
> Any hint for how I can optimize a bit smarter?
>=20

First, realize that you are not the only one with this problem.  There are =
programs that run on supercomputers for megayears of CPU time.  It may take=
 weeks (often on smaller systems) to figure out the "right" parameters for =
a given run.  Why does it take so long?  The usual approach is to create a =
linear regression model usually with linear and squared values for each mod=
el parameter, and sometimes cross-products.  Now take your regression model=
 and choose enough test points to get a decent result.  Usually this is on =
the order of three or four data points for each model parameter.  For examp=
le, your model might be t/p =3D 1/m + 1/m^2 + 1/N+ 1/N^2 + s + s^2 + s^3 + =
d/s + d/(s^2) where
t is time in seconds per iteration, p is the number of processors, m is mem=
ory size per CPU core in Gigabytes, N is an internal model sizing parameter=
, s is problem size in data points, and d is total (free) disk space in Gig=
abytes.

Now you pick say 30 or so points, including some where you expect the model=
 to crash or run impossibly slow.  Do the runs, with probably a 1000 second=
 limit per iteration per run.  Now eliminate any time outs or crashes (you =
are not going to do big runs in that parameter space) and find the paramete=
r values.  From experience you are going to repeat the experiment on the bi=
g machine, with test parameters close to what you expect on a full run, but=
 again with one to a few time steps.

Now you know enough to ask for time (and number of CPU cores) on the big ma=
chine.  Today, you will probably want to try running on both the CPU cores =
and on the GPUs.

Is this a lot of work?  Sure, but if it saves a few CPU centuries, it is wo=
rth the effort.

In your case, you might want to "fool around" with various model parameters=
 that are combinations of your N and memory per CPU.  Oh, and I often have =
algorithm parameters which correspond to L1, L2 and L3 data cache sizes.  A=
 typical result for a "simple" matrix multiplication (A*B=3DC) might have A=
 fitted to L1, and B to L2.  If you are doing something expressed in linear=
 algebra, check out the ATLAS version of the BLAS library: http://math-atla=
s.sourceforge.net/  The big advantage of using ATLAS is that it will give g=
ood results for ALL > n^2 functions in terms of matrix multiplication.  So =
even if you use some other BLAS, you can use the ATLAS libraries for some L=
APACK calls.  (There are several Ada bindings to BLAS floating around.  I'm=
 not choosing one, since your choices of OS and compiler will affect your c=
hoice.)

Too much information?  Probably.  But if you do have a program that require=
s CPU years to run, or one that can be simplified by using LAPACK or BLAS? =
 Have at it.