From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.107.3.225 with SMTP id e94mr9847331ioi.7.1519093884861; Mon, 19 Feb 2018 18:31:24 -0800 (PST) X-Received: by 10.157.112.141 with SMTP id l13mr912719otj.1.1519093884738; Mon, 19 Feb 2018 18:31:24 -0800 (PST) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!news.uzoreto.com!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!o66no1676485ita.0!news-out.google.com!s63ni4549itb.0!nntp.google.com!w142no1682268ita.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Mon, 19 Feb 2018 18:31:24 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=87.116.179.50; posting-account=z-xFXQkAAABpEOAnT3LViyFXc8dmoW_p NNTP-Posting-Host: 87.116.179.50 References: <83493d20-7001-405b-8658-8a3f5d6c90fa@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <06efbe02-cdae-4fac-a17d-6d0c1be7848c@googlegroups.com> Subject: Re: GNAT can't vectorize Real_Matrix multiplication from Ada.Numerics.Real_Arrays. What a surprise! From: Bojan Bozovic Injection-Date: Tue, 20 Feb 2018 02:31:24 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: reader02.eternal-september.org comp.lang.ada:50513 Date: 2018-02-19T18:31:24-08:00 List-Id: On Monday, February 19, 2018 at 10:08:41 PM UTC+1, Robert Eachus wrote: > On Sunday, February 18, 2018 at 4:48:42 PM UTC-5, Nasser M. Abbasi wrote: > > On 2/18/2018 1:38 PM, Bojan Bozovic wrote: > >=20 > > If you are doing A*B by hand, then you are doing something > > wrong. Almost all languages end up calling Blas > > Fortran libraries for these operations. Your code and > > the Ada code can't be faster. > >=20 > > http://www.netlib.org/blas/ > >=20 > > Intel Math Kernel Library has all these. > >=20 > > https://en.wikipedia.org/wiki/Math_Kernel_Library >=20 > For multiplying two small matrices, blas is overkill and will be slower. = If you have say, 1000x1000 matrices, then you should be using blas. But w= hich BLAS? Intel and AMD both have math libraries optimized for their CPUs= . However, I tend to use ATLAS. ATLAS will build a blas targeted at your = specific hardware. This is not just about instruction set additions like S= IMD2. It will tailor the implementation to your number of cores and suppor= ted threads, cache sizes, and memory speeds. I've also used the goto blas,= but ATLAS even though not perfect, builds all of blas3 using matrix multip= lication and blas2, such that all operations slower than O(n^2) have their = speed determined by matrix multiplication. (Then use multiple matrix multi= plication codes with different parameters to find the fastest.) >=20 > Usually hardware vendor libraries catch up to and surpass ATLAS, but by t= hen the hardware is obsolete. :-( The other problem right now is that bla= s libraries are pretty dumb when it comes to multiprocessor systems. I'm w= orking on fixing that. ;-) I have looked at ATLAS, however it can't spawn more threads than specified = at compile time, so there's lots of possibility to optimize there, by spawn= ing as many threads as supported at run-time. Ada would do much better here= than C, because you could make portable code without resorting to ugly hac= ks of C, and using parallelism no matter whats the underlying processor arc= hitecture. That are my $0.02, worthless or not (and if you want to use asse= mbler to "optimize" further in C, that can be done in any language, which I= fear Intel MKL library and other vendor libraries do).