From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.98.23.197 with SMTP id 188mr2573342pfx.18.1485565116641; Fri, 27 Jan 2017 16:58:36 -0800 (PST) X-Received: by 10.157.52.34 with SMTP id v31mr956756otb.9.1485565116594; Fri, 27 Jan 2017 16:58:36 -0800 (PST) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!news.glorb.com!r185no781463ita.0!news-out.google.com!15ni16166itm.0!nntp.google.com!r185no781455ita.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Fri, 27 Jan 2017 16:58:36 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=2601:191:8303:2100:5985:2c17:9409:aa9c; posting-account=fdRd8woAAADTIlxCu9FgvDrUK4wPzvy3 NNTP-Posting-Host: 2601:191:8303:2100:5985:2c17:9409:aa9c References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: Extend slices for n dimensional arrays for Ada 202X From: Robert Eachus Injection-Date: Sat, 28 Jan 2017 00:58:36 +0000 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Xref: news.eternal-september.org comp.lang.ada:33205 Date: 2017-01-27T16:58:36-08:00 List-Id: On Friday, January 27, 2017 at 6:30:32 PM UTC-5, Randy Brukardt wrote: > Single dimensional slices are of course completely different, and I wasn'= t=20 > talking about them. (Although your advice for copying them is 25 years ou= t=20 > of date: Intel hardware, at least, automatically makes many of those=20 > optimizations so there is no value to manually doing them yourself - that= =20 > actually would make your code slower.) As it happens, when I get down to the point of considering actual cache siz= es and TLBs, I am coding for AMD CPUs. (Even have one in this system. ;-) = But believe me, 4 or 5% time improvement is considered worthwhile, and I a= m usually doing finding 50% or more. It also turns out that I am usually w= orking on low-level routines often in BLAS that are eating loads of CPU tim= e. Oh, and I have yet to convince anyone to make the main program Ada inst= ead of Fortran, but that's another issue. Yes, modern CPUs will reorder operations to improve performance, but the re= order buffers have a limited reach. So if you are transposing an array whi= ch is thousands of rows and columns in size, the programmer has to bring th= e instructions that should be brought together close enough that the CPU ca= n do the rest.