From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.237.49.46 with SMTP id 43mr15524571qtg.31.1501640600094; Tue, 01 Aug 2017 19:23:20 -0700 (PDT) X-Received: by 10.36.19.81 with SMTP id 78mr171570itz.2.1501640600054; Tue, 01 Aug 2017 19:23:20 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!news.glorb.com!s6no1455673qtc.1!news-out.google.com!196ni1714itl.0!nntp.google.com!u14no8180ita.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Tue, 1 Aug 2017 19:23:19 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=2601:191:8303:2100:ad21:fae8:74f1:4499; posting-account=fdRd8woAAADTIlxCu9FgvDrUK4wPzvy3 NNTP-Posting-Host: 2601:191:8303:2100:ad21:fae8:74f1:4499 References: <9e51f87c-3b54-4d09-b9ca-e3c6a6e8940a@googlegroups.com> <49d02dda-8f1b-4005-a164-7af34e1993cc@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <914ae4df-cc52-4e6e-b342-584bcac98e88@googlegroups.com> Subject: Re: Real tasking problems with Ada. From: Robert Eachus Injection-Date: Wed, 02 Aug 2017 02:23:20 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: news.eternal-september.org comp.lang.ada:47552 Date: 2017-08-01T19:23:19-07:00 List-Id: On Tuesday, August 1, 2017 at 12:45:43 AM UTC-4, Randy Brukardt wrote: > Yes, really. Use a discriminant of type CPU, and use that in the aspect.= =20 > That's an age-old technique, and indeed is the major reason that tasks ha= ve=20 > discriminants. You then can allocate the tasks (which would be my=20 > suggestion), or you could create the entire set in an aggregate (assuming= =20 > you have Ada 2020). Sorry, you missed what all the shouting was about. ;-) On the processor I = am using (an AMD FX-6300 Vishera) running on all CPU cores causes contentio= n for the floating-point units. So for efficiency I have to run on one cor= e from each pair of CPU cores. Currently my program uses 2,4, and 6. Crea= ting an array indexed by CPU doesn't work. If we had Algol style indexing-= -but I am certainly not going to advocate that. This is not a problem uniq= ue to one family of CPUs. I'm upgrading to an AMD Ryzen 7 which will have = 8 cores and 16 threads. It is going IMNSHO, to require the same thing. Sa= me for Intel processors with Hyperthreading enabled. As for cache line sizes affecting code, yes the garbage case was a bug in m= y code--or in GNAT, or in expectations. (GNAT 2017 has Standard'Maximum_Al= ignment equal to 16. At least the version I am using does. I was trying to = trick it into 64 byte (cache line) alignment by using computed Address clau= ses. On AMD processors cache lines are 64 bytes, but usually two lines (1= 28 bytes) are read if no other thread is waiting for a cache line. Intel d= oes it the other way around 256 byte cache lines, and the CPU will only fet= ch 128 if there are other requests queued.) Yes, I knew what I was doing was messy and dangerous--or at least required = careful checking. My point was that if Maximum_Alignment was large enough,= I wouldn't be going through the pain. Was it worth it? That is what this= is all about. I have a program which spreads a matrix multiplication over= multiple processors--and compares the result with the single processor cas= e. Right now, unfortunately, every time I get the tasking version faster, = the non-tasking version improves as well. (I'm currently at about 700 Mill= ion multiplications, 1.4 GigaFLOPS ignoring the integer indexing.) Now if = I could get up to 2 GigaFLOPS on the tasking version I'd be happy. Of cour= se, once I move to the Ryzen 7 I expect much better numbers, and better sti= ll for video cards.