From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5164ccc41905b2d0 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit X-Received: by 10.66.244.130 with SMTP id xg2mr60533pac.39.1362703152045; Thu, 07 Mar 2013 16:39:12 -0800 (PST) Path: jm3ni38252pbb.0!nntp.google.com!Xl.tags.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!news.giganews.com.POSTED!not-for-mail NNTP-Posting-Date: Thu, 07 Mar 2013 18:39:11 -0600 Date: Thu, 07 Mar 2013 19:39:10 -0500 From: "Peter C. Chapin" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130221 Thunderbird/17.0.3 MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Ada and OpenMP References: <87k3pjht79.fsf@ludovic-brenta.org> In-Reply-To: Message-ID: X-Usenet-Provider: http://www.giganews.com X-Trace: sv3-E7ZLq8tetN9mv6VnOemXI+gjMAMWFZbZzNywEkyC+tz0ivC/VNFzXIErH1IbapK7tfWHyZVhoUCeYhV!nwxcOW6/fkSrN7j9TwbmlevIgCcmrduoJMMctmqoZTTkKLwLmrtsL6cJTli+gyY= X-Complaints-To: abuse@giganews.com X-DMCA-Notifications: http://www.giganews.com/info/dmca.html X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 3822 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Date: 2013-03-07T19:39:10-05:00 List-Id: On 03/07/2013 06:42 PM, Randy Brukardt wrote: > Isn't OpenMP aimed at SIMD-type machines (as in video processors), as > opposed to generalized cores as in typical Intel and ARM designs? > Fine-grained parallelism doesn't make much sense on the latter, because > cache coherence and core scheduling issues will eat up gains in almost all > circumstances. Ada tasks are a much better model. Well, I used OpenMP for a program targeting x64 architectures and it worked well in my case. It was easy to use: my program became 8x faster by the addition of a single line of source text. It even computed the right answer. My program was very well suited to the OpenMP model of computation, however, so I wouldn't expect such a dramatic result in all cases of course. > Well, this doesn't make much sense. If the pragma doesn't change the > semantics of the loop, then its not necessary at all (the compiler can and > ought to do the optimization when it makes sense, possibly under the control > of global flags). Programmers are lousy at determining where and how the > best of use of machine resources can be made. I only used the pragma above to follow the mindset of OpenMP under C. I agree it might not be the best way to do it in Ada. I'm a little uncertain, though, about how well the compiler can be expected to find this sort of parallelization... at least with current technology. The compiler I was using for the program above, and it wasn't an ancient one, certainly had no idea how to do such things on its own. In a high performance application nested loops are common and often the body of a loop calls a subprogram implemented in a library that itself has loops. I don't want all of these nested loops parallelized because that would create huge overheads. Yet without detailed semantic information about what the library subprograms do, I'm not sure how the compiler can know it's safe to parallelize the top level loop. I'm not an expert in writing parallelizing compilers for sure, but it seemed to me, when I was experimenting with OpenMP, that it did a nice job of taking care of the grunt work while still allow me to apply my broad knowledge of the application to find good places to parallelize. I certainly could have written my earlier program with tasks. In fact I had a version that used threads before I tried OpenMP. It worked but it was ugly and a bit flaky. Doing the job with one line was certainly a lot nicer and proved to be more reliable (and faster running too, in my case). Peter