* Ada and OpenMP @ 2013-03-07 18:04 Rego, P. 2013-03-07 20:04 ` Ludovic Brenta 2013-03-07 22:52 ` Simon Wright 0 siblings, 2 replies; 26+ messages in thread From: Rego, P. @ 2013-03-07 18:04 UTC (permalink / raw) Dear friends, I'm trying some exercises of parallel computing using that pragmas from OpenMP in C, but it would be good to use it also in Ada. Is it possible to use that pragmas from OpenMP in Ada? And...does gnat gpl supports it? Regards. P.Rego ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 18:04 Ada and OpenMP Rego, P. @ 2013-03-07 20:04 ` Ludovic Brenta 2013-03-07 22:22 ` Peter C. Chapin 2013-03-07 22:52 ` Simon Wright 1 sibling, 1 reply; 26+ messages in thread From: Ludovic Brenta @ 2013-03-07 20:04 UTC (permalink / raw) Rego, P. writes on comp.lang.ada: > I'm trying some exercises of parallel computing using that pragmas > from OpenMP in C, but it would be good to use it also in Ada. Is it > possible to use that pragmas from OpenMP in Ada? And...does gnat gpl > supports it? Why would you use pragmas when Ada supports tasking directly in the language? -- Ludovic Brenta. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 20:04 ` Ludovic Brenta @ 2013-03-07 22:22 ` Peter C. Chapin 2013-03-07 23:42 ` Randy Brukardt ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Peter C. Chapin @ 2013-03-07 22:22 UTC (permalink / raw) OpenMP is a different animal than Ada tasks. It provides fine grained parallelism where, for example, it is possible to have the compiler automatically parallelize a loop. In C: #pragma omp parallel for for( i = 0; i < MAX; ++i ) { array[i]++; } The compiler automatically splits the loop iterations over an "appropriate" number of threads (probably based on the number of cores). In Ada one might write, perhaps pragma OMP(Parallel_For) for I in 1 .. MAX loop A(I) := A(I) + 1 end loop; Doing this with Ada tasks in such a way that it uses an optimal number of threads on each execution (based on core count) would be much more complicated, I should imagine. Please correct me if I'm wrong! OpenMP has various other features, some of which could be done naturally with tasks, but much of what OpenMP is about is semi-automatic fine grained parallelization. It is to Ada tasking what Ada tasking is to the explicit handling of locks, etc. Peter On 03/07/2013 03:04 PM, Ludovic Brenta wrote: > Rego, P. writes on comp.lang.ada: >> I'm trying some exercises of parallel computing using that pragmas >> from OpenMP in C, but it would be good to use it also in Ada. Is it >> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl >> supports it? > > Why would you use pragmas when Ada supports tasking directly in the > language? > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 22:22 ` Peter C. Chapin @ 2013-03-07 23:42 ` Randy Brukardt 2013-03-08 0:39 ` Peter C. Chapin ` (3 more replies) 2013-03-07 23:43 ` Georg Bauhaus 2013-03-08 14:24 ` Rego, P. 2 siblings, 4 replies; 26+ messages in thread From: Randy Brukardt @ 2013-03-07 23:42 UTC (permalink / raw) "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote in message news:hr-dnULuncyRjqTM4p2dnAA@giganews.com... > OpenMP is a different animal than Ada tasks. It provides fine grained > parallelism where, for example, it is possible to have the compiler > automatically parallelize a loop. In C: > > #pragma omp parallel for > for( i = 0; i < MAX; ++i ) { > array[i]++; > } > > The compiler automatically splits the loop iterations over an > "appropriate" number of threads (probably based on the number of cores). Isn't OpenMP aimed at SIMD-type machines (as in video processors), as opposed to generalized cores as in typical Intel and ARM designs? Fine-grained parallelism doesn't make much sense on the latter, because cache coherence and core scheduling issues will eat up gains in almost all circumstances. Ada tasks are a much better model. > In Ada one might write, perhaps > > pragma OMP(Parallel_For) > for I in 1 .. MAX loop > A(I) := A(I) + 1 > end loop; > > Doing this with Ada tasks in such a way that it uses an optimal number of > threads on each execution (based on core count) would be much more > complicated, I should imagine. Please correct me if I'm wrong! Well, this doesn't make much sense. If the pragma doesn't change the semantics of the loop, then its not necessary at all (the compiler can and ought to do the optimization when it makes sense, possibly under the control of global flags). Programmers are lousy at determining where and how the best of use of machine resources can be made. (Pragma Inline is a similar thing that should never have existed and certainly shouldn't be necessary.) If the pragma does change the semantics, then it violates "good taste in pragmas". It would be much better for the change to be indicated by syntax or by an aspect. Pragmas, IMHO, are the worst way to do anything. Compiler writers tend to use them because they can do so without appearing to modify the language, but it's all an illusion: the program probably won't work right without the pragma, so you're still locked into that particular vendor. Might as well have done it right in the first place (and make a proposal to the ARG, backed with practice, so it can get done right in the next version of Ada). Randy Brukardt, President, Anti-Pragma Society. :-) > OpenMP has various other features, some of which could be done naturally > with tasks, but much of what OpenMP is about is semi-automatic fine > grained parallelization. It is to Ada tasking what Ada tasking is to the > explicit handling of locks, etc. > > Peter > > On 03/07/2013 03:04 PM, Ludovic Brenta wrote: >> Rego, P. writes on comp.lang.ada: >>> I'm trying some exercises of parallel computing using that pragmas >>> from OpenMP in C, but it would be good to use it also in Ada. Is it >>> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl >>> supports it? >> >> Why would you use pragmas when Ada supports tasking directly in the >> language? >> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 23:42 ` Randy Brukardt @ 2013-03-08 0:39 ` Peter C. Chapin 2013-03-08 3:31 ` Randy Brukardt 2013-03-08 14:40 ` Rego, P. 2013-03-08 1:15 ` Shark8 ` (2 subsequent siblings) 3 siblings, 2 replies; 26+ messages in thread From: Peter C. Chapin @ 2013-03-08 0:39 UTC (permalink / raw) On 03/07/2013 06:42 PM, Randy Brukardt wrote: > Isn't OpenMP aimed at SIMD-type machines (as in video processors), as > opposed to generalized cores as in typical Intel and ARM designs? > Fine-grained parallelism doesn't make much sense on the latter, because > cache coherence and core scheduling issues will eat up gains in almost all > circumstances. Ada tasks are a much better model. Well, I used OpenMP for a program targeting x64 architectures and it worked well in my case. It was easy to use: my program became 8x faster by the addition of a single line of source text. It even computed the right answer. My program was very well suited to the OpenMP model of computation, however, so I wouldn't expect such a dramatic result in all cases of course. > Well, this doesn't make much sense. If the pragma doesn't change the > semantics of the loop, then its not necessary at all (the compiler can and > ought to do the optimization when it makes sense, possibly under the control > of global flags). Programmers are lousy at determining where and how the > best of use of machine resources can be made. I only used the pragma above to follow the mindset of OpenMP under C. I agree it might not be the best way to do it in Ada. I'm a little uncertain, though, about how well the compiler can be expected to find this sort of parallelization... at least with current technology. The compiler I was using for the program above, and it wasn't an ancient one, certainly had no idea how to do such things on its own. In a high performance application nested loops are common and often the body of a loop calls a subprogram implemented in a library that itself has loops. I don't want all of these nested loops parallelized because that would create huge overheads. Yet without detailed semantic information about what the library subprograms do, I'm not sure how the compiler can know it's safe to parallelize the top level loop. I'm not an expert in writing parallelizing compilers for sure, but it seemed to me, when I was experimenting with OpenMP, that it did a nice job of taking care of the grunt work while still allow me to apply my broad knowledge of the application to find good places to parallelize. I certainly could have written my earlier program with tasks. In fact I had a version that used threads before I tried OpenMP. It worked but it was ugly and a bit flaky. Doing the job with one line was certainly a lot nicer and proved to be more reliable (and faster running too, in my case). Peter ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 0:39 ` Peter C. Chapin @ 2013-03-08 3:31 ` Randy Brukardt 2013-03-08 7:17 ` Simon Wright 2013-03-08 12:07 ` Peter C. Chapin 2013-03-08 14:40 ` Rego, P. 1 sibling, 2 replies; 26+ messages in thread From: Randy Brukardt @ 2013-03-08 3:31 UTC (permalink / raw) "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote in message news:za6dnTU03LGyrqTM4p2dnAA@giganews.com... >> Isn't OpenMP aimed at SIMD-type machines (as in video processors), as >> opposed to generalized cores as in typical Intel and ARM designs? >> Fine-grained parallelism doesn't make much sense on the latter, because >> cache coherence and core scheduling issues will eat up gains in almost >> all >> circumstances. Ada tasks are a much better model. > >Well, I used OpenMP for a program targeting x64 architectures and it worked >well in my case. It was easy to use: my program became 8x faster by the >addition of a single line of source text. It even computed the right >answer. My program was very well suited to the OpenMP model of computation, >however, so I wouldn't expect such a dramatic result in all cases of >course. But (based on the rest of your note) isn't "fine-grained parallelism". You called a bunch of expensive library functions in the loop, and thus your actual computations are large enough that the mechanism would work well. But so would have an arrangement like Paraffin (with a bit more code rearrangement). ... >> Well, this doesn't make much sense. If the pragma doesn't change the >> semantics of the loop, then its not necessary at all (the compiler can >> and >> ought to do the optimization when it makes sense, possibly under the >> control >> of global flags). Programmers are lousy at determining where and how the >> best of use of machine resources can be made. > > I only used the pragma above to follow the mindset of OpenMP under C. I > agree it might not be the best way to do it in Ada. > > I'm a little uncertain, though, about how well the compiler can be > expected to find this sort of parallelization... at least with current > technology. The compiler I was using for the program above, and it wasn't > an ancient one, certainly had no idea how to do such things on its own. Well, the problem is that if you follow Ada semantics (which are sequential for loops), you probably can't parallelize even if you have a pragma. That's because the sequential semantics are observable because of exceptions: an exception happening after the second iteration of a loop had better not modify anything that the fourth iteration would do. And if you want to make such a major change to Ada semantics, writing a pragma is NOT the right way, IMHO. It would be many times better for Ada to simply have parallel loops: for I in 1 .. 10 loop in parallel ... end loop; In which case depending upon the sequential exceution wouldn't be allowed. (There also would have to be some restrictions on what the "..." could be). We're exploring some such ideas for a future version of Ada, and it would be nice if some trial implementations appeared. > In a high performance application nested loops are common and often the > body of a loop calls a subprogram implemented in a library that itself has > loops. I don't want all of these nested loops parallelized because that > would create huge overheads. This is my point: "fine-grained parallelism" means that *everything* is (potentially) parallelized. (See Tucker's Parasail for an example.) You're essentially saying that it doesn't work. Also, there is a strong argument that you're prematurely optimizing your code if you're worrying about "overheads" created. The compiler can figure these out far better than a human can -- when you do it, you're only guessing -- a compiler has a lot more information with which to decide. It's best to tell the compiler where you don't care about exceptions (for instance) and let it pick the best parallelization. > Yet without detailed semantic information about what the library > subprograms do, I'm not sure how the compiler can know it's safe to > parallelize the top level loop. The compiler *has* to have such information, or it can't do *anything* useful. By anything, I mean optimizations (both sequential or parallel), proofs, static checking, and the like. Either it has to have access to the bodies, or it needs *strong* contracts covering everything. We proposed all of those for Ada 2012, but we didn't have the energy to finish the global in/out contracts. You can get a bit of information from "Pure", but that's about it. For Janus/Ada, we always assume a subprogram can do anything, and that prevents about 98% of optimizations from happening across subprogram calls. But that simply isn't acceptable today, especially with Ada 2012's assertions (you have to be able to remove redundant assertion checks to make the cost cheap enough that they don't need to be left on all the time). And parallelization is a similar situation: the compiler needs to know about side-effects of every function, in detail, before it can generate code to take advantage of modern features. > I'm not an expert in writing parallelizing compilers for sure, but it > seemed to me, when I was experimenting with OpenMP, that it did a nice job > of taking care of the grunt work while still allow me to apply my broad > knowledge of the application to find good places to parallelize. I suspect it works better with C, which doesn't have error semantics. In that case, how the loop is implemented doesn't really matter. Similarly, no one cares that you can easily introduce hard to find bugs if there is any overlap between your iterations. Ada is different in both of these regards -- new ways to introduce erroneous execution are not tolerated by most Ada users. In any case, these sorts of things are hacks to use until the compilers and languages catch up. Programs shouldn't be specifying in-lining or parallelism in detail, at most some hints might be provided. Compilers do this sort of grunt work a whole lot better than humans. Randy. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 3:31 ` Randy Brukardt @ 2013-03-08 7:17 ` Simon Wright 2013-03-08 23:40 ` Randy Brukardt 2013-03-08 12:07 ` Peter C. Chapin 1 sibling, 1 reply; 26+ messages in thread From: Simon Wright @ 2013-03-08 7:17 UTC (permalink / raw) "Randy Brukardt" <randy@rrsoftware.com> writes: > you have to be able to remove redundant assertion checks to make the > cost cheap enough that they don't need to be left on all the time *off* all the time? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 7:17 ` Simon Wright @ 2013-03-08 23:40 ` Randy Brukardt 0 siblings, 0 replies; 26+ messages in thread From: Randy Brukardt @ 2013-03-08 23:40 UTC (permalink / raw) "Simon Wright" <simon@pushface.org> wrote in message news:lya9qenyw9.fsf@pushface.org... > "Randy Brukardt" <randy@rrsoftware.com> writes: > >> you have to be able to remove redundant assertion checks to make the >> cost cheap enough that they don't need to be left on all the time > > *off* all the time? Yes, of course, sorry. Randy. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 3:31 ` Randy Brukardt 2013-03-08 7:17 ` Simon Wright @ 2013-03-08 12:07 ` Peter C. Chapin 1 sibling, 0 replies; 26+ messages in thread From: Peter C. Chapin @ 2013-03-08 12:07 UTC (permalink / raw) On 03/07/2013 10:31 PM, Randy Brukardt wrote: > But (based on the rest of your note) isn't "fine-grained parallelism". You > called a bunch of expensive library functions in the loop, and thus your > actual computations are large enough that the mechanism would work well. But > so would have an arrangement like Paraffin (with a bit more code > rearrangement). Yes, "fine-grained" was probably a poor choice of words. Yet splitting loops, at any level, into parallel tasks is tedious to do manually. I think this is where you're coming from when you say it's something the compiler should be doing. I don't disagree with you. My post was in response to an early question wondering if OpenMP style parallelism was really necessary in a language with explicit tasking. My answer is "yes," but if it's implemented by having the compiler parallelize things automatically, then all the better. Either way the programmer doesn't have to roll out the machinery of tasking to manually parallelize loops. From your post you are optimistic about the capabilities of current and upcoming compilers and you would know more about that than I do. Indeed, I look forward to using such compilers and would be happy to provide whatever hints they need via aspects, or annotations, or pragmas or whatever might be necessary. Alas when I wrote the program I mentioned before the necessary compiler technology to do automatic loop parallelization was not available to me. Hence OpenMP. Maybe OpenMP is obsolete or maybe it's not. After all the pragmas it uses (in the C world) are a form of annotation. They may not be the best way to do things, especially in Ada, but ultimately it seems like the programmer would have to be involved in some way. Peter ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 0:39 ` Peter C. Chapin 2013-03-08 3:31 ` Randy Brukardt @ 2013-03-08 14:40 ` Rego, P. 1 sibling, 0 replies; 26+ messages in thread From: Rego, P. @ 2013-03-08 14:40 UTC (permalink / raw) > > Isn't OpenMP aimed at SIMD-type machines (as in video processors), as > > opposed to generalized cores as in typical Intel and ARM designs? > > Fine-grained parallelism doesn't make much sense on the latter, because > > cache coherence and core scheduling issues will eat up gains in almost all > > circumstances. Ada tasks are a much better model. > > Well, I used OpenMP for a program targeting x64 architectures and it > worked well in my case. It was easy to use: my program became 8x faster > by the addition of a single line of source text. It even computed the > right answer. My program was very well suited to the OpenMP model of > computation, however, so I wouldn't expect such a dramatic result in all > cases of course. In my case, I'm using OpenMP in a parallel computer (which I do not have idea of the architecture). But I would like to use also on my I7 x64 6/4 cores, just for curiosity, sure it's not a supermachine, but it would be good to compare the same algorithms performance with it. The idea of using Ada came out because while studying OpenMP I found very similar several parallel OpenMP concepts with the Ada tasking scheme, so why not (if it could be done). ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 23:42 ` Randy Brukardt 2013-03-08 0:39 ` Peter C. Chapin @ 2013-03-08 1:15 ` Shark8 2013-03-08 3:42 ` Randy Brukardt 2013-03-08 7:37 ` Simon Wright 2013-03-10 18:00 ` Waldek Hebisch 3 siblings, 1 reply; 26+ messages in thread From: Shark8 @ 2013-03-08 1:15 UTC (permalink / raw) On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote: > > Pragmas, IMHO, are the worst way to do anything. Compiler writers tend to > use them because they can do so without appearing to modify the language, > but it's all an illusion: the program probably won't work right without the > pragma, so you're still locked into that particular vendor. Might as well > have done it right in the first place (and make a proposal to the ARG, > backed with practice, so it can get done right in the next version of Ada). I'm not sure I totally agree with your sentiment; IIRC, Pragmas in Ada were supposed to be parentheticals to the compiler that were not essential to program correctness -- such as Optimize or the source-printing pragma Page. That sentiment was thwarted with the inclusion of representation pragmas. -- However, the Ada specification *does* allow for implementation-defined pragmas. It seems to me that such would be ideal for experimental or compiler-specific code-generation. The previous poster's "Pragma OMP(...)" is an excellent example (though I think the FOR_LOOP parameter is stupid, the feature [OMP] should be turned on/off as needed, letting the compiler determine the optimal technique to use. {A program written in such a way would be correct even if the code were ported to a non-OMP compiler, provided the underlying algorithm was correct.} ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 1:15 ` Shark8 @ 2013-03-08 3:42 ` Randy Brukardt 2013-03-08 14:53 ` Rego, P. 2013-03-08 16:52 ` Shark8 0 siblings, 2 replies; 26+ messages in thread From: Randy Brukardt @ 2013-03-08 3:42 UTC (permalink / raw) "Shark8" <onewingedshark@gmail.com> wrote in message news:3e01ac49-4427-4f50-8577-8edab7e539a6@googlegroups.com... On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote: >> >> Pragmas, IMHO, are the worst way to do anything. Compiler writers tend to >> use them because they can do so without appearing to modify the language, >> but it's all an illusion: the program probably won't work right without >> the >> pragma, so you're still locked into that particular vendor. Might as well >> have done it right in the first place (and make a proposal to the ARG, >> backed with practice, so it can get done right in the next version of >> Ada). > >I'm not sure I totally agree with your sentiment; IIRC, Pragmas in Ada were >supposed >to be parentheticals to the compiler that were not essential to program >correctness -- such >as Optimize or the source-printing pragma Page. > >That sentiment was thwarted with the inclusion of representation pragmas. Which are now obsolescent. We've finally removed most of a basic mistake in the design of the language. It would be a pity to bring it back. > -- However, the Ada specification *does* allow for implementation-defined > pragmas. > It seems to me that such would be ideal for experimental or > compiler-specific > code-generation. That's fine if that is all that it does. But that's not possible here. > The previous poster's "Pragma OMP(...)" is an excellent example No, it changes the semantics drastically. Exceptions don't work, I/O and progress counters have to be avoided in the loop, etc. > (though I think the FOR_LOOP parameter is stupid, the feature [OMP] should > be > turned on/off as needed, letting the compiler determine the optimal > technique to use. > {A program written in such a way would be correct even if the code were > ported > to a non-OMP compiler, provided the underlying algorithm was correct.} In order for that to be the case, the pragma would have to make various constructs illegal in the loop and in the surrounding code (exception handlers, any code where one iteration of the loop depends on the next, erroneous use of shared variables). But a pragma shouldn't be changing the legality rules of the language. (And it's not clear this would really fix the problem.) Alternatively, the parallel version could simply be erroneous if any of those things happened. But that means you have no idea what will happen, and you've introduced new forms of erroneousness (meaning that there is no chance that the pragma would ever be standardized). Ada is about doing things right, and that should be true even for implementation-defined stuff. And we *need* people to figure out good ways of doing these things (for instance, a "parallel" classification for functions would be very helpful). The sloppy way helps little. Randy "No New Pragmas" Brukardt ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 3:42 ` Randy Brukardt @ 2013-03-08 14:53 ` Rego, P. 2013-03-08 15:47 ` Georg Bauhaus 2013-03-08 23:40 ` Randy Brukardt 2013-03-08 16:52 ` Shark8 1 sibling, 2 replies; 26+ messages in thread From: Rego, P. @ 2013-03-08 14:53 UTC (permalink / raw) > On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote: > Ada is about doing things right, and that should be true even for > implementation-defined stuff. And we *need* people to figure out good ways > of doing these things (for instance, a "parallel" classification for > functions would be very helpful). The sloppy way helps little. Got your point. Would you have a suggestion on how I could a loop such as "parallelization" like pragma OMP(Parallel_For) for I in 1 .. MAX loop A(I) := A(I) + 1 end loop; but without using pragmas? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 14:53 ` Rego, P. @ 2013-03-08 15:47 ` Georg Bauhaus 2013-03-08 23:40 ` Randy Brukardt 1 sibling, 0 replies; 26+ messages in thread From: Georg Bauhaus @ 2013-03-08 15:47 UTC (permalink / raw) On 08.03.13 15:53, Rego, P. wrote: >> On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote: > >> Ada is about doing things right, and that should be true even for >> implementation-defined stuff. And we *need* people to figure out good ways >> of doing these things (for instance, a "parallel" classification for >> functions would be very helpful). The sloppy way helps little. > > Got your point. > Would you have a suggestion on how I could a loop such as "parallelization" like > > pragma OMP(Parallel_For) > for I in 1 .. MAX loop > A(I) := A(I) + 1 > end loop; If A is an array of small objects, then with GNAT on Intel, you can turn on -ftree-vectorize (or -O3) and see what this gives. Adding -ftree-vectorizer-verbose=2 (old) or -fopt-info-optimize instructs GCC to report successful vectorizations. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 14:53 ` Rego, P. 2013-03-08 15:47 ` Georg Bauhaus @ 2013-03-08 23:40 ` Randy Brukardt 1 sibling, 0 replies; 26+ messages in thread From: Randy Brukardt @ 2013-03-08 23:40 UTC (permalink / raw) "Rego, P." <pvrego@gmail.com> wrote in message news:f6fff4ba-e9a1-4335-a4a0-cb9d60152ad9@googlegroups.com... >> On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote: > >> Ada is about doing things right, and that should be true even for >> implementation-defined stuff. And we *need* people to figure out good >> ways >> of doing these things (for instance, a "parallel" classification for >> functions would be very helpful). The sloppy way helps little. > > Got your point. > Would you have a suggestion on how I could a loop such as > "parallelization" like > > pragma OMP(Parallel_For) > for I in 1 .. MAX loop > A(I) := A(I) + 1 > end loop; > > but without using pragmas? (1) Use a compiler that does this automatically (apparently GNAT does this in some circumstances). (2) Use a library like Paraffin; a bit less convinient but it will work on any Ada compiler for any target. Some of the Ada 2012 features may make such a library more convinient to write (I haven't been keeping up with Brad's work on this). (3) Use a compiler with an appropriate extension for parallel loops. One possibility would be something like: for I in 1 .. MAX loop in parallel A(I) := A(I) + 1 end loop; This of course ties you to a particular implementation, or to wait for Ada 202x. Of course, so does a pragma, and it's much less likely to be standardized. So I suggest (1) or (2). Randy. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 3:42 ` Randy Brukardt 2013-03-08 14:53 ` Rego, P. @ 2013-03-08 16:52 ` Shark8 2013-03-08 23:36 ` Randy Brukardt 1 sibling, 1 reply; 26+ messages in thread From: Shark8 @ 2013-03-08 16:52 UTC (permalink / raw) On Thursday, March 7, 2013 8:42:15 PM UTC-7, Randy Brukardt wrote: > > In order for that to be the case, the pragma would have to make various > constructs illegal in the loop and in the surrounding code (exception > handlers, any code where one iteration of the loop depends on the next, > erroneous use of shared variables). But a pragma shouldn't be changing the > legality rules of the language. (And it's not clear this would really fix > the problem.) Why would that have to change the semantics of the program: since there would have to be a non-implementation-defined code-generation method (for when the pragma was off) the compiler should just use that if those constructs are used. It limits the usefulness, yes; but there's no reason to assume that it should cause erroneous execution. -- I suppose that in early versions of playing w/ an experimental version there might be some because of some case that was overlooked. -- Though this does seem to bring us back to the idea of exceptions (and how to indicate/assert to the compiler that none, or a limited set, can be raised). > > > > Alternatively, the parallel version could simply be erroneous if any of > > those things happened. But that means you have no idea what will happen, and > > you've introduced new forms of erroneousness (meaning that there is no > > chance that the pragma would ever be standardized). > > > > Ada is about doing things right, and that should be true even for > > implementation-defined stuff. And we *need* people to figure out good ways > > of doing these things (for instance, a "parallel" classification for > > functions would be very helpful). The sloppy way helps little. > > > > Randy "No New Pragmas" Brukardt ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 16:52 ` Shark8 @ 2013-03-08 23:36 ` Randy Brukardt 2013-03-09 4:13 ` Brad Moore 0 siblings, 1 reply; 26+ messages in thread From: Randy Brukardt @ 2013-03-08 23:36 UTC (permalink / raw) "Shark8" <onewingedshark@gmail.com> wrote in message news:9e0bbbdf-ccfa-4d4c-90af-2d56d46242b3@googlegroups.com... >On Thursday, March 7, 2013 8:42:15 PM UTC-7, Randy Brukardt wrote: >> >> In order for that to be the case, the pragma would have to make various >> constructs illegal in the loop and in the surrounding code (exception >> handlers, any code where one iteration of the loop depends on the next, >> erroneous use of shared variables). But a pragma shouldn't be changing >> the >> legality rules of the language. (And it's not clear this would really fix >> the problem.) > >Why would that have to change the semantics of the program: since there >would have >to be a non-implementation-defined code-generation method (for when the >pragma >was off) the compiler should just use that if those constructs are used. Mainly because 95% of Ada code is going to fail such tests; it would virtually never be able to use the fancy code. Take the OP's example, for example: for I in 1 .. MAX loop A(I) := A(I) + 1; -- Can raise Constraint_Error because of overflow or range checks. end loop; This can be done in parallel only if (A) there is no exception handler for Constraint_Error or others anywhere in the program; or (B) pragma Suppress applies to the loop (nasty, we never, ever want an incentive to use Suppress); or (C) no exception handler or code following the handler can ever access A (generally only possible if A is a local variable, not a parameter or global). For some loops there would be a (D) be able to prove from subtypes and constraints that no exception can happen -- but that is never possible for increment or decrement operations like the above. These conditions aren't going to happen that often, and unless a compiler has access to the source code for the entire program, (A) isn't possible to determine anyway. And if the compiler is going to go through all of that anyway, it might as well just do it whenever it can, no pragma is necessary or useful. The whole advantage of having a "marker" here is to allow a change in the semantics in the error case. If you're not going to do that, you're hardly ever going to be able to parallelize, so what's the point of a pragma? Randy. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-08 23:36 ` Randy Brukardt @ 2013-03-09 4:13 ` Brad Moore 2013-03-10 4:24 ` Randy Brukardt 0 siblings, 1 reply; 26+ messages in thread From: Brad Moore @ 2013-03-09 4:13 UTC (permalink / raw) On 08/03/2013 4:36 PM, Randy Brukardt wrote: > "Shark8" <onewingedshark@gmail.com> wrote in message > news:9e0bbbdf-ccfa-4d4c-90af-2d56d46242b3@googlegroups.com... >> On Thursday, March 7, 2013 8:42:15 PM UTC-7, Randy Brukardt wrote: >>> >>> In order for that to be the case, the pragma would have to make various >>> constructs illegal in the loop and in the surrounding code (exception >>> handlers, any code where one iteration of the loop depends on the next, >>> erroneous use of shared variables). But a pragma shouldn't be changing >>> the >>> legality rules of the language. (And it's not clear this would really fix >>> the problem.) >> >> Why would that have to change the semantics of the program: since there >> would have >> to be a non-implementation-defined code-generation method (for when the >> pragma >> was off) the compiler should just use that if those constructs are used. > > Mainly because 95% of Ada code is going to fail such tests; it would > virtually never be able to use the fancy code. > > Take the OP's example, for example: > > for I in 1 .. MAX loop > A(I) := A(I) + 1; -- Can raise Constraint_Error because of overflow or > range checks. > end loop; > > This can be done in parallel only if (A) there is no exception handler for > Constraint_Error or others anywhere in the program; or I am working towards a new version of Paraffin to be released soon that handles exceptions in such loops (as well as a number of other features). The technique though, is to have the workers catch any exception that might have been raised in the users code, and then call Ada.Exceptions.Save_Occurence to save the exception to be raised later. Once all workers have completed their work before returning to let the sequential code continue on, a check is made to see if any occurrences were saved. If so, then Ada.Exceptions.Reraise_Occurrence is called, to get the exception to appear in the same task that invoked the parallelism. Testing so far indicates this seems to work well, maintaining the exception abstraction as though the code were being executed sequentially. Currently only the most recent exception is saved, so if more than one exception is raised by the parallel workers, only one will get fed back to the calling task, but I think thats OK, as that would have been the behaviour for the sequential case. Such an exception also sets a flag indicating the work is complete, which attempts to get other workers to abort their work as soon as possible. Also, under GNAT at least, this exception handling logic doesn't appear to impact performance. Apparently they use zero cost exception handling which might be why. I'm not sure what sort of impact that might have on other implementations that model exceptions differently. Hopefully, it wouldn't be a significant impact. Brad (B) pragma Suppress > applies to the loop (nasty, we never, ever want an incentive to use > Suppress); or (C) no exception handler or code following the handler can > ever access A (generally only possible if A is a local variable, not a > parameter or global). For some loops there would be a (D) be able to prove > from subtypes and constraints that no exception can happen -- but that is > never possible for increment or decrement operations like the above. These > conditions aren't going to happen that often, and unless a compiler has > access to the source code for the entire program, (A) isn't possible to > determine anyway. > > And if the compiler is going to go through all of that anyway, it might as > well just do it whenever it can, no pragma is necessary or useful. > > The whole advantage of having a "marker" here is to allow a change in the > semantics in the error case. If you're not going to do that, you're hardly > ever going to be able to parallelize, so what's the point of a pragma? > > Randy. > > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-09 4:13 ` Brad Moore @ 2013-03-10 4:24 ` Randy Brukardt 0 siblings, 0 replies; 26+ messages in thread From: Randy Brukardt @ 2013-03-10 4:24 UTC (permalink / raw) "Brad Moore" <brad.moore@shaw.ca> wrote in message news:513AB6D3.6030106@shaw.ca... > On 08/03/2013 4:36 PM, Randy Brukardt wrote: ... >> Take the OP's example, for example: >> >> for I in 1 .. MAX loop >> A(I) := A(I) + 1; -- Can raise Constraint_Error because of overflow >> or >> range checks. >> end loop; >> >> This can be done in parallel only if (A) there is no exception handler >> for >> Constraint_Error or others anywhere in the program; or > > I am working towards a new version of Paraffin to be released soon that > handles exceptions in such loops (as well as a number of other features). > > The technique though, is to have the workers catch any exception that > might have been raised in the users code, and then call > Ada.Exceptions.Save_Occurence to save the exception to be raised later. I'd expect this to work fine - it's how I'd implement it if I was doing that. The issue, though, is that this changes the semantics of the loop WRT to exceptions. Specifically, the parts of A that get modified would be unspecified, while that's not true for the sequential loop (the items that are modified have to be a contiguous group at the lower end of the array). That's fine for Paraffin, because no one will accidentally use it expecting deterministic behavior. It's not so clear when you actually write the loop syntax. Which is why a parallel loop syntax seems valuable, as it would make it explicit that parallelism is expected (and would also allow checking for dependencies between iterations, which usually can't be allowed). Of course, an alternative would be just to standardize a library like Paraffin for this purpose, possibly with some tie-in to the iterator syntax. (I know you proposed something on this line, but too late to include in Ada 2012.) Randy. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 23:42 ` Randy Brukardt 2013-03-08 0:39 ` Peter C. Chapin 2013-03-08 1:15 ` Shark8 @ 2013-03-08 7:37 ` Simon Wright 2013-03-10 18:00 ` Waldek Hebisch 3 siblings, 0 replies; 26+ messages in thread From: Simon Wright @ 2013-03-08 7:37 UTC (permalink / raw) "Randy Brukardt" <randy@rrsoftware.com> writes: > Pragmas, IMHO, are the worst way to do anything. Compiler writers tend > to use them because they can do so without appearing to modify the > language, but it's all an illusion: the program probably won't work > right without the pragma, so you're still locked into that particular > vendor. You'd be just as locked with implementation-defined aspects (GNAT certainly has these). But at least another compiler would have to fail if it didn't support the aspect at all (you'd be as badly off as with pragmas if it did, but with different semantics). ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 23:42 ` Randy Brukardt ` (2 preceding siblings ...) 2013-03-08 7:37 ` Simon Wright @ 2013-03-10 18:00 ` Waldek Hebisch 3 siblings, 0 replies; 26+ messages in thread From: Waldek Hebisch @ 2013-03-10 18:00 UTC (permalink / raw) Randy Brukardt <randy@rrsoftware.com> wrote: > "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote in message > news:hr-dnULuncyRjqTM4p2dnAA@giganews.com... > > OpenMP is a different animal than Ada tasks. It provides fine grained > > parallelism where, for example, it is possible to have the compiler > > automatically parallelize a loop. In C: > > > > #pragma omp parallel for > > for( i = 0; i < MAX; ++i ) { > > array[i]++; > > } > > > > The compiler automatically splits the loop iterations over an > > "appropriate" number of threads (probably based on the number of cores). > > Isn't OpenMP aimed at SIMD-type machines (as in video processors), as > opposed to generalized cores as in typical Intel and ARM designs? > Fine-grained parallelism doesn't make much sense on the latter, because > cache coherence and core scheduling issues will eat up gains in almost all > circumstances. Ada tasks are a much better model. Actually OpenMP only looks like fine-grained parallelism, but is not: OpenMP creates (and destroys) tasks as needed. Main advantage of OpenMP is that is automates some common parallel patterns and consequently code is much closer to seqential version. It is very hard to get similar effect in fully automatic way, without pragmas. Simply, taking looses on fine-graned cases and pragmas tell the compiler that code is coarse enough to use several tasks. Also, OMP pragmas control memory consitency -- without them compiler would have to assume worst case and generate slower code. -- Waldek Hebisch hebisch@math.uni.wroc.pl ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 22:22 ` Peter C. Chapin 2013-03-07 23:42 ` Randy Brukardt @ 2013-03-07 23:43 ` Georg Bauhaus 2013-03-08 10:18 ` Georg Bauhaus 2013-03-08 14:24 ` Rego, P. 2 siblings, 1 reply; 26+ messages in thread From: Georg Bauhaus @ 2013-03-07 23:43 UTC (permalink / raw) "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote: > OpenMP is a different animal than Ada tasks. It provides fine grained > parallelism where, for example, it is possible to have the compiler > automatically parallelize a loop. In C: > > #pragma omp parallel for > for( i = 0; i < MAX; ++i ) { > array[i]++; > } Fortunately, OpenMP is no longer needed to achieve automatic parallelism in either C or Ada at the low level. GCC's vectorizer produces code that runs in parallel for a number of loop patterns. These are documented, and they work in GNAT GPL or more recent FSF GNATs. Later 4.7s IIRC. The higher level constructs added to OpenMP 3 do have a lot in common with Ada tasking, if I have understood some slides correctly. How could it be otherwise? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 23:43 ` Georg Bauhaus @ 2013-03-08 10:18 ` Georg Bauhaus 0 siblings, 0 replies; 26+ messages in thread From: Georg Bauhaus @ 2013-03-08 10:18 UTC (permalink / raw) On 08.03.13 00:43, Georg Bauhaus wrote: > "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote: >> OpenMP is a different animal than Ada tasks. It provides fine grained >> parallelism where, for example, it is possible to have the compiler >> automatically parallelize a loop. In C: >> >> #pragma omp parallel for >> for( i = 0; i < MAX; ++i ) { >> array[i]++; >> } > > Fortunately, OpenMP is no longer needed to achieve automatic > parallelism in either C or Ada at the low level. GCC's vectorizer > produces code that runs in parallel for a number of loop patterns. > These are documented, and they work in GNAT GPL or more > recent FSF GNATs. Later 4.7s IIRC. For example, adding -ftree-vectorize to the set of options (-O2 ...) increases the speed of the program below by factors up to 3, depending to some extent on the value of MAX. (Option -O3 is even easier in this case, and yields improvements when MAX = 8.) The assembly listing includes instructions like MOVDQA and PADDD used with SSE registers. GNAT will report successful optimizations when -fopt-info-optimized is among the switches (or -ftree-vectorizer-verbose=2 for older GNATs). package Fast is MAX : constant := 50; subtype Number is Integer; type Index is new Natural range 0 .. MAX; type Vect is array (Index) of Number; procedure Inc_Array (V : in out Vect); end Fast; package body Fast is procedure Inc_Array (V : in out Vect) is begin for K in Index loop V (K) := V (K) + 1; end loop; end Inc_Array; end Fast; with Ada.Real_Time; use Ada.Real_Time; with Ada.Text_IO; with Fast; use Fast; procedure Test_Fast is Start, Finish : Time; Data : Vect; Result : Integer := 0; pragma Volatile (Result); begin Start := Clock; for Run in 1 .. 500_000_000/MAX loop Inc_Array (Data); if Data (Index(MAX/2 + Run mod MAX/2)) rem 2 = 1 then Result := 1; end if; end loop; Finish := Clock; Ada.Text_IO.Put_Line (Duration'Image (To_Duration (Finish - Start))); end Test_Fast; ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 22:22 ` Peter C. Chapin 2013-03-07 23:42 ` Randy Brukardt 2013-03-07 23:43 ` Georg Bauhaus @ 2013-03-08 14:24 ` Rego, P. 2 siblings, 0 replies; 26+ messages in thread From: Rego, P. @ 2013-03-08 14:24 UTC (permalink / raw) Em quinta-feira, 7 de março de 2013 19h22min03s UTC-3, Peter C. Chapin escreveu: > OpenMP is a different animal than Ada tasks. It provides fine grained > parallelism where, for example, it is possible to have the compiler > automatically parallelize a loop. In C: > #pragma omp parallel for > for( i = 0; i < MAX; ++i ) { > array[i]++; > } > The compiler automatically splits the loop iterations over an > "appropriate" number of threads (probably based on the number of cores). > In Ada one might write, perhaps > pragma OMP(Parallel_For) > for I in 1 .. MAX loop > A(I) := A(I) + 1 > end loop; > Doing this with Ada tasks in such a way that it uses an optimal number > of threads on each execution (based on core count) would be much more > complicated, I should imagine. Please correct me if I'm wrong! > OpenMP has various other features, some of which could be done naturally > with tasks, but much of what OpenMP is about is semi-automatic fine > grained parallelization. It is to Ada tasking what Ada tasking is to the > explicit handling of locks, etc. > Peter Yes, that's the idea. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 18:04 Ada and OpenMP Rego, P. 2013-03-07 20:04 ` Ludovic Brenta @ 2013-03-07 22:52 ` Simon Wright 2013-03-08 21:37 ` Brad Moore 1 sibling, 1 reply; 26+ messages in thread From: Simon Wright @ 2013-03-07 22:52 UTC (permalink / raw) "Rego, P." <pvrego@gmail.com> writes: > I'm trying some exercises of parallel computing using that pragmas > from OpenMP in C, but it would be good to use it also in Ada. Is it > possible to use that pragmas from OpenMP in Ada? And...does gnat gpl > supports it? GNAT doesn't support OpenMP pragmas. But you might take a look at Paraffin: http://sourceforge.net/projects/paraffin/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Ada and OpenMP 2013-03-07 22:52 ` Simon Wright @ 2013-03-08 21:37 ` Brad Moore 0 siblings, 0 replies; 26+ messages in thread From: Brad Moore @ 2013-03-08 21:37 UTC (permalink / raw) On 07/03/2013 3:52 PM, Simon Wright wrote: > "Rego, P." <pvrego@gmail.com> writes: > >> I'm trying some exercises of parallel computing using that pragmas >> from OpenMP in C, but it would be good to use it also in Ada. Is it >> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl >> supports it? > > GNAT doesn't support OpenMP pragmas. > > But you might take a look at Paraffin: > http://sourceforge.net/projects/paraffin/ > To give an example using Paraffin libraries, The following code shows the same problem executed sequentially, and then executed with Paraffin libraries. with Ada.Real_Time; use Ada.Real_Time; with Ada.Command_Line; with Ada.Text_IO; use Ada.Text_IO; with Parallel.Iteration.Work_Stealing; procedure Test_Loops is procedure Integer_Loops is new Parallel.Iteration.Work_Stealing (Iteration_Index_Type => Integer); Start : Time; Array_Size : Natural := 50; Iterations : Natural := 10_000_000; begin -- Allow first command line parameter to override default iteration count if Ada.Command_Line.Argument_Count >= 1 then Iterations := Integer'Value (Ada.Command_Line.Argument (1)); end if; -- Allow second command line parameter to override default array size if Ada.Command_Line.Argument_Count >= 2 then Array_Size := Integer'Value (Ada.Command_Line.Argument (2)); end if; Data_Block : declare Data : array (1 .. Array_Size) of Natural := (others => 0); begin -- Sequential Version of the code, any parallelization must be auto -- generated by the compiler Start := Clock; for I in Data'Range loop for J in 1 .. Iterations loop Data (I) := Data (I) + 1; end loop; end loop; Put_Line ("Sequential Elapsed=" & Duration'Image (To_Duration (Clock - Start))); Data := (others => 0); Start := Clock; -- Parallel Version of the code, explicitly parallelized using Paraffin declare procedure Iterate (First : Integer; Last : Integer) is begin for I in First .. Last loop for J in 1 .. Iterations loop Data (I) := Data (I) + 1; end loop; end loop; end Iterate; begin Integer_Loops (From => Data'First, To => Data'Last, Worker_Count => 4, Process => Iterate'Access); end; Put_Line ("Parallel Elapsed=" & Duration'Image (To_Duration (Clock - Start))); end Data_Block; end Test_Loops; When run on my machine AMD Quadcore with parameters 100_000 100_000, with full optimization turned on with -ftree-vectorize, I get. Sequential Elapsed= 6.874298000 Parallel Elapsed= 6.287230000 With optimization turned off, I get Sequential Elapsed= 32.428908000 Parallel Elapsed= 8.424717000 gcc with GNAT does a good job of optimization when its enabled, for these cases as shown, but the differences between optimization and using Paraffin can be more pronounced in other cases that are more complex, such as loops that involve reduction (e.g. calculating a sum) Brad ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2013-03-10 18:00 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-03-07 18:04 Ada and OpenMP Rego, P. 2013-03-07 20:04 ` Ludovic Brenta 2013-03-07 22:22 ` Peter C. Chapin 2013-03-07 23:42 ` Randy Brukardt 2013-03-08 0:39 ` Peter C. Chapin 2013-03-08 3:31 ` Randy Brukardt 2013-03-08 7:17 ` Simon Wright 2013-03-08 23:40 ` Randy Brukardt 2013-03-08 12:07 ` Peter C. Chapin 2013-03-08 14:40 ` Rego, P. 2013-03-08 1:15 ` Shark8 2013-03-08 3:42 ` Randy Brukardt 2013-03-08 14:53 ` Rego, P. 2013-03-08 15:47 ` Georg Bauhaus 2013-03-08 23:40 ` Randy Brukardt 2013-03-08 16:52 ` Shark8 2013-03-08 23:36 ` Randy Brukardt 2013-03-09 4:13 ` Brad Moore 2013-03-10 4:24 ` Randy Brukardt 2013-03-08 7:37 ` Simon Wright 2013-03-10 18:00 ` Waldek Hebisch 2013-03-07 23:43 ` Georg Bauhaus 2013-03-08 10:18 ` Georg Bauhaus 2013-03-08 14:24 ` Rego, P. 2013-03-07 22:52 ` Simon Wright 2013-03-08 21:37 ` Brad Moore
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox