comp.lang.ada
 help / color / mirror / Atom feed
* Ada and OpenMP
@ 2013-03-07 18:04 Rego, P.
  2013-03-07 20:04 ` Ludovic Brenta
  2013-03-07 22:52 ` Simon Wright
  0 siblings, 2 replies; 26+ messages in thread
From: Rego, P. @ 2013-03-07 18:04 UTC (permalink / raw)


Dear friends,

I'm trying some exercises of parallel computing using that pragmas from OpenMP in C, but it would be good to use it also in Ada. Is it possible to use that pragmas from OpenMP in Ada? And...does gnat gpl supports it?

Regards.
P.Rego



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 18:04 Ada and OpenMP Rego, P.
@ 2013-03-07 20:04 ` Ludovic Brenta
  2013-03-07 22:22   ` Peter C. Chapin
  2013-03-07 22:52 ` Simon Wright
  1 sibling, 1 reply; 26+ messages in thread
From: Ludovic Brenta @ 2013-03-07 20:04 UTC (permalink / raw)


Rego, P. writes on comp.lang.ada:
> I'm trying some exercises of parallel computing using that pragmas
> from OpenMP in C, but it would be good to use it also in Ada. Is it
> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl
> supports it?

Why would you use pragmas when Ada supports tasking directly in the
language?

-- 
Ludovic Brenta.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 20:04 ` Ludovic Brenta
@ 2013-03-07 22:22   ` Peter C. Chapin
  2013-03-07 23:42     ` Randy Brukardt
                       ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Peter C. Chapin @ 2013-03-07 22:22 UTC (permalink / raw)


OpenMP is a different animal than Ada tasks. It provides fine grained 
parallelism where, for example, it is possible to have the compiler 
automatically parallelize a loop. In C:

#pragma omp parallel for
for( i = 0; i < MAX; ++i ) {
   array[i]++;
}

The compiler automatically splits the loop iterations over an 
"appropriate" number of threads (probably based on the number of cores).

In Ada one might write, perhaps

pragma OMP(Parallel_For)
for I in 1 .. MAX loop
   A(I) := A(I) + 1
end loop;

Doing this with Ada tasks in such a way that it uses an optimal number 
of threads on each execution (based on core count) would be much more 
complicated, I should imagine. Please correct me if I'm wrong!

OpenMP has various other features, some of which could be done naturally 
with tasks, but much of what OpenMP is about is semi-automatic fine 
grained parallelization. It is to Ada tasking what Ada tasking is to the 
explicit handling of locks, etc.

Peter

On 03/07/2013 03:04 PM, Ludovic Brenta wrote:
> Rego, P. writes on comp.lang.ada:
>> I'm trying some exercises of parallel computing using that pragmas
>> from OpenMP in C, but it would be good to use it also in Ada. Is it
>> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl
>> supports it?
>
> Why would you use pragmas when Ada supports tasking directly in the
> language?
>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 18:04 Ada and OpenMP Rego, P.
  2013-03-07 20:04 ` Ludovic Brenta
@ 2013-03-07 22:52 ` Simon Wright
  2013-03-08 21:37   ` Brad Moore
  1 sibling, 1 reply; 26+ messages in thread
From: Simon Wright @ 2013-03-07 22:52 UTC (permalink / raw)


"Rego, P." <pvrego@gmail.com> writes:

> I'm trying some exercises of parallel computing using that pragmas
> from OpenMP in C, but it would be good to use it also in Ada. Is it
> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl
> supports it?

GNAT doesn't support OpenMP pragmas.

But you might take a look at Paraffin:
http://sourceforge.net/projects/paraffin/



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 22:22   ` Peter C. Chapin
@ 2013-03-07 23:42     ` Randy Brukardt
  2013-03-08  0:39       ` Peter C. Chapin
                         ` (3 more replies)
  2013-03-07 23:43     ` Georg Bauhaus
  2013-03-08 14:24     ` Rego, P.
  2 siblings, 4 replies; 26+ messages in thread
From: Randy Brukardt @ 2013-03-07 23:42 UTC (permalink / raw)


"Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote in message 
news:hr-dnULuncyRjqTM4p2dnAA@giganews.com...
> OpenMP is a different animal than Ada tasks. It provides fine grained 
> parallelism where, for example, it is possible to have the compiler 
> automatically parallelize a loop. In C:
>
> #pragma omp parallel for
> for( i = 0; i < MAX; ++i ) {
>   array[i]++;
> }
>
> The compiler automatically splits the loop iterations over an 
> "appropriate" number of threads (probably based on the number of cores).

Isn't OpenMP aimed at SIMD-type machines (as in video processors), as 
opposed to generalized cores as in typical Intel and ARM designs? 
Fine-grained parallelism doesn't make much sense on the latter, because 
cache coherence and core scheduling issues will eat up gains in almost all 
circumstances. Ada tasks are a much better model.

> In Ada one might write, perhaps
>
> pragma OMP(Parallel_For)
> for I in 1 .. MAX loop
>   A(I) := A(I) + 1
> end loop;
>
> Doing this with Ada tasks in such a way that it uses an optimal number of 
> threads on each execution (based on core count) would be much more 
> complicated, I should imagine. Please correct me if I'm wrong!

Well, this doesn't make much sense. If the pragma doesn't change the 
semantics of the loop, then its not necessary at all (the compiler can and 
ought to do the optimization when it makes sense, possibly under the control 
of global flags). Programmers are lousy at determining where and how the 
best of use of machine resources can be made. (Pragma Inline is a similar 
thing that should never have existed and certainly shouldn't be necessary.)

If the pragma does change the semantics, then it violates "good taste in 
pragmas". It would be much better for the change to be indicated by syntax 
or by an aspect.

Pragmas, IMHO, are the worst way to do anything. Compiler writers tend to 
use them because they can do so without appearing to modify the language, 
but it's all an illusion: the program probably won't work right without the 
pragma, so you're still locked into that particular vendor. Might as well 
have done it right in the first place (and make a proposal to the ARG, 
backed with practice, so it can get done right in the next version of Ada).

                      Randy Brukardt,
                      President, Anti-Pragma Society. :-)


> OpenMP has various other features, some of which could be done naturally 
> with tasks, but much of what OpenMP is about is semi-automatic fine 
> grained parallelization. It is to Ada tasking what Ada tasking is to the 
> explicit handling of locks, etc.
>
> Peter
>
> On 03/07/2013 03:04 PM, Ludovic Brenta wrote:
>> Rego, P. writes on comp.lang.ada:
>>> I'm trying some exercises of parallel computing using that pragmas
>>> from OpenMP in C, but it would be good to use it also in Ada. Is it
>>> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl
>>> supports it?
>>
>> Why would you use pragmas when Ada supports tasking directly in the
>> language?
>> 





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 22:22   ` Peter C. Chapin
  2013-03-07 23:42     ` Randy Brukardt
@ 2013-03-07 23:43     ` Georg Bauhaus
  2013-03-08 10:18       ` Georg Bauhaus
  2013-03-08 14:24     ` Rego, P.
  2 siblings, 1 reply; 26+ messages in thread
From: Georg Bauhaus @ 2013-03-07 23:43 UTC (permalink / raw)


"Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote:
> OpenMP is a different animal than Ada tasks. It provides fine grained
> parallelism where, for example, it is possible to have the compiler
> automatically parallelize a loop. In C:
> 
> #pragma omp parallel for
> for( i = 0; i < MAX; ++i ) {
>   array[i]++;
> }

Fortunately, OpenMP is no longer needed to achieve automatic
parallelism in either C or Ada at the low level. GCC's vectorizer
produces code that runs in parallel for a number of loop patterns.
These are documented, and they work in GNAT GPL or more
recent FSF GNATs. Later 4.7s IIRC.

The higher level constructs added to OpenMP 3 do have a lot
in common with Ada tasking, if I have understood some
slides correctly. How could it be otherwise?



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 23:42     ` Randy Brukardt
@ 2013-03-08  0:39       ` Peter C. Chapin
  2013-03-08  3:31         ` Randy Brukardt
  2013-03-08 14:40         ` Rego, P.
  2013-03-08  1:15       ` Shark8
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 26+ messages in thread
From: Peter C. Chapin @ 2013-03-08  0:39 UTC (permalink / raw)



On 03/07/2013 06:42 PM, Randy Brukardt wrote:

> Isn't OpenMP aimed at SIMD-type machines (as in video processors), as
> opposed to generalized cores as in typical Intel and ARM designs?
> Fine-grained parallelism doesn't make much sense on the latter, because
> cache coherence and core scheduling issues will eat up gains in almost all
> circumstances. Ada tasks are a much better model.

Well, I used OpenMP for a program targeting x64 architectures and it 
worked well in my case. It was easy to use: my program became 8x faster 
by the addition of a single line of source text. It even computed the 
right answer. My program was very well suited to the OpenMP model of 
computation, however, so I wouldn't expect such a dramatic result in all 
cases of course.

> Well, this doesn't make much sense. If the pragma doesn't change the
> semantics of the loop, then its not necessary at all (the compiler can and
> ought to do the optimization when it makes sense, possibly under the control
> of global flags). Programmers are lousy at determining where and how the
> best of use of machine resources can be made.

I only used the pragma above to follow the mindset of OpenMP under C. I 
agree it might not be the best way to do it in Ada.

I'm a little uncertain, though, about how well the compiler can be 
expected to find this sort of parallelization... at least with current 
technology. The compiler I was using for the program above, and it 
wasn't an ancient one, certainly had no idea how to do such things on 
its own.

In a high performance application nested loops are common and often the 
body of a loop calls a subprogram implemented in a library that itself 
has loops. I don't want all of these nested loops parallelized because 
that would create huge overheads. Yet without detailed semantic 
information about what the library subprograms do, I'm not sure how the 
compiler can know it's safe to parallelize the top level loop.

I'm not an expert in writing parallelizing compilers for sure, but it 
seemed to me, when I was experimenting with OpenMP, that it did a nice 
job of taking care of the grunt work while still allow me to apply my 
broad knowledge of the application to find good places to parallelize.

I certainly could have written my earlier program with tasks. In fact I 
had a version that used threads before I tried OpenMP. It worked but it 
was ugly and a bit flaky. Doing the job with one line was certainly a 
lot nicer and proved to be more reliable (and faster running too, in my 
case).

Peter



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 23:42     ` Randy Brukardt
  2013-03-08  0:39       ` Peter C. Chapin
@ 2013-03-08  1:15       ` Shark8
  2013-03-08  3:42         ` Randy Brukardt
  2013-03-08  7:37       ` Simon Wright
  2013-03-10 18:00       ` Waldek Hebisch
  3 siblings, 1 reply; 26+ messages in thread
From: Shark8 @ 2013-03-08  1:15 UTC (permalink / raw)


On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote:
> 
> Pragmas, IMHO, are the worst way to do anything. Compiler writers tend to 
> use them because they can do so without appearing to modify the language, 
> but it's all an illusion: the program probably won't work right without the 
> pragma, so you're still locked into that particular vendor. Might as well 
> have done it right in the first place (and make a proposal to the ARG, 
> backed with practice, so it can get done right in the next version of Ada).

I'm not sure I totally agree with your sentiment; IIRC, Pragmas in Ada were supposed to be parentheticals to the compiler that were not essential to program correctness -- such as Optimize or the source-printing pragma Page.

That sentiment was thwarted with the inclusion of representation pragmas. -- However, the Ada specification *does* allow for implementation-defined pragmas. It seems to me that such would be ideal for experimental or compiler-specific code-generation. The previous poster's "Pragma OMP(...)" is an excellent example (though I think the FOR_LOOP parameter is stupid, the feature [OMP] should be turned on/off as needed, letting the compiler determine the optimal technique to use. {A program written in such a way would be correct even if the code were ported to a non-OMP compiler, provided the underlying algorithm was correct.}



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  0:39       ` Peter C. Chapin
@ 2013-03-08  3:31         ` Randy Brukardt
  2013-03-08  7:17           ` Simon Wright
  2013-03-08 12:07           ` Peter C. Chapin
  2013-03-08 14:40         ` Rego, P.
  1 sibling, 2 replies; 26+ messages in thread
From: Randy Brukardt @ 2013-03-08  3:31 UTC (permalink / raw)


"Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote in message 
news:za6dnTU03LGyrqTM4p2dnAA@giganews.com...

>> Isn't OpenMP aimed at SIMD-type machines (as in video processors), as
>> opposed to generalized cores as in typical Intel and ARM designs?
>> Fine-grained parallelism doesn't make much sense on the latter, because
>> cache coherence and core scheduling issues will eat up gains in almost 
>> all
>> circumstances. Ada tasks are a much better model.
>
>Well, I used OpenMP for a program targeting x64 architectures and it worked 
>well in my case. It was easy to use: my program became 8x faster by the 
>addition of a single line of source text. It even computed the right 
>answer. My program was very well suited to the OpenMP model of computation, 
>however, so I wouldn't expect such a dramatic result in all cases of 
>course.

But (based on the rest of your note) isn't "fine-grained parallelism". You 
called a bunch of expensive library functions in the loop, and thus your 
actual computations are large enough that the mechanism would work well. But 
so would have an arrangement like Paraffin (with a bit more code 
rearrangement).

...
>> Well, this doesn't make much sense. If the pragma doesn't change the
>> semantics of the loop, then its not necessary at all (the compiler can 
>> and
>> ought to do the optimization when it makes sense, possibly under the 
>> control
>> of global flags). Programmers are lousy at determining where and how the
>> best of use of machine resources can be made.
>
> I only used the pragma above to follow the mindset of OpenMP under C. I 
> agree it might not be the best way to do it in Ada.
>
> I'm a little uncertain, though, about how well the compiler can be 
> expected to find this sort of parallelization... at least with current 
> technology. The compiler I was using for the program above, and it wasn't 
> an ancient one, certainly had no idea how to do such things on its own.

Well, the problem is that if you follow Ada semantics (which are sequential 
for loops), you probably can't parallelize even if you have a pragma. That's 
because the sequential semantics are observable because of exceptions: an 
exception happening after the second iteration of a loop had better not 
modify anything that the fourth iteration would do.

And if you want to make such a major change to Ada semantics, writing a 
pragma is NOT the right way, IMHO. It would be many times better for Ada to 
simply have parallel loops:

     for I in 1 .. 10 loop in parallel
         ...
     end loop;

In which case depending upon the sequential exceution wouldn't be allowed. 
(There also would have to be some restrictions on what the "..." could be). 
We're exploring some such ideas for a future version of Ada, and it would be 
nice if some trial implementations appeared.

> In a high performance application nested loops are common and often the 
> body of a loop calls a subprogram implemented in a library that itself has 
> loops. I don't want all of these nested loops parallelized because that 
> would create huge overheads.

This is my point: "fine-grained parallelism" means that *everything* is 
(potentially) parallelized. (See Tucker's Parasail for an example.) You're 
essentially saying that it doesn't work.

Also, there is a strong argument that you're prematurely optimizing your 
code if you're worrying about "overheads" created. The compiler can figure 
these out far better than a human can -- when you do it, you're only 
guessing -- a compiler has a lot more information with which to decide. It's 
best to tell the compiler where you don't care about exceptions (for 
instance) and let it pick the best parallelization.

> Yet without detailed semantic information about what the library 
> subprograms do, I'm not sure how the compiler can know it's safe to 
> parallelize the top level loop.

The compiler *has* to have such information, or it can't do *anything* 
useful. By anything, I mean optimizations (both sequential or parallel), 
proofs, static checking, and the like. Either it has to have access to the 
bodies, or it needs *strong* contracts covering everything. We proposed all 
of those for Ada 2012, but we didn't have the energy to finish the global 
in/out contracts. You can get a bit of information from "Pure", but that's 
about it.

For Janus/Ada, we always assume a subprogram can do anything, and that 
prevents about 98% of optimizations from happening across subprogram calls. 
But that simply isn't acceptable today, especially with Ada 2012's 
assertions (you have to be able to remove redundant assertion checks to make 
the cost cheap enough that they don't need to be left on all the time). And 
parallelization is a similar situation: the compiler needs to know about 
side-effects of every function, in detail, before it can generate code to 
take advantage of modern features.

> I'm not an expert in writing parallelizing compilers for sure, but it 
> seemed to me, when I was experimenting with OpenMP, that it did a nice job 
> of taking care of the grunt work while still allow me to apply my broad 
> knowledge of the application to find good places to parallelize.

I suspect it works better with C, which doesn't have error semantics. In 
that case, how the loop is implemented doesn't really matter. Similarly, no 
one cares that you can easily introduce hard to find bugs if there is any 
overlap between your iterations. Ada is different in both of these 
regards -- new ways to introduce erroneous execution are not tolerated by 
most Ada users.

In any case, these sorts of things are hacks to use until the compilers and 
languages catch up. Programs shouldn't be specifying in-lining or 
parallelism in detail, at most some hints might be provided. Compilers do 
this sort of grunt work a whole lot better than humans.

                                                Randy.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  1:15       ` Shark8
@ 2013-03-08  3:42         ` Randy Brukardt
  2013-03-08 14:53           ` Rego, P.
  2013-03-08 16:52           ` Shark8
  0 siblings, 2 replies; 26+ messages in thread
From: Randy Brukardt @ 2013-03-08  3:42 UTC (permalink / raw)


"Shark8" <onewingedshark@gmail.com> wrote in message 
news:3e01ac49-4427-4f50-8577-8edab7e539a6@googlegroups.com...
On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote:
>>
>> Pragmas, IMHO, are the worst way to do anything. Compiler writers tend to
>> use them because they can do so without appearing to modify the language,
>> but it's all an illusion: the program probably won't work right without 
>> the
>> pragma, so you're still locked into that particular vendor. Might as well
>> have done it right in the first place (and make a proposal to the ARG,
>> backed with practice, so it can get done right in the next version of 
>> Ada).
>
>I'm not sure I totally agree with your sentiment; IIRC, Pragmas in Ada were 
>supposed
>to be parentheticals to the compiler that were not essential to program 
>correctness -- such
>as Optimize or the source-printing pragma Page.
>
>That sentiment was thwarted with the inclusion of representation pragmas.

Which are now obsolescent. We've finally removed most of a basic mistake in 
the design of the language. It would be a pity to bring it back.

> -- However, the Ada specification *does* allow for implementation-defined 
> pragmas.
> It seems to me that such would be ideal for experimental or 
> compiler-specific
> code-generation.

That's fine if that is all that it does. But that's not possible here.

> The previous poster's "Pragma OMP(...)" is an excellent example

No, it changes the semantics drastically. Exceptions don't work, I/O and 
progress counters have to be avoided in the loop, etc.

> (though I think the FOR_LOOP parameter is stupid, the feature [OMP] should 
> be
> turned on/off as needed, letting the compiler determine the optimal 
> technique to use.
> {A program written in such a way would be correct even if the code were 
> ported
> to a non-OMP compiler, provided the underlying algorithm was correct.}

In order for that to be the case, the pragma would have to make various 
constructs illegal in the loop and in the surrounding code (exception 
handlers, any code where one iteration of the loop depends on the next, 
erroneous use of shared variables). But a pragma shouldn't be changing the 
legality rules of the language. (And it's not clear this would really fix 
the problem.)

Alternatively, the parallel version could simply be erroneous if any of 
those things happened. But that means you have no idea what will happen, and 
you've introduced new forms of erroneousness (meaning that there is no 
chance that the pragma would ever be standardized).

Ada is about doing things right, and that should be true even for 
implementation-defined stuff. And we *need* people to figure out good ways 
of doing these things (for instance, a "parallel" classification for 
functions would be very helpful). The sloppy way helps little.

                                       Randy "No New Pragmas" Brukardt





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  3:31         ` Randy Brukardt
@ 2013-03-08  7:17           ` Simon Wright
  2013-03-08 23:40             ` Randy Brukardt
  2013-03-08 12:07           ` Peter C. Chapin
  1 sibling, 1 reply; 26+ messages in thread
From: Simon Wright @ 2013-03-08  7:17 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:

> you have to be able to remove redundant assertion checks to make the
> cost cheap enough that they don't need to be left on all the time

*off* all the time?



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 23:42     ` Randy Brukardt
  2013-03-08  0:39       ` Peter C. Chapin
  2013-03-08  1:15       ` Shark8
@ 2013-03-08  7:37       ` Simon Wright
  2013-03-10 18:00       ` Waldek Hebisch
  3 siblings, 0 replies; 26+ messages in thread
From: Simon Wright @ 2013-03-08  7:37 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:

> Pragmas, IMHO, are the worst way to do anything. Compiler writers tend
> to use them because they can do so without appearing to modify the
> language, but it's all an illusion: the program probably won't work
> right without the pragma, so you're still locked into that particular
> vendor.

You'd be just as locked with implementation-defined aspects (GNAT
certainly has these).

But at least another compiler would have to fail if it didn't support
the aspect at all (you'd be as badly off as with pragmas if it did, but
with different semantics).



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 23:43     ` Georg Bauhaus
@ 2013-03-08 10:18       ` Georg Bauhaus
  0 siblings, 0 replies; 26+ messages in thread
From: Georg Bauhaus @ 2013-03-08 10:18 UTC (permalink / raw)


On 08.03.13 00:43, Georg Bauhaus wrote:
> "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote:
>> OpenMP is a different animal than Ada tasks. It provides fine grained
>> parallelism where, for example, it is possible to have the compiler
>> automatically parallelize a loop. In C:
>>
>> #pragma omp parallel for
>> for( i = 0; i < MAX; ++i ) {
>>    array[i]++;
>> }
>
> Fortunately, OpenMP is no longer needed to achieve automatic
> parallelism in either C or Ada at the low level. GCC's vectorizer
> produces code that runs in parallel for a number of loop patterns.
> These are documented, and they work in GNAT GPL or more
> recent FSF GNATs. Later 4.7s IIRC.

For example, adding -ftree-vectorize to the set of options (-O2 ...)
increases the speed of the program below by factors up to 3, depending
to some extent on the value of MAX. (Option -O3 is even easier in this
case, and yields improvements when MAX = 8.)

The assembly listing includes instructions like MOVDQA and PADDD used
with SSE registers.

GNAT will report successful optimizations when -fopt-info-optimized
is among the switches (or -ftree-vectorizer-verbose=2 for older GNATs).

package Fast is

    MAX : constant := 50;

    subtype Number is Integer;
    type Index is new Natural range 0 .. MAX;

    type Vect is array (Index) of Number;

    procedure Inc_Array (V : in out Vect);

end Fast;

package body Fast is

    procedure Inc_Array (V : in out Vect) is
    begin
       for K in Index loop
          V (K) := V (K) + 1;
       end loop;
    end Inc_Array;

end Fast;


with Ada.Real_Time;    use Ada.Real_Time;
with Ada.Text_IO;
with Fast;    use Fast;
procedure Test_Fast is
    Start, Finish : Time;
    Data : Vect;
    Result : Integer := 0;
    pragma Volatile (Result);
begin
    Start := Clock;
    for Run in 1 .. 500_000_000/MAX loop
       Inc_Array (Data);
       if Data (Index(MAX/2 + Run mod MAX/2)) rem 2 = 1 then
          Result := 1;
       end if;
    end loop;
    Finish := Clock;
    Ada.Text_IO.Put_Line
      (Duration'Image (To_Duration (Finish - Start)));
end Test_Fast;





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  3:31         ` Randy Brukardt
  2013-03-08  7:17           ` Simon Wright
@ 2013-03-08 12:07           ` Peter C. Chapin
  1 sibling, 0 replies; 26+ messages in thread
From: Peter C. Chapin @ 2013-03-08 12:07 UTC (permalink / raw)




On 03/07/2013 10:31 PM, Randy Brukardt wrote:

> But (based on the rest of your note) isn't "fine-grained parallelism". You
> called a bunch of expensive library functions in the loop, and thus your
> actual computations are large enough that the mechanism would work well. But
> so would have an arrangement like Paraffin (with a bit more code
> rearrangement).

Yes, "fine-grained" was probably a poor choice of words. Yet splitting 
loops, at any level, into parallel tasks is tedious to do manually. I 
think this is where you're coming from when you say it's something the 
compiler should be doing.

I don't disagree with you. My post was in response to an early question 
wondering if OpenMP style parallelism was really necessary in a language 
with explicit tasking. My answer is "yes," but if it's implemented by 
having the compiler parallelize things automatically, then all the 
better. Either way the programmer doesn't have to roll out the machinery 
of tasking to manually parallelize loops.

 From your post you are optimistic about the capabilities of current and 
upcoming compilers and you would know more about that than I do. Indeed, 
I look forward to using such compilers and would be happy to provide 
whatever hints they need via aspects, or annotations, or pragmas or 
whatever might be necessary. Alas when I wrote the program I mentioned 
before the necessary compiler technology to do automatic loop 
parallelization was not available to me. Hence OpenMP.

Maybe OpenMP is obsolete or maybe it's not. After all the pragmas it 
uses (in the C world) are a form of annotation. They may not be the best 
way to do things, especially in Ada, but ultimately it seems like the 
programmer would have to be involved in some way.

Peter



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 22:22   ` Peter C. Chapin
  2013-03-07 23:42     ` Randy Brukardt
  2013-03-07 23:43     ` Georg Bauhaus
@ 2013-03-08 14:24     ` Rego, P.
  2 siblings, 0 replies; 26+ messages in thread
From: Rego, P. @ 2013-03-08 14:24 UTC (permalink / raw)


Em quinta-feira, 7 de março de 2013 19h22min03s UTC-3, Peter C. Chapin  escreveu:
> OpenMP is a different animal than Ada tasks. It provides fine grained 
> parallelism where, for example, it is possible to have the compiler 
> automatically parallelize a loop. In C:
> #pragma omp parallel for
> for( i = 0; i < MAX; ++i ) {
>    array[i]++;
> }
> The compiler automatically splits the loop iterations over an 
> "appropriate" number of threads (probably based on the number of cores).
> In Ada one might write, perhaps
> pragma OMP(Parallel_For)
> for I in 1 .. MAX loop
>    A(I) := A(I) + 1
> end loop;
> Doing this with Ada tasks in such a way that it uses an optimal number 
> of threads on each execution (based on core count) would be much more 
> complicated, I should imagine. Please correct me if I'm wrong!
> OpenMP has various other features, some of which could be done naturally 
> with tasks, but much of what OpenMP is about is semi-automatic fine 
> grained parallelization. It is to Ada tasking what Ada tasking is to the 
> explicit handling of locks, etc.
> Peter

Yes, that's the idea.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  0:39       ` Peter C. Chapin
  2013-03-08  3:31         ` Randy Brukardt
@ 2013-03-08 14:40         ` Rego, P.
  1 sibling, 0 replies; 26+ messages in thread
From: Rego, P. @ 2013-03-08 14:40 UTC (permalink / raw)


> > Isn't OpenMP aimed at SIMD-type machines (as in video processors), as
> > opposed to generalized cores as in typical Intel and ARM designs?
> > Fine-grained parallelism doesn't make much sense on the latter, because
> > cache coherence and core scheduling issues will eat up gains in almost all
> > circumstances. Ada tasks are a much better model.
> 
> Well, I used OpenMP for a program targeting x64 architectures and it 
> worked well in my case. It was easy to use: my program became 8x faster 
> by the addition of a single line of source text. It even computed the 
> right answer. My program was very well suited to the OpenMP model of 
> computation, however, so I wouldn't expect such a dramatic result in all 
> cases of course.

In my case, I'm using OpenMP in a parallel computer (which I do not have idea of the architecture). But I would like to use also on my I7 x64 6/4 cores, just for curiosity, sure it's not a supermachine, but it would be good to compare the same algorithms performance with it. The idea of using Ada came out because while studying OpenMP I found very similar several parallel OpenMP concepts with the Ada tasking scheme, so why not (if it could be done).




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  3:42         ` Randy Brukardt
@ 2013-03-08 14:53           ` Rego, P.
  2013-03-08 15:47             ` Georg Bauhaus
  2013-03-08 23:40             ` Randy Brukardt
  2013-03-08 16:52           ` Shark8
  1 sibling, 2 replies; 26+ messages in thread
From: Rego, P. @ 2013-03-08 14:53 UTC (permalink / raw)


> On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote:

> Ada is about doing things right, and that should be true even for 
> implementation-defined stuff. And we *need* people to figure out good ways 
> of doing these things (for instance, a "parallel" classification for 
> functions would be very helpful). The sloppy way helps little.

Got your point. 
Would you have a suggestion on how I could a loop such as "parallelization" like

pragma OMP(Parallel_For) 
for I in 1 .. MAX loop 
   A(I) := A(I) + 1 
end loop; 

but without using pragmas?



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08 14:53           ` Rego, P.
@ 2013-03-08 15:47             ` Georg Bauhaus
  2013-03-08 23:40             ` Randy Brukardt
  1 sibling, 0 replies; 26+ messages in thread
From: Georg Bauhaus @ 2013-03-08 15:47 UTC (permalink / raw)


On 08.03.13 15:53, Rego, P. wrote:
>> On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote:
> 
>> Ada is about doing things right, and that should be true even for 
>> implementation-defined stuff. And we *need* people to figure out good ways 
>> of doing these things (for instance, a "parallel" classification for 
>> functions would be very helpful). The sloppy way helps little.
> 
> Got your point. 
> Would you have a suggestion on how I could a loop such as "parallelization" like
> 
> pragma OMP(Parallel_For) 
> for I in 1 .. MAX loop 
>    A(I) := A(I) + 1 
> end loop; 

If A is an array of small objects, then with GNAT on Intel,
you can turn on -ftree-vectorize (or -O3) and see what this gives.

Adding -ftree-vectorizer-verbose=2 (old) or -fopt-info-optimize
instructs GCC to report successful vectorizations.




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  3:42         ` Randy Brukardt
  2013-03-08 14:53           ` Rego, P.
@ 2013-03-08 16:52           ` Shark8
  2013-03-08 23:36             ` Randy Brukardt
  1 sibling, 1 reply; 26+ messages in thread
From: Shark8 @ 2013-03-08 16:52 UTC (permalink / raw)


On Thursday, March 7, 2013 8:42:15 PM UTC-7, Randy Brukardt wrote:
> 
> In order for that to be the case, the pragma would have to make various 
> constructs illegal in the loop and in the surrounding code (exception 
> handlers, any code where one iteration of the loop depends on the next, 
> erroneous use of shared variables). But a pragma shouldn't be changing the 
> legality rules of the language. (And it's not clear this would really fix 
> the problem.)

Why would that have to change the semantics of the program: since there would have to be a non-implementation-defined code-generation method (for when the pragma was off) the compiler should just use that if those constructs are used.

It limits the usefulness, yes; but there's no reason to assume that it should cause erroneous execution. -- I suppose that in early versions of playing w/ an experimental version there might be some because of some case that was overlooked.  -- Though this does seem to bring us back to the idea of exceptions (and how to indicate/assert to the compiler that none, or a limited set, can be raised).

> 
> 
> 
> Alternatively, the parallel version could simply be erroneous if any of 
> 
> those things happened. But that means you have no idea what will happen, and 
> 
> you've introduced new forms of erroneousness (meaning that there is no 
> 
> chance that the pragma would ever be standardized).
> 
> 
> 
> Ada is about doing things right, and that should be true even for 
> 
> implementation-defined stuff. And we *need* people to figure out good ways 
> 
> of doing these things (for instance, a "parallel" classification for 
> 
> functions would be very helpful). The sloppy way helps little.
> 
> 
> 
>                                        Randy "No New Pragmas" Brukardt




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 22:52 ` Simon Wright
@ 2013-03-08 21:37   ` Brad Moore
  0 siblings, 0 replies; 26+ messages in thread
From: Brad Moore @ 2013-03-08 21:37 UTC (permalink / raw)


On 07/03/2013 3:52 PM, Simon Wright wrote:
> "Rego, P." <pvrego@gmail.com> writes:
>
>> I'm trying some exercises of parallel computing using that pragmas
>> from OpenMP in C, but it would be good to use it also in Ada. Is it
>> possible to use that pragmas from OpenMP in Ada? And...does gnat gpl
>> supports it?
>
> GNAT doesn't support OpenMP pragmas.
>
> But you might take a look at Paraffin:
> http://sourceforge.net/projects/paraffin/
>

To give an example using Paraffin libraries,

The following code shows the same problem executed sequentially, and
then executed with Paraffin libraries.

with Ada.Real_Time;    use Ada.Real_Time;
with Ada.Command_Line;
with Ada.Text_IO; use Ada.Text_IO;
with Parallel.Iteration.Work_Stealing;

procedure Test_Loops is

    procedure Integer_Loops is new
      Parallel.Iteration.Work_Stealing (Iteration_Index_Type => Integer);

    Start : Time;

    Array_Size : Natural := 50;
    Iterations : Natural := 10_000_000;

begin

    --  Allow first command line parameter to override default iteration 
count
    if Ada.Command_Line.Argument_Count >= 1 then
       Iterations := Integer'Value (Ada.Command_Line.Argument (1));
    end if;

    --  Allow second command line parameter to override default array size
    if Ada.Command_Line.Argument_Count >= 2 then
       Array_Size := Integer'Value (Ada.Command_Line.Argument (2));
    end if;

    Data_Block : declare
       Data : array (1 .. Array_Size) of Natural := (others => 0);
    begin

       --  Sequential Version of the code, any parallelization must be auto
       --  generated by the compiler

       Start := Clock;

       for I in Data'Range loop
          for J in 1 .. Iterations loop
             Data (I) := Data (I) + 1;
          end loop;
       end loop;

       Put_Line ("Sequential Elapsed=" & Duration'Image (To_Duration 
(Clock - Start)));

       Data := (others => 0);
       Start := Clock;

       --  Parallel Version of the code, explicitly parallelized using 
Paraffin
       declare

          procedure Iterate (First : Integer; Last : Integer) is
          begin
             for I in First .. Last loop
                for J in 1 .. Iterations loop
                   Data (I) := Data (I) + 1;
                end loop;
             end loop;
          end Iterate;

       begin
          Integer_Loops (From         => Data'First,
                         To           => Data'Last,
                         Worker_Count => 4,
                         Process      => Iterate'Access);
       end;

       Put_Line ("Parallel Elapsed=" & Duration'Image (To_Duration 
(Clock - Start)));

    end Data_Block;

end Test_Loops;

When run on my machine AMD Quadcore with parameters 100_000 100_000, 
with full optimization turned on with -ftree-vectorize, I get.

Sequential Elapsed= 6.874298000
Parallel Elapsed= 6.287230000

With optimization turned off, I get

Sequential Elapsed= 32.428908000
Parallel Elapsed= 8.424717000

gcc with GNAT does a good job of optimization when its enabled, for 
these cases as shown, but the differences between optimization and using 
Paraffin can be more pronounced in other cases that are more complex, 
such as loops that involve reduction (e.g. calculating a sum)

Brad



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08 16:52           ` Shark8
@ 2013-03-08 23:36             ` Randy Brukardt
  2013-03-09  4:13               ` Brad Moore
  0 siblings, 1 reply; 26+ messages in thread
From: Randy Brukardt @ 2013-03-08 23:36 UTC (permalink / raw)


"Shark8" <onewingedshark@gmail.com> wrote in message 
news:9e0bbbdf-ccfa-4d4c-90af-2d56d46242b3@googlegroups.com...
>On Thursday, March 7, 2013 8:42:15 PM UTC-7, Randy Brukardt wrote:
>>
>> In order for that to be the case, the pragma would have to make various
>> constructs illegal in the loop and in the surrounding code (exception
>> handlers, any code where one iteration of the loop depends on the next,
>> erroneous use of shared variables). But a pragma shouldn't be changing 
>> the
>> legality rules of the language. (And it's not clear this would really fix
>> the problem.)
>
>Why would that have to change the semantics of the program: since there 
>would have
>to be a non-implementation-defined code-generation method (for when the 
>pragma
>was off) the compiler should just use that if those constructs are used.

Mainly because 95% of Ada code is going to fail such tests; it would 
virtually never be able to use the fancy code.

Take the OP's example, for example:

for I in 1 .. MAX loop
   A(I) := A(I) + 1; -- Can raise Constraint_Error because of overflow or 
range checks.
end loop;

This can be done in parallel only if (A) there is no exception handler for 
Constraint_Error or others anywhere in the program; or (B) pragma Suppress 
applies to the loop (nasty, we never, ever want an incentive to use 
Suppress); or (C) no exception handler or code following the handler can 
ever access A (generally only possible if A is a local variable, not a 
parameter or global). For some loops there would be a (D) be able to prove 
from subtypes and constraints that no exception can happen -- but that is 
never possible for increment or decrement operations like the above. These 
conditions aren't going to happen that often, and unless a compiler has 
access to the source code for the entire program, (A) isn't possible to 
determine anyway.

And if the compiler is going to go through all of that anyway, it might as 
well just do it whenever it can, no pragma is necessary or useful.

The whole advantage of having a "marker" here is to allow a change in the 
semantics in the error case. If you're not going to do that, you're hardly 
ever going to be able to parallelize, so what's the point of a pragma?

                                       Randy.







^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08 14:53           ` Rego, P.
  2013-03-08 15:47             ` Georg Bauhaus
@ 2013-03-08 23:40             ` Randy Brukardt
  1 sibling, 0 replies; 26+ messages in thread
From: Randy Brukardt @ 2013-03-08 23:40 UTC (permalink / raw)


"Rego, P." <pvrego@gmail.com> wrote in message 
news:f6fff4ba-e9a1-4335-a4a0-cb9d60152ad9@googlegroups.com...
>> On Thursday, March 7, 2013 4:42:59 PM UTC-7, Randy Brukardt wrote:
>
>> Ada is about doing things right, and that should be true even for
>> implementation-defined stuff. And we *need* people to figure out good 
>> ways
>> of doing these things (for instance, a "parallel" classification for
>> functions would be very helpful). The sloppy way helps little.
>
> Got your point.
> Would you have a suggestion on how I could a loop such as 
> "parallelization" like
>
> pragma OMP(Parallel_For)
> for I in 1 .. MAX loop
>   A(I) := A(I) + 1
> end loop;
>
> but without using pragmas?

(1) Use a compiler that does this automatically (apparently GNAT does this 
in some circumstances).
(2) Use a library like Paraffin; a bit less convinient but it will work on 
any Ada compiler for any target. Some of the Ada 2012 features may make such 
a library more convinient to write (I haven't been keeping up with Brad's 
work on this).
(3) Use a compiler with an appropriate extension for parallel loops. One 
possibility would be something like:

for I in 1 .. MAX loop in parallel
  A(I) := A(I) + 1
end loop;

This of course ties you to a particular implementation, or to wait for Ada 
202x. Of course, so does a pragma, and it's much less likely to be 
standardized. So I suggest (1) or (2).

                                              Randy.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08  7:17           ` Simon Wright
@ 2013-03-08 23:40             ` Randy Brukardt
  0 siblings, 0 replies; 26+ messages in thread
From: Randy Brukardt @ 2013-03-08 23:40 UTC (permalink / raw)


"Simon Wright" <simon@pushface.org> wrote in message 
news:lya9qenyw9.fsf@pushface.org...
> "Randy Brukardt" <randy@rrsoftware.com> writes:
>
>> you have to be able to remove redundant assertion checks to make the
>> cost cheap enough that they don't need to be left on all the time
>
> *off* all the time?

Yes, of course, sorry.

                        Randy. 





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-08 23:36             ` Randy Brukardt
@ 2013-03-09  4:13               ` Brad Moore
  2013-03-10  4:24                 ` Randy Brukardt
  0 siblings, 1 reply; 26+ messages in thread
From: Brad Moore @ 2013-03-09  4:13 UTC (permalink / raw)


On 08/03/2013 4:36 PM, Randy Brukardt wrote:
> "Shark8" <onewingedshark@gmail.com> wrote in message
> news:9e0bbbdf-ccfa-4d4c-90af-2d56d46242b3@googlegroups.com...
>> On Thursday, March 7, 2013 8:42:15 PM UTC-7, Randy Brukardt wrote:
>>>
>>> In order for that to be the case, the pragma would have to make various
>>> constructs illegal in the loop and in the surrounding code (exception
>>> handlers, any code where one iteration of the loop depends on the next,
>>> erroneous use of shared variables). But a pragma shouldn't be changing
>>> the
>>> legality rules of the language. (And it's not clear this would really fix
>>> the problem.)
>>
>> Why would that have to change the semantics of the program: since there
>> would have
>> to be a non-implementation-defined code-generation method (for when the
>> pragma
>> was off) the compiler should just use that if those constructs are used.
>
> Mainly because 95% of Ada code is going to fail such tests; it would
> virtually never be able to use the fancy code.
>
> Take the OP's example, for example:
>
> for I in 1 .. MAX loop
>     A(I) := A(I) + 1; -- Can raise Constraint_Error because of overflow or
> range checks.
> end loop;
>
> This can be done in parallel only if (A) there is no exception handler for
> Constraint_Error or others anywhere in the program; or

I am working towards a new version of Paraffin to be released soon that 
handles exceptions in such loops (as well as a number of other features).

The technique though, is to have the workers catch any exception that 
might have been raised in the users code, and then call 
Ada.Exceptions.Save_Occurence to save the exception to be raised later.

Once all workers have completed their work before returning to let the 
sequential code continue on, a check is made to see if any occurrences 
were saved. If so, then Ada.Exceptions.Reraise_Occurrence is called, to
get the exception to appear in the same task that invoked the parallelism.

Testing so far indicates this seems to work well, maintaining the 
exception abstraction as though the code were being executed 
sequentially. Currently only the most recent exception is saved, so if
more than one exception is raised by the parallel workers, only one will
get fed back to the calling task, but I think thats OK, as that would 
have been the behaviour for the sequential case. Such an exception also 
sets a flag indicating the work is complete, which attempts to get other 
workers to abort their work as soon as possible. Also, under GNAT at 
least, this exception handling logic doesn't appear to impact 
performance. Apparently they use zero cost exception handling which 
might be why. I'm not sure what sort of impact that might have on other 
implementations that model exceptions differently. Hopefully, it 
wouldn't be a significant impact.

Brad

(B) pragma Suppress
> applies to the loop (nasty, we never, ever want an incentive to use
> Suppress); or (C) no exception handler or code following the handler can
> ever access A (generally only possible if A is a local variable, not a
> parameter or global). For some loops there would be a (D) be able to prove
> from subtypes and constraints that no exception can happen -- but that is
> never possible for increment or decrement operations like the above. These
> conditions aren't going to happen that often, and unless a compiler has
> access to the source code for the entire program, (A) isn't possible to
> determine anyway.
>
> And if the compiler is going to go through all of that anyway, it might as
> well just do it whenever it can, no pragma is necessary or useful.
>
> The whole advantage of having a "marker" here is to allow a change in the
> semantics in the error case. If you're not going to do that, you're hardly
> ever going to be able to parallelize, so what's the point of a pragma?
>
>                                         Randy.
>
>
>
>




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-09  4:13               ` Brad Moore
@ 2013-03-10  4:24                 ` Randy Brukardt
  0 siblings, 0 replies; 26+ messages in thread
From: Randy Brukardt @ 2013-03-10  4:24 UTC (permalink / raw)


"Brad Moore" <brad.moore@shaw.ca> wrote in message 
news:513AB6D3.6030106@shaw.ca...
> On 08/03/2013 4:36 PM, Randy Brukardt wrote:
...
>> Take the OP's example, for example:
>>
>> for I in 1 .. MAX loop
>>     A(I) := A(I) + 1; -- Can raise Constraint_Error because of overflow 
>> or
>> range checks.
>> end loop;
>>
>> This can be done in parallel only if (A) there is no exception handler 
>> for
>> Constraint_Error or others anywhere in the program; or
>
> I am working towards a new version of Paraffin to be released soon that 
> handles exceptions in such loops (as well as a number of other features).
>
> The technique though, is to have the workers catch any exception that 
> might have been raised in the users code, and then call 
> Ada.Exceptions.Save_Occurence to save the exception to be raised later.

I'd expect this to work fine - it's how I'd implement it if I was doing 
that. The issue, though, is that this changes the semantics of the loop WRT 
to exceptions. Specifically, the parts of A that get modified would be 
unspecified, while that's not true for the sequential loop (the items that 
are modified have to be a contiguous group at the lower end of the array).

That's fine for Paraffin, because no one will accidentally use it expecting 
deterministic behavior. It's not so clear when you actually write the loop 
syntax. Which is why a parallel loop syntax seems valuable, as it would make 
it explicit that parallelism is expected (and would also allow checking for 
dependencies between iterations, which usually can't be allowed).

Of course, an alternative would be just to standardize a library like 
Paraffin for this purpose, possibly with some tie-in to the iterator syntax. 
(I know you proposed something on this line, but too late to include in Ada 
2012.)

                                        Randy.





^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ada and OpenMP
  2013-03-07 23:42     ` Randy Brukardt
                         ` (2 preceding siblings ...)
  2013-03-08  7:37       ` Simon Wright
@ 2013-03-10 18:00       ` Waldek Hebisch
  3 siblings, 0 replies; 26+ messages in thread
From: Waldek Hebisch @ 2013-03-10 18:00 UTC (permalink / raw)


Randy Brukardt <randy@rrsoftware.com> wrote:
> "Peter C. Chapin" <PChapin@vtc.vsc.edu> wrote in message 
> news:hr-dnULuncyRjqTM4p2dnAA@giganews.com...
> > OpenMP is a different animal than Ada tasks. It provides fine grained 
> > parallelism where, for example, it is possible to have the compiler 
> > automatically parallelize a loop. In C:
> >
> > #pragma omp parallel for
> > for( i = 0; i < MAX; ++i ) {
> >   array[i]++;
> > }
> >
> > The compiler automatically splits the loop iterations over an 
> > "appropriate" number of threads (probably based on the number of cores).
> 
> Isn't OpenMP aimed at SIMD-type machines (as in video processors), as 
> opposed to generalized cores as in typical Intel and ARM designs? 
> Fine-grained parallelism doesn't make much sense on the latter, because 
> cache coherence and core scheduling issues will eat up gains in almost all 
> circumstances. Ada tasks are a much better model.

Actually OpenMP only looks like fine-grained parallelism, but is not:
OpenMP creates (and destroys) tasks as needed.  Main advantage of
OpenMP is that is automates some common parallel patterns and
consequently code is much closer to seqential version.

It is very hard to get similar effect in fully automatic way,
without pragmas.  Simply, taking looses on fine-graned cases
and pragmas tell the compiler that code is coarse enough
to use several tasks.  Also, OMP pragmas control memory
consitency -- without them compiler would have to assume
worst case and generate slower code.

-- 
                              Waldek Hebisch
hebisch@math.uni.wroc.pl 



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2013-03-10 18:00 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-07 18:04 Ada and OpenMP Rego, P.
2013-03-07 20:04 ` Ludovic Brenta
2013-03-07 22:22   ` Peter C. Chapin
2013-03-07 23:42     ` Randy Brukardt
2013-03-08  0:39       ` Peter C. Chapin
2013-03-08  3:31         ` Randy Brukardt
2013-03-08  7:17           ` Simon Wright
2013-03-08 23:40             ` Randy Brukardt
2013-03-08 12:07           ` Peter C. Chapin
2013-03-08 14:40         ` Rego, P.
2013-03-08  1:15       ` Shark8
2013-03-08  3:42         ` Randy Brukardt
2013-03-08 14:53           ` Rego, P.
2013-03-08 15:47             ` Georg Bauhaus
2013-03-08 23:40             ` Randy Brukardt
2013-03-08 16:52           ` Shark8
2013-03-08 23:36             ` Randy Brukardt
2013-03-09  4:13               ` Brad Moore
2013-03-10  4:24                 ` Randy Brukardt
2013-03-08  7:37       ` Simon Wright
2013-03-10 18:00       ` Waldek Hebisch
2013-03-07 23:43     ` Georg Bauhaus
2013-03-08 10:18       ` Georg Bauhaus
2013-03-08 14:24     ` Rego, P.
2013-03-07 22:52 ` Simon Wright
2013-03-08 21:37   ` Brad Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox