Loops and parallel execution

comp.lang.ada
 help / color / mirror / Atom feed

* Loops and parallel execution
@ 2011-01-25 15:40 Georg Bauhaus
  2011-01-25 16:37 ` Dmitry A. Kazakov
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Georg Bauhaus @ 2011-01-25 15:40 UTC (permalink / raw)


A quick idea.  Assume that some subprogram Op from package P
is reentrant (and does not depend on global state). Then,

with P;
...
   for K in all First .. Last loop
       P.Op (K);
   end loop;

should have the effect of the following being permitted:

(a) to pick K from  First .. Last  in any order

(b) to execute P (J) in parallel with P (K) for J, K from
First .. Last

The same would be allowed for sufficiently simple expressions:

   for K in all First .. Last loop
       L(K) := Standard."*" (K, 3);
   end loop;

Can this be borrowed from HPF (IIUC)?
Is pragma Pure (P) sufficient to signal reentrance?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 15:40 Loops and parallel execution Georg Bauhaus
@ 2011-01-25 16:37 ` Dmitry A. Kazakov
  2011-01-25 17:36   ` Georg Bauhaus
  2011-01-26  8:46 ` Egil Høvik
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Dmitry A. Kazakov @ 2011-01-25 16:37 UTC (permalink / raw)

On Tue, 25 Jan 2011 16:40:38 +0100, Georg Bauhaus wrote:

> A quick idea.  Assume that some subprogram Op from package P
> is reentrant (and does not depend on global state). Then,
> 
> with P;
> ...
>    for K in all First .. Last loop
>        P.Op (K);
>    end loop;
> 
> should have the effect of the following being permitted:
> 
> (a) to pick K from  First .. Last  in any order
> 
> (b) to execute P (J) in parallel with P (K) for J, K from
> First .. Last
> 
> The same would be allowed for sufficiently simple expressions:
> 
>    for K in all First .. Last loop
>        L(K) := Standard."*" (K, 3);
>    end loop;
> 
> Can this be borrowed from HPF (IIUC)?
> Is pragma Pure (P) sufficient to signal reentrance?

No, it is not sufficient because it is wrong. P cannot be pure because all
instances of P.Op must be synchronized at the end of the "loop." You need
some frame, context relatively to which P might become pure. "Embedded
task", thread, fiber, call it as you want. If you have that at the language
level, then it becomes no matter whether the thing executed on such a
context is pure or not. This is similar to proper Ada tasks, you can access
shared data from a task as you wish. If you do this inconsistently that is
your problem (erroneous execution).

The point is, if you are going to somehow derive concurrency stuff from a
sequentially written program using pragmas and a "mind reading" compiler, I
doubt that could go anywhere. If you want to add light-weight embedded in
code tasking constructs a-la Occam, that might go, but I don't think that
they could be much useful. You need to map them onto OS services in order
to gain something, because normally there is no direct access to the cores.
That is not light-weight. Have you some certain OS in mind?

This thing you wanted in present Ada:

   task type Worker (Do_Me : not null access procedure (K : Integer)) is
      entry Op (K : Integer);
   end Worker;
   task body Worker is
      I : Integer;
   begin
      accept Op (K : Integer) do
         I := K;
      end Op;
      Do_Me (I);
   end Worker;

   procedure Print (K : Integer) is
   begin
      Put_Line (Integer'Image (K));
   end Print;
...
   declare
      Threads : array (1..20) of Worker (Print'Access);
   begin
      for K in Threads'Range loop
         Threads (K).Op (K);
      end loop;
   end;

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 16:37 ` Dmitry A. Kazakov
@ 2011-01-25 17:36   ` Georg Bauhaus
  2011-01-25 17:38     ` Georg Bauhaus
  2011-01-25 21:32     ` Dmitry A. Kazakov
  0 siblings, 2 replies; 22+ messages in thread
From: Georg Bauhaus @ 2011-01-25 17:36 UTC (permalink / raw)

On 25.01.11 17:37, Dmitry A. Kazakov wrote:
> On Tue, 25 Jan 2011 16:40:38 +0100, Georg Bauhaus wrote:

>> Can this be borrowed from HPF (IIUC)?
>> Is pragma Pure (P) sufficient to signal reentrance?
> 
> No, it is not sufficient because it is wrong. P cannot be pure because all
> instances of P.Op must be synchronized at the end of the "loop." 

I was thinking of relatively small things, starting from this
observation:

If a compiler detects two independent paths inside a loop,
it will generate sequences of instructions that, while written
sequentially, will be executed in parallel by the processor
because the processor can do the corresponding distribution
across the many register words. Generalize that.

The Paraffin library just posted by Brad Moore looks like
it will resolve all other issues nicely! :-)

On an AS-IF basis, I thought of a tasking profile much < Ravenscar:
Other than exiting with a value, each P.Op (K) has no need for
communication. Each P.Op (K) is associated with a result object,
like this

   compiler_type Result_Object is limited record
       Ready : Boolean_Slot;  -- a CAS bit?
       Result : Value_Type;
   end record;

The idea of employing Pure or something similar was to have
the compiler prevent each P.Op (K) from meddling in P.Op (J)'s
affairs, J /= K, as with protected objects, again on an
AS-IF basis, as much as possible.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 17:36   ` Georg Bauhaus
@ 2011-01-25 17:38     ` Georg Bauhaus
  2011-01-25 21:32     ` Dmitry A. Kazakov
  1 sibling, 0 replies; 22+ messages in thread
From: Georg Bauhaus @ 2011-01-25 17:38 UTC (permalink / raw)


On 25.01.11 18:36, Georg Bauhaus wrote:

> The idea of employing Pure or something similar was to have
> the compiler prevent each P.Op (K) from meddling in P.Op (J)'s
> affairs, J /= K, as with protected objects, again on an
> AS-IF basis, as much as possible.

IOW, be better than OpenMP!



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 17:36   ` Georg Bauhaus
  2011-01-25 17:38     ` Georg Bauhaus
@ 2011-01-25 21:32     ` Dmitry A. Kazakov
  2011-01-25 22:07       ` Georg Bauhaus
  2011-01-26  1:06       ` Yannick Duchêne (Hibou57)
  1 sibling, 2 replies; 22+ messages in thread
From: Dmitry A. Kazakov @ 2011-01-25 21:32 UTC (permalink / raw)


On Tue, 25 Jan 2011 18:36:29 +0100, Georg Bauhaus wrote:

> If a compiler detects two independent paths inside a loop,

I don't see any application for this. Can you remember the last time you
wrote such loop? I cannot.

The Occam's par-statement could be a better candidate, but I don't see how
this could be useful under a modern general-purpose OS with their
"vertical" parallelism, when each task is assigned to one core. The thing
you propose is "horizontal" parallelism, when a task/process would run on
all cores simultaneously. Inmos' Occam ran under no true OS, and the
processor architecture was well suited for such ad-hoc parallelism. Modern
processors are very different from T805 and I doubt that they would allow
an efficient implementation of this.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 21:32     ` Dmitry A. Kazakov
@ 2011-01-25 22:07       ` Georg Bauhaus
  2011-01-26  1:31         ` Yannick Duchêne (Hibou57)
  2011-01-26  9:04         ` Dmitry A. Kazakov
  2011-01-26  1:06       ` Yannick Duchêne (Hibou57)
  1 sibling, 2 replies; 22+ messages in thread
From: Georg Bauhaus @ 2011-01-25 22:07 UTC (permalink / raw)

On 1/25/11 10:32 PM, Dmitry A. Kazakov wrote:
> On Tue, 25 Jan 2011 18:36:29 +0100, Georg Bauhaus wrote:
>
>> If a compiler detects two independent paths inside a loop,
>
> I don't see any application for this. Can you remember the last time you
> wrote such loop? I cannot.

In fact, I have seen such a loop recently; it computes a Mandelbrot
set twice as fast.  (I am confident that the lessons learned in
finding this loop have found applications in other loops that
manipulate larger amounts of numeric data.)  The author has found a way
to split Ada's Complex type into its constituent parts (two FPT
objects) such that the program is a lot more efficient.  (One would wish
that types like Complex would be treated specially by the compiler.)

> The Occam's par-statement could be a better candidate, but I don't see how
> this could be useful under a modern general-purpose OS with their
> "vertical" parallelism, when each task is assigned to one core.  The thing
> you propose is "horizontal" parallelism, when a task/process would run on
> all cores simultaneously. Inmos' Occam ran under no true OS, and the
> processor architecture was well suited for such ad-hoc parallelism. Modern
> processors are very different from T805 and I doubt that they would allow
> an efficient implementation of this.

I have recently seen small boards carrying one processor each
that could be connected to one another on all sides, IIRC.
What matters thens is, I think, the efficiency of
(a) distribution of small computation, and
(b) the delivery of results at some nodes.
Is it therefore so unthinkable to have something like a transputer
these days?

BTW, FUD places the whole idea (from the early days, I guess)
subject to patent lawyerly action, nowadays, under names such
as map-reduce...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 21:32     ` Dmitry A. Kazakov
  2011-01-25 22:07       ` Georg Bauhaus
@ 2011-01-26  1:06       ` Yannick Duchêne (Hibou57)
  2011-01-26 10:08         ` Dmitry A. Kazakov
  2011-01-31 13:01         ` Paul Colin Gloster
  1 sibling, 2 replies; 22+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2011-01-26  1:06 UTC (permalink / raw)


Le Tue, 25 Jan 2011 22:32:57 +0100, Dmitry A. Kazakov  
<mailbox@dmitry-kazakov.de> a écrit:

> On Tue, 25 Jan 2011 18:36:29 +0100, Georg Bauhaus wrote:
>
>> If a compiler detects two independent paths inside a loop,
>
> I don't see any application for this. Can you remember the last time you
> wrote such loop? I cannot.
I can see one: a kind of compiler optimization. I use to though about  
something similar to what Georg exposed (except not strictly with loops),  
which I called “micro-parallelism”. There are many case in an application  
where some short sequence of instructions or groups of instructions does  
not need to be sequenced. Typically I notice this when I do not know which  
order to give these to make the source clear, as many orders would be  
equivalent. Unfortunately, tasking is inefficient here (too much  
overhead). You talked about Occam which I do not know (just the  
principle), but could be fine, yes.

This kind of parallelism requires to be handled at low level (CPU or  
else). This could be either marked explicitly by the author or detected by  
the compiler as Georg suggested, as a kind of compiler optimization.

-- 
Si les chats miaulent et font autant de vocalises bizarres, c’est pas pour  
les chiens.

“I am fluent in ASCII” [Warren 2010]



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 22:07       ` Georg Bauhaus
@ 2011-01-26  1:31         ` Yannick Duchêne (Hibou57)
  2011-01-26  9:04         ` Dmitry A. Kazakov
  1 sibling, 0 replies; 22+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2011-01-26  1:31 UTC (permalink / raw)


Le Tue, 25 Jan 2011 23:07:01 +0100, Georg Bauhaus  
<rm-host.bauhaus@maps.futureapps.de> a écrit:
> BTW, FUD places the whole idea (from the early days, I guess)
> subject to patent lawyerly action, nowadays, under names such
> as map-reduce...
What FUD ? What patent ? Please, tell more.


-- 
Si les chats miaulent et font autant de vocalises bizarres, c’est pas pour  
les chiens.

“I am fluent in ASCII” [Warren 2010]



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 15:40 Loops and parallel execution Georg Bauhaus
  2011-01-25 16:37 ` Dmitry A. Kazakov
@ 2011-01-26  8:46 ` Egil Høvik
  2011-01-26 10:47   ` Georg Bauhaus
  2011-01-26 11:29 ` Peter C. Chapin
  2011-01-26 21:57 ` Randy Brukardt
  3 siblings, 1 reply; 22+ messages in thread
From: Egil Høvik @ 2011-01-26  8:46 UTC (permalink / raw)


On Tuesday, January 25, 2011 4:40:38 PM UTC+1, Georg Bauhaus wrote:
> A quick idea.  Assume that some subprogram Op from package P
> is reentrant (and does not depend on global state). Then,
> 
> with P;
> ...
>    for K in all First .. Last loop
>        P.Op (K);
>    end loop;
> 
> should have the effect of the following being permitted:
> 
> (a) to pick K from  First .. Last  in any order
> 
> (b) to execute P (J) in parallel with P (K) for J, K from
> First .. Last
> 

You should take a look at Tucker Tafts blog about ParaSail,

"ParaSail allows many things to proceed in parallel by default, effectively inserting implicit parallelism everywhere"

http://parasail-programming-language.blogspot.com/2009/09/parasail-language-themes-and-philosophy.html

-- 
~egilhh



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 22:07       ` Georg Bauhaus
  2011-01-26  1:31         ` Yannick Duchêne (Hibou57)
@ 2011-01-26  9:04         ` Dmitry A. Kazakov
  1 sibling, 0 replies; 22+ messages in thread
From: Dmitry A. Kazakov @ 2011-01-26  9:04 UTC (permalink / raw)

On Tue, 25 Jan 2011 23:07:01 +0100, Georg Bauhaus wrote:

> On 1/25/11 10:32 PM, Dmitry A. Kazakov wrote:

>> The Occam's par-statement could be a better candidate, but I don't see how
>> this could be useful under a modern general-purpose OS with their
>> "vertical" parallelism, when each task is assigned to one core.  The thing
>> you propose is "horizontal" parallelism, when a task/process would run on
>> all cores simultaneously. Inmos' Occam ran under no true OS, and the
>> processor architecture was well suited for such ad-hoc parallelism. Modern
>> processors are very different from T805 and I doubt that they would allow
>> an efficient implementation of this.
> 
> I have recently seen small boards carrying one processor each
> that could be connected to one another on all sides, IIRC.
> What matters thens is, I think, the efficiency of
> (a) distribution of small computation, and
> (b) the delivery of results at some nodes.

The Parix OS (actually a monitor) did that. E.g. if you called, say,
"printf" in a node which didn't have a direct link to the server (the
server was an MS-DOS PC or a Solaris workstation), the output would be
routed to the node connected to the server and from there to the server
which printed the output.

> Is it therefore so unthinkable to have something like a transputer
> these days?

I saw them too. BTW, they are in some sense a step back comparing to the
level Inmos arrived before its fall. They introduced a programmable TP-link
switch, so that you could reconnect the network of transputers on the fly.

But the problem is. I really see no use for the par-statement or alike. The
main argument against par is that using threads causes to much overhead. If
the argument stands, I mean if you don't have very long code alternatives
running in parallel for seconds, then using a mesh of processors would make
things only worse. The overhead to distribute the code and data over the
mesh of processors is much bigger than doing this on a machine with shared
memory (multi-core). There certainly exist examples of long independent
code alternatives, but I would say that most of them are constructed or
marginal.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-26  1:06       ` Yannick Duchêne (Hibou57)
@ 2011-01-26 10:08         ` Dmitry A. Kazakov
  2011-01-31 13:01         ` Paul Colin Gloster
  1 sibling, 0 replies; 22+ messages in thread
From: Dmitry A. Kazakov @ 2011-01-26 10:08 UTC (permalink / raw)


On Wed, 26 Jan 2011 02:06:57 +0100, Yannick Duchêne (Hibou57) wrote:

> Le Tue, 25 Jan 2011 22:32:57 +0100, Dmitry A. Kazakov  
> <mailbox@dmitry-kazakov.de> a écrit:
> 
>> On Tue, 25 Jan 2011 18:36:29 +0100, Georg Bauhaus wrote:
>>
>>> If a compiler detects two independent paths inside a loop,
>>
>> I don't see any application for this. Can you remember the last time you
>> wrote such loop? I cannot.

> I can see one: a kind of compiler optimization. I use to though about  
> something similar to what Georg exposed (except not strictly with loops),  
> which I called “micro-parallelism”.

OK, but you need no special constructs for this. The "mind reading"
compiler could optimize the standard for-loop without further hints.

> This kind of parallelism requires to be handled at low level (CPU or  
> else).

If you had a dataflow signal processor...

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-26  8:46 ` Egil Høvik
@ 2011-01-26 10:47   ` Georg Bauhaus
  2011-02-14 23:27     ` Tuck
  0 siblings, 1 reply; 22+ messages in thread
From: Georg Bauhaus @ 2011-01-26 10:47 UTC (permalink / raw)


On 26.01.11 09:46, Egil Hï¿½vik wrote:

> You should take a look at Tucker Tafts blog about ParaSail,

I have, need to look more closely, though.  Is anything known yet
about what John Barnes's Integrate example would look like if
written in idiomatic ParaSail?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 15:40 Loops and parallel execution Georg Bauhaus
  2011-01-25 16:37 ` Dmitry A. Kazakov
  2011-01-26  8:46 ` Egil Høvik
@ 2011-01-26 11:29 ` Peter C. Chapin
  2011-01-26 21:57 ` Randy Brukardt
  3 siblings, 0 replies; 22+ messages in thread
From: Peter C. Chapin @ 2011-01-26 11:29 UTC (permalink / raw)


On Tue, 25 Jan 2011, Georg Bauhaus wrote:

> A quick idea.  Assume that some subprogram Op from package P
> is reentrant (and does not depend on global state). Then,
>
> with P;
> ...
>   for K in all First .. Last loop
>       P.Op (K);
>   end loop;
>
> should have the effect of the following being permitted:
>
> (a) to pick K from  First .. Last  in any order
>
> (b) to execute P (J) in parallel with P (K) for J, K from
> First .. Last

I've often wondered what it would take to support OpenMP (or something like 
it) in Ada. The advantage with such an approach is that OpenMP is well 
documented and widely used and understood. Right now the OpenMP standard 
only supports C (and C++?) and Fortran. Why not Ada?

Peter



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-25 15:40 Loops and parallel execution Georg Bauhaus
                   ` (2 preceding siblings ...)
  2011-01-26 11:29 ` Peter C. Chapin
@ 2011-01-26 21:57 ` Randy Brukardt
  2011-01-27 23:01   ` tmoran
  3 siblings, 1 reply; 22+ messages in thread
From: Randy Brukardt @ 2011-01-26 21:57 UTC (permalink / raw)


"Georg Bauhaus" <rm.dash-bauhaus@futureapps.de> wrote in message 
news:4d3eeef7$0$6879$9b4e6d93@newsspool2.arcor-online.net...
...
> Can this be borrowed from HPF (IIUC)?
> Is pragma Pure (P) sufficient to signal reentrance?

I've thought about such an idea. But it requires restrictions well beyond 
those enforced by pragma Pure. For instance, Pure packages can write 
dereferences of pointers to keep global state. Moreover, there can't be any 
"cheating", which is common in pragma Pure packages.

So there would need to be a new kind of categorization for this. I was 
hoping that we could using the proposed global in/global out categorizations 
to do the job, but those got dropped from Ada 2012.

Also, I think that "no communication" is impractical in most real 
applications. But it is sufficient if the communication is tightly limited 
(via atomic and protected objects, and/or synchronized interfaces - you'll 
need to access global data, just safely). That's another reason why "checked 
global in/global out" is needed.

Finally, like Dmitry, I'm skeptical about fine-grained parallelism buying 
much. Unless there is specific architectural support (something that doesn't 
exist in commonly used processors -- and especially in commonly used target 
OSes/RTOSes), the management overhead will kill any savings on "small" 
expressions. Thread creation is not cheap! The "win" is on larger tasks - 
which means that subprograms - and separately compiled subprograms - have to 
be involved in some way.

My main interest in this technology is to make it much easier to create 
programs that use threads but don't deadlock, livelock, or have dangerous 
use of globals. That seems to require restrictions on what you can do, and 
definitely requires some form of compile-time checking to enforce those 
restrictions. If done usefully, that could be a giant win, as you could use 
sequential reasoning for the majority of your programming and debugging, and 
still get parallelism when useful.

                                 Randy.






^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-26 21:57 ` Randy Brukardt
@ 2011-01-27 23:01   ` tmoran
  2011-01-29  0:23     ` Randy Brukardt
  0 siblings, 1 reply; 22+ messages in thread
From: tmoran @ 2011-01-27 23:01 UTC (permalink / raw)


> Finally, like Dmitry, I'm skeptical about fine-grained parallelism buying
> much. Unless there is specific architectural support (something that doesn't
> exist in commonly used processors -- and especially in commonly used target
> OSes/RTOSes), the management overhead will kill any savings on "small"

  What about the SIMD (vector) instructions in Intel CPUs?  Or is that
better done by simply calling their optimized, CPU capability detecting,
libraries?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-27 23:01   ` tmoran
@ 2011-01-29  0:23     ` Randy Brukardt
  2011-02-06 20:10       ` Yannick Duchêne (Hibou57)
  0 siblings, 1 reply; 22+ messages in thread
From: Randy Brukardt @ 2011-01-29  0:23 UTC (permalink / raw)


<tmoran@acm.org> wrote in message news:ihsth1$igr$1@speranza.aioe.org...
>> Finally, like Dmitry, I'm skeptical about fine-grained parallelism buying
>> much. Unless there is specific architectural support (something that 
>> doesn't
>> exist in commonly used processors -- and especially in commonly used 
>> target
>> OSes/RTOSes), the management overhead will kill any savings on "small"
>
>  What about the SIMD (vector) instructions in Intel CPUs?  Or is that
> better done by simply calling their optimized, CPU capability detecting,
> libraries?

That's a code generation problem; I don't believe that there is much if any 
value to the programmer cluttering their code with parallel operations in 
that point.

To expand on that a bit: code generation for a CISC machine is primarily a 
pattern matching problem. That is, the intermediate code is a list of very 
simple pseudo instructions, and the code generator needs to map those to 
more complex machine instructions (along with simple ones when the pattern 
matching fails). Matching SIMD instructions is a more complex problem than 
the simple matcher used in Janus/Ada (to take the example I'm most familar 
with), but it is fundementally the same problem. In this case, I would 
probably apply a loop unrolling optimization, then a series of pattern 
matching operations to create the SIMD instructions.

We already do something like this for aggregates in Janus/Ada. An aggregate 
assignment like:

                My_Str :=  (others => Ch)

can get turned into the Intel STOSB (I think that's the right opcode) 
instruction (plus a bit of setup code); which is a lot simpler than the loop 
that would be otherwise generated.

In either case, you'll automatically get the benefit of the advanced 
instructions when they can be used, and no code changes are needed. Of 
course, if your code doesn't match the pattern, the advanced instructions 
wouldn't be used, but it's unlikely that adding a "parallel" direction to 
the loop would somehow change that.

I'd be surprised if GCC doesn't already do something like this. (This 
particular problem hasn't been on my radar, in part because I didn't even 
have a machine that supported most of those instructions until last year.)

                                         Randy.
 





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-26  1:06       ` Yannick Duchêne (Hibou57)
  2011-01-26 10:08         ` Dmitry A. Kazakov
@ 2011-01-31 13:01         ` Paul Colin Gloster
  2011-02-06 20:06           ` Yannick Duchêne (Hibou57)
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Colin Gloster @ 2011-01-31 13:01 UTC (permalink / raw)


In news:op.vpv5dvb1ule2fv@garhos it was mentioned:
|------------------------------------------------------------------------|
|"[..]                                                                   |
|                                                                        |
|[..]                                                                    |
|                                                                        |
|"[..] There are many case in an application                             |
|where some short sequence of instructions or groups of instructions does|
|not need to be sequenced. [..]                                          |
|[..]                                                                    |
|                                                                        |
|[..]"                                                                   |
|------------------------------------------------------------------------|

Instruction-level parallelism?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-31 13:01         ` Paul Colin Gloster
@ 2011-02-06 20:06           ` Yannick Duchêne (Hibou57)
  2011-02-07 11:43             ` Nicholas Paul Collin Gloster
  0 siblings, 1 reply; 22+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2011-02-06 20:06 UTC (permalink / raw)


Le Mon, 31 Jan 2011 14:01:04 +0100, Paul Colin Gloster  
<Colin_Paul_Gloster@acm.org> a écrit:

> In news:op.vpv5dvb1ule2fv@garhos it was mentioned:
> |------------------------------------------------------------------------|
> |"[..]                                                                    
> |
> |                                                                         
> |
> |[..]                                                                     
> |
> |                                                                         
> |
> |"[..] There are many case in an application                              
> |
> |where some short sequence of instructions or groups of instructions  
> does|
> |not need to be sequenced. [..]                                           
> |
> |[..]                                                                     
> |
> |                                                                         
> |
> |[..]"                                                                    
> |
> |------------------------------------------------------------------------|
>
> Instruction-level parallelism?
Like in graphic cards ?

-- 
Si les chats miaulent et font autant de vocalises bizarres, c’est pas pour  
les chiens.

“I am fluent in ASCII” [Warren 2010]



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-29  0:23     ` Randy Brukardt
@ 2011-02-06 20:10       ` Yannick Duchêne (Hibou57)
  0 siblings, 0 replies; 22+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2011-02-06 20:10 UTC (permalink / raw)


Le Sat, 29 Jan 2011 01:23:13 +0100, Randy Brukardt <randy@rrsoftware.com>  
a écrit:
> We already do something like this for aggregates in Janus/Ada. An  
> aggregate
> assignment like:
>
>                 My_Str :=  (others => Ch)
>
> can get turned into the Intel STOSB (I think that's the right opcode)
Yes, if I remember correctly, this is precisely “rep stosb”, with a  
counter in the ECX (or CX) register. Stosb alone would just update source  
and destination register (SI and DI). This use, although valid, is less  
common.

-- 
Si les chats miaulent et font autant de vocalises bizarres, c’est pas pour  
les chiens.

“I am fluent in ASCII” [Warren 2010]



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-02-06 20:06           ` Yannick Duchêne (Hibou57)
@ 2011-02-07 11:43             ` Nicholas Paul Collin Gloster
  0 siblings, 0 replies; 22+ messages in thread
From: Nicholas Paul Collin Gloster @ 2011-02-07 11:43 UTC (permalink / raw)


On 2011-02-06, Yannick Duchï¿½ne <yannick_duchene@Yahoo.Fr> sent:
|--------------------------------|
|"[..]                           |
|> Instruction-level parallelism?|
|Like in graphic cards ?"        |
|--------------------------------|

Like in pipelined C.P.U.s: the C.P.U. can fetch one
instruction while an earlier instruction is being executed.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-01-26 10:47   ` Georg Bauhaus
@ 2011-02-14 23:27     ` Tuck
  2011-02-15 21:10       ` Georg Bauhaus
  0 siblings, 1 reply; 22+ messages in thread
From: Tuck @ 2011-02-14 23:27 UTC (permalink / raw)


On Wednesday, January 26, 2011 5:47:00 AM UTC-5, Georg Bauhaus wrote:
> On 26.01.11 09:46, Egil Høvik wrote:
> 
> > You should take a look at Tucker Tafts blog about ParaSail,
> 
> I have, need to look more closely, though.  Is anything known yet
> about what John Barnes's Integrate example would look like if
> written in idiomatic ParaSail?

It might look something like this:

 function Integrate(
   function Func(X : Floating is Float<>) -> Floating;
   Over : Interval<Floating>)
     -> Floating is
   ...
 end function Integrate;

 ...

 const Result := Integrate(Sin, Over => 0.0 .. Pi/2.0);

-Tuck



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Loops and parallel execution
  2011-02-14 23:27     ` Tuck
@ 2011-02-15 21:10       ` Georg Bauhaus
  0 siblings, 0 replies; 22+ messages in thread
From: Georg Bauhaus @ 2011-02-15 21:10 UTC (permalink / raw)


On 2/15/11 12:27 AM, Tuck wrote:
> On Wednesday, January 26, 2011 5:47:00 AM UTC-5, Georg Bauhaus wrote:
>>   Is anything known yet
>> about what John Barnes's Integrate example would look like if
>> written in idiomatic ParaSail?
>
> It might look something like this:
>
>   function Integrate(
>     function Func(X : Floating is Float<>) ->  Floating;
>     Over : Interval<Floating>)
>       ->  Floating is
>     ...
>   end function Integrate;
>
>   ...
>
>   const Result := Integrate(Sin, Over =>  0.0 .. Pi/2.0);

Thanks, I must have overlooked nested declarations.
(Thought that one would have to make passing values
explicit;  thus using an outer function's argument
within another function local to the former would
have been ruled out. Doesn't seem so!)





^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-02-15 21:10 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-25 15:40 Loops and parallel execution Georg Bauhaus
2011-01-25 16:37 ` Dmitry A. Kazakov
2011-01-25 17:36   ` Georg Bauhaus
2011-01-25 17:38     ` Georg Bauhaus
2011-01-25 21:32     ` Dmitry A. Kazakov
2011-01-25 22:07       ` Georg Bauhaus
2011-01-26  1:31         ` Yannick Duchêne (Hibou57)
2011-01-26  9:04         ` Dmitry A. Kazakov
2011-01-26  1:06       ` Yannick Duchêne (Hibou57)
2011-01-26 10:08         ` Dmitry A. Kazakov
2011-01-31 13:01         ` Paul Colin Gloster
2011-02-06 20:06           ` Yannick Duchêne (Hibou57)
2011-02-07 11:43             ` Nicholas Paul Collin Gloster
2011-01-26  8:46 ` Egil Høvik
2011-01-26 10:47   ` Georg Bauhaus
2011-02-14 23:27     ` Tuck
2011-02-15 21:10       ` Georg Bauhaus
2011-01-26 11:29 ` Peter C. Chapin
2011-01-26 21:57 ` Randy Brukardt
2011-01-27 23:01   ` tmoran
2011-01-29  0:23     ` Randy Brukardt
2011-02-06 20:10       ` Yannick Duchêne (Hibou57)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox