GNAT and Tasklets

comp.lang.ada
 help / color / mirror / Atom feed

* GNAT and Tasklets
@ 2014-12-10 16:31 vincent.diemunsch
  2014-12-11 10:02 ` Jacob Sparre Andersen
                   ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-10 16:31 UTC (permalink / raw)


Hello,

I have read some parts of the book from Alan BURNS « Concurrent and Real Time Programming in Ada ». He presents in chapter 11 an implementation of jobs that are submited to a pool of tasks (see : Callables, Executors, Futures). These jobs are typical of a parallel computation done on a multicore system, like computing a complex image, for instance.

I find this really interesting and want to use this mecanism in the future, but it raises a question : if we use a library in Ada to do this tasking, we loose in fact the ability to use tasking directly inside the Ada langage ! And it is easy and cleaner to create a local task inside the subprogram for each job.

So my question is : does GNAT create a kernel thread for each local task or is it able to compile local tasks as jobs sent to a pool of tasks created in the runtime ?

Kind regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-10 16:31 GNAT and Tasklets vincent.diemunsch
@ 2014-12-11 10:02 ` Jacob Sparre Andersen
  2014-12-11 16:30   ` Anh Vo
                     ` (3 more replies)
  2014-12-14  0:18 ` Hubert
  2014-12-16  4:42 ` Brad Moore
  2 siblings, 4 replies; 73+ messages in thread
From: Jacob Sparre Andersen @ 2014-12-11 10:02 UTC (permalink / raw)

Vincent Diemunsch wrote:

> So my question is : does GNAT create a kernel thread for each local
> task or is it able to compile local tasks as jobs sent to a pool of
> tasks created in the runtime ?

It can do both.  There exists multiple run-time libraries for GNAT, some
have no tasking at all, some implement tasking directly on the hardware,
some use user-space threads and some use operating system threads.

Ada 202X is likely to include slightly more implicit parallel processing
than tasks.  The Gang of Four is busy(?) working out a proposal for how
to do it.

Greetings,

Jacob
-- 
People in cars cause accidents. Accidents in cars cause people.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 10:02 ` Jacob Sparre Andersen
@ 2014-12-11 16:30   ` Anh Vo
  2014-12-11 18:15     ` David Botton
  2014-12-11 21:45     ` Egil H H
  2014-12-11 23:09   ` Randy Brukardt
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 73+ messages in thread
From: Anh Vo @ 2014-12-11 16:30 UTC (permalink / raw)


On Thursday, December 11, 2014 2:02:41 AM UTC-8, Jacob Sparre Andersen wrote:
> Vincent Diemunsch wrote:
>  
> Ada 202X is likely to include slightly more implicit parallel processing
> than tasks.  The Gang of Four is busy(?) working out a proposal for how
> to do it.

Can you give hint to who are The Gang of Four? Thanks.

Anh Vo

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 16:30   ` Anh Vo
@ 2014-12-11 18:15     ` David Botton
  2014-12-11 21:45     ` Egil H H
  1 sibling, 0 replies; 73+ messages in thread
From: David Botton @ 2014-12-11 18:15 UTC (permalink / raw)


> Can you give hint to who are The Gang of Four? Thanks.

Well the Gang of Three was the three OO Methodology guys that came together for UML:

Grady Booch, James Rumbaugh, and Ivar Jacobson

Then some one said that is cool lets call the design pattern guys the Gang of Four:

Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides

But perhaps this is a new Gang of Four.

I am thinking to start a new Gang too where we rough up everyone to use Ada as the only way to write software :)

We could have a Gang war, OO and Patterns vs the Ada Gang

Ada would win since it does OO, Patterns and a host of other proven software development techniques that are being forgotten as one fad rolls over the next in the Gang wars.

David Botton


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 16:30   ` Anh Vo
  2014-12-11 18:15     ` David Botton
@ 2014-12-11 21:45     ` Egil H H
  1 sibling, 0 replies; 73+ messages in thread
From: Egil H H @ 2014-12-11 21:45 UTC (permalink / raw)


On Thursday, December 11, 2014 5:30:14 PM UTC+1, Anh Vo wrote:
> On Thursday, December 11, 2014 2:02:41 AM UTC-8, Jacob Sparre Andersen wrote:
> > Vincent Diemunsch wrote:
> >  
> > Ada 202X is likely to include slightly more implicit parallel processing
> > than tasks.  The Gang of Four is busy(?) working out a proposal for how
> > to do it.
> 
> Can you give hint to who are The Gang of Four? Thanks.
> 
> Anh Vo


I believe it was a Gang of Three, consisting of Luis Miguel Pinho, Brad Moore and Stephen Michell until Tucker Taft joined to make a Gang of Four.

-- 
~egilhh

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 10:02 ` Jacob Sparre Andersen
  2014-12-11 16:30   ` Anh Vo
@ 2014-12-11 23:09   ` Randy Brukardt
  2014-12-12  2:28     ` Jacob Sparre Andersen
  2014-12-12  8:46   ` vincent.diemunsch
  2014-12-13  2:06   ` Brad Moore
  3 siblings, 1 reply; 73+ messages in thread
From: Randy Brukardt @ 2014-12-11 23:09 UTC (permalink / raw)


"Jacob Sparre Andersen" <jacob@jacob-sparre.dk> wrote in message 
news:87oaray05e.fsf@adaheads.sparre-andersen.dk...
...
> Ada 202X is likely to include slightly more implicit parallel processing
> than tasks.  The Gang of Four is busy(?) working out a proposal for how
> to do it.

The parallel subgroup includes Tucker Taft, Brad Moore, and Stephen Michell. 
I've forgotten who the fourth person is, and since there are no authors 
listed on the HILT paper that Tucker sent the ARG, I'm not going to guess.

In any case, the rough proposals look useful. That's especially true as the 
support is partitioned into groups of features, many of which would be 
useful even to programs that never use the parallel features. (For instance, 
additional kinds of contracts will help with correctness and static analysis 
of all programs.)

But the problems with these sorts of proposals only show up when the details 
are worked out, and there aren't many of those to date. That is, don't 
expect this stuff to show up in your favorite compiler in the near future.

                                      Randy.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 23:09   ` Randy Brukardt
@ 2014-12-12  2:28     ` Jacob Sparre Andersen
  0 siblings, 0 replies; 73+ messages in thread
From: Jacob Sparre Andersen @ 2014-12-12  2:28 UTC (permalink / raw)


Randy Brukardt wrote:

> The parallel subgroup includes Tucker Taft, Brad Moore, and Stephen
> Michell.

And Luís Miguel Pinho.

Greetings,

Jacob
-- 
"In school, and in most aspects of life, a 90% is an A.
 In software, a 99.9% is, or may be, an utter disaster."


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 10:02 ` Jacob Sparre Andersen
  2014-12-11 16:30   ` Anh Vo
  2014-12-11 23:09   ` Randy Brukardt
@ 2014-12-12  8:46   ` vincent.diemunsch
  2014-12-12 23:33     ` Georg Bauhaus
  2014-12-13  2:06   ` Brad Moore
  3 siblings, 1 reply; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-12  8:46 UTC (permalink / raw)


Thank Jacob for your response.

> > So my question is : does GNAT create a kernel thread for each local
> > task or is it able to compile local tasks as jobs sent to a pool of
> > tasks created in the runtime ?
> 
> It can do both.  There exists multiple run-time libraries for GNAT, some
> have no tasking at all, some implement tasking directly on the hardware,
> some use user-space threads and some use operating system threads.

Where can I find information on which implementation does what ?
For instance, on Mac OSX, does GNAT uses only kernel threads ? Are user-space
threads somehow related to the Florist package (Through Posix Pthreads) ?


> Ada 202X is likely to include slightly more implicit parallel processing
> than tasks.  The Gang of Four is busy(?) working out a proposal for how
> to do it.

I am looking forward to hearing from it.
Kind regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-12  8:46   ` vincent.diemunsch
@ 2014-12-12 23:33     ` Georg Bauhaus
  0 siblings, 0 replies; 73+ messages in thread
From: Georg Bauhaus @ 2014-12-12 23:33 UTC (permalink / raw)

<vincent.diemunsch@gmail.com> wrote:

> Where can I find information on which implementation does what ?
> For instance, on Mac OSX, does GNAT uses only kernel threads ? Are user-space
> threads somehow related to the Florist package (Through Posix Pthreads) ?

In the past, I found this information
in the source text of GNAT. Not ideal,
but then maybe the system specific
documentation has it. It does address
Mac specifics. Not sure there is a
documentation requirement that would
allow predicting its presence.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-11 10:02 ` Jacob Sparre Andersen
                     ` (2 preceding siblings ...)
  2014-12-12  8:46   ` vincent.diemunsch
@ 2014-12-13  2:06   ` Brad Moore
  2014-12-13  6:50     ` Dirk Craeynest
  3 siblings, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-13  2:06 UTC (permalink / raw)

On 2014-12-11 3:02 AM, Jacob Sparre Andersen wrote:
> Ada 202X is likely to include slightly more implicit parallel processing
> than tasks.  The Gang of Four is busy(?) working out a proposal for how
> to do it.
>

The ideas continue to evolve with improvements and refinements. Even 
since the last HILT paper we have some new ideas that might include 
significant changes in how we deal with parallel loops, but we have yet 
to work through those ideas in greater detail. Right now we are focusing 
on a paper on parallelism for Real Time, for possible consideration for 
the next IRTAW (International Real Time Ada Workshop) next April, as 
well as for the next Ada Europe. Deadlines for Ada Europe submissions 
are coming up soon, so yes, we are busy....

Brad

> Greetings,
>
> Jacob
>

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-13  2:06   ` Brad Moore
@ 2014-12-13  6:50     ` Dirk Craeynest
  0 siblings, 0 replies; 73+ messages in thread
From: Dirk Craeynest @ 2014-12-13  6:50 UTC (permalink / raw)


In article <41Hiw.315423$ZT5.146602@fx07.iad>,
Brad Moore  <brad.moore@shaw.ca> wrote:
>[...] Right now we are focusing on a paper on parallelism for Real
>Time, [...]  Deadlines for Ada Europe submissions are coming up soon,
>so yes, we are busy....

For all others who would like to prepare a submission for next year's
Ada-Europe conference, the deadlines are:

11 January 2015: regular papers, tutorial and workshop proposals
25 January 2015: industrial presentation proposals

Dirk
Dirk.Craeynest@cs.kuleuven.be (for Ada-Belgium/Ada-Europe/SIGAda/WG9)

*** 20th Intl.Conf.on Reliable Software Technologies - Ada-Europe'2015
*** June 22-26, 2015 **** Madrid, Spain **** http://www.ada-europe.org

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-10 16:31 GNAT and Tasklets vincent.diemunsch
  2014-12-11 10:02 ` Jacob Sparre Andersen
@ 2014-12-14  0:18 ` Hubert
  2014-12-14 21:29   ` vincent.diemunsch
  2014-12-16  4:42 ` Brad Moore
  2 siblings, 1 reply; 73+ messages in thread
From: Hubert @ 2014-12-14  0:18 UTC (permalink / raw)


>
> So my question is : does GNAT create a kernel thread for each local task or is it able to compile local tasks as jobs sent to a pool of tasks created in the runtime ?

I asked a similar question about 2 years ago when I was busy 
implementing a request system on a server (albeit in C++ but I was 
already ogling at Ada) and I got different answers, depending on the 
compiler.
For me there was an additional requirement though, I had to deal with 
several
1000s of "Jobs" in parallel, not in hard realtime, but they had to be 
served as soon as possible. the result of my research was that depending 
on the OS the Ada program was running on you could get several 100 OS 
threads or maybe 1-2K on Linux but there is an upper limit because every 
OS thread that runs a task will have a stack associated with it, so 
mostly the available memory is the limit, I think.
Then when your Jobs use only very limited stack resources, this can be a 
great waste, especially when thy have different stack requirements so 
you will ahve to go with the greatest common value which means lots of 
waste.
My solution was to implement my own pre-emptive Job system on top of the 
OS threads. I allocate as many threads (or Tasks in Ada) as there are 
processor cores and then assign a number of Jobs to each.
The Jobs work internally with a state machine. Whenever they perform a 
blocking operatin (mostly communicating with other Jobs), they send a 
message and return control and when the answer arrives, my system calls 
them again with the next state whee they fetch the answer.

Depending on what your requirements are (great number of parallel Jobs), 
this may very well be your only reliable solution.







---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-14  0:18 ` Hubert
@ 2014-12-14 21:29   ` vincent.diemunsch
  2014-12-16  5:09     ` Brad Moore
  0 siblings, 1 reply; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-14 21:29 UTC (permalink / raw)


Le dimanche 14 décembre 2014 01:18:42 UTC+1, Hubert a écrit :

> the result of my research was that depending 
> on the OS the Ada program was running on you could get several 100 OS 
> threads or maybe 1-2K on Linux but there is an upper limit because every 
> OS thread that runs a task will have a stack associated with it, so 
> mostly the available memory is the limit, I think.
> [...]
> My solution was to implement my own pre-emptive Job system on top of the 
> OS threads. I allocate as many threads (or Tasks in Ada) as there are 
> processor cores and then assign a number of Jobs to each.
> [...]
> Depending on what your requirements are (great number of parallel Jobs), 
> this may very well be your only reliable solution.
> 

Yes, I think you are completely right. Is your library private or do you plan to release
it as Open Source ?

This shows clearly that the compiler wasn't able to produce an adequate solution, even 
if the case of a lot of little local tasks is quite simple, and has become a standard way of
using multicore computers (see for instance Grand Central Dispatch on Mac OS X). 

I really hope that Ada 202X will limit new features to a few real improvements, and try hard improving compilers.

Kind regards,

Vincent

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-10 16:31 GNAT and Tasklets vincent.diemunsch
  2014-12-11 10:02 ` Jacob Sparre Andersen
  2014-12-14  0:18 ` Hubert
@ 2014-12-16  4:42 ` Brad Moore
  2014-12-17 13:06   ` vincent.diemunsch
  2 siblings, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-16  4:42 UTC (permalink / raw)

On 2014-12-10 9:31 AM, vincent.diemunsch@gmail.com wrote:
> Hello,
>
> I have read some parts of the book from Alan BURNS « Concurrent and Real Time Programming in Ada ». He presents in chapter 11 an implementation of jobs that are submited to a pool of tasks (see : Callables, Executors, Futures). These jobs are typical of a parallel computation done on a multicore system, like computing a complex image, for instance.
>
> I find this really interesting and want to use this mecanism in the future, but it raises a question : if we use a library in Ada to do this tasking, we loose in fact the ability to use tasking directly inside the Ada langage ! And it is easy and cleaner to create a local task inside the subprogram for each job.

I dont think this is accurate, as creating tasks in Ada generally serves 
a different purpose than adding improved parallelism. Tasks are useful 
constructs for creating independent concurrent activities. It is a way 
of breaking an application into separate independent logical executions 
that separate concerns, improving the logic and understanding of a 
program. Parallelism on the other hand is only about making the program 
execute faster. If the parallelism does not do that, it fails to serve 
its purpose.

So the availability of a parallelism library shouldn't really affect the 
way one structures their program into a collection of tasks.

I find such a library is useful when one ones to improve the execution 
time of one/some of the tasks in the application where performance is 
not adequate. Tasks and parallelism libraries can complement each other 
to achieve the best of both worlds.

>
> So my question is : does GNAT create a kernel thread for each local task or is it able to compile local tasks as jobs sent to a pool of tasks created in the runtime ?

My understanding is that GNAT generally maps tasks to OS threads on a 
one to one basis, but as others have pointed out, there may be 
configurations where other mappings are also available.

My understanding also is that at one time, GNAT had an implementation 
built on top of FSU threads developed at Florida State University, by 
Ted Baker. This implementation ran all tasks under one OS thread. This 
highlights the difference between concurrency and parallelism. The FSU 
thread implementation gives you concurrency by allowing tasks to execute 
independently from each other, using some preemptive scheduling model to 
shift the processor between the multiple tasks of an application.

Using a parallelism library with a GNAT implementation based on FSU 
threads likely wont give you improved performance if all the tasks are 
executing concurrently within a single OS thread though, running on a 
single core of a multicore platform.

My understanding is that GNAT no longer supports this option to use FSU 
threads, mostly because there isn't sufficient customer interest in 
maintaining this. Or at least that is my recollection of what Robert 
Dewar stated in one of his emails a few years back.

Brad

>
> Kind regards,
>
> Vincent
>

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-14 21:29   ` vincent.diemunsch
@ 2014-12-16  5:09     ` Brad Moore
  2014-12-17 13:24       ` vincent.diemunsch
  0 siblings, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-16  5:09 UTC (permalink / raw)

On 2014-12-14 2:29 PM, vincent.diemunsch@gmail.com wrote:
> Le dimanche 14 décembre 2014 01:18:42 UTC+1, Hubert a écrit :
>
>> the result of my research was that depending
>> on the OS the Ada program was running on you could get several 100 OS
>> threads or maybe 1-2K on Linux but there is an upper limit because every
>> OS thread that runs a task will have a stack associated with it, so
>> mostly the available memory is the limit, I think.
>> [...]
>> My solution was to implement my own pre-emptive Job system on top of the
>> OS threads. I allocate as many threads (or Tasks in Ada) as there are
>> processor cores and then assign a number of Jobs to each.
>> [...]
>> Depending on what your requirements are (great number of parallel Jobs),
>> this may very well be your only reliable solution.
>>
>
> Yes, I think you are completely right. Is your library private or do you plan to release
> it as Open Source ?

As another alternative, you could look at the Paraffin libraries, which 
can be found at

https://sourceforge.net/projects/paraffin/

These libraries are a set of open source generics that provide several 
different strategies to use for parallel loops, parallel recursion, and 
parallel blocks. You can choose between different parallelism strategies 
such as a static load balancing (work sharing), or dynamic load 
balancing using work stealing approach for loops, or what I call work 
seeking which is another variation of load balancing.

You can also choose between using task pools, or creating worker tasks 
dynamically on the fly.

Generally I found similar results as reported by Hubert, that the 
optimal number of workers is typically based on the number of available 
cores in the system. Adding more workers above that typically does not 
improve performance, and eventually degrades performance, as each worker 
introduces some overhead, and if the cores are already fully loaded with 
work, adding more workers only adds overhead without adding performance 
benefits.

>
> This shows clearly that the compiler wasn't able to produce an adequate solution, even
> if the case of a lot of little local tasks is quite simple, and has become a standard way of
> using multicore computers (see for instance Grand Central Dispatch on Mac OS X).

The compiler is already allowed to use implicit parallelism when it sees 
fit, if it can achieve the same semantic effects that would result from 
sequential execution.

RM 9.11 "Concurrent task execution may be implemented on multicomputers, 
multiprocessors, or with interleaved execution on a single physical 
processor. On the other hand, whenever an implementation can determine 
that the required semantic effects can be achieved when parts of the 
execution of a given task are performed by different physical processors 
acting in parallel, it may choose to perform them in this way"

However, there are limits to what the compiler can do implicitly. For 
instance it cannot determine if the following loop can execute in parallel.

       Sum : Integer := 0;
      for I in 1 .. 1000 loop
          Sum := Sum + Foo(I);
      end loop;

For one there is a data race on the variable Sum. If the loop were to be 
broken up into multiple tasklets executing in parallel, the compiler 
would need to structure the implementation of the loop very differently 
than written, and the semantics of execution would not be the same as 
the sequential case, particularly if an exception is raised inside the 
loop. Secondly, if Foo is a third party library call, the compiler 
cannot know if the Foo function itself is modifying global variables 
which would be unsafe for parallelization.

>
> I really hope that Ada 202X will limit new features to a few real improvements, and try hard improving compilers.

In order for the compiler to generate implicit parallelism for code such 
as the example above, it needs to be given additional semantic 
information so that it can guarantee the parallel transformation can be 
done safely. We are looking at ways of providing such information to the 
compiler via new aspects that can be checked statically by the compiler. 
  Whether such proposals will actually become part of Ada 202x is 
another question. It depends on the demand for such features, and how 
well they can be worked out, without adding too much complexity to the 
language, or implementation burden to the compiler vendors. I think the 
general goal at this point will be to limit Ada 202x in terms of new 
features, but that is the future, and the future is unknown.

Brad

>
> Kind regards,
>
> Vincent
>

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-16  4:42 ` Brad Moore
@ 2014-12-17 13:06   ` vincent.diemunsch
  2014-12-17 20:31     ` Niklas Holsti
                       ` (3 more replies)
  0 siblings, 4 replies; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-17 13:06 UTC (permalink / raw)


Hello Brad,

> I dont think this is accurate, as creating tasks in Ada generally serves 
> a different purpose than adding improved parallelism. Tasks are useful 
> constructs for creating independent concurrent activities. It is a way 
> of breaking an application into separate independent logical executions 
> that separate concerns, improving the logic and understanding of a 
> program. Parallelism on the other hand is only about making the program 
> execute faster. If the parallelism does not do that, it fails to serve 
> its purpose.

I am rather surprised that you made a distinction between creating tasks
and parallelism. I agree that the goal of parallelism is to increase CPU
usage and therefore make the program run faster. For me creating tasks is
the Ada way of implementing parallelism. And it is a sound way of doing it since compilers, as far as I know, are not really able to find automaticaly parallelism in a program. Moreover using things like state machines to
create parallelism is to complex for a programmer and needs the use of a
dedicated langage. So tasks are fine.

> So the availability of a parallelism library shouldn't really affect the 
> way one structures their program into a collection of tasks.
> I find such a library is useful when one ones to improve the execution 
> time of one/some of the tasks in the application where performance is 
> not adequate. Tasks and parallelism libraries can complement each other 
> to achieve the best of both worlds.

I am sorry to disagree : the very existence of a parallelism Library shows
the inability of the current Ada technology to do deal directly with parallelism
inside the Ada langage. I realy think this is due to the weakness of current compilers, but if there are also problems inside the langage they should be addressed (like the Ravenscar restriction that allowed predictable tasking, or
special constructs to express parallelism, or "aspects" to indicate that a task
should be run on a GPU...). Since should be only a few features of Ada 202X.


> My understanding is that GNAT generally maps tasks to OS threads on a 
> one to one basis, but as others have pointed out, there may be 
> configurations where other mappings are also available.

I could understand that a Library level task (i.e. a task declared immediately
in a package that is at lirary level) be mapped to an OS thread, but a
simple local task should definetly not. And even that is a simplification since
as you pointed, there is often no use to create more kernel threads than the 
number of available CPU.
 
> My understanding also is that at one time, GNAT had an implementation 
> built on top of FSU threads developed at Florida State University, by 
> Ted Baker. This implementation ran all tasks under one OS thread. 
> [...] The FSU 
> thread implementation gives you concurrency by allowing tasks to execute 
> independently from each other, using some preemptive scheduling model to 
> shift the processor between the multiple tasks of an application.

The solution of all tasks under one kernel thread is good for monoprocessors, and since User Level threads are lightweight compared to Kernel threads, it was acceptable to map a task to a thread.
But with multiple cores, we need all tasks running on a pool of kernel threads, one thread per core. And I suppose that when multicores came, it has been considered easier to drop the FSU implementation and simply map one task to a kernel thread. But doing this is an oversimplification that gives poor performances for pure parallel computing, and gave rise to the need of parallelism Library ! (Not to mention the problem of GPU that are commonly used for highly demanding computations and are not supported by GNAT... )

What we need now is a new implementation of tasking in GNAT, able to treat
local tasks as jobs.

Regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-16  5:09     ` Brad Moore
@ 2014-12-17 13:24       ` vincent.diemunsch
  0 siblings, 0 replies; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-17 13:24 UTC (permalink / raw)


> As another alternative, you could look at the Paraffin libraries, which 
> can be found at
> 
> https://sourceforge.net/projects/paraffin/
> 

This is really impressive. It strongly gives me the fealing that it should
definetly be integrated into the Ada compilers, maybe with some compilation
directives standardized in Ada 202X ?

Regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 13:06   ` vincent.diemunsch
@ 2014-12-17 20:31     ` Niklas Holsti
  2014-12-17 22:08       ` Randy Brukardt
  2014-12-18  8:42       ` Dmitry A. Kazakov
  2014-12-17 21:08     ` Brad Moore
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 73+ messages in thread
From: Niklas Holsti @ 2014-12-17 20:31 UTC (permalink / raw)

On 14-12-17 15:06 , vincent.diemunsch@gmail.com wrote:
> Hello Brad,
>
>> I dont think this is accurate, as creating tasks in Ada generally serves
>> a different purpose than adding improved parallelism. Tasks are useful
>> constructs for creating independent concurrent activities. It is a way
>> of breaking an application into separate independent logical executions
>> that separate concerns, improving the logic and understanding of a
>> program. Parallelism on the other hand is only about making the program
>> execute faster. If the parallelism does not do that, it fails to serve
>> its purpose.
>
> I am rather surprised that you made a distinction between creating tasks
> and parallelism. I agree that the goal of parallelism is to increase CPU
> usage and therefore make the program run faster. For me creating tasks is
> the Ada way of implementing parallelism.

Ada uses tasks for parallelism, yes, but it is not the only purpose of 
Ada tasks. As Brad said, another purpose is to separate logical threads 
of control, and I would add a third purpose, which is to prioritize 
tasks of different urgencies, for real-time systems.

>> My understanding is that GNAT generally maps tasks to OS threads on a
>> one to one basis, but as others have pointed out, there may be
>> configurations where other mappings are also available.
>
> I could understand that a Library level task (i.e. a task declared immediately
> in a package that is at lirary level) be mapped to an OS thread, but a
> simple local task should definetly not.

I disagree; I don't see any logical difference between a library-level 
task and a local task that would imply different implementations.

That said, I might welcome a standard ability by which the programmer 
could suggest suitable implementations for specific tasks, for pragmatic 
reasons. Assuming, of course, that an Ada programming system (compiler + 
run-time support) provides more than one implementation of tasks.

>> My understanding also is that at one time, GNAT had an implementation
>> built on top of FSU threads developed at Florida State University, by
>> Ted Baker. This implementation ran all tasks under one OS thread.
>> [...] The FSU
>> thread implementation gives you concurrency by allowing tasks to execute
>> independently from each other, using some preemptive scheduling model to
>> shift the processor between the multiple tasks of an application.
>
> The solution of all tasks under one kernel thread is good
> for monoprocessors,

As I remember, the user-level thread solution in GNAT had the drawback 
that if one thread blocked on an OS call, the whole program was blocked.

> and since User Level threads are lightweight
> compared to Kernel threads,

My impression is that this is no longer the case, but perhaps things 
have changed again in recent years.

> But with multiple cores, we need all tasks running on a pool of
> kernel threads, one thread per core.

In what way would that be better than having one kernel thread per task? 
In either case, each task would have to have its own stack area, and I 
don't see why task switching would be radically faster, either (assuming 
that these kernel threads share the same virtual memory space).

> And I suppose that when multicores came, it has been considered
> easier to drop the FSU implementation and simply map one task to
> a kernel thread.

My impression is that dropping the user thread model had more to do with 
improved kernel thread support in OSes and decreased kernel thread 
overheads compared to user threads. Of course, in proportion to the 
maintenance cost of the user-thread systems. The multi-core thing was 
probably a contributing factor, I agree.

> But doing this is an oversimplification that
> gives poor performances for pure parallel computing, and gave
> rise to the need of parallelism Library !

I believe the parallelism library is there for the same reason as any 
other library: to implement reusable functionality, not specifically for 
fixing any problems in the Ada task system.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 13:06   ` vincent.diemunsch
  2014-12-17 20:31     ` Niklas Holsti
@ 2014-12-17 21:08     ` Brad Moore
  2014-12-18  8:47       ` vincent.diemunsch
  2014-12-17 22:18     ` Randy Brukardt
  2014-12-18  0:56     ` Shark8
  3 siblings, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-17 21:08 UTC (permalink / raw)

On 14-12-17 06:06 AM, vincent.diemunsch@gmail.com wrote:
> Hello Brad,
>
>> I dont think this is accurate, as creating tasks in Ada generally serves
>> a different purpose than adding improved parallelism. Tasks are useful
>> constructs for creating independent concurrent activities. It is a way
>> of breaking an application into separate independent logical executions
>> that separate concerns, improving the logic and understanding of a
>> program. Parallelism on the other hand is only about making the program
>> execute faster. If the parallelism does not do that, it fails to serve
>> its purpose.
>
> I am rather surprised that you made a distinction between creating tasks
> and parallelism. I agree that the goal of parallelism is to increase CPU
> usage and therefore make the program run faster. For me creating tasks is
> the Ada way of implementing parallelism. And it is a sound way of doing it since compilers, as far as I know, are not really able to find automaticaly parallelism in a program. Moreover using things like state machines to
> create parallelism is to complex for a programmer and needs the use of a
> dedicated langage. So tasks are fine.

I made the distinction because they are not the same. Parallelism is a 
form of concurrency but is really just a subset of concurrency. In 
parallelism, multiple tasks are executing at the same time. In more 
general concurrency, this is not necessarily the case. Time slicing 
might be used for example, so that in reality only one task is executing 
at a given instance in time.

I am not disagreeing that that tasks can be useful for implementing 
parallelism. The Paraffin libraries for instance use Ada tasks as the 
underlying worker. It just that to effectively use multicores/manycore 
architectures for parallelism, tasks generally are too course grained a 
construct for the application programmer to have to use as a starting 
point every time they want to to introduce parallelism. It's just too 
much code to have to write each time if you want to take advantage of 
things like load balancing, reductions, variable numbers of cores, 
oversubscription of parallelism prevention, choosing the right number of 
workers for the job, obtaining workers from a task pool, etc.
Having a library that does all this work for you makes sense, so that 
the programmer can focus more on the algorithm without having to think 
so much about the parallelism.

The other alternative is to build in more smarts into the compiler so 
that it can implicitly generate parallelism.

Ada compilers are already able to automatically parallelize some things, 
and I believe some of them (all of them?) do.

Someone posted an example a year or so ago of a loop in Ada that GNAT 
could optimize to utilize the cores. It might be that the parallelism 
was implemented as vectorization of the GCC backend, and it might have 
been that the compiler was taking advantage of hardware parallelism 
instructions rather than using a software thread based approach.

But there are limits to what the compiler can safely parallelize. The 
compiler needs to ensure that data races are not introduced, for 
instance, which could cause an otherwise good sequential program to fail 
disastrously.

In our HILT paper from last October, we presented a notion of being able 
to define a Global aspect, which identify dependencies on global data, 
that can be applied to subprograms. This is an extension of the Global 
aspect associated with SPARK. We also have a similar aspect that 
identifies subprograms that are potentially blocking. If the compiler 
can statically tell which subprograms have unsafe dependencies on global 
data, or can tell that no such dependencies exist, then it should be 
able to implicitly parallelize a loop that includes such calls.  Without 
such information being statically available to the compiler, it cannot 
safely inject parallelism, and has to play it safe and generate 
sequential code.

If we get into the realm of having the compiler generate the parallelism 
implicitly, then the underlying worker does not necessarily need to be a 
task. It could be, but if the compiler can make use of some lighter 
weight mechanism, it can do so, so long as the semantic effect of the 
parallelism is the same.

>
>> So the availability of a parallelism library shouldn't really affect the
>> way one structures their program into a collection of tasks.
>> I find such a library is useful when one ones to improve the execution
>> time of one/some of the tasks in the application where performance is
>> not adequate. Tasks and parallelism libraries can complement each other
>> to achieve the best of both worlds.
>
> I am sorry to disagree : the very existence of a parallelism Library shows
> the inability of the current Ada technology to do deal directly with parallelism
> inside the Ada langage. I realy think this is due to the weakness of current compilers, but if there are also problems inside the langage they should be addressed (like the Ravenscar restriction that allowed predictable tasking, or
> special constructs to express parallelism, or "aspects" to indicate that a task
> should be run on a GPU...). Since should be only a few features of Ada 202X.

I dont think we are disagreeing here. I was only mentioning that for 
non-parallelism concurrency, or for courser grained parallelism, tasks 
can still be used in much the same way they are used for concurrency on 
a single core. We are hoping that some new aspects and syntax could be 
considered for Ada 202x, but if that doesn't get the support needed for 
standardization, then one could either ask their compiler vendors for 
implementation defined, non-portable support, or they can resort to 
using libraries such as Paraffin, which actually are portable.
I have used Paraffin on both GNAT and the ICC Ada compiler.

>
>
>> My understanding is that GNAT generally maps tasks to OS threads on a
>> one to one basis, but as others have pointed out, there may be
>> configurations where other mappings are also available.
>
> I could understand that a Library level task (i.e. a task declared immediately
> in a package that is at lirary level) be mapped to an OS thread, but a
> simple local task should definetly not.

Why not?

And even that is a simplification since
> as you pointed, there is often no use to create more kernel threads than the
> number of available CPU.

It depends. For example, if those threads block, then while they are 
blocked the cores are not being used, so it is actually beneficial to 
have more threads than there are CPU for such cases.

>
>> My understanding also is that at one time, GNAT had an implementation
>> built on top of FSU threads developed at Florida State University, by
>> Ted Baker. This implementation ran all tasks under one OS thread.
>> [...] The FSU
>> thread implementation gives you concurrency by allowing tasks to execute
>> independently from each other, using some preemptive scheduling model to
>> shift the processor between the multiple tasks of an application.
>
> The solution of all tasks under one kernel thread is good for monoprocessors, and since User Level threads are lightweight compared to Kernel threads, it was acceptable to map a task to a thread.
> But with multiple cores, we need all tasks running on a pool of kernel threads, one thread per core. And I suppose that when multicores came, it has been considered easier to drop the FSU implementation and simply map one task to a kernel thread. But doing this is an oversimplification that gives poor performances for pure parallel computing, and gave rise to the need of parallelism Library ! (Not to mention the problem of GPU that are commonly used for highly demanding computations and are not supported by GNAT... )
>
> What we need now is a new implementation of tasking in GNAT, able to treat
> local tasks as jobs.

I'm not convinced here. In Paraffin, I have several versions of the 
libraries. Some that use task pools where worker tasks are started 
beforehand, and some that create local tasks on the fly. I was quite 
surprised at the good performance of creating local tasks on the fly on 
the platforms I've tried. (Windows, Linux, and Android).

Brad

>
> Regards,
>
> Vincent
>

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 20:31     ` Niklas Holsti
@ 2014-12-17 22:08       ` Randy Brukardt
  2014-12-17 22:52         ` Björn Lundin
  2014-12-18  8:42       ` Dmitry A. Kazakov
  1 sibling, 1 reply; 73+ messages in thread
From: Randy Brukardt @ 2014-12-17 22:08 UTC (permalink / raw)

"Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 
news:cfe7hnFaj8oU1@mid.individual.net...
> On 14-12-17 15:06 , vincent.diemunsch@gmail.com wrote:
...
>> and since User Level threads are lightweight
>> compared to Kernel threads,
>
> My impression is that this is no longer the case, but perhaps things have 
> changed again in recent years.
>
>> But with multiple cores, we need all tasks running on a pool of
>> kernel threads, one thread per core.
>
> In what way would that be better than having one kernel thread per task? 
> In either case, each task would have to have its own stack area, and I 
> don't see why task switching would be radically faster, either (assuming 
> that these kernel threads share the same virtual memory space).

I don't agree. We did experiments on this back in the early days of Windows 
XP, and the Janus/Ada implementation (cooperative tasks) was several times 
faster than the threaded implementations of other compilers tested. As such, 
we didn't pursue a threaded implementation at that time. (Obviously, we 
didn't anticipate multicore machines at that time.)

While I'm sure that OS threads have less overhead now, they'll always be 
behind since they have to save a lot more stuff during a task switch. Since 
a task switch mainly happens when a task routine is called (a task 
dispatching point in Ada parlence), and we can make sure that little needs 
to be saved in that case (nothing is in registers, for instance), task 
switches could be very fast. (They're not anywhere near as fast as they 
could be for historical reasons.)

(It's the same reason that preemption would be much slower in our 
implementation than normal task switching. That doesn't matter since we 
don't support any priorities; priorities don't really work on multicore 
anyway since tasks generally don't migrate so it is best that programmers 
live without them.)

I've considered looking into building a task supervisor based on 
work-stealing, which would only use a small number of OS threads (probably 
one per core). That would would make better use of the unique aspects of Ada 
tasks, but I worry that it wouldn't be compatible with C++ and the like 
(sadly, a business requirement these days).

> As I remember, the user-level thread solution in GNAT had the drawback 
> that if one thread blocked on an OS call, the whole program was blocked.

True, but that's not a real problem in most software (that has to respond in 
human timescales - if you need microsecond response times, forget 
Janus/Ada!). It requires a bit of care to avoid waiting on anything that 
could block for a long time (typically reads of some sort), but most OS 
operations happen fast enough that you can't notice them.

After all, Claw programs work (including those with multiple tasks 
displaying in a window) essentially the same when compiled with Janus/Ada, 
or with GNAT, or with the old Rational compiler. The tasking model doesn't 
matter to Claw. Similarly, our web server and mail filter both work fine 
with the Janus/Ada implementation and I'll soon be porting them to Gnat on 
Linux - I expect them to work fine there, too.

                           Randy.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 13:06   ` vincent.diemunsch
  2014-12-17 20:31     ` Niklas Holsti
  2014-12-17 21:08     ` Brad Moore
@ 2014-12-17 22:18     ` Randy Brukardt
  2014-12-18  0:56     ` Shark8
  3 siblings, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-17 22:18 UTC (permalink / raw)


<vincent.diemunsch@gmail.com> wrote in message 
news:f9828477-a98e-4795-803d-5926aa7a1fdb@googlegroups.com...
...
>I am rather surprised that you made a distinction between creating tasks
>and parallelism. I agree that the goal of parallelism is to increase CPU
>usage and therefore make the program run faster. For me creating tasks is
>the Ada way of implementing parallelism. And it is a sound way of doing it
>since compilers, as far as I know, are not really able to find automaticaly
>parallelism in a program. Moreover using things like state machines to
>create parallelism is to complex for a programmer and needs the use of a
>dedicated langage. So tasks are fine.

No, tasks are *not* fine. They are far too heavy-weight for the average 
programmer (or even the one that wants to accomplish something fast). And it 
is far too easy to introduce race conditions or deadlocks into them. On top 
of which (as Brad notes), the optimal parallelism structure depends on the 
actual OS/hardware that you run on. You surely don't want to have to encode 
that into your program and redo it everytime you find out about a better way 
to work on a particular target.

For a lot of uses, what you want is a parallel block or loop that eliminates 
the implicit sequential behavior. In that case, the compiler can execute the 
code in parallel or sequentially depending on the actual execution 
environment. But to make that easy *and* safe, there have to be restrictions 
on what you can put into such constructs. That's what the group of four are 
looking at (and I would, if I could spare the time).

Tasks are good for separating relatively large and loosely related jobs. 
They work great for dealing with web server requests, for instance, because 
the requests don't need to talk to each other and just need to do whatever 
they do. They're not a very good way to split up a single calculation, 
because the granularity is almost certainly going to be too coarse (not 
enough parallelism) or too fine (too much overhead), and they're too 
inflexible to be able to change that on the fly.

                                       Randy.



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 22:08       ` Randy Brukardt
@ 2014-12-17 22:52         ` Björn Lundin
  2014-12-17 23:58           ` Randy Brukardt
  0 siblings, 1 reply; 73+ messages in thread
From: Björn Lundin @ 2014-12-17 22:52 UTC (permalink / raw)

On 2014-12-17 23:08, Randy Brukardt wrote:
> "Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 

> 
>> As I remember, the user-level thread solution in GNAT had the drawback 
>> that if one thread blocked on an OS call, the whole program was blocked.
> 
> True, but that's not a real problem in most software (that has to respond in 
> human timescales - if you need microsecond response times, forget 
> Janus/Ada!). 

That kind of rules out a common pattern we have in communication
processes.
Usually we set up a socket, globally, in a package body.

One task does a blocking select() - thus hanging on it - for say 5-30 s.
When a message arrives, it
* logs it in a database table
* acks back to the sender, meaning we own the message now
* Notifies another process to treat the message
* goes back to blocking select

if select() returns timeout,
a check for shutdown is made. This is set in a PO by someone else -
the main program

Another task waits in select/accept for clients to
call it for sending messages - on the same socket,
or to shutdown

A third task calls the second task with keepAlive messages,
periodically, with checks for shutdown mode

A client to the package (usually main) may call shutdown,
which set shutdown-mode in the PO, and closes the global socket.

Closing the socket causes receiving task to unblock, read,
and realize shutdown-mode thus exit.

-- 
--
Björn

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 22:52         ` Björn Lundin
@ 2014-12-17 23:58           ` Randy Brukardt
  2014-12-18 10:39             ` Björn Lundin
  0 siblings, 1 reply; 73+ messages in thread
From: Randy Brukardt @ 2014-12-17 23:58 UTC (permalink / raw)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2619 bytes --]

"Björn Lundin" <b.f.lundin@gmail.com> wrote in message 
news:m6t1f1$5th$1@dont-email.me...
> On 2014-12-17 23:08, Randy Brukardt wrote:
>> "Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message

>>> As I remember, the user-level thread solution in GNAT had the drawback
>>> that if one thread blocked on an OS call, the whole program was blocked.
>>
>> True, but that's not a real problem in most software (that has to respond 
>> in
>> human timescales - if you need microsecond response times, forget
>> Janus/Ada!).
>
> That kind of rules out a common pattern we have in communication
> processes.

Certainly not: my web and mail servers do plenty of communication!

> Usually we set up a socket, globally, in a package body.
>
> One task does a blocking select() - thus hanging on it - for say 5-30 s.

Here's the problem, you're thinking at much too low of a level. The Claw/NC 
sockets libraries abstract sockets comminucation into an I/O model (no such 
thing as "select"!). And the implementation can avoid actual blocking (even 
though at the call level you will see what appears to be blocking).

                     Get (My_Socket, Timeout => 30.0, Item => Buffer, Last 
=> Last);

This will appear to block for 30 seconds, but it surely doesn't have to be 
*implemented* that way.

For servers, there is a server object. "Greet" returns an open socket for a 
connection. Again, this appears blocking, but doesn't have to be implemented 
that way.

[Truth-in-advertising notice: The Claw libraries actually do block. I wrote 
an abstraction extension on top of them that don't block -- but those use 
the exact same subprogram profiles, so there's no operational difference.]

I don't see any sense to the other tasks that you have (I realize I don't 
understand the precise problem that you are trying to solve). But it all 
seems WAY too low-level to me; the reason for using such a pattern is that 
you need high speed responsiveness (far faster than human speeds) --  
otherwise sticking with a simple I/O model is much easier to understand for 
maintenance.

On top of which, using the same (Ada) object from two different tasks 
without synchronization is an invalid use of shared variables. Such a 
program is technically erroneous, and as such, it could do anything at all. 
So I'm dubious that your pattern even works on other compilers (regardless 
of the blocking issue). It's unfortunate that Ada doesn't have any static 
checking for such things, because it's all too easy to write something that 
works today but won't work in the future.

                                  Randy.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 13:06   ` vincent.diemunsch
                       ` (2 preceding siblings ...)
  2014-12-17 22:18     ` Randy Brukardt
@ 2014-12-18  0:56     ` Shark8
  3 siblings, 0 replies; 73+ messages in thread
From: Shark8 @ 2014-12-18  0:56 UTC (permalink / raw)


On 17-Dec-14 06:06, vincent.diemunsch@gmail.com wrote:
> I am sorry to disagree : the very existence of a parallelism Library shows
> the inability of the current Ada technology to do deal directly with parallelism
> inside the Ada langage. I realy think this is due to the weakness of current compilers,
> but if there are also problems inside the langage they should be addressed (like the
> Ravenscar restriction that allowed predictable tasking, or
> special constructs to express parallelism, or "aspects" to indicate that a task
> should be run on a GPU...).

I don't think there's anything keeping, say, NVidia from making an Ada 
compiler that automatically throws tasks/subprograms on the GPU [ala 
CUDA] with use of a implementation-defined aspect/pragma... in fact, 
IIRC, that's *exactly* what pragmas were intended for in the `83 standard,

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 20:31     ` Niklas Holsti
  2014-12-17 22:08       ` Randy Brukardt
@ 2014-12-18  8:42       ` Dmitry A. Kazakov
  2014-12-18  8:56         ` vincent.diemunsch
  2014-12-18  9:34         ` GNAT and Tasklets Niklas Holsti
  1 sibling, 2 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-18  8:42 UTC (permalink / raw)

On Wed, 17 Dec 2014 22:31:57 +0200, Niklas Holsti wrote:

> As Brad said, another purpose is to separate logical threads 
> of control, and I would add a third purpose, which is to prioritize 
> tasks of different urgencies, for real-time systems.

This is a very important point.

In my branch of work (data/event driven architectures) a great deal of
things could be designed much easily and safely if state machines were
replaced by a logical chain of control ("task"). It would not mean a
separate physical task behind. Actually in most cases it is just few
physical tasks and thousands of logical tasks scheduled by events.

If Ada supported this (co-routines) it would greatly simplify I/O, GUI, Web
designs.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 21:08     ` Brad Moore
@ 2014-12-18  8:47       ` vincent.diemunsch
  2014-12-18 21:58         ` Randy Brukardt
  0 siblings, 1 reply; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-18  8:47 UTC (permalink / raw)


Le mercredi 17 décembre 2014 22:08:33 UTC+1, Brad Moore a écrit :

>> I could understand that a Library level task (i.e. a task declared immediately 
>> in a package that is at lirary level) be mapped to an OS thread, but a 
>> simple local task should definetly not. 
>
> Why not? 

Because a thread is, in my understanding of an OS, an abstraction of a CPU. And
technicaly a kernel thread is a scarce ressource, because it needs to store data
inside the kernel, with a strong overhead because creating it, destroying it and
even using it is done through expensive kernel - process transitions.

A local task is a simple way to express parallelism in a part of an alogrithm inside a subrogram. It must be lightweight otherwise it is useless. And it is exactly the problem we encountered, for current compiler creates heavy kernel threads even for little tasks. Remember it was also my motivation at the beginning
of this post : to have a Library to do parallelism.

So I have a good response now, for I think you have very well solved the problem with your Library, that I will try to use !

But it remains the question of the Ada language : it is not acceptable that Ada
degenerates slowly into a low level langage, where we need to replace a task abstraction by 
- a procedure
- an access to that procedure
- the instanciation of a spawning mecanism, taking the access to subprogramm.
This is exactly the job of a compiler !

So the next step is to integrate the good job you made in your Library in a compiler...

Kind regards,

Vincent

 
and given good solutions for


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18  8:42       ` Dmitry A. Kazakov
@ 2014-12-18  8:56         ` vincent.diemunsch
  2014-12-18  9:36           ` Dmitry A. Kazakov
  2014-12-18  9:34         ` GNAT and Tasklets Niklas Holsti
  1 sibling, 1 reply; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-18  8:56 UTC (permalink / raw)


Le jeudi 18 décembre 2014 09:41:53 UTC+1, Dmitry A. Kazakov a écrit :
> On Wed, 17 Dec 2014 22:31:57 +0200, Niklas Holsti wrote:
> 
> > As Brad said, another purpose is to separate logical threads 
> > of control, and I would add a third purpose, which is to prioritize 
> > tasks of different urgencies, for real-time systems.
> 
> This is a very important point.
> 
> In my branch of work (data/event driven architectures) a great deal of
> things could be designed much easily and safely if state machines were
> replaced by a logical chain of control ("task"). It would not mean a
> separate physical task behind. Actually in most cases it is just few
> physical tasks and thousands of logical tasks scheduled by events.
> 
> If Ada supported this (co-routines) it would greatly simplify I/O, GUI, Web
> designs.
> 
> -- 
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de

Hello Dimitry

What is a "physical task" and a "logical task" ?
For me :
- a task is a "logical task", because it is a concept of the Ada langage.
- a thread is what you call a "physical task", because it is an OS feature.

And if there should be a clear rule inside the Ada langage I would argue for
a simple rule like :
 - tasks (and task types) declared at Library level are mapped to kernel threads
 - tasks (and task types) declared localy are mapped to simple jobs spawned to a
   pool of tasks.

And add the use of pragma, or aspects, for special cases (GPU...)

Kind regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18  8:42       ` Dmitry A. Kazakov
  2014-12-18  8:56         ` vincent.diemunsch
@ 2014-12-18  9:34         ` Niklas Holsti
  2014-12-18  9:50           ` Dmitry A. Kazakov
  1 sibling, 1 reply; 73+ messages in thread
From: Niklas Holsti @ 2014-12-18  9:34 UTC (permalink / raw)

On 14-12-18 10:42 , Dmitry A. Kazakov wrote:
> On Wed, 17 Dec 2014 22:31:57 +0200, Niklas Holsti wrote:
>
>> As Brad said, another purpose is to separate logical threads
>> of control, and I would add a third purpose, which is to prioritize
>> tasks of different urgencies, for real-time systems.
>
> This is a very important point.
>
> In my branch of work (data/event driven architectures) a great deal of
> things could be designed much easily and safely if state machines were
> replaced by a logical chain of control ("task"). It would not mean a
> separate physical task behind. Actually in most cases it is just few
> physical tasks and thousands of logical tasks scheduled by events.
>
> If Ada supported this (co-routines) it would greatly simplify I/O, GUI, Web
> designs.

The problem is that implementing a co-routine is not much easier/lighter 
than implementing a task, *if* you let the co-routine use subprograms 
and let it "yield" (that is, pass control to some other co-routine) 
inside a nest of subprogram calls.

If you only allow "yield" at the outermost or "main" subprogram of a 
co-routine, you can make the co-routines share the same stack and the 
same "physical task". Of course you must still arrange for separate 
storage of the local variables of the main subprogram of each 
co-routine, but that is not too hard.

But if "yield" can occur at any depth in a nest of subprogram calls, 
then each co-routine must have its own stack. Perhaps the overhead of 
co-routine switching could still be less than for task switching, 
because co-routine switching would not be pre-emptive, but IMO that is 
an unimportant optimisation.

So it seems to me that there is a fundamental conflict between the idea 
of light-weight co-routines on the one hand, and the use of procedural 
abstraction of behaviour, on the other hand.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18  8:56         ` vincent.diemunsch
@ 2014-12-18  9:36           ` Dmitry A. Kazakov
  2014-12-18 10:32             ` vincent.diemunsch
  0 siblings, 1 reply; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-18  9:36 UTC (permalink / raw)

On Thu, 18 Dec 2014 00:56:30 -0800 (PST), vincent.diemunsch@gmail.com
wrote:

> Le jeudi 18 décembre 2014 09:41:53 UTC+1, Dmitry A. Kazakov a écrit :
>> On Wed, 17 Dec 2014 22:31:57 +0200, Niklas Holsti wrote:
>> 
>>> As Brad said, another purpose is to separate logical threads 
>>> of control, and I would add a third purpose, which is to prioritize 
>>> tasks of different urgencies, for real-time systems.
>> 
>> This is a very important point.
>> 
>> In my branch of work (data/event driven architectures) a great deal of
>> things could be designed much easily and safely if state machines were
>> replaced by a logical chain of control ("task"). It would not mean a
>> separate physical task behind. Actually in most cases it is just few
>> physical tasks and thousands of logical tasks scheduled by events.
>> 
>> If Ada supported this (co-routines) it would greatly simplify I/O, GUI, Web
>> designs.
>> 
> What is a "physical task" and a "logical task" ?
> For me :
> - a task is a "logical task", because it is a concept of the Ada langage.
> - a thread is what you call a "physical task", because it is an OS feature.

RM mingles task and thread:

"The execution of an Ada program consists of the execution of one or more
tasks. Each task represents a separate thread of control that proceeds
independently and concurrently between the points where it interacts with
other tasks." RM 9(1)

Your "thread" is an implementation of a task.

> And if there should be a clear rule inside the Ada langage I would argue for
> a simple rule like :
>  - tasks (and task types) declared at Library level are mapped to kernel threads

Why? One is about visibility, scope, elaboration another is about
scheduling. Scheduling by OS is IMO not related to the scope.

>  - tasks (and task types) declared localy are mapped to simple jobs spawned to a
>    pool of tasks.
> 
> And add the use of pragma, or aspects, for special cases (GPU...)

I don't think pragma would be enough. It is all about scheduling. Physical
tasks are scheduled by timer interrupts or other steady event sources,
which makes task interactions, such as rendezvous, more or less reliable. 

The case of co-routines scheduling is different because the driving events
(new data are here, GUI event signaled, HTTP multipart chunk received) are
no more *independent*.

If scheduling is not independent then tasks are not. Note, the RM's
definition, which mentions independence. It is not only things like
task-owned objects and state. It is also the concept that a task can be
preempted without behavior change. This is less true for co-routines which
are less "task" in this sense.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18  9:34         ` GNAT and Tasklets Niklas Holsti
@ 2014-12-18  9:50           ` Dmitry A. Kazakov
  0 siblings, 0 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-18  9:50 UTC (permalink / raw)


On Thu, 18 Dec 2014 11:34:18 +0200, Niklas Holsti wrote:

> On 14-12-18 10:42 , Dmitry A. Kazakov wrote:
>> On Wed, 17 Dec 2014 22:31:57 +0200, Niklas Holsti wrote:
>>
>>> As Brad said, another purpose is to separate logical threads
>>> of control, and I would add a third purpose, which is to prioritize
>>> tasks of different urgencies, for real-time systems.
>>
>> This is a very important point.
>>
>> In my branch of work (data/event driven architectures) a great deal of
>> things could be designed much easily and safely if state machines were
>> replaced by a logical chain of control ("task"). It would not mean a
>> separate physical task behind. Actually in most cases it is just few
>> physical tasks and thousands of logical tasks scheduled by events.
>>
>> If Ada supported this (co-routines) it would greatly simplify I/O, GUI, Web
>> designs.
> 
> The problem is that implementing a co-routine is not much easier/lighter 
> than implementing a task, *if* you let the co-routine use subprograms 
> and let it "yield" (that is, pass control to some other co-routine) 
> inside a nest of subprogram calls.
> 
> If you only allow "yield" at the outermost or "main" subprogram of a 
> co-routine, you can make the co-routines share the same stack and the 
> same "physical task". Of course you must still arrange for separate 
> storage of the local variables of the main subprogram of each 
> co-routine, but that is not too hard.
> 
> But if "yield" can occur at any depth in a nest of subprogram calls, 
> then each co-routine must have its own stack. Perhaps the overhead of 
> co-routine switching could still be less than for task switching, 
> because co-routine switching would not be pre-emptive, but IMO that is 
> an unimportant optimisation.

My concern is OS limitations and the overhead for the scheduler caused by
merely having a thread. A co-routine will never appear in the list of
active tasks.

> So it seems to me that there is a fundamental conflict between the idea 
> of light-weight co-routines on the one hand, and the use of procedural 
> abstraction of behaviour, on the other hand.

Maybe, but in earlier days, same was said about processes. Why should
anybody introduce threads? Still threads are lighter than processes and
become lighter in relation.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18  9:36           ` Dmitry A. Kazakov
@ 2014-12-18 10:32             ` vincent.diemunsch
  2014-12-18 11:19               ` Dmitry A. Kazakov
  2014-12-18 22:33               ` Randy Brukardt
  0 siblings, 2 replies; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-18 10:32 UTC (permalink / raw)



> >> 
> > What is a "physical task" and a "logical task" ?
> > For me :
> > - a task is a "logical task", because it is a concept of the Ada langage.
> > - a thread is what you call a "physical task", because it is an OS feature.
> 
> RM mingles task and thread:
> 
> "The execution of an Ada program consists of the execution of one or more
> tasks. Each task represents a separate thread of control that proceeds
> independently and concurrently between the points where it interacts with
> other tasks." RM 9(1)
> 
> Your "thread" is an implementation of a task.
> 

I agree that a thread is an implementation of the logical concept of a task, and
that is how I understand the above sentence of the RM that you quote. But, there are different threads :

- a kernel thread is created and scheduled by the kernel, with great power and also great cost (kernel memory and high overhead for context switch). But it allows the use of different CPU (multicore) and the use of concurrent processes even by time slicing. Basically this is an abstraction of a CPU.
 
- a user-level thread is created and scheduled inside the process by the runtime, it is much faster and lighter, but is less powerful : typically time slicing is not supported and threads need to cooperate more. User-level threads are often multiplexed on one or more kernel threads.

- finaly jobs, executed on a pool of workers (kernel threads) provide very lightweight threads.

All are threads, each class of them corresponding to a level of complexity of an Ada task : Library level, permanent tasks are typically kernel threads, Library level dynamic tasks created on demand are typically user-level threads, and little local task, needed for a parallel computation are often simple jobs.

And if the compiler chose only one implementation, like always a kernel thread for a task, which is the case now, I have to say that it is not a mature implementation of tasking.

Finaly, I really hope that the new version of the langage will keep Ada simple and add "intelligence" in compilers and not add different new features with :
- tasks for kernel threads
- coroutines or tasklets for user-level threads
- jobs for lightweight threads,
because this would be a major conceptual regression. I believe that aspects on tasks could be an inelegant but decent way to solve the problem.

Regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-17 23:58           ` Randy Brukardt
@ 2014-12-18 10:39             ` Björn Lundin
  2014-12-18 23:01               ` Randy Brukardt
  0 siblings, 1 reply; 73+ messages in thread
From: Björn Lundin @ 2014-12-18 10:39 UTC (permalink / raw)

On 2014-12-18 00:58, Randy Brukardt wrote:
>> That kind of rules out a common pattern we have in communication
>> processes.
> 
> Certainly not: my web and mail servers do plenty of communication!

Sorry. should have been more clear.
I meant a common pattern we have at work.

>> Usually we set up a socket, globally, in a package body.
>>
>> One task does a blocking select() - thus hanging on it - for say 5-30 s.
> 
> Here's the problem, you're thinking at much too low of a level. The Claw/NC 
> sockets libraries abstract sockets comminucation into an I/O model (no such 
> thing as "select"!). And the implementation can avoid actual blocking (even 
> though at the call level you will see what appears to be blocking).
> 
>                      Get (My_Socket, Timeout => 30.0, Item => Buffer, Last 
> => Last);
> 
> This will appear to block for 30 seconds, but it surely doesn't have to be 
> *implemented* that way.

But then you get to poll, no?
well that will do it of course.

> 
> I don't see any sense to the other tasks that you have (I realize I don't 
> understand the precise problem that you are trying to solve). But it all 
> seems WAY too low-level to me; 

Not really
Task 1 is for just receiving data,secure it, and notify others.
Task 2 is to serialize _writes_ to the socket.
PO's cant be involved in potentially blocking stuff.

Tasks 3 is a to have keepAlive telegrams. KeepAlives  are usually
mandatory at protocol level, so in order not to jam that logic into the
the writer task, it is taken out into a separate one.
Easier to maintain.

The shutdown things and select() is just to be able to send
a message (on a named pipe) that the process is to go down
and thus the tasks must shut down. They will not if blocking on read()

>the reason for using such a pattern is that 
> you need high speed responsiveness (far faster than human speeds) --  

No. The reason is to have communication logic for say a
crane or conveyor system in one process, and the business logic
for that mechanical device in another process.
(I'm talking about a warehouse control system here,
with 20-50 daemon processes)

Having several I/O-processes makes it easy to change the
way of communication, and still have the same business logic.
(say an installation from 1992 talks Siemens 3964r/k512 over a serial
line (its a old standard protocol) and they want to switch plc's
which talks tcp/ip with another transmission protocol.
If keeping the messages within the protocol intact,
then it's just matter of a new I/O process.
The rest of the system is untouched including business logic processes.
(I just did this this spring)

> otherwise sticking with a simple I/O model is much easier to understand for 
> maintenance.

Yes, but it also has to work.

> 
> On top of which, using the same (Ada) object from two different tasks 
> without synchronization is an invalid use of shared variables. Such a 
> program is technically erroneous, and as such, it could do anything at all. 

Hmm, is it?

I got this from stackoverflow.

<http://stackoverflow.com/questions/13021796/simultaneously-read-and-write-on-the-same-socket-in-c-or-c>

"You don't have to worry about it. One thread reading and one thread
writing will work as you expect. Sockets are full duplex, so you can
read while you write and vice-versa. You'd have to worry if you had
multiple writers, but this is not the case."

And that is basically what I heard before.
So it does work, and works well.
But of course, it may be illegal anyway.

> So I'm dubious that your pattern even works on other compilers (regardless 
> of the blocking issue). 

It did work well with ObjectAda too, but that was 10 years ago.

AlsysAda for AIX did not like this. It had, as Janus has,
tasking in its runtime, and anything blocking would block all.

> It's unfortunate that Ada doesn't have any static 
> checking for such things, because it's all too easy to write something that 
> works today but won't work in the future.
> 

Yes, This makes me think that access to it should be wrapped in a PO

--
Björn

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 10:32             ` vincent.diemunsch
@ 2014-12-18 11:19               ` Dmitry A. Kazakov
  2014-12-18 12:09                 ` vincent.diemunsch
  2014-12-18 22:33               ` Randy Brukardt
  1 sibling, 1 reply; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-18 11:19 UTC (permalink / raw)


On Thu, 18 Dec 2014 02:32:44 -0800 (PST), vincent.diemunsch@gmail.com
wrote:

> Finaly, I really hope that the new version of the langage will keep Ada
> simple and add "intelligence" in compilers and not add different new
> features with :
> - tasks for kernel threads
> - coroutines or tasklets for user-level threads
> - jobs for lightweight threads,
> because this would be a major conceptual regression. I believe that
> aspects on tasks could be an inelegant but decent way to solve the
> problem.

Ideally yes, but if you consider the implications, you will have to
reconsider protected actions and rendezvous. E.g. a protected action
interlocking co-routines driven by the same thread do not need locking. Do
you want to get advantage from this? Or, a rendezvous between them is a
deadlock? Statically resolved? What with exception propagation on both
ends? Forbidden? etc.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 11:19               ` Dmitry A. Kazakov
@ 2014-12-18 12:09                 ` vincent.diemunsch
  2014-12-18 13:07                   ` Dmitry A. Kazakov
  2014-12-19 10:40                   ` Georg Bauhaus
  0 siblings, 2 replies; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-18 12:09 UTC (permalink / raw)


Le jeudi 18 décembre 2014 12:19:50 UTC+1, Dmitry A. Kazakov a écrit :
> On Thu, 18 Dec 2014 02:32:44 -0800 (PST), vincent.diemunsch@gmail.com
> wrote:
> 
> > Finaly, I really hope that the new version of the langage will keep Ada
> > simple and add "intelligence" in compilers and not add different new
> > features with :
> > - tasks for kernel threads
> > - coroutines or tasklets for user-level threads
> > - jobs for lightweight threads,
> > because this would be a major conceptual regression. I believe that
> > aspects on tasks could be an inelegant but decent way to solve the
> > problem.
> 
> Ideally yes, but if you consider the implications, you will have to
> reconsider protected actions and rendezvous. E.g. a protected action
> interlocking co-routines driven by the same thread do not need locking. Do
> you want to get advantage from this? Or, a rendezvous between them is a
> deadlock? Statically resolved? What with exception propagation on both
> ends? Forbidden? etc.
> 
> -- 
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de

yes, this is not a trivial issue. It needs a real study.

It would be interesting to do a little survey on existing code using tasking.
I have the impression that only tasks at Library level do rendez-vous and protected object synchronisation, and local tasks, most of the time, are limited to a rendez-vous with their parent task at the beginning or at the end. So maybe we should put restrictions on local tasks, so that we can map them to jobs.

This situation can be compared to Ravenscar restrictions, that allowed to use
fast, predictable tasking.

Kind regards,

Vincent

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 12:09                 ` vincent.diemunsch
@ 2014-12-18 13:07                   ` Dmitry A. Kazakov
  2014-12-19 10:40                   ` Georg Bauhaus
  1 sibling, 0 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-18 13:07 UTC (permalink / raw)


On Thu, 18 Dec 2014 04:09:22 -0800 (PST), vincent.diemunsch@gmail.com
wrote:

> It would be interesting to do a little survey on existing code using tasking.

Don't forget a huge amount of code implemented without tasking because
there is no co-routines. Furthermore, if designed with co-routines this
code would have looked totally different. Ergo, I think a lot of
theoretical work should be done as well.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18  8:47       ` vincent.diemunsch
@ 2014-12-18 21:58         ` Randy Brukardt
  0 siblings, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-18 21:58 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3019 bytes --]

<vincent.diemunsch@gmail.com> wrote in message 
news:0b33cce3-7b1b-47e0-9d6d-48fafcfb025c@googlegroups.com...
>Le mercredi 17 décembre 2014 22:08:33 UTC+1, Brad Moore a écrit :

>>> I could understand that a Library level task (i.e. a task declared 
>>> immediately
>>> in a package that is at lirary level) be mapped to an OS thread, but a
>>> simple local task should definetly not.
>>
>> Why not?
>
>Because a thread is, in my understanding of an OS, an abstraction of a CPU. 
>And
>technicaly a kernel thread is a scarce ressource, because it needs to store 
>data
>inside the kernel, with a strong overhead because creating it, destroying 
>it and
>even using it is done through expensive kernel - process transitions.

Which matches almost exactly the language requirements of an Ada task. These 
are fairly heavyweight structures, requiring their own separate stacks, 
exception management, finalization management, termination management, etc. 
Initializing and destroying these things is not trivial.

>A local task is a simple way to express parallelism in a part of an 
>alogrithm inside a
>subrogram.

Maybe for you, but not in Ada.

>It must be lightweight otherwise it is useless.

Surely not; there are plenty of uses that do not involve explicitly creating 
parallelism. "Lightweight Ada task" is an oxymoron. On top of which, the 
compiler should be creating and managing lightweight parallelism, not the 
programmer (the compiler/runtime know a lot more about the execution 
environment than the programmer can or should). If you have to write much 
beyond a keyword, we've already failed.

>And it is exactly the problem we encountered, for current compiler creates
>heavy kernel threads even for little tasks.

There are no "little tasks" in Ada. You want something else, which Ada 
doesn't have at the moment.

> Remember it was also my motivation
>at the beginning of this post : to have a Library to do parallelism.
> So I have a good response now, for I think you have very well
>solved the problem with your Library, that I will try to use !

>But it remains the question of the Ada language : it is not acceptable that
>Ada degenerates slowly into a low level langage, where we need to
>replace a task abstraction by
>- a procedure
>- an access to that procedure
>- the instanciation of a spawning mecanism, taking the access to 
>subprogramm.
>This is exactly the job of a compiler !

Yes, of course. The reason Brad built his library was to get experience into 
what could reasonably be done and not done with the language as it stands. 
And now he and others are working on language proposals for the future for 
parallel loops and blocks among other things in order to make these things a 
first-class part of the language.

But language change is a long and slow process, in part because we have to 
get it right the first time. 'cause we're stuck with whatever we come up 
with forever (note the other discussion on anonymous access types).

                                     Randy.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 10:32             ` vincent.diemunsch
  2014-12-18 11:19               ` Dmitry A. Kazakov
@ 2014-12-18 22:33               ` Randy Brukardt
  2014-12-19 13:01                 ` GNATï¿½and Tasklets vincent.diemunsch
  1 sibling, 1 reply; 73+ messages in thread
From: Randy Brukardt @ 2014-12-18 22:33 UTC (permalink / raw)


<vincent.diemunsch@gmail.com> wrote in message 
news:9e1d2b9f-1b97-4679-8eec-5ba75f3c357c@googlegroups.com...
...
>And if the compiler chose only one implementation, like always a kernel 
>thread
>for a task, which is the case now, I have to say that it is not a mature
>implementation of tasking.

In which case I think you will have an approximately 0% chance of ever 
encountering your idea of a "mature implementation of tasking" in an Ada 
compiler. Ada tasking is far too complex to implement it more than once, 
especially with all of the potential race conditions and deadlocks that 
mixed implementations would face.

There is no such thing as a "lightweight Ada task". The syntax is heavy, the 
semantics is heavy (separate stack, exception handling, finalization, 
termination), and as a correlary, the implementation is heavy.

>Finaly, I really hope that the new version of the langage will keep Ada 
>simple ...

That ship sailed with Ada 95, if it ever was true at all.

Besides, you confuse the appearance of simple (that is simple to use, simple 
to learn) with simple in language terms.

    My_Vector(I) := 10;

looks simple, but if My_Vector is a vector container, what actually happens 
is anything but simple. But so what? Only people that have to write/debug 
containers ought to care; we've made usage simple at the cost of making 
creation harder.

Similarly,
    for I in parallel 1 .. 10 loop
       ...
    end loop;

looks and feels simple, even though what would have to happen under the 
covers to implement that is anything but simple.

>and add "intelligence" in compilers and not add different new features with 
>:
>- tasks for kernel threads
>- coroutines or tasklets for user-level threads
>- jobs for lightweight threads,
>because this would be a major conceptual regression. I believe that aspects
>on tasks could be an inelegant but decent way to solve the problem.

You are still thinking way too low-level. Creating a parallel program should 
be as easy as creating a sequential one. There should (almost) be no special 
constructs at all. Ideally, most programs would be made up of parallel loops 
and blocks, and there would be hardly any tasks (mainly to be used for 
separate conceptual things).

Writing a task correctly is a nearly impossible job, especially as it isn't 
possible to statically eliminate deadlocks and race conditions. It's not 
something for the "normal" programmer to do. We surely don't want to put 
Ada's future in that box -- it would eventually have no future (especially 
if very-many-core machines become as common as some predict).

In any case, there won't be any major upgrade of the Ada language for at 
least another 5 years. The upcoming Corrigendum (bug fix) has few new 
features and those are all tied to specific bugs in the Ada 2012. So I 
wouldn't wait for language changes to bail you out; Brad's libraries are the 
best option for now.

Also please note that language enhancements occur through a process of 
consensus. Most of the ARG has to agree on a direction before it gets into 
the language standard. You should have noted by now that pretty much 
everyone who has answered here has disagreed with your position. It's highly 
unlikely that the ARG would invent language changes that the majority of the 
Ada community think are the wrong direction. That's especially true as Brad 
and I are on the ARG and are working on these language changes. The ARG has 
already voted to continue work in the direction that the "Group of Four" 
proposed (which is similar to what I would propose if it was up to me). I'd 
be very surprised if we made a u-turn at this point (especially as the 
proposals fit well into the existing framework of Ada and most are useful 
outside of the narrow area of parallel operations). This will be a big job 
and it will be a long time before we get to the real nitty-gritty of these 
proposals (the problems will be in the details, not the broad outline --  
anyone can do that!). Anyway, you can swim against the current if you like, 
but you most likely aren't going to get anywhere doing that.

                                        Randy.



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 10:39             ` Björn Lundin
@ 2014-12-18 23:01               ` Randy Brukardt
  2014-12-19  8:39                 ` Natasha Kerensikova
                                   ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-18 23:01 UTC (permalink / raw)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 7190 bytes --]

"Björn Lundin" <b.f.lundin@gmail.com> wrote in message 
news:m6uast$1rh$1@dont-email.me...
> On 2014-12-18 00:58, Randy Brukardt wrote:
...
>>> Usually we set up a socket, globally, in a package body.
>>>
>>> One task does a blocking select() - thus hanging on it - for say 5-30 s.
>>
>> Here's the problem, you're thinking at much too low of a level. The 
>> Claw/NC
>> sockets libraries abstract sockets comminucation into an I/O model (no 
>> such
>> thing as "select"!). And the implementation can avoid actual blocking 
>> (even
>> though at the call level you will see what appears to be blocking).
>>
>>                      Get (My_Socket, Timeout => 30.0, Item => Buffer, 
>> Last => Last);
>>
>> This will appear to block for 30 seconds, but it surely doesn't have to 
>> be
>> *implemented* that way.
>
> But then you get to poll, no?
> well that will do it of course.

(Careful) polling is better, IMHO, no matter what you're doing. Timeouts 
were very poorly implemented on Windows (perhaps that's been fixed), so 
things could block for long periods even with short timeouts. (DNS queries 
for instance, always seem to block for 30 seconds no matter what timeout is 
used.)

For something like sockets, which are relatively slow compared to the 
computer, polling has no visible effect on performance and ensures that 
nothing will get starved.

If you're using a high-level communication library (not some low-level 
sockets bindings - yuck!), how that's implemented is invisible to the 
client. They shouldn't care.

>> I don't see any sense to the other tasks that you have (I realize I don't
>> understand the precise problem that you are trying to solve). But it all
>> seems WAY too low-level to me;
>
> Not really
> Task 1 is for just receiving data,secure it, and notify others.
> Task 2 is to serialize _writes_ to the socket.
> PO's cant be involved in potentially blocking stuff.

As I noted later, there should only be one task reading/writing the socket 
at a time. Unless, of course, the Ada binding is documented to do the 
locking needed to support that. That's true of *any* Ada object -- without 
documentation and/or code to manage it, it should be assumed to only work 
with one task at a time.

That's of course especially true with a proper higher-level communications 
library, since there almost certainly will be instance data that needs to be 
protected against multi-access.

>>the reason for using such a pattern is that
>> you need high speed responsiveness (far faster than human speeds) --
>
> No. The reason is to have communication logic for say a
> crane or conveyor system in one process, and the business logic
> for that mechanical device in another process.
> (I'm talking about a warehouse control system here,
> with 20-50 daemon processes)

That sounds confused to me. One would expect a communication abstraction 
that all the business logic calls as needed. How that abstraction is 
organized should not be visible to the business logic at all. The 
communication abstraction would just be a set of packages to send and 
receive various kinds of messages.

But we're getting way off-topic here, you're not going to redesign your 
application because I think it sounds confused (and I probably don't know 
enough to make an informed opinion anyway).

 > Having several I/O-processes makes it easy to change the
> way of communication, and still have the same business logic.
> (say an installation from 1992 talks Siemens 3964r/k512 over a serial
> line (its a old standard protocol) and they want to switch plc's
> which talks tcp/ip with another transmission protocol.
> If keeping the messages within the protocol intact,
> then it's just matter of a new I/O process.
> The rest of the system is untouched including business logic processes.
> (I just did this this spring)

I still think this sounds way too low-level. (I shouldn't talk, I tend to do 
whatever works rather than writing a proper abstraction. It's one of the 
reasons I haven't been in a hurry to open-source Janus/Ada; it was designed 
by a bunch of college students that didn't know better. :-)

>> otherwise sticking with a simple I/O model is much easier to understand 
>> for
>> maintenance.
>
> Yes, but it also has to work.

Surely.

>> On top of which, using the same (Ada) object from two different tasks
>> without synchronization is an invalid use of shared variables. Such a
>> program is technically erroneous, and as such, it could do anything at 
>> all.
>
> Hmm, is it?
>
> I got this from stackoverflow.
>
> <http://stackoverflow.com/questions/13021796/simultaneously-read-and-write-on-the-same-socket-in-c-or-c>
>
> "You don't have to worry about it. One thread reading and one thread
> writing will work as you expect. Sockets are full duplex, so you can
> read while you write and vice-versa. You'd have to worry if you had
> multiple writers, but this is not the case."

Sure, the underlying C might be task safe (although I wouldn't trust it 
myself). But I hope you're using a nice Ada abstraction of sockets and not 
just calling raw C routines to do stuff. (If you are, you're hopeless and 
this whole discussion is irrelevant.) And that latter abstraction is only 
safe if it is programmed to be safe (that is, uses locks as needed) or is 
wrapped in a PO to make it safe. Specifically, Claw sockets and it's bastard 
child NC_Sockets are only task safe to the extent that separate tasks can 
operate on separate sockets objects.

> And that is basically what I heard before.
> So it does work, and works well.
> But of course, it may be illegal anyway.
>
>> So I'm dubious that your pattern even works on other compilers 
>> (regardless
>> of the blocking issue).
>
> It did work well with ObjectAda too, but that was 10 years ago.
>
> AlsysAda for AIX did not like this. It had, as Janus has,
> tasking in its runtime, and anything blocking would block all.

Right. And even on the systems where it worked, you're dependent that your 
sockets library doesn't do any buffering (NC_Sockets can), doesn't have any 
local data other than a sockets handle, and the like. A updated sockets 
library might change that behavior.

>> It's unfortunate that Ada doesn't have any static
>> checking for such things, because it's all too easy to write something 
>> that
>> works today but won't work in the future.
>>
>
> Yes, This makes me think that access to it should be wrapped in a PO

I think so. It certainly would be safer that way (it would surely work 
rather than "probably work"). Remember that you can't really find race 
conditions by testing -- they can occur quite rarely (two tasks have to be 
doing exactly the iffy thing at exactly the same time) and usually they'll 
leave no trace other than a malfunctioning system. You could have some and 
not even know it. (Indeed, I'd be surprised if there were more than a 
handful of Ada programs that contain tasks that don't have any race 
conditions. One of the goals of the parallel work is to be able to construct 
programs in a subset of Ada that cannot have any race conditions, and get an 
compile-time error if that is not true.)

                                     Randy.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 23:01               ` Randy Brukardt
@ 2014-12-19  8:39                 ` Natasha Kerensikova
  2014-12-19 23:39                   ` Randy Brukardt
  2014-12-19  8:59                 ` Dmitry A. Kazakov
  2014-12-19 11:56                 ` Björn Lundin
  2 siblings, 1 reply; 73+ messages in thread
From: Natasha Kerensikova @ 2014-12-19  8:39 UTC (permalink / raw)

Hello,

On 2014-12-18, Randy Brukardt <randy@rrsoftware.com> wrote:
> [...]                           That's true of *any* Ada object -- without 
> documentation and/or code to manage it, it should be assumed to only work 
> with one task at a time.

That's a bit at odds with a recent (or not so recent) discussion, where
I didn't find in the RM where it is written that concurrent read-only
basic operations (like access dereference, array indexing, etc) are
task-safe, and the answer was that everything should work as expected
unless explicitly noted otherwise, even when concurrency is involved.

For example, I wrote Constant_Indefinite_Ordered_Maps (available at
https://github.com/faelys/natools/blob/trunk/src/natools-constant_indefinite_ordered_maps.ads )
which is an ordered map based on a binary search in a sorted array.
It is meant to be task-safe without any code to manage it, basing its
safety solely under the assumption that concurrent access dereference
and concurrent array indexing are safe.

I hope it's not a brittle assumption...

Natasha

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 23:01               ` Randy Brukardt
  2014-12-19  8:39                 ` Natasha Kerensikova
@ 2014-12-19  8:59                 ` Dmitry A. Kazakov
  2014-12-19 11:56                 ` Björn Lundin
  2 siblings, 0 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-19  8:59 UTC (permalink / raw)

On Thu, 18 Dec 2014 17:01:18 -0600, Randy Brukardt wrote:

> As I noted later, there should only be one task reading/writing the socket 
> at a time.

That might be OK for a server application, but clients must read and write
from socket concurrently. In a half-duplex scenario (client-server is) it
is highly advisable that one of the peers read and write simultaneously in
order to prevent deadlock. Consider a case when a peer tries to write when
another is reading. This may happen in half-duplex exchange on protocol
errors.

For full-duplex exchanges you simply must do it concurrently. There are
lots of full-duplex protocols over sockets. 

> That's of course especially true with a proper higher-level communications 
> library, since there almost certainly will be instance data that needs to be 
> protected against multi-access.

That is not a problem, usually. The states manipulated when reading and
ones upon writing are usually well insulated.

>> Having several I/O-processes makes it easy to change the
>> way of communication, and still have the same business logic.
>> (say an installation from 1992 talks Siemens 3964r/k512 over a serial
>> line (its a old standard protocol) and they want to switch plc's
>> which talks tcp/ip with another transmission protocol.
>> If keeping the messages within the protocol intact,
>> then it's just matter of a new I/O process.
>> The rest of the system is untouched including business logic processes.
>> (I just did this this spring)
> 
> I still think this sounds way too low-level.

Typical socket architecture is

1. Low-level I(O
2. Protocol stack(s)
3. Application logic

Any of 1-3 can be handled tasks taken from some workers pool.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 12:09                 ` vincent.diemunsch
  2014-12-18 13:07                   ` Dmitry A. Kazakov
@ 2014-12-19 10:40                   ` Georg Bauhaus
  2014-12-19 11:01                     ` Dmitry A. Kazakov
  1 sibling, 1 reply; 73+ messages in thread
From: Georg Bauhaus @ 2014-12-19 10:40 UTC (permalink / raw)


<vincent.diemunsch@gmail.com> wrote:

> It would be interesting to do a little survey on existing code using tasking.
> I have the impression that only tasks at Library level do rendez-vous and
> protected object synchronisation, and local tasks, most of the time, are
> limited to a rendez-vous with their parent task at the beginning or at
> the end. So maybe we should put restrictions on local tasks, so that we
> can map them to jobs.

Won't the parallel loop feature be providing
for this kind of mini job:

for Row of Matrix loop in parallel
   Remove_fractional (row):
End loop;

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 10:40                   ` Georg Bauhaus
@ 2014-12-19 11:01                     ` Dmitry A. Kazakov
  2014-12-19 16:42                       ` Brad Moore
  2014-12-19 23:51                       ` Randy Brukardt
  0 siblings, 2 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-19 11:01 UTC (permalink / raw)

On Fri, 19 Dec 2014 10:40:03 +0000 (UTC), Georg Bauhaus wrote:

> <vincent.diemunsch@gmail.com> wrote:
> 
>> It would be interesting to do a little survey on existing code using tasking.
>> I have the impression that only tasks at Library level do rendez-vous and
>> protected object synchronisation, and local tasks, most of the time, are
>> limited to a rendez-vous with their parent task at the beginning or at
>> the end. So maybe we should put restrictions on local tasks, so that we
>> can map them to jobs.
> 
> Won't the parallel loop feature be providing
> for this kind of mini job:

Parallel loop is useless for practical purposes. It wonders me why people
wasting time with this.

They could start with logical operations instead:

    X and Y

is already parallel by default. AFAIK nothing in RM forbids concurrent
evaluation of X and Y if they are independent. Same with Ada arithmetic.
E.g.

   A + B + C + D

So far no compiler evaluates arguments concurrently or vectorizes
sub-expressions like:

   A
   B  +
   C      +
   D  +

Because if they did the result would work slower than sequential code. It
simply does not worth the efforts with existing machine architectures.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-18 23:01               ` Randy Brukardt
  2014-12-19  8:39                 ` Natasha Kerensikova
  2014-12-19  8:59                 ` Dmitry A. Kazakov
@ 2014-12-19 11:56                 ` Björn Lundin
  2014-12-20  0:02                   ` Randy Brukardt
  2 siblings, 1 reply; 73+ messages in thread
From: Björn Lundin @ 2014-12-19 11:56 UTC (permalink / raw)

On 2014-12-19 00:01, Randy Brukardt wrote:
> For something like sockets, which are relatively slow compared to the 
> computer, polling has no visible effect on performance and ensures that 
> nothing will get starved.
> 
> If you're using a high-level communication library (not some low-level 
> sockets bindings - yuck!), how that's implemented is invisible to the 
> client. They shouldn't care.

Well, it is the yuck alternative.
A homebrew binding from 1996 with far to much
exposed in the spec, making change difficult.
To much backwards incompability.
It resembles a bad version of AdaSockets.

>>
>> No. The reason is to have communication logic for say a
>> crane or conveyor system in one process, and the business logic
>> for that mechanical device in another process.
>> (I'm talking about a warehouse control system here,
>> with 20-50 daemon processes)
> 
> That sounds confused to me. One would expect a communication abstraction 
> that all the business logic calls as needed. 

No. you want to separate the business logic from the actual protocol.
You do it with libraries, but this design uses a set of specialized
processes with IPC via names pipes. So the I/O process gets commands
from business process. the I/O process then converts these commands
to the actual protocall, and is responisible for securing data until acked.

By this design, just change the I/O is fairly easy and maintainable

  business-process <-IPC-> I/O <-whatever proto-> PLC
The internal IPC proctocol does not change.
The busines process does not care about socket/serial/usb/
its the isolated I/O process' responsibility
Change I/O process, and you are done with protocol-to-plc upgrade

>How that abstraction is 
> organized should not be visible to the business logic at all. 

and it is not as per above

> But we're getting way off-topic here, you're not going to redesign your 
> application because I think it sounds confused 

correct, let's leave it here.

> I still think this sounds way too low-level. 

I/O in our industry _is_ low-level.

> But I hope you're using a nice Ada abstraction of sockets and not 
> just calling raw C routines to do stuff. (If you are, you're hopeless and 
> this whole discussion is irrelevant.) 

So, just let us call me hopeless.
However, I do know about Murphy's law
"if it _can_ happen it _will_ happen"

This design runs at site with 50_000 + tote movements per day, and has
for many years. A tote move will need say 10 messages back-and forth
in cranes and conveyors for going out from store and back in again.

This gives about half a million messages - a day - on a busy site.
Murphy should have kicked in here long time ago.

This design do work very well.

> Right. And even on the systems where it worked, you're dependent that your 
> sockets library doesn't do any buffering (NC_Sockets can), doesn't have any 
> local data other than a sockets handle, and the like. A updated sockets 
> library might change that behavior.

correct, but homebrew (yuck) also means full control. No buffering.

--
Björn

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNATï¿½and Tasklets
  2014-12-18 22:33               ` Randy Brukardt
@ 2014-12-19 13:01                 ` vincent.diemunsch
  2014-12-19 17:46                   ` GNAT?and Tasklets Brad Moore
                                     ` (2 more replies)
  0 siblings, 3 replies; 73+ messages in thread
From: vincent.diemunsch @ 2014-12-19 13:01 UTC (permalink / raw)


Hello Randy,

Thank you for your response. I find it a little bit confusing in fact.

> There is no such thing as a "lightweight Ada task". The syntax is heavy, the 
> semantics is heavy (separate stack, exception handling, finalization, 
> termination), and as a correlary, the implementation is heavy.

Have you heard of Ravenscar ? The idea is to restrict tasking to a simple subset
that is both predictable, therefore easy to set up right, and light to implement.
I am pretty sure, that local tasks could be restricted a lot and yet be very useful.
And tasks with very limited synchronization, without rendez-vous, are easy to
manage.

> >Finaly, I really hope that the new version of the langage will keep Ada 
> >simple ...
> That ship sailed with Ada 95, if it ever was true at all.

So what ? We will continue to make it more and more complex ? Isn't it possible
to stop that trend and try to do better ?
 
> Besides, you confuse the appearance of simple (that is simple to use, simple 
> to learn) with simple in language terms.

When I say "simple", I simply means "simple in langage terms". For me a compiler
is a tool that allows us to generate machine code, without the difficulty of using
assembly langage. That's why it brings simplicity. And I know well that a compiler
is complex : parsing, AST expansion, type evaluation, code generation, all this 
requires a lot of theatrical work. I only have the feeling that compilers are still
lacking theoretical grounding upon tasking, and therefore are not able to deal
with it in a useful way.

> Similarly,
>     for I in parallel 1 .. 10 loop
>        ...
>     end loop;
> 
> looks and feels simple, even though what would have to happen under the 
> covers to implement that is anything but simple.

We agree on that point :-).

> You are still thinking way too low-level. Creating a parallel program should 
> be as easy as creating a sequential one. There should (almost) be no special 
> constructs at all. Ideally, most programs would be made up of parallel loops 
> and blocks, and there would be hardly any tasks (mainly to be used for 
> separate conceptual things).

That's funny ! Really are you serious ? Ada have created the task abstraction
to deal with parallelism, and for me it is an abstraction, that can be implemented
by different ways, depending on the compiler, the OS but also the way it is used 
inside the program. First you are telling me that it is too difficult for a compiler to
implement this abstraction in an other way than a heavy OS thread, because it 
would be to complex to automatically find simple cases allowing reduce tasking,
and because a runtime library using mixed tasking would be to difficult to write.
And strongly disagree to both arguments : 
1. the Ravenscar profile is a counter example, and using aspects or pragma should 
allow us to make useful restrictions on local tasks
2. the Paraffin library shows clearly that we can have mixed tasking, for it can be
used in parallel to tasks. 
And after, that you pretend seriously that a compiler should be able to find by its 
own parallelism in the program ! It seems to me a major contradiction !

No, creating a parallel program is far more complex than creating a sequential
one, and until we have a compiler so smart to do this, I prefer to rely on
explicit tasking.
 
> Writing a task correctly is a nearly impossible job, especially as it isn't 
> possible to statically eliminate deadlocks and race conditions. It's not 
> something for the "normal" programmer to do. We surely don't want to put 
> Ada's future in that box -- it would eventually have no future (especially 
> if very-many-core machines become as common as some predict).

See Linear Temporal Logic and Model Checking based on that logic. It is
the foundation of concurrency. But to make parallel computation, we really
don't need such complexity.

> In any case, there won't be any major upgrade of the Ada language for at 
> least another 5 years. The upcoming Corrigendum (bug fix) has few new 
> features and those are all tied to specific bugs in the Ada 2012. So I 
> wouldn't wait for language changes to bail you out; Brad's libraries are the 
> best option for now.

Fine.

> Also please note that language enhancements occur through a process of 
> consensus. Most of the ARG has to agree on a direction before it gets into 
> the language standard.

That is why I am taking time to discuss on this forum.

> You should have noted by now that pretty much everyone who has answered 
> here has disagreed with your position. 

No I haven't noticed. I had the feeling that at least Dimitry and Brad had were
sensible to some of my points. But even if it was the case, that doesn't mean that 
I went wrong :-) Montesquieu said "The less you think, the more people agree with you" :-).

Kind regards,

Vincent


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 11:01                     ` Dmitry A. Kazakov
@ 2014-12-19 16:42                       ` Brad Moore
  2014-12-19 17:28                         ` Dmitry A. Kazakov
  2014-12-19 23:51                       ` Randy Brukardt
  1 sibling, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-19 16:42 UTC (permalink / raw)

On 14-12-19 04:01 AM, Dmitry A. Kazakov wrote:
> On Fri, 19 Dec 2014 10:40:03 +0000 (UTC), Georg Bauhaus wrote:
>
>> <vincent.diemunsch@gmail.com> wrote:
>>
>>> It would be interesting to do a little survey on existing code using tasking.
>>> I have the impression that only tasks at Library level do rendez-vous and
>>> protected object synchronisation, and local tasks, most of the time, are
>>> limited to a rendez-vous with their parent task at the beginning or at
>>> the end. So maybe we should put restrictions on local tasks, so that we
>>> can map them to jobs.
>>
>> Won't the parallel loop feature be providing
>> for this kind of mini job:
>
> Parallel loop is useless for practical purposes. It wonders me why people
> wasting time with this.

For multicore, the idea is to make better use of the cores when doing so 
will improve performance. To the best of my knowledge all frameworks and 
language enhancements in other languages all have some concept of a 
parallel loop. (eg. OpenMP, Cilk, TBB)

Just because a loop can be parallelized doesn't mean it should be, the 
gains of the parallelism need to be greater than the overhead introduced 
to inject the parallelism.

For example, a loop to do image processing such as darkening the color 
of all pixels in a large image might benefit from parallelism.

Also, the number of iterations does not need to be large to see 
parallelism benefits.

     for I in parallel 1 .. 10 loop
         Lengthy_Processing_Of_Image (I);
     end loop;

>
> They could start with logical operations instead:
>
>      X and Y
>
> is already parallel by default. AFAIK nothing in RM forbids concurrent
> evaluation of X and Y if they are independent. Same with Ada arithmetic.
> E.g.
>
>     A + B + C + D
>
> So far no compiler evaluates arguments concurrently or vectorizes
> sub-expressions like:
>
>     A
>     B  +
>     C      +
>     D  +
>
> Because if they did the result would work slower than sequential code. It
> simply does not worth the efforts with existing machine architectures.
>

The compiler should be able to make the decision to parallelize these if 
there is any benefit to doing so. Likely the decision would be to *not* 
parallelize these, if A, B, C, and D are objects of some elementary type..

But it depends on the datatype of A, B, C, and D.

Also A, B, C, and D might be function calls, not simple data references, 
and these calls might involve lengthy processing, in which case, adding 
parallelism might make sense.

Or, if these are objects of a Big Number library with infinite 
precision, you might have an irrational number with pages of digits each 
for numerator and denominator. Performing math on such values might very 
well benefit from parallelism.

We looked at being able to explicitly state parallelism for subprograms, 
(parallel subprograms), but found that syntax was messy, and there were 
too many other problems.

We are currently thinking a parallel block syntax better provides this 
capability, if the programmer wants to explicitly indicate where 
parallelism is desired.

eg.

      Left, Right, Total : Integer := 0;

      parallel
           Left := A + B;
      and
           Right := C + D;
      end parallel;

      Total := Left + Right;

or possibly allow some automatic reduction

    Total : Integer with Reduce := 0;

    parallel
         Total := A + B;
    and
         Total := C + D;
    end parallel;

Here, two "tasklets" would get created that can execute in parallel, 
each with their local instance of the Total result (i.e. thread local 
storage), and at the end of the parallel block, the two results are 
reduced into one and assigned to the actual Total.

The reduction operation and identity value used to initialize the local 
instances of the Total could be defaulted by the compiler for simple 
data types, but could be explicitly stated if desired.

eg.

    Total : Integer with Reduce => (Reducer => "+", Identity => 0) := 0;

    parallel
         Total := A + B;
    and
         Total := C + D;
    end parallel;

Brad

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 16:42                       ` Brad Moore
@ 2014-12-19 17:28                         ` Dmitry A. Kazakov
  2014-12-19 18:35                           ` Brad Moore
                                             ` (3 more replies)
  0 siblings, 4 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-19 17:28 UTC (permalink / raw)


On Fri, 19 Dec 2014 09:42:52 -0700, Brad Moore wrote:

> On 14-12-19 04:01 AM, Dmitry A. Kazakov wrote:
>> On Fri, 19 Dec 2014 10:40:03 +0000 (UTC), Georg Bauhaus wrote:
>>
>>> <vincent.diemunsch@gmail.com> wrote:
>>>
>>>> It would be interesting to do a little survey on existing code using tasking.
>>>> I have the impression that only tasks at Library level do rendez-vous and
>>>> protected object synchronisation, and local tasks, most of the time, are
>>>> limited to a rendez-vous with their parent task at the beginning or at
>>>> the end. So maybe we should put restrictions on local tasks, so that we
>>>> can map them to jobs.
>>>
>>> Won't the parallel loop feature be providing
>>> for this kind of mini job:
>>
>> Parallel loop is useless for practical purposes. It wonders me why people
>> wasting time with this.
> 
> For multicore, the idea is to make better use of the cores when doing so 
> will improve performance.

I don't think multi-core would bring any advantage. Starting / activating /
reusing / feeding / re-synchronizing threads is too expensive.

Parallel loops could be useful on some massively vectorized architectures
in some very specialized form, or on architectures with practically
infinite number of cores (e.g. molecular computers). Anyway feeding threads
with inputs and gathering outputs may still mean more overhead than any
gain.

> Also, the number of iterations does not need to be large to see 
> parallelism benefits.
> 
>      for I in parallel 1 .. 10 loop
>          Lengthy_Processing_Of_Image (I);
>      end loop;

Certainly not for image processing. In image processing when doing it by
segments, you need to sew the segments along their borders, practically in
all algorithms. That makes parallelization far more complicated to be
handled by such a blunt thing as parallel loop.

>> They could start with logical operations instead:
>>
>>      X and Y
>>
>> is already parallel by default. AFAIK nothing in RM forbids concurrent
>> evaluation of X and Y if they are independent. Same with Ada arithmetic.
>> E.g.
>>
>>     A + B + C + D
>>
>> So far no compiler evaluates arguments concurrently or vectorizes
>> sub-expressions like:
>>
>>     A
>>     B  +
>>     C      +
>>     D  +
>>
>> Because if they did the result would work slower than sequential code. It
>> simply does not worth the efforts with existing machine architectures.
> 
> The compiler should be able to make the decision to parallelize these if 
> there is any benefit to doing so. Likely the decision would be to *not* 
> parallelize these, if A, B, C, and D are objects of some elementary type..
> 
> But it depends on the datatype of A, B, C, and D.
> 
> Also A, B, C, and D might be function calls, not simple data references, 
> and these calls might involve lengthy processing, in which case, adding 
> parallelism might make sense.

Yes, provided the language has means to describe side effects of such
computations in a way making the decision safe.

> Or, if these are objects of a Big Number library with infinite 
> precision, you might have an irrational number with pages of digits each 
> for numerator and denominator. Performing math on such values might very 
> well benefit from parallelism.

It won't, because a big number library will use the heap or a shared part
of the stack which will require interlocking and thus will either be marked
as "impure", so that the compiler will not try to parallelize, or else will
make the compiler to use locks, which will effectively kill parallelism.

> We looked at being able to explicitly state parallelism for subprograms, 
> (parallel subprograms), but found that syntax was messy, and there were 
> too many other problems.
> 
> We are currently thinking a parallel block syntax better provides this 
> capability, if the programmer wants to explicitly indicate where 
> parallelism is desired.
> 
> eg.
> 
>       Left, Right, Total : Integer := 0;
> 
>       parallel
>            Left := A + B;
>       and
>            Right := C + D;
>       end parallel;
> 
>       Total := Left + Right;
> 
> or possibly allow some automatic reduction
> 
> 
>     Total : Integer with Reduce := 0;
> 
>     parallel
>          Total := A + B;
>     and
>          Total := C + D;
>     end parallel;
> 
> Here, two "tasklets" would get created that can execute in parallel, 
> each with their local instance of the Total result (i.e. thread local 
> storage), and at the end of the parallel block, the two results are 
> reduced into one and assigned to the actual Total.
> 
> The reduction operation and identity value used to initialize the local 
> instances of the Total could be defaulted by the compiler for simple 
> data types, but could be explicitly stated if desired.
> 
> eg.
> 
>     Total : Integer with Reduce => (Reducer => "+", Identity => 0) := 0;
> 
>     parallel
>          Total := A + B;
>     and
>          Total := C + D;
>     end parallel;

The semantics is not clear. What happens if:

   parallel
      Total := Total + 1;
      Total := A + B;
   and
      Total := C + D;
   end parallel;

and of course the question of exceptions raised within concurrent paths.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT?and Tasklets
  2014-12-19 13:01                 ` GNATï¿½and Tasklets vincent.diemunsch
@ 2014-12-19 17:46                   ` Brad Moore
  2014-12-20  0:39                   ` GNAT and Tasklets Peter Chapin
  2014-12-20  0:58                   ` GNAT�and Tasklets Randy Brukardt
  2 siblings, 0 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-19 17:46 UTC (permalink / raw)

On 14-12-19 06:01 AM, vincent.diemunsch@gmail.com wrote:
> Hello Randy,
>
> Thank you for your response. I find it a little bit confusing in fact.
>
>> There is no such thing as a "lightweight Ada task". The syntax is heavy, the
>> semantics is heavy (separate stack, exception handling, finalization,
>> termination), and as a correlary, the implementation is heavy.
>
> Have you heard of Ravenscar ?

I presume this is tongue in cheek, or perhaps you were unaware that 
Randy is the editor for the RM standard work. I can assure you that 
Randy would be well familiar with Ravenscar. He would have had to type 
that word many times, let alone discuss it and understand it in great 
detail at ARG meetings. :-)

The idea is to restrict tasking to a simple subset
> that is both predictable, therefore easy to set up right, and light to implement.
> I am pretty sure, that local tasks could be restricted a lot and yet be very useful.
> And tasks with very limited synchronization, without rendez-vous, are easy to
> manage.

I think both Randy and I would say that while tasks in Ravenscar might 
involve a simpler run-time model than the general case, they can still 
be considered to be far too heavy to consider where each "tasklet" is 
given its own task. That's a key point. If the goal of this is speed, 
then ultimately you want to use the lightest weight possible construct 
available, so that the overhead of introducing parallelism is as small 
as possible. This means more things can be parallelized. Tasks have too 
many semantics that you cannot just eliminate with restrictions. It 
doesn't mean that an implementation cant use tasks (or OS threads) as a 
basis for implementing parallelism, because it certainly can work as 
evidenced in Paraffin, but we dont want to mandate in the standard that 
the underlying mechanism be tasks, if better alternatives exist.

In Ravenscar, the simplicity helps the real-time analysis, but there 
still are a lot of semantics and overhead associated with tasks that are 
still present even in Ravenscar. Ravenscar and real-time is not about 
having faster tasks. Its about being able to statically guarantee that 
tasks will meet their deadlines.

>
>>> Finaly, I really hope that the new version of the langage will keep Ada
>>> simple ...
>> That ship sailed with Ada 95, if it ever was true at all.
>
> So what ? We will continue to make it more and more complex ? Isn't it possible
> to stop that trend and try to do better ?

Complexity is in the eye of the beholder. Adding parallelism support for 
example, makes the language more complex, but might make adding 
parallelism more simpler to the programmer who wants to do that sort of 
thing. To the compiler writer though, it likely involves more than a 
fair bit of work.

>
>> Besides, you confuse the appearance of simple (that is simple to use, simple
>> to learn) with simple in language terms.
>
> When I say "simple", I simply means "simple in langage terms". For me a compiler
> is a tool that allows us to generate machine code, without the difficulty of using
> assembly langage. That's why it brings simplicity. And I know well that a compiler
> is complex : parsing, AST expansion, type evaluation, code generation, all this
> requires a lot of theatrical work. I only have the feeling that compilers are still
> lacking theoretical grounding upon tasking, and therefore are not able to deal
> with it in a useful way.
>
>> Similarly,
>>      for I in parallel 1 .. 10 loop
>>         ...
>>      end loop;
>>
>> looks and feels simple, even though what would have to happen under the
>> covers to implement that is anything but simple.
>
> We agree on that point :-).
>
>> You are still thinking way too low-level. Creating a parallel program should
>> be as easy as creating a sequential one. There should (almost) be no special
>> constructs at all. Ideally, most programs would be made up of parallel loops
>> and blocks, and there would be hardly any tasks (mainly to be used for
>> separate conceptual things).
>
> That's funny ! Really are you serious ? Ada have created the task abstraction
> to deal with parallelism,

Tasks were designed to support concurrency, which is broader than 
parallelism. They are useful for coarse grained parallelism, but too 
low-level for fine grained parallelism, such as parallel loops, etc.

and for me it is an abstraction, that can be implemented
> by different ways, depending on the compiler, the OS but also the way it is used
> inside the program.

This certainly is true, but all the ways one implements tasks needs to 
support the semantics described in the RM. Randy will also agree because 
he has actually written an Ada compiler, and his implementation is quite 
different than say, the GNAT compiler.

  First you are telling me that it is too difficult for a compiler to
> implement this abstraction in an other way than a heavy OS thread, because it
> would be to complex to automatically find simple cases allowing reduce tasking,
> and because a runtime library using mixed tasking would be to difficult to write.
> And strongly disagree to both arguments :
> 1. the Ravenscar profile is a counter example, and using aspects or pragma should
> allow us to make useful restrictions on local tasks
> 2. the Paraffin library shows clearly that we can have mixed tasking, for it can be
> used in parallel to tasks.
> And after, that you pretend seriously that a compiler should be able to find by its
> own parallelism in the program ! It seems to me a major contradiction !

If the compiler has enough static knowledge of the program, for example 
that there are no data races or blocking operations, and it knows how 
much executable code is being parallelized, then the compiler likely is 
better than the human in deciding where and how parallelism should occur 
in many or most cases.

Brad
>
> No, creating a parallel program is far more complex than creating a sequential
> one, and until we have a compiler so smart to do this, I prefer to rely on
> explicit tasking.
>
>> Writing a task correctly is a nearly impossible job, especially as it isn't
>> possible to statically eliminate deadlocks and race conditions. It's not
>> something for the "normal" programmer to do. We surely don't want to put
>> Ada's future in that box -- it would eventually have no future (especially
>> if very-many-core machines become as common as some predict).
>
> See Linear Temporal Logic and Model Checking based on that logic. It is
> the foundation of concurrency. But to make parallel computation, we really
> don't need such complexity.
>
>> In any case, there won't be any major upgrade of the Ada language for at
>> least another 5 years. The upcoming Corrigendum (bug fix) has few new
>> features and those are all tied to specific bugs in the Ada 2012. So I
>> wouldn't wait for language changes to bail you out; Brad's libraries are the
>> best option for now.
>
> Fine.
>
>> Also please note that language enhancements occur through a process of
>> consensus. Most of the ARG has to agree on a direction before it gets into
>> the language standard.
>
> That is why I am taking time to discuss on this forum.
>
>> You should have noted by now that pretty much everyone who has answered
>> here has disagreed with your position.
>
> No I haven't noticed. I had the feeling that at least Dimitry and Brad had were
> sensible to some of my points. But even if it was the case, that doesn't mean that
> I went wrong :-) Montesquieu said "The less you think, the more people agree with you" :-).
>
> Kind regards,
>
> Vincent
>

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 17:28                         ` Dmitry A. Kazakov
@ 2014-12-19 18:35                           ` Brad Moore
  2014-12-19 20:37                             ` Dmitry A. Kazakov
  2014-12-20 16:49                             ` Dennis Lee Bieber
  2014-12-19 19:43                           ` Peter Chapin
                                             ` (2 subsequent siblings)
  3 siblings, 2 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-19 18:35 UTC (permalink / raw)


On 14-12-19 10:28 AM, Dmitry A. Kazakov wrote:
> On Fri, 19 Dec 2014 09:42:52 -0700, Brad Moore wrote:
>
>> On 14-12-19 04:01 AM, Dmitry A. Kazakov wrote:
>>> On Fri, 19 Dec 2014 10:40:03 +0000 (UTC), Georg Bauhaus wrote:
>>>
>>>> <vincent.diemunsch@gmail.com> wrote:
>>>>
>>>>> It would be interesting to do a little survey on existing code using tasking.
>>>>> I have the impression that only tasks at Library level do rendez-vous and
>>>>> protected object synchronisation, and local tasks, most of the time, are
>>>>> limited to a rendez-vous with their parent task at the beginning or at
>>>>> the end. So maybe we should put restrictions on local tasks, so that we
>>>>> can map them to jobs.
>>>>
>>>> Won't the parallel loop feature be providing
>>>> for this kind of mini job:
>>>
>>> Parallel loop is useless for practical purposes. It wonders me why people
>>> wasting time with this.
>>
>> For multicore, the idea is to make better use of the cores when doing so
>> will improve performance.
>
> I don't think multi-core would bring any advantage. Starting / activating /
> reusing / feeding / re-synchronizing threads is too expensive.

Yes, there is overhead, but if the amount of overhead is small relative 
to the work being performed, and the work can be divided up between 
multicple cores then there obviously is a performance advantage.


>
> Parallel loops could be useful on some massively vectorized architectures
> in some very specialized form, or on architectures with practically
> infinite number of cores (e.g. molecular computers). Anyway feeding threads
> with inputs and gathering outputs may still mean more overhead than any
> gain.
>
>> Also, the number of iterations does not need to be large to see
>> parallelism benefits.
>>
>>       for I in parallel 1 .. 10 loop
>>           Lengthy_Processing_Of_Image (I);
>>       end loop;
>
> Certainly not for image processing. In image processing when doing it by
> segments, you need to sew the segments along their borders, practically in
> all algorithms. That makes parallelization far more complicated to be
> handled by such a blunt thing as parallel loop.


If you have special hardware, such as vectorization support, that is 
another form of parallelism. Not all platforms have such hardware 
support though. A compiler might decide to take advantage of both 
hardware and software parallelism for a particular problem.

The example here was meant to be a trivial one. The example was meant to 
show manipulation of a large data structure such as a bit array. Another 
example might be solving a large matrix of differential equations.

>
>>> They could start with logical operations instead:
>>>
>>>       X and Y
>>>
>>> is already parallel by default. AFAIK nothing in RM forbids concurrent
>>> evaluation of X and Y if they are independent. Same with Ada arithmetic.
>>> E.g.
>>>
>>>      A + B + C + D
>>>
>>> So far no compiler evaluates arguments concurrently or vectorizes
>>> sub-expressions like:
>>>
>>>      A
>>>      B  +
>>>      C      +
>>>      D  +
>>>
>>> Because if they did the result would work slower than sequential code. It
>>> simply does not worth the efforts with existing machine architectures.
>>
>> The compiler should be able to make the decision to parallelize these if
>> there is any benefit to doing so. Likely the decision would be to *not*
>> parallelize these, if A, B, C, and D are objects of some elementary type..
>>
>> But it depends on the datatype of A, B, C, and D.
>>
>> Also A, B, C, and D might be function calls, not simple data references,
>> and these calls might involve lengthy processing, in which case, adding
>> parallelism might make sense.
>
> Yes, provided the language has means to describe side effects of such
> computations in a way making the decision safe.
>
>> Or, if these are objects of a Big Number library with infinite
>> precision, you might have an irrational number with pages of digits each
>> for numerator and denominator. Performing math on such values might very
>> well benefit from parallelism.
>
> It won't, because a big number library will use the heap or a shared part
> of the stack which will require interlocking and thus will either be marked
> as "impure", so that the compiler will not try to parallelize, or else will
> make the compiler to use locks, which will effectively kill parallelism.

A while back I wrote an example tackling the problem associated with the 
first computer program, (the one written by Ada Lovelace), which is to 
generate Bernoulli numbers. I originally wanted to use an algorithm that 
closely matched the processing of Babbage's Analytical Engine, but found 
that algorithm couldn't be parallelized, since the results of each 
iteration depended on the results from previous iterations.

However, I did find another algorithm to generate Bernoulli numbers that 
could be parallelized.

For that implementation, I wrote a Big Number library, and found indeed 
that the parallelism did work, and was beneficial.

At the time I believe I was just using the default memory pool, which 
would have had the interlocking you mention.

If I were to try again, I would try to use a pool such as the Depend 
storage pool, where each worker thread has its own local pool, and thus 
no interlocking is needed. I would expect to see even more performance 
gains using this approach.


>
>> We looked at being able to explicitly state parallelism for subprograms,
>> (parallel subprograms), but found that syntax was messy, and there were
>> too many other problems.
>>
>> We are currently thinking a parallel block syntax better provides this
>> capability, if the programmer wants to explicitly indicate where
>> parallelism is desired.
>>
>> eg.
>>
>>        Left, Right, Total : Integer := 0;
>>
>>        parallel
>>             Left := A + B;
>>        and
>>             Right := C + D;
>>        end parallel;
>>
>>        Total := Left + Right;
>>
>> or possibly allow some automatic reduction
>>
>>
>>      Total : Integer with Reduce := 0;
>>
>>      parallel
>>           Total := A + B;
>>      and
>>           Total := C + D;
>>      end parallel;
>>
>> Here, two "tasklets" would get created that can execute in parallel,
>> each with their local instance of the Total result (i.e. thread local
>> storage), and at the end of the parallel block, the two results are
>> reduced into one and assigned to the actual Total.
>>
>> The reduction operation and identity value used to initialize the local
>> instances of the Total could be defaulted by the compiler for simple
>> data types, but could be explicitly stated if desired.
>>
>> eg.
>>
>>      Total : Integer with Reduce => (Reducer => "+", Identity => 0) := 0;
>>
>>      parallel
>>           Total := A + B;
>>      and
>>           Total := C + D;
>>      end parallel;
>
> The semantics is not clear. What happens if:
>
>     parallel
>        Total := Total + 1;
>        Total := A + B;
>     and
>        Total := C + D;
>     end parallel;
>
> and of course the question of exceptions raised within concurrent paths.
>


I haven't listed all the semantics but for the questions you ask,
each arm of the parallel block is a separate thread of execution (which 
we have been calling a tasklet).

Each tasklet starts off with its own local declaration of Total, 
initialized to 0, which is the Identity value for the reduction.

So, for the top Total, you end up with {Top_}Total := 1 + A + B;
for the bottom Total, you end up with {Bottom_}Total := C + D;

Then during the reduction phase, those two results get reduced using the 
reduction operation, which in this case is "+".

So the end result is Total = 1 + A + B + C + D;

As for exceptions, we are thinking that execution does not continue past 
the parallel block until all branches have finished executing. If 
exceptions are raised in multiple branches of the parallel block, then 
only of those exceptions would be selected, and only that one exception 
would get propagated outside the block.

Brad


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 17:28                         ` Dmitry A. Kazakov
  2014-12-19 18:35                           ` Brad Moore
@ 2014-12-19 19:43                           ` Peter Chapin
  2014-12-19 20:45                           ` Georg Bauhaus
  2014-12-19 23:55                           ` Randy Brukardt
  3 siblings, 0 replies; 73+ messages in thread
From: Peter Chapin @ 2014-12-19 19:43 UTC (permalink / raw)


On Fri, 19 Dec 2014, Dmitry A. Kazakov wrote:

>> For multicore, the idea is to make better use of the cores when doing 
>> so will improve performance.
>
> I don't think multi-core would bring any advantage. Starting / 
> activating / reusing / feeding / re-synchronizing threads is too 
> expensive.
>
> Parallel loops could be useful on some massively vectorized 
> architectures in some very specialized form, or on architectures with 
> practically infinite number of cores (e.g. molecular computers). Anyway 
> feeding threads with inputs and gathering outputs may still mean more 
> overhead than any gain.

I've written some numerical programs using OpenMP and in one particularly 
dramatic case I acheived a speed up of nearly 4 (about 3.98) on a quad 
core system by marking a single critical loop as parallel. I made no other 
changes. Literally it was a one line change to my sequential program.

The program was not particularly odd, nor was the hardware I was using. So 
there is definitely a benefit to be had in some cases with parallel loops.

Peter

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 18:35                           ` Brad Moore
@ 2014-12-19 20:37                             ` Dmitry A. Kazakov
  2014-12-20  1:05                               ` Randy Brukardt
  2014-12-20 16:49                             ` Dennis Lee Bieber
  1 sibling, 1 reply; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-19 20:37 UTC (permalink / raw)


On Fri, 19 Dec 2014 11:35:05 -0700, Brad Moore wrote:

> I haven't listed all the semantics but for the questions you ask,
> each arm of the parallel block is a separate thread of execution (which 
> we have been calling a tasklet).
> 
> Each tasklet starts off with its own local declaration of Total, 
> initialized to 0, which is the Identity value for the reduction.
> 
> So, for the top Total, you end up with {Top_}Total := 1 + A + B;
> for the bottom Total, you end up with {Bottom_}Total := C + D;
> 
> Then during the reduction phase, those two results get reduced using the 
> reduction operation, which in this case is "+".
> 
> So the end result is Total = 1 + A + B + C + D;

I think that the block should have explicit parameters, e.g. Total must be
an in-out parameter of the block. The syntax should be similar to the
selective accept. Each arm must also have parameters, and only those and of
the block must be visible within an arm. E.g. A, B must be parameters.
Nothing else should be visible.

> As for exceptions, we are thinking that execution does not continue past 
> the parallel block until all branches have finished executing. If 
> exceptions are raised in multiple branches of the parallel block, then 
> only of those exceptions would be selected, and only that one exception 
> would get propagated outside the block.

In this model, Exception_Occurrence should be an out-parameter. Each arm
must convert exceptions to an occurrence and channel into that parameter,
the reduction rule will take over and select one to propagate or ignore. If
an exception is unhandled, it is Program_Error.

------------------------
As a practical example for the concept I suggest this one. A parallel block
used for a spawned process. The process has the standard input pipe (to
write) and the standard output and error pipes (to read out). The block has
3 arms. One writes the input, other two read the output and error.

When the process completes, the block ends. If executed on a single CPU
machine arms get switched one to another when get blocked on the pipe I/O.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 17:28                         ` Dmitry A. Kazakov
  2014-12-19 18:35                           ` Brad Moore
  2014-12-19 19:43                           ` Peter Chapin
@ 2014-12-19 20:45                           ` Georg Bauhaus
  2014-12-19 20:56                             ` Dmitry A. Kazakov
  2014-12-19 23:55                           ` Randy Brukardt
  3 siblings, 1 reply; 73+ messages in thread
From: Georg Bauhaus @ 2014-12-19 20:45 UTC (permalink / raw)

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote:
> On Fri, 19 Dec 2014 09:42:52 -0700, Brad . In image processing when doing it by
> segments, you need to sew the segments along their borders, practically in
> all algorithms. That makes parallelization far more complicated to be
> handled by such a blunt thing as parallel loop.

Conclusions drawn from one example
are fallacious here, as I am sure you
know well. Take some other algorthm like
Inverting all pixels, or anything written 
for parallel APL, or R. Anything whose
parts need not be run on a full machine
with I/O layers and active control.

> It won't, because a big number library will use the heap or a shared part
> of the stack which will require interlocking and thus will either be marked
> as "impure", so that the compiler will not try to parallelize, or else will
> make the compiler to use locks, which will effectively kill parallelism.

All the aliasing analysis explored at
SofCheck/AdaCore and considerations
regarding purity and experience gained
when developing the Spark language
based on Ada should be a good basis for
describing aspects of subprograms
that are candidates for parallel execution?

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 20:45                           ` Georg Bauhaus
@ 2014-12-19 20:56                             ` Dmitry A. Kazakov
  0 siblings, 0 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-19 20:56 UTC (permalink / raw)


On Fri, 19 Dec 2014 20:45:49 +0000 (UTC), Georg Bauhaus wrote:

> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote:
>> On Fri, 19 Dec 2014 09:42:52 -0700, Brad . In image processing when doing it by
>> segments, you need to sew the segments along their borders, practically in
>> all algorithms. That makes parallelization far more complicated to be
>> handled by such a blunt thing as parallel loop.
> 
> Conclusions drawn from one example
> are fallacious here,

Exactly. Doing something on image pixels independently is such a fallacious
example.

> as I am sure you
> know well. Take some other algorthm like
> Inverting all pixels, or anything written 
> for parallel APL, or R. Anything whose
> parts need not be run on a full machine
> with I/O layers and active control.

Parallel loop is useless for symmetric and vectorized processing.

>> It won't, because a big number library will use the heap or a shared part
>> of the stack which will require interlocking and thus will either be marked
>> as "impure", so that the compiler will not try to parallelize, or else will
>> make the compiler to use locks, which will effectively kill parallelism.
> 
> All the aliasing analysis explored at
> SofCheck/AdaCore and considerations
> regarding purity and experience gained
> when developing the Spark language
> based on Ada should be a good basis for
> describing aspects of subprograms
> that are candidates for parallel execution?

How that changes anything? You either access heap from concurrent threads
or not. If you do you have to lock. Copying portions of heap into local
memory and running transactions on the heap is a non-starter. That is.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19  8:39                 ` Natasha Kerensikova
@ 2014-12-19 23:39                   ` Randy Brukardt
  0 siblings, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-19 23:39 UTC (permalink / raw)

"Natasha Kerensikova" <lithiumcat@instinctive.eu> wrote in message 
news:slrnm97p2q.nrc.lithiumcat@nat.rebma.instinctive.eu...
> Hello,
>
> On 2014-12-18, Randy Brukardt <randy@rrsoftware.com> wrote:
>> [...]                           That's true of *any* Ada object --  
>> without
>> documentation and/or code to manage it, it should be assumed to only work
>> with one task at a time.
>
> That's a bit at odds with a recent (or not so recent) discussion, where
> I didn't find in the RM where it is written that concurrent read-only
> basic operations (like access dereference, array indexing, etc) are
> task-safe, and the answer was that everything should work as expected
> unless explicitly noted otherwise, even when concurrency is involved.

No it's not. You are talking about "operations" and I'm talking about 
"objects". Very different things! Now, any ADT is going to have both objects 
and operations, so it's rather hard to do a concurrent operation without 
using concurrent objects.

> For example, I wrote Constant_Indefinite_Ordered_Maps (available at
> https://github.com/faelys/natools/blob/trunk/src/natools-constant_indefinite_ordered_maps.ads )
> which is an ordered map based on a binary search in a sorted array.
> It is meant to be task-safe without any code to manage it, basing its
> safety solely under the assumption that concurrent access dereference
> and concurrent array indexing are safe.

Sure, those OPERATIONS are safe. But not (necessarily) access to the object: 
that's only safe if the object is volatile or if the object is never 
written. See 9.10 and the associated A(3).

I personally do not believe in the "read-only" data structure (although 
others disagree). My experience is that such structures are usually 
read-mostly -- which means they still need some sort of protection. Some 
have claimed that protection by convention is enough -- which to me brings 
to mind what they say about the "rhythm" method of contraception ("people 
that use rhythm are called parents"). Someone will break the convention and 
then you'll get cars that will accelerate without commands and other lovely 
things.

> I hope it's not a brittle assumption...

It's safe if you can truly prove that the data structure is read-only and 
*never* gets written. I think *that* is brittle, but YMMV.

Example: the parameters in my spam filter rarely change. But rarely is not 
never, and unless I was willing to restart the filter for every parameter 
change, they still have to be protected.

                                        Randy.

>
> Natasha 

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 11:01                     ` Dmitry A. Kazakov
  2014-12-19 16:42                       ` Brad Moore
@ 2014-12-19 23:51                       ` Randy Brukardt
  1 sibling, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-19 23:51 UTC (permalink / raw)

"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:yjrnhk0w8gjd.k6ht3uh7raiw.dlg@40tude.net...
> On Fri, 19 Dec 2014 10:40:03 +0000 (UTC), Georg Bauhaus wrote:
>
>> <vincent.diemunsch@gmail.com> wrote:
>>
>>> It would be interesting to do a little survey on existing code using 
>>> tasking.
>>> I have the impression that only tasks at Library level do rendez-vous 
>>> and
>>> protected object synchronisation, and local tasks, most of the time, are
>>> limited to a rendez-vous with their parent task at the beginning or at
>>> the end. So maybe we should put restrictions on local tasks, so that we
>>> can map them to jobs.
>>
>> Won't the parallel loop feature be providing
>> for this kind of mini job:
>
> Parallel loop is useless for practical purposes. It wonders me why people
> wasting time with this.

Because it's about medium-grained parallelism rather than the impractical 
fine-grained parallelism or the heavy-weight of Ada tasks.

> They could start with logical operations instead:
>
>    X and Y
>
> is already parallel by default. AFAIK nothing in RM forbids concurrent
> evaluation of X and Y if they are independent.

Not really true in practice -- it's rare that a compiler can prove 
independence of anything but the simplest entities (which are too cheap to 
benefit from optimization). Common-subexpression elimination is hard and 
rarely does anything.

Additionally, 1.1.4(18) is taken to mean that "arbitrary order" means any 
possible sequential order, but NOT a parallel order. Thus parallel execution 
is only allowed in cases that (almost) never happen in practice (large, 
independent expressions).

One of the reasons for having parallel blocks and loops is to signal to the 
compiler and reader that parallel execution IS allowed, even though it might 
not be an allowed sequential order.

> Same with Ada arithmetic.
> E.g.
>
>   A + B + C + D
>
> So far no compiler evaluates arguments concurrently or vectorizes
> sub-expressions like:
>
>   A
>   B  +
>   C      +
>   D  +
>
> Because if they did the result would work slower than sequential code. It
> simply does not worth the efforts with existing machine architectures.

I agree with you vis-a-vis fine-grained parallelism. The overhead would kill 
you.

But the reason for parallel loops and blocks is to make expensive subprogram 
calls (or other decent-sized chunks of code) in parallel without writing a 
massive amount of overhead. That is not allowed in Ada today because of 
issues with exception handling and rules like 1.1.4(18).

While
     for i in parallel 1 .. 1000 loop
         A := A + I;
     end loop;
would be madness to execute in parallel, the more realistic
     for I in parallel 1 .. 1000 loop
         A := A + Expensive_Function(I);
    end loop;
probably makes sense. It's the latter that we're interested in.

                                               Randy.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 17:28                         ` Dmitry A. Kazakov
                                             ` (2 preceding siblings ...)
  2014-12-19 20:45                           ` Georg Bauhaus
@ 2014-12-19 23:55                           ` Randy Brukardt
  3 siblings, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-19 23:55 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:1r2ziulc78imb$.ad6zx5upic6s$.dlg@40tude.net...
...
> and of course the question of exceptions raised within concurrent paths.

That's one of the details that the rest of the ARG is waiting to find out. 
:-) This sort of language design is hard, and it will be a long time before 
all of the details are worked out. And maybe it won't work once that is 
done. But how are we going to find that out if the effort isn't made??

                                  Randy.



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 11:56                 ` Björn Lundin
@ 2014-12-20  0:02                   ` Randy Brukardt
  0 siblings, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-20  0:02 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

"Björn Lundin" <b.f.lundin@gmail.com> wrote in message 
news:m713pl$l7a$1@dont-email.me...
> On 2014-12-19 00:01, Randy Brukardt wrote:
...
>> Right. And even on the systems where it worked, you're dependent that 
>> your
>> sockets library doesn't do any buffering (NC_Sockets can), doesn't have 
>> any
>> local data other than a sockets handle, and the like. A updated sockets
>> library might change that behavior.
>
> correct, but homebrew (yuck) also means full control. No buffering.

True enough. But I would expect approximately 1% of programmers would be 
using a homebrew sockets library, as that's a lot of work to maintain. If 
you're using some library from your compiler vendor or some other source, 
than change is always a potential problem. (Can be mitigated by staying with 
ancient versions of software, but that doesn't work in the long haul.)

                                          Randy.



^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 13:01                 ` GNATï¿½and Tasklets vincent.diemunsch
  2014-12-19 17:46                   ` GNAT?and Tasklets Brad Moore
@ 2014-12-20  0:39                   ` Peter Chapin
  2014-12-20  9:03                     ` Dmitry A. Kazakov
  2014-12-20  0:58                   ` GNAT�and Tasklets Randy Brukardt
  2 siblings, 1 reply; 73+ messages in thread
From: Peter Chapin @ 2014-12-20  0:39 UTC (permalink / raw)


On Fri, 19 Dec 2014, vincent.diemunsch@gmail.com wrote:

> Ada have created the task abstraction to deal with parallelism...

I haven't tried any parallel programming with Ada but I have used both 
OpenMP and "direct" pthreads calls in parallel C programs. Creating and 
managing threads manually is a major pain compared to what OpenMP does. 
Sure it's possible to use threads directly but it clutters the program's 
logic and it's tricky to get it both right and efficient at the same time.

Ada's tasking features are a lot nicer than pthreads, but I agree with the 
point others have made that they are the wrong tool for writing parallel 
code... at least writing parallel code without extreme pain. Tasks are 
good for large, relatively independent chunks of logic that need to 
execute concurrently such as different control loops in an embedded 
system. However, when writing parallel code that will execute over highly 
regular data structures such as large matrices, explicit tasks are just a 
distraction. In such applications the existence of tasks is an 
implementation detail, not a design element.

Trying to use explicit tasks for parallel programming is a kind of 
abstraction inversion: using a high level construct to implement some low 
level functionality. Sure it is possible but the result is ugly and 
inefficient.

Peter


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT�and Tasklets
  2014-12-19 13:01                 ` GNATï¿½and Tasklets vincent.diemunsch
  2014-12-19 17:46                   ` GNAT?and Tasklets Brad Moore
  2014-12-20  0:39                   ` GNAT and Tasklets Peter Chapin
@ 2014-12-20  0:58                   ` Randy Brukardt
  2 siblings, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-20  0:58 UTC (permalink / raw)


<vincent.diemunsch@gmail.com> wrote in message 
news:2d4fc21f-5739-4100-9551-959b6822c761@googlegroups.com...
>Hello Randy,
>
>Thank you for your response. I find it a little bit confusing in fact.
>
>> There is no such thing as a "lightweight Ada task". The syntax is heavy, 
>> the
>> semantics is heavy (separate stack, exception handling, finalization,
>> termination), and as a correlary, the implementation is heavy.
>
>Have you heard of Ravenscar ? The idea is to restrict tasking to a simple 
>subset
>that is both predictable, therefore easy to set up right, and light to 
>implement.
>I am pretty sure, that local tasks could be restricted a lot and yet be 
>very useful.
>And tasks with very limited synchronization, without rendez-vous, are easy 
>to
>manage.

Brad already answered this, so I'll just add that the stuff that makes these 
tasks expensive (a separate stack, finalization, exception handling, 
termination) can't be eliminated by Ravenscar-like restrictions. You could 
make draconian restrictions on what you can do in them, of course, but you'd 
be throwing out almost all of Ada's good parts if you did. (Ada without ADTs 
or exceptions is just another programming language, and an overly wordy one 
at that).

Also, Ravenscar gets much of its simplicity by having tasks that never 
terminate. That's because task termination in Ada is very complex, and 
pretty much the only way to simplify it is to ensure it doesn't happen. 
That's not going to work for your purposes (local tasks better stop or the 
containing subprogram can never return).

>> >Finaly, I really hope that the new version of the langage will keep Ada
>> >simple ...
>> That ship sailed with Ada 95, if it ever was true at all.
>
>So what ? We will continue to make it more and more complex ? Isn't it 
>possible
>to stop that trend and try to do better ?

No, it's not possible. Adding anything at all to the language makes it more 
complex, and compatibility concerns mean that almost nothing can be deleted. 
The only way to make a simpler language would be to start over, and that 
wouldn't be Ada anymore [doing that killed Algol 60, for instance].

The alternative way to keep the complexity the same is simply to give up and 
not change the language at all. But then it soon will end up in the museum 
of disused languages.

...
>> You are still thinking way too low-level. Creating a parallel program 
>> should
>> be as easy as creating a sequential one. There should (almost) be no 
>> special
>> constructs at all. Ideally, most programs would be made up of parallel 
>> loops
>> and blocks, and there would be hardly any tasks (mainly to be used for
>> separate conceptual things).

>That's funny ! Really are you serious ? Ada have created the task 
>abstraction
>to deal with parallelism, and for me it is an abstraction, that can be 
>implemented
>by different ways, depending on the compiler, the OS but also the way it is 
>used
>inside the program. First you are telling me that it is too difficult for a 
>compiler to
>implement this abstraction in an other way than a heavy OS thread, because 
>it
>would be to complex to automatically find simple cases allowing reduce 
>tasking,
>and because a runtime library using mixed tasking would be to difficult to 
>write.

There are no practical "simple" tasking cases. Every task has to terminate, 
deal with exceptions, deal with finalization, and the like. The only tasks 
that could eliminate those things would contain no calls, no uses of ADTs 
(meaning no containers), no heap use. And such tasks are too simple to be 
worth parallelizing anyway (see Dmitry's message and my reply to it).

Janus/Ada has some code-generation cases for cutting the costs of subprogram 
calls for very simple subprograms. And when I tested coverage of the code 
generator on our in-house collection of Ada test programs, it turned up as 
having never been tested. There was not a single usage of the "very simple" 
subprogram anywhere. It was just a bunch of work for no gain. That almost 
always happens with "simple" anythings.

>And strongly disagree to both arguments :
>1. the Ravenscar profile is a counter example, and using aspects or pragma 
>should
>allow us to make useful restrictions on local tasks
>2. the Paraffin library shows clearly that we can have mixed tasking, for 
>it can be
>used in parallel to tasks.
>And after, that you pretend seriously that a compiler should be able to 
>find by its
>own parallelism in the program ! It seems to me a major contradiction !

Surely, because you are inventing things. I'm not in favor of the compiler 
automatically doing anything, in large part because it violates the rules of 
Ada as they stand. I'm in favor of Brad's parallel loops and blocks, because 
they *declare* to the compiler that code should be executed in parallel if 
that makes sense. Plus there is static checking that the code is 
parallel-safe (or a suppression of such checking, so the compiler doesn't 
have to care whether it works).

The result is easy to read, easy to write, and relatively easy to compile. 
"Simple" in you description.

Writing tasks for short parallelization is just too heavyweight, especially 
as most people will not understand what is and is not allowed for 
communication. (And there are very few truly independent tasks.)

As an example, Ada 2005 used procedure parameters to do iteration. It works, 
it didn't add any new features or complexity -- and people hated it because 
it required inverting the way one thinks about a loop. So Ada 2012 added 
user-defined iterator syntax. The code does essentially the same thing, but 
it's much more readable (almost too readable since now people are 
complaining about the overhead).

We don't want to repeat that mistake; tasks are just too heavyweight to 
describe loop contents, just as subprograms are too heavyweight for the same 
purpose.

>No, creating a parallel program is far more complex than creating a 
>sequential
>one, and until we have a compiler so smart to do this, I prefer to rely on
>explicit tasking.

But that's the whole point: we HAVE to have a compiler smart enough to at 
least check this, else Ada will be passed by other languages that provide 
the same thing. Indeed, Ada's entire future is in having the compiler check 
a lot more things (both sequential and parallel). The proposed parallel loop 
construct will only allow code that is task-safe (we intend to allow a 
programmer to suppress the check, but it will default to safe).

There's no point in any half-measures; certainly not at the language level.

...
>> Also please note that language enhancements occur through a process of
>> consensus. Most of the ARG has to agree on a direction before it gets 
>> into
>> the language standard.
>
>That is why I am taking time to discuss on this forum.
>
>> You should have noted by now that pretty much everyone who has answered
>> here has disagreed with your position.
>
>No I haven't noticed.

Which is why I wrote this part of my reply. Please notice!

> I had the feeling that at least Dimitry and Brad had were
> sensible to some of my points.

Every squirrel finds some nuts. :-)

> But even if it was the case, that doesn't mean that
> I went wrong :-) Montesquieu said "The less you think, the more people 
> agree with you" :-).

And this IS the problem. You don't seem to understand that many of us have 
been working on this problem for many years, have already seen many failed 
solutions (your ideas are very similar to the passive tasks of Ada 83; Ada 
95 decided not to go that way for good reason), and have already been 
working on achieving consensus on a direction for the future. And you seem 
to be ignoring the reasons why we're going that way in favor of an unusual 
understanding of the purposes of tasks.

Anyway, this is pretty much my last attempt to discuss this. I realize that 
I've spent more than 2 hours writing messages here the last couple of days 
and I really ought to accomplish some work rather than throwing away more 
time.

                                      Randy.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 20:37                             ` Dmitry A. Kazakov
@ 2014-12-20  1:05                               ` Randy Brukardt
  2014-12-20 17:36                                 ` Brad Moore
  0 siblings, 1 reply; 73+ messages in thread
From: Randy Brukardt @ 2014-12-20  1:05 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:1gfkkgi7ukoj3$.1pqtchynzp9rc$.dlg@40tude.net...
> On Fri, 19 Dec 2014 11:35:05 -0700, Brad Moore wrote:
...
>> I haven't listed all the semantics but for the questions you ask,
>> each arm of the parallel block is a separate thread of execution (which
>> we have been calling a tasklet).
>>
>> Each tasklet starts off with its own local declaration of Total,
>> initialized to 0, which is the Identity value for the reduction.
>>
>> So, for the top Total, you end up with {Top_}Total := 1 + A + B;
>> for the bottom Total, you end up with {Bottom_}Total := C + D;
>>
>> Then during the reduction phase, those two results get reduced using the
>> reduction operation, which in this case is "+".
>>
>> So the end result is Total = 1 + A + B + C + D;
>
> I think that the block should have explicit parameters, e.g. Total must be
> an in-out parameter of the block. The syntax should be similar to the
> selective accept. Each arm must also have parameters, and only those and 
> of
> the block must be visible within an arm. E.g. A, B must be parameters.
> Nothing else should be visible.

Interesting. This does sound like a better approach to me. (The whole 
reduction object idea seems to me to be the worst part of the parallel 
proposals -- something needs to be available, but that doesn't seem to be 
the way to do it.)

OTOH, the syntax to specify such parameters doesn't seem natural. We surely 
don't want to force a parallel block or loop to be the only contents of a 
subprogram.

More thought required.

                                             Randy.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-20  0:39                   ` GNAT and Tasklets Peter Chapin
@ 2014-12-20  9:03                     ` Dmitry A. Kazakov
  0 siblings, 0 replies; 73+ messages in thread
From: Dmitry A. Kazakov @ 2014-12-20  9:03 UTC (permalink / raw)

On Fri, 19 Dec 2014 19:39:22 -0500, Peter Chapin wrote:

> Tasks are 
> good for large, relatively independent chunks of logic that need to 
> execute concurrently such as different control loops in an embedded 
> system.

Not only embedded, it is any system dealing with asynchronous events, like
when doing I/O.

> However, when writing parallel code that will execute over highly 
> regular data structures such as large matrices, explicit tasks are just a 
> distraction. In such applications the existence of tasks is an 
> implementation detail, not a design element.

Yes. The question is whether the programming paradigm built around a chain
of imperative instructions is good for such code altogether. I am not
arguing for graphical or functional languages. But I think that, possibly,
we should look after decomposition in terms of special objects instead,
like we did with protected objects in Ada 95.

> Trying to use explicit tasks for parallel programming is a kind of 
> abstraction inversion: using a high level construct to implement some low 
> level functionality. Sure it is possible but the result is ugly and 
> inefficient.

It is communication which makes that complex. The body of a tasklet or of a
parallel loop etc is usually not the problem. Which is why I think that
having it in an imperative manner does not solve it.

I see it as a lots of very small chains of instructions (bodies) which need
to be connected in some kind of mesh. The bodies themselves are of little
interest in this kind of problem. The connections is what should be
addressed.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-19 18:35                           ` Brad Moore
  2014-12-19 20:37                             ` Dmitry A. Kazakov
@ 2014-12-20 16:49                             ` Dennis Lee Bieber
  2014-12-20 17:58                               ` Brad Moore
  1 sibling, 1 reply; 73+ messages in thread
From: Dennis Lee Bieber @ 2014-12-20 16:49 UTC (permalink / raw)


On Fri, 19 Dec 2014 11:35:05 -0700, Brad Moore <brad.moore@shaw.ca>
declaimed the following:

>
>I haven't listed all the semantics but for the questions you ask,
>each arm of the parallel block is a separate thread of execution (which 
>we have been calling a tasklet).
>
>Each tasklet starts off with its own local declaration of Total, 
>initialized to 0, which is the Identity value for the reduction.
>
>So, for the top Total, you end up with {Top_}Total := 1 + A + B;

	There is no way I would ever interpret /that/ result... I'd actually
expect an optimizing compiler to see that the result of incrementing Total
is thrown away by the next statement in the block.

>for the bottom Total, you end up with {Bottom_}Total := C + D;
>
>Then during the reduction phase, those two results get reduced using the 
>reduction operation, which in this case is "+".
>
>So the end result is Total = 1 + A + B + C + D;
>

	So your reduction phase becomes the equivalent of a summing
operation...

	par_sum(	1,
				A + B,
				C + D	)	{of course, to do this right requires deferred
								evaluation of the parameters}

What if you want the result to be a product?
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-20  1:05                               ` Randy Brukardt
@ 2014-12-20 17:36                                 ` Brad Moore
  2014-12-21 18:23                                   ` Brad Moore
  2014-12-22 23:06                                   ` Randy Brukardt
  0 siblings, 2 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-20 17:36 UTC (permalink / raw)


On 14-12-19 06:05 PM, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:1gfkkgi7ukoj3$.1pqtchynzp9rc$.dlg@40tude.net...
>> On Fri, 19 Dec 2014 11:35:05 -0700, Brad Moore wrote:
> ...
>>> I haven't listed all the semantics but for the questions you ask,
>>> each arm of the parallel block is a separate thread of execution (which
>>> we have been calling a tasklet).
>>>
>>> Each tasklet starts off with its own local declaration of Total,
>>> initialized to 0, which is the Identity value for the reduction.
>>>
>>> So, for the top Total, you end up with {Top_}Total := 1 + A + B;
>>> for the bottom Total, you end up with {Bottom_}Total := C + D;
>>>
>>> Then during the reduction phase, those two results get reduced using the
>>> reduction operation, which in this case is "+".
>>>
>>> So the end result is Total = 1 + A + B + C + D;
>>
>> I think that the block should have explicit parameters, e.g. Total must be
>> an in-out parameter of the block. The syntax should be similar to the
>> selective accept. Each arm must also have parameters, and only those and
>> of
>> the block must be visible within an arm. E.g. A, B must be parameters.
>> Nothing else should be visible.
>
> Interesting. This does sound like a better approach to me. (The whole
> reduction object idea seems to me to be the worst part of the parallel
> proposals -- something needs to be available, but that doesn't seem to be
> the way to do it.)
>
> OTOH, the syntax to specify such parameters doesn't seem natural. We surely
> don't want to force a parallel block or loop to be the only contents of a
> subprogram.
>
> More thought required.

I had considered the idea of parallel block parameters as well, as the 
underlying idea has appeal, but had dismissed the idea in my mind due to 
similar reasons. Having a parameter list in the middle of a section of 
code looks plain weird to me. It looks like a callable entity, but the 
call is never made. The call implicitly occurs when the parallel block 
is encountered in the thread of execution, and the parameters are 
implicitly passed from other objects having the same name. I would think 
that would be quite foreign to existing Ada programmers, and might be 
considered somewhat inelegant.

It seemed better to treat this as a simple control structure more like 
an if statement, and leave the parameter passing and exception contracts 
to the enclosing subprogram.

As for reduction. We haven't yet discussed reduction for parallel 
blocks. It's not clear that we even need that, because in a parallel 
block, it seems easy enough to declare separate variables, and then 
write the reduction yourself.

eg.

     function Fibonacci (N : Integer) return Integer
     is
        Left, Right : Integer := 0;
     begin
        if N <= 2 return N;

        parallel
            Left := Fibonacci (N - 1);
        and
            Right := Fibonacci (N - 2);
        end parallel;

        return Left + Right;  -- No automatic reduction needed
     end Fibonacci;

Brad

>
>                                               Randy.
>
>

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-20 16:49                             ` Dennis Lee Bieber
@ 2014-12-20 17:58                               ` Brad Moore
  0 siblings, 0 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-20 17:58 UTC (permalink / raw)

On 14-12-20 09:49 AM, Dennis Lee Bieber wrote:
> On Fri, 19 Dec 2014 11:35:05 -0700, Brad Moore <brad.moore@shaw.ca>
> declaimed the following:
>
>>
>> I haven't listed all the semantics but for the questions you ask,
>> each arm of the parallel block is a separate thread of execution (which
>> we have been calling a tasklet).
>>
>> Each tasklet starts off with its own local declaration of Total,
>> initialized to 0, which is the Identity value for the reduction.
>>
>> So, for the top Total, you end up with {Top_}Total := 1 + A + B;
>
> 	There is no way I would ever interpret /that/ result... I'd actually
> expect an optimizing compiler to see that the result of incrementing Total
> is thrown away by the next statement in the block.

You are absolutely right. I had missed that Total wasn't being added 
back in on the second statement. I don't know if Dmitry had intended 
that in his original question, or had missed that as well.

In any case, hopefully the question has been answered. The useless 
increment of total in that case has no effect on the result, and you end 
up with

Total = A + B + C + D

Sorry if this was confusing people.

Each branch of the parallel block is a separate thread of execution. 
Within each branch the sequence of statements is executed sequentially.

>
>> for the bottom Total, you end up with {Bottom_}Total := C + D;
>>
>> Then during the reduction phase, those two results get reduced using the
>> reduction operation, which in this case is "+".
>>
>> So the end result is Total = 1 + A + B + C + D;
>>
>
> 	So your reduction phase becomes the equivalent of a summing
> operation...
>
> 	par_sum(	1,
> 				A + B,
> 				C + D	)	{of course, to do this right requires deferred
> 								evaluation of the parameters}

As mentioned above, the reduction only involves summing the results from 
each branch, and 1 is not part of the first result.

so
      par_sum(A + B, C + D)

>
> What if you want the result to be a product?
>

That's why reductions are tricky. It depends on the operation, For 
integer addition and subtraction, the reducing operation is "+", and the 
identity value is 0.

For products and divisions, the reducing operation is "*", and the 
identity value is 1.

The identity value is a value that when applied as a parameter using the 
reducting operation, yields the same result.

For elementary types such as integer and float, the compiler should be 
able to choose a default reducer and identity, but for user defined 
types, the compiler is not likely going to be able to determine a 
suitable reducer function, or identity value, so the programmer needs to 
have a way to specify this.

As I mentioned already, for parallel blocks it is questionable whether 
you even need to have automatic reduction support. The complexity of the 
feature may not be worth adding, though it does seem to be needed for 
parallel loops.

Brad

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-20 17:36                                 ` Brad Moore
@ 2014-12-21 18:23                                   ` Brad Moore
  2014-12-21 19:21                                     ` Shark8
  2014-12-21 21:35                                     ` tmoran
  2014-12-22 23:06                                   ` Randy Brukardt
  1 sibling, 2 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-21 18:23 UTC (permalink / raw)


On 2014-12-20 10:36 AM, Brad Moore wrote:
> On 14-12-19 06:05 PM, Randy Brukardt wrote:
>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>>>
>>> I think that the block should have explicit parameters, e.g. Total
>>> must be
>>> an in-out parameter of the block. The syntax should be similar to the
>>> selective accept. Each arm must also have parameters, and only those and
>>> of
>>> the block must be visible within an arm. E.g. A, B must be parameters.
>>> Nothing else should be visible.
>>
>> Interesting. This does sound like a better approach to me. (The whole
>> reduction object idea seems to me to be the worst part of the parallel
>> proposals -- something needs to be available, but that doesn't seem to be
>> the way to do it.)
>>
>> OTOH, the syntax to specify such parameters doesn't seem natural. We
>> surely
>> don't want to force a parallel block or loop to be the only contents of a
>> subprogram.
>>
>> More thought required.
>
> I had considered the idea of parallel block parameters as well, as the
> underlying idea has appeal, but had dismissed the idea in my mind due to
> similar reasons. Having a parameter list in the middle of a section of
> code looks plain weird to me. It looks like a callable entity, but the
> call is never made. The call implicitly occurs when the parallel block
> is encountered in the thread of execution, and the parameters are
> implicitly passed from other objects having the same name. I would think
> that would be quite foreign to existing Ada programmers, and might be
> considered somewhat inelegant.
>
> It seemed better to treat this as a simple control structure more like
> an if statement, and leave the parameter passing and exception contracts
> to the enclosing subprogram.
>

Actually, the proposed parallel block syntax fits nicely into the 
language when you think of it as a control statement, like an if statement.

An if statement with an else can be viewed as a logical "or" of two or 
more sequences of statements.

eg.

     if X
       Do_This;
     else
       Do_That;
     end if;

Here the effect is;
    Do_This *or* Do_That

A parallel block statement could be viewed as a logical "and of two or 
more sequences of statements.

    parallel
       Do_This;
    and
       Do_That;
    end parallel;

Here the effect is;
     Do_This *and* Do_That

So another argument against having parameters to the parallel block, is 
that it would be inharmonious to have parameters on a parallel block, if 
they cannot also be on an if statement.

I have never seen a programming language that has parameters on an if 
statement. That doesn't mean they don't exist. I'd be curious if anyone 
is aware of any such language. Adding parameters to if statements in Ada 
might be a real challenge.

Brad


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 18:23                                   ` Brad Moore
@ 2014-12-21 19:21                                     ` Shark8
  2014-12-21 19:45                                       ` Brad Moore
  2014-12-21 21:35                                     ` tmoran
  1 sibling, 1 reply; 73+ messages in thread
From: Shark8 @ 2014-12-21 19:21 UTC (permalink / raw)


On 21-Dec-14 11:23, Brad Moore wrote:
> I have never seen a programming language that has parameters on an if
> statement.

But all if-statements have a parameter: the conditional they test.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 19:21                                     ` Shark8
@ 2014-12-21 19:45                                       ` Brad Moore
  2014-12-21 23:21                                         ` Shark8
  0 siblings, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-21 19:45 UTC (permalink / raw)


On 2014-12-21 12:21 PM, Shark8 wrote:
> On 21-Dec-14 11:23, Brad Moore wrote:
>> I have never seen a programming language that has parameters on an if
>> statement.
>
> But all if-statements have a parameter: the conditional they test.

Not the sort of parameters we have been discussing though. Parameters 
that identify all the external variables that are read and/or modified 
inside the if statement.


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 18:23                                   ` Brad Moore
  2014-12-21 19:21                                     ` Shark8
@ 2014-12-21 21:35                                     ` tmoran
  2014-12-21 22:50                                       ` Brad Moore
  1 sibling, 1 reply; 73+ messages in thread
From: tmoran @ 2014-12-21 21:35 UTC (permalink / raw)


Is it the intent that parallel blocks would be used, for instance,
for Quicksort?

    parallel
       Sort(A(A'first .. Pivot-1));
    and
       Sort(A(Pivot+1 .. A'last));
    end parallel;


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 21:35                                     ` tmoran
@ 2014-12-21 22:50                                       ` Brad Moore
  2014-12-21 23:34                                         ` Shark8
  0 siblings, 1 reply; 73+ messages in thread
From: Brad Moore @ 2014-12-21 22:50 UTC (permalink / raw)


On 2014-12-21 2:35 PM, tmoran@acm.org wrote:
> Is it the intent that parallel blocks would be used, for instance,
> for Quicksort?
>
>      parallel
>         Sort(A(A'first .. Pivot-1));
>      and
>         Sort(A(Pivot+1 .. A'last));
>      end parallel;
>

Yes, exactly.

A full version (adapted from the parallel blocks Paraffin example) might be

procedure Quicksort
   (Container : in out Array_Type)
is

    procedure Swap (L, R : Index_Type);
    pragma Inline (Swap);

    procedure Parallel_Quicksort
      (Left, Right : Index_Type)
    is

       I : Index_Type'Base := Left;
       J : Index_Type'Base := Right;

       Pivot : constant Element_Type
         := Container
           (Index_Type'Val
                (Index_Type'Pos (Container'First) +
                 ((Index_Type'Pos (Left) +
                      Index_Type'Pos (Right)) / 2) - 1));

       use type Parallel.Local.CPU_Count;

    begin -- Parallel_Quicksort

       while I <= J loop

          while Container (I) < Pivot loop
             I := Index_Type'Succ (I);
          end loop;

          while Pivot < Container (J) loop
             J := Index_Type'Pred (J);
          end loop;

          if I <= J then

             Swap (I, J);

             I := Index_Type'Succ (I);
             J := Index_Type'Pred (J);
          end if;

       end loop;

       parallel
          if Left < J then
             Parallel_Quicksort (Left, J);
          end if;
       and
          if I < Right then
             Parallel_Quicksort (I, Right);
          end if;
       end parallel;

    end Parallel_Quicksort;

    procedure Swap (L, R : Index_Type) is
       Temp : constant Element_Type := Container (L);
    begin
       Container (L) := Container (R);
       Container (R) := Temp;
    end Swap;

begin
    Parallel_Quicksort (Container'First, Container'Last);
end Quicksort;

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 19:45                                       ` Brad Moore
@ 2014-12-21 23:21                                         ` Shark8
  2014-12-22 16:53                                           ` Brad Moore
  0 siblings, 1 reply; 73+ messages in thread
From: Shark8 @ 2014-12-21 23:21 UTC (permalink / raw)


On 21-Dec-14 12:45, Brad Moore wrote:
>>> I have never seen a programming language that has parameters on an if
>>> statement.
>>
>> But all if-statements have a parameter: the conditional they test.
>
> Not the sort of parameters we have been discussing though. Parameters
> that identify all the external variables that are read and/or modified
> inside the if statement.

True.
Maybe we could overload WITH or AT, or perhaps have an optional area 
[like declare] to be explicit about them?

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 22:50                                       ` Brad Moore
@ 2014-12-21 23:34                                         ` Shark8
  2014-12-22 16:55                                           ` Brad Moore
  0 siblings, 1 reply; 73+ messages in thread
From: Shark8 @ 2014-12-21 23:34 UTC (permalink / raw)


On 21-Dec-14 15:50, Brad Moore wrote:
> On 2014-12-21 2:35 PM, tmoran@acm.org wrote:
>> Is it the intent that parallel blocks would be used, for instance,
>> for Quicksort?
>
> Yes, exactly.

Aww.. I was hoping for a parallel ShellSort.
;)

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 23:21                                         ` Shark8
@ 2014-12-22 16:53                                           ` Brad Moore
  0 siblings, 0 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-22 16:53 UTC (permalink / raw)

On 14-12-21 04:21 PM, Shark8 wrote:
> On 21-Dec-14 12:45, Brad Moore wrote:
>>>> I have never seen a programming language that has parameters on an if
>>>> statement.
>>>
>>> But all if-statements have a parameter: the conditional they test.
>>
>> Not the sort of parameters we have been discussing though. Parameters
>> that identify all the external variables that are read and/or modified
>> inside the if statement.
>
> True.
> Maybe we could overload WITH or AT, or perhaps have an optional area
> [like declare] to be explicit about them?

Note also, if we start allowing this to be specified for if statements, 
you'd also want to allow for other constructs such as while loops, 
normal blocks, etc, as well as subprograms.

What we were proposing is to just allow this for subprograms. 
Subprograms already identify parameters in this way, but in addition we 
were proposing adding a Global aspect that could be applied to 
subprograms that identifies the side effects on external variables.

Subprograms already allow aspects, so this fits in well with existing 
syntax, whereas statements such as if statements, loop statements, and 
block statements, do not currently allow aspects in the syntax, so 
adding aspect support for those is a bigger change.

I also think it would become too tedious and add too much clutter and 
noise to the readability of the code if we start specifying these 
effects on all types of statements. Being able to do this on subprograms 
feels to me like the happier middle ground.

^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-21 23:34                                         ` Shark8
@ 2014-12-22 16:55                                           ` Brad Moore
  0 siblings, 0 replies; 73+ messages in thread
From: Brad Moore @ 2014-12-22 16:55 UTC (permalink / raw)


On 14-12-21 04:34 PM, Shark8 wrote:
> On 21-Dec-14 15:50, Brad Moore wrote:
>> On 2014-12-21 2:35 PM, tmoran@acm.org wrote:
>>> Is it the intent that parallel blocks would be used, for instance,
>>> for Quicksort?
>>
>> Yes, exactly.
>
> Aww.. I was hoping for a parallel ShellSort.
> ;)
>

I guess I was thinking of a more less-exactly form of exactly. I suspect 
you'd would like to do other things as well. :-)


^ permalink raw reply	[flat|nested] 73+ messages in thread

* Re: GNAT and Tasklets
  2014-12-20 17:36                                 ` Brad Moore
  2014-12-21 18:23                                   ` Brad Moore
@ 2014-12-22 23:06                                   ` Randy Brukardt
  1 sibling, 0 replies; 73+ messages in thread
From: Randy Brukardt @ 2014-12-22 23:06 UTC (permalink / raw)

"Brad Moore" <brad.moore@shaw.ca> wrote in message 
news:Wsilw.1234991$1s.181390@fx05.iad...
> On 14-12-19 06:05 PM, Randy Brukardt wrote:
...
>>> I think that the block should have explicit parameters, e.g. Total must 
>>> be
>>> an in-out parameter of the block. The syntax should be similar to the
>>> selective accept. Each arm must also have parameters, and only those and
>>> of
>>> the block must be visible within an arm. E.g. A, B must be parameters.
>>> Nothing else should be visible.
>>
>> Interesting. This does sound like a better approach to me. (The whole
>> reduction object idea seems to me to be the worst part of the parallel
>> proposals -- something needs to be available, but that doesn't seem to be
>> the way to do it.)
>>
>> OTOH, the syntax to specify such parameters doesn't seem natural. We 
>> surely
>> don't want to force a parallel block or loop to be the only contents of a
>> subprogram.
...
> As for reduction. We haven't yet discussed reduction for parallel blocks. 
> It's not clear that we even need that, because in a parallel block, it 
> seems easy enough to declare separate variables, and then write the 
> reduction yourself.

I was more reacting to the method of specfying reduction for parallel 
statements (I hadn't realized that it was only allowed for loops, that seems 
inconsistent to me, even if it is mostly necessary for the loop construct).

The problem I have is that it is really hard to wrap one's mind around the 
idea of reduction objects and reduction operations and especially reduction 
identity value. I'd like to it be more able to be obvious from the code 
specified. (I.e., if Total is being added in the loop, then clearly it will 
be reduced that way.) But I'm quite willing to say I don't know how to do 
that (yet), thus "more thought needed". Clearly the reduction objects have 
to be identified to the compiler; it's unclear to me that more is needed. 
After all, how the reduction is done is clearly in the sequential code, 
can't the compiler extract that for parallel reduction?

                                              Randy.

^ permalink raw reply	[flat|nested] 73+ messages in thread

end of thread, other threads:[~2014-12-22 23:06 UTC | newest]

Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10 16:31 GNAT and Tasklets vincent.diemunsch
2014-12-11 10:02 ` Jacob Sparre Andersen
2014-12-11 16:30   ` Anh Vo
2014-12-11 18:15     ` David Botton
2014-12-11 21:45     ` Egil H H
2014-12-11 23:09   ` Randy Brukardt
2014-12-12  2:28     ` Jacob Sparre Andersen
2014-12-12  8:46   ` vincent.diemunsch
2014-12-12 23:33     ` Georg Bauhaus
2014-12-13  2:06   ` Brad Moore
2014-12-13  6:50     ` Dirk Craeynest
2014-12-14  0:18 ` Hubert
2014-12-14 21:29   ` vincent.diemunsch
2014-12-16  5:09     ` Brad Moore
2014-12-17 13:24       ` vincent.diemunsch
2014-12-16  4:42 ` Brad Moore
2014-12-17 13:06   ` vincent.diemunsch
2014-12-17 20:31     ` Niklas Holsti
2014-12-17 22:08       ` Randy Brukardt
2014-12-17 22:52         ` Björn Lundin
2014-12-17 23:58           ` Randy Brukardt
2014-12-18 10:39             ` Björn Lundin
2014-12-18 23:01               ` Randy Brukardt
2014-12-19  8:39                 ` Natasha Kerensikova
2014-12-19 23:39                   ` Randy Brukardt
2014-12-19  8:59                 ` Dmitry A. Kazakov
2014-12-19 11:56                 ` Björn Lundin
2014-12-20  0:02                   ` Randy Brukardt
2014-12-18  8:42       ` Dmitry A. Kazakov
2014-12-18  8:56         ` vincent.diemunsch
2014-12-18  9:36           ` Dmitry A. Kazakov
2014-12-18 10:32             ` vincent.diemunsch
2014-12-18 11:19               ` Dmitry A. Kazakov
2014-12-18 12:09                 ` vincent.diemunsch
2014-12-18 13:07                   ` Dmitry A. Kazakov
2014-12-19 10:40                   ` Georg Bauhaus
2014-12-19 11:01                     ` Dmitry A. Kazakov
2014-12-19 16:42                       ` Brad Moore
2014-12-19 17:28                         ` Dmitry A. Kazakov
2014-12-19 18:35                           ` Brad Moore
2014-12-19 20:37                             ` Dmitry A. Kazakov
2014-12-20  1:05                               ` Randy Brukardt
2014-12-20 17:36                                 ` Brad Moore
2014-12-21 18:23                                   ` Brad Moore
2014-12-21 19:21                                     ` Shark8
2014-12-21 19:45                                       ` Brad Moore
2014-12-21 23:21                                         ` Shark8
2014-12-22 16:53                                           ` Brad Moore
2014-12-21 21:35                                     ` tmoran
2014-12-21 22:50                                       ` Brad Moore
2014-12-21 23:34                                         ` Shark8
2014-12-22 16:55                                           ` Brad Moore
2014-12-22 23:06                                   ` Randy Brukardt
2014-12-20 16:49                             ` Dennis Lee Bieber
2014-12-20 17:58                               ` Brad Moore
2014-12-19 19:43                           ` Peter Chapin
2014-12-19 20:45                           ` Georg Bauhaus
2014-12-19 20:56                             ` Dmitry A. Kazakov
2014-12-19 23:55                           ` Randy Brukardt
2014-12-19 23:51                       ` Randy Brukardt
2014-12-18 22:33               ` Randy Brukardt
2014-12-19 13:01                 ` GNATï¿½and Tasklets vincent.diemunsch
2014-12-19 17:46                   ` GNAT?and Tasklets Brad Moore
2014-12-20  0:39                   ` GNAT and Tasklets Peter Chapin
2014-12-20  9:03                     ` Dmitry A. Kazakov
2014-12-20  0:58                   ` GNAT�and Tasklets Randy Brukardt
2014-12-18  9:34         ` GNAT and Tasklets Niklas Holsti
2014-12-18  9:50           ` Dmitry A. Kazakov
2014-12-17 21:08     ` Brad Moore
2014-12-18  8:47       ` vincent.diemunsch
2014-12-18 21:58         ` Randy Brukardt
2014-12-17 22:18     ` Randy Brukardt
2014-12-18  0:56     ` Shark8

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox