comp.lang.ada
 help / color / mirror / Atom feed
* RFC: Prototype for a user threading library in Ada
@ 2016-06-17  9:44 Hadrien Grasland
  2016-06-17 16:18 ` Niklas Holsti
                   ` (4 more replies)
  0 siblings, 5 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-17  9:44 UTC (permalink / raw)


So, a while ago, after playing with the nice user-mode threading libraries that are available these days in C++, like Intel TBB and HPX, I thought it would be nice if Ada had something similar.

Looking around for existing work, I found a number of projects with custom, relatively specialized mechanisms, and a very nice library called Paraffin whose high-level data-parallel interface I found highly interesting, but whose lower-level tasking abstractions did not match what I had in mind.

So I decided to have a go at my vision of a low-level user threading model, taking plenty of inspiration from other designs which I am fond of such as OpenCL's command queues.

Today, I am confident that the resulting design and implementation is solid enough for third party review. So if anyone here is interested, please have a go at studying it on GitHub!

https://github.com/HadrienG2/ada-async-tests


I tried to write a fairly extensive README which explains what I had in mind, what I have done, what I plan to do next if this experiment is conclusive, and which design points require particular attention today in my opinion.

I do not have a license yet, because I couldn't decide on a nice name for this project at this point (ada-async is already taken...). But anyway, I really wouldn't recommend anyone to use this library at this stage, as I am still open to design and interface changes at this stage, so that's not so much of an issue at this point.

Once I get the name thing sorted out, you can most likely expect GPL or LGPL licensing. These are my standard license for hobby projects: I don't want any money for them, but your code is of high interest to me :)

Hadrien


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17  9:44 RFC: Prototype for a user threading library in Ada Hadrien Grasland
@ 2016-06-17 16:18 ` Niklas Holsti
  2016-06-17 16:46   ` Dmitry A. Kazakov
  2016-06-18  7:56   ` Hadrien Grasland
  2016-06-18  8:33 ` Hadrien Grasland
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 72+ messages in thread
From: Niklas Holsti @ 2016-06-17 16:18 UTC (permalink / raw)


On 16-06-17 12:44 , Hadrien Grasland wrote:
> So, a while ago, after playing with the nice user-mode threading
> libraries that are available these days in C++, like Intel TBB and
> HPX, I thought it would be nice if Ada had something similar.
>
> Looking around for existing work, I found a number of projects with
> custom, relatively specialized mechanisms, and a very nice library
> called Paraffin whose high-level data-parallel interface I found
> highly interesting, but whose lower-level tasking abstractions did
> not match what I had in mind.
>
> So I decided to have a go at my vision of a low-level user threading
> model, taking plenty of inspiration from other designs which I am
> fond of such as OpenCL's command queues.
>
> Today, I am confident that the resulting design and implementation is
> solid enough for third party review. So if anyone here is interested,
> please have a go at studying it on GitHub!
>
> https://github.com/HadrienG2/ada-async-tests

I had a quick first look, and it seems interesting, but I have two 
suggestions to make the development more understandable to the 
Ada-oriented reader:

- First, please do not redefine the word "task", even in the qualified 
form "asynchronous task". It is quite confusing, in the Ada context.

- Second, I question the terminology of "user thread". The "events", or 
"asynchronous tasks", are not "threads" in the sense of keeping their 
own machine-level control-flow state; they are automata that are invoked 
from the "executors" through a single "Run" operation. If some 
control-flow state must be kept between invocations of "Run", the 
"asynchronous task" must keep it in Ada-level variables/components.

I don't quite know what to call your "events" / "asynchronous tasks", 
but perhaps the term "work item", which you use in a comment, is better.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17 16:18 ` Niklas Holsti
@ 2016-06-17 16:46   ` Dmitry A. Kazakov
  2016-06-18  8:16     ` Hadrien Grasland
  2016-06-21  2:40     ` rieachus
  2016-06-18  7:56   ` Hadrien Grasland
  1 sibling, 2 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-17 16:46 UTC (permalink / raw)


On 2016-06-17 18:18, Niklas Holsti wrote:

> - First, please do not redefine the word "task", even in the qualified
> form "asynchronous task". It is quite confusing, in the Ada context.

Yes.

> - Second, I question the terminology of "user thread". The "events", or
> "asynchronous tasks", are not "threads" in the sense of keeping their
> own machine-level control-flow state; they are automata that are invoked
> from the "executors" through a single "Run" operation. If some
> control-flow state must be kept between invocations of "Run", the
> "asynchronous task" must keep it in Ada-level variables/components.
>
> I don't quite know what to call your "events" / "asynchronous tasks",
> but perhaps the term "work item", which you use in a comment, is better.

"Event" looks like a plain event (as opposed to a pulse event). 
"Asynchronous task" looks like a subprogram, not a task proper, not even 
a co-routine.

My take on this problematic is that there cannot exist a solution 
implemented at the library level. All these frameworks maybe fun (for 
the developer) but useless (horror for the end user) when the key 
problem is not solved. That is, the control-flow state (as you said) and 
the stack of the local objects both preserved between scheduling points. 
This can be done IMO only at the language level as co-routines, 
non-preemptive, cooperative, user-scheduled tasks, call it as you wish.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17 16:18 ` Niklas Holsti
  2016-06-17 16:46   ` Dmitry A. Kazakov
@ 2016-06-18  7:56   ` Hadrien Grasland
  1 sibling, 0 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-18  7:56 UTC (permalink / raw)


Le vendredi 17 juin 2016 18:18:25 UTC+2, Niklas Holsti a écrit :
> I had a quick first look, and it seems interesting, but I have two 
> suggestions to make the development more understandable to the 
> Ada-oriented reader:
> 
> - First, please do not redefine the word "task", even in the qualified 
> form "asynchronous task". It is quite confusing, in the Ada context.
> 
> - Second, I question the terminology of "user thread". The "events", or 
> "asynchronous tasks", are not "threads" in the sense of keeping their 
> own machine-level control-flow state; they are automata that are invoked 
> from the "executors" through a single "Run" operation. If some 
> control-flow state must be kept between invocations of "Run", the 
> "asynchronous task" must keep it in Ada-level variables/components.
> 
> I don't quite know what to call your "events" / "asynchronous tasks", 
> but perhaps the term "work item", which you use in a comment, is better.

Terminology has indeed been much of a pain for me, and something I can improve before freezing the interface. The main issue that I have is to find a good usability compromise between reusing familiar names when possible, and avoiding name clashes with existing concepts that have different meanings.

I thought I could be fine by focusing on terms that are known to have highly overloaded meanings across the software landscape, and thus warn the user that special attention to definitions is required, but you are right that in the Ada context "task" is definitely problematic enough to warrant trying to find something else.

What's nice about "task", in a C++ context at least, is that the Intel marketing department did quite a good job of imposing that as the official terminology for the elementary work unit of user-mode cooperative multitasking facilities. They were followed in this by other "task-parallel" libraries like HPX, so these days, when someone familiar with one of these libraries sees a "task", they get an idea which is quite close to what I mean here.

Common alternatives such as "green threads" or "user thread" are clearly inferior, in the sense of being politically loaded in the first case and only comprehensible by people with an OS tech background in the second case. "coroutine" or "process" is abusing established terminology that means something very different, and "generator" is only really usable for functional abstractions that yield results, as it is a very clumsy term to describe procedural abstractions that simply return control to the caller.

As for "work-item", it is something which I played with recently, but I'm not very keen on it either because in the OpenCL context, it has a totally different meaning, and I thus fear that users experienced with the OpenCL terminology would interprete a "work-item" as something that is somewhat lightweight, and fall in the trap of too fine a task granularity. Still, it looks better than the options above, so I may converge on it unless a better idea emerges.

Again, the great thing about "task" is that it is highly descriptive and was not widely used in the C++ community before, but obviously that is not true in an Ada context.

---

On the other hand, I am quite keen on "event" because what I am proposing is a near-superset of the OpenCL event model, although the use of a higher-level programming language allows for a much more convenient programming interface. An alternative which I had in mind was "completion", but although it brings some disambiguation with the interrupt-like event model used by e.g. GUI toolkits, it is also highly nonstandard, and somewhat confusing in the sense that "manipulating a completion" is meaningless.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17 16:46   ` Dmitry A. Kazakov
@ 2016-06-18  8:16     ` Hadrien Grasland
  2016-06-18  8:47       ` Dmitry A. Kazakov
  2016-06-23  1:42       ` Randy Brukardt
  2016-06-21  2:40     ` rieachus
  1 sibling, 2 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-18  8:16 UTC (permalink / raw)


Le vendredi 17 juin 2016 18:46:46 UTC+2, Dmitry A. Kazakov a écrit :
> On 2016-06-17 18:18, Niklas Holsti wrote:
> 
> > - First, please do not redefine the word "task", even in the qualified
> > form "asynchronous task". It is quite confusing, in the Ada context.
> 
> Yes.
> 
> > - Second, I question the terminology of "user thread". The "events", or
> > "asynchronous tasks", are not "threads" in the sense of keeping their
> > own machine-level control-flow state; they are automata that are invoked
> > from the "executors" through a single "Run" operation. If some
> > control-flow state must be kept between invocations of "Run", the
> > "asynchronous task" must keep it in Ada-level variables/components.
> >
> > I don't quite know what to call your "events" / "asynchronous tasks",
> > but perhaps the term "work item", which you use in a comment, is better.
> 
> "Event" looks like a plain event (as opposed to a pulse event). 
> "Asynchronous task" looks like a subprogram, not a task proper, not even 
> a co-routine.
> 
> My take on this problematic is that there cannot exist a solution 
> implemented at the library level. All these frameworks maybe fun (for 
> the developer) but useless (horror for the end user) when the key 
> problem is not solved. That is, the control-flow state (as you said) and 
> the stack of the local objects both preserved between scheduling points. 
> This can be done IMO only at the language level as co-routines, 
> non-preemptive, cooperative, user-scheduled tasks, call it as you wish.

I agree that implementation support for coroutines would be extremely valuable, if it were available at the language level (as in Python, C#, Go...) or even in specific implementations (as in Visual C++).

In addition to the benefits you mention, I will also add that a language-level implementation can potentially trap blocking system calls and substitute them with task switches for latency hiding. It will also get wide support from the standard library instead of requiring many custom primitives an wrapper, which means more development time to focus on the core concurrency model.

However, I think that as it stands, we are just about as likely to see it happening in Ada as we are to get lambdas and first-class function objects. So at some point, it is necessary to move on, and try to do what we can at the library level. In my case, state saving is partially available in the sense that any data member of Asynchronous_Task is preserved across iterations, so we do get something relatively straightforward to use, but I will certainly agree that this something is less flexible and pleasant to use than full language-level coroutine/generator support.

If this prototype works well, there is hope to take it further in the long term, though. For example, the productive interaction of HPX with the C++17 commitee has shown that having a working (even if clumsy) implementation of new concurrent primitives is a very strong argument when interacting with a language standardization commitee in order to request new language features that make these primitives easier to implement and use.

And even if we really cannot get the language features, if I get this project far enough to motivate other developers to join, it is also possible to envision implementing architecture- and OS-specific automatic task state saving, as in Intel TBB, which is something which I will refuse to do for now because it is too difficult to handle for a lone developer with a limited base of test hardware.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17  9:44 RFC: Prototype for a user threading library in Ada Hadrien Grasland
  2016-06-17 16:18 ` Niklas Holsti
@ 2016-06-18  8:33 ` Hadrien Grasland
  2016-06-18 11:38 ` Hadrien Grasland
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-18  8:33 UTC (permalink / raw)


Just a very random naming idea: this project is mainly about a tightly bound, small group of cooperative workers in a dictatorial infrastructure. So if I were ready to deal with the consequences in terms of perceived seriousness and search engine accessibility, maybe "Kolkhoz" could work as a project codename! ;-)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18  8:16     ` Hadrien Grasland
@ 2016-06-18  8:47       ` Dmitry A. Kazakov
  2016-06-18  9:17         ` Hadrien Grasland
  2016-06-23  1:42       ` Randy Brukardt
  1 sibling, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-18  8:47 UTC (permalink / raw)


On 2016-06-18 10:16, Hadrien Grasland wrote:

> In addition to the benefits you mention, I will also add that a
> language-level implementation can potentially trap blocking system calls
> and substitute them with task switches for latency hiding. It will also
> get wide support from the standard library instead of requiring many
> custom primitives an wrapper, which means more development time to focus
> on the core concurrency model.

It looks like too much burden on the implementation if possible at all.

But certainly, yes, entry calls must be aware of the context when made 
from a co-routine.

Regarding system calls, surely there must be a way to make non-blocking 
system calls looking as if they were blocking. Otherwise the whole idea 
would make no sense at all.

I don't care much about wrappers, since they can be easily done:

    procedure Read (Buffer : in out Stream_Element_Array) is
       Last : Stream_Element_Offset;
    begin
       loop
          Read (File, Buffer, Last); -- Non-blocking
          exit when Last = Buffer'Last;
          accept Reschedule; -- Give up until next time
       end loop;
    end Read;

The co-routine body would simply call Read and get all buffer filled.

A bigger problem is "releasing" a co-routine waiting for an asynchronous 
system call completion without polling. One solution could be events 
(protected objects) associated with the state of the non-blocking exchange:

    procedure Read (Buffer : in out Stream_Element_Array) is
       Last : Stream_Element_Offset;
    begin
       loop
          Read (File, Buffer, Last);
          exit when Last = Buffer'Last;
          File.IO_Event.Signaled; -- "Entry call"
       end loop;
    end Read;

> However, I think that as it stands, we are just about as likely to
> see  it happening in Ada as we are to get lambdas and first-class function
> objects.

Yes, it is difficult to convince. But co-routines do not look like much 
a change since all syntax necessary is basically there. It is semantics 
of entry calls and accept statements which must be attuned.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18  8:47       ` Dmitry A. Kazakov
@ 2016-06-18  9:17         ` Hadrien Grasland
  2016-06-18 11:53           ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-18  9:17 UTC (permalink / raw)


Le samedi 18 juin 2016 10:48:00 UTC+2, Dmitry A. Kazakov a écrit :
>
> Regarding system calls, surely there must be a way to make non-blocking 
> system calls looking as if they were blocking. Otherwise the whole idea 
> would make no sense at all.
>
> I don't care much about wrappers, since they can be easily done:
> 
>     procedure Read (Buffer : in out Stream_Element_Array) is
>        Last : Stream_Element_Offset;
>     begin
>        loop
>           Read (File, Buffer, Last); -- Non-blocking
>           exit when Last = Buffer'Last;
>           accept Reschedule; -- Give up until next time
>        end loop;
>     end Read;
> 
> The co-routine body would simply call Read and get all buffer filled.
> 
> A bigger problem is "releasing" a co-routine waiting for an asynchronous 
> system call completion without polling. One solution could be events 
> (protected objects) associated with the state of the non-blocking exchange:
> 
>     procedure Read (Buffer : in out Stream_Element_Array) is
>        Last : Stream_Element_Offset;
>     begin
>        loop
>           Read (File, Buffer, Last);
>           exit when Last = Buffer'Last;
>           File.IO_Event.Signaled; -- "Entry call"
>        end loop;
>     end Read;

Yes, if you control the wrapper responsible for making the "blocking" call, handling caller release is quite easy. You simply launch the nonblocking IO asynchronously, and have the caller task wait for the corresponding event object. When the IO is done, the event is fired, and the "blocked" caller is rescheduled.

But nonblocking IO is something I want to study more during the evolution of this library, as I think it is something which stresses the limits of the event model I propose. Single-shot events are a good fit when a clear notion of task completion exists, but they are less suitable when dealing with continuous processes such as streaming IO.

I do not want to go in the direction of reusable events, as the amount of ways these can go wrong is all but infinite, however there has to be a better synchronization primitive for this kind of progressive evolution.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17  9:44 RFC: Prototype for a user threading library in Ada Hadrien Grasland
  2016-06-17 16:18 ` Niklas Holsti
  2016-06-18  8:33 ` Hadrien Grasland
@ 2016-06-18 11:38 ` Hadrien Grasland
  2016-06-18 13:17   ` Niklas Holsti
  2016-06-18 16:27   ` Jeffrey R. Carter
  2016-06-20  8:42 ` Hadrien Grasland
  2016-07-10  0:45 ` rieachus
  4 siblings, 2 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-18 11:38 UTC (permalink / raw)


After going through a thesaurus a bit, what would you think about "job", perhaps with an extra "asynchronous" qualifier?

The term has a long history of being used in various task scheduling contexts. A minor issue is that it tends to be associated with batch processing, but that is not an extremely strong connection, and isn't entirely wrong either. In general, the term reflects the idea of an ongoing process without abusing that specific word. I think it could work quite well.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18  9:17         ` Hadrien Grasland
@ 2016-06-18 11:53           ` Dmitry A. Kazakov
  2016-06-20  8:23             ` Hadrien Grasland
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-18 11:53 UTC (permalink / raw)


On 2016-06-18 11:17, Hadrien Grasland wrote:

> But nonblocking IO is something I want to study more during the
> evolution of this library, as I think it is something which stresses the
> limits of the event model I propose. Single-shot events are a good fit
> when a clear notion of task completion exists, but they are less
> suitable when dealing with continuous processes such as streaming IO.

You could use a pulse event instead. The event is reset when all waiting 
tasks are released. It is not difficult to implement with protected 
objects using entry count attribute.

> I do not want to go in the direction of reusable events, as the
> amount  of ways these can go wrong is all but infinite, however there has to be
a better synchronization primitive for this kind of progressive evolution.

One solution is to have more states than Reset/Signaled. An event can 
traverse a larger set of states being a small state machine. As well as 
transitions may be initiated not only explicitly but also through 
scheduling events, e.g. task release in case of the pulse event.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18 11:38 ` Hadrien Grasland
@ 2016-06-18 13:17   ` Niklas Holsti
  2016-06-18 16:27   ` Jeffrey R. Carter
  1 sibling, 0 replies; 72+ messages in thread
From: Niklas Holsti @ 2016-06-18 13:17 UTC (permalink / raw)


On 16-06-18 14:38 , Hadrien Grasland wrote:
> After going through a thesaurus a bit, what would you think about
> "job", perhaps with an extra "asynchronous" qualifier?

Yes, I think "job" is much better than "task", in this context. "Action" 
could work, too.

Another similar word is "chore", but perhaps that sounds too negative 
and tiresome :-)

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18 11:38 ` Hadrien Grasland
  2016-06-18 13:17   ` Niklas Holsti
@ 2016-06-18 16:27   ` Jeffrey R. Carter
  1 sibling, 0 replies; 72+ messages in thread
From: Jeffrey R. Carter @ 2016-06-18 16:27 UTC (permalink / raw)


On Saturday, June 18, 2016 at 4:38:34 AM UTC-7, Hadrien Grasland wrote:
> After going through a thesaurus a bit, what would you think about "job", perhaps with an extra "asynchronous" qualifier?
> 
> The term has a long history of being used in various task scheduling contexts. A minor issue is that it tends to be associated with batch processing, but that is not an extremely strong connection, and isn't entirely wrong either. In general, the term reflects the idea of an ongoing process without abusing that specific word. I think it could work quite well.

Yes. Since yesterday (see my post on Eternal September) I've been trying to post:

Although your terminology is different, IIUC, this seems to be library to provide job pools: a set of tasks that can process jobs as they become available. A job, which you confusingly call an asynchronous task, is defined by a type with an associated operation. A task (in Ada terms) in the pool simply obtains a value of the type and invokes the operation on it. The library provides queuing for jobs when all tasks are busy, blocking of tasks when no jobs are available, and the creation of the tasks. There is a number of variations of the problem, mostly depending on dynamic/static creation of tasks and whether/how tasks terminate.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18 11:53           ` Dmitry A. Kazakov
@ 2016-06-20  8:23             ` Hadrien Grasland
  2016-06-20  9:22               ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-20  8:23 UTC (permalink / raw)


Le samedi 18 juin 2016 13:53:13 UTC+2, Dmitry A. Kazakov a écrit :
> On 2016-06-18 11:17, Hadrien Grasland wrote:
> 
> > But nonblocking IO is something I want to study more during the
> > evolution of this library, as I think it is something which stresses the
> > limits of the event model I propose. Single-shot events are a good fit
> > when a clear notion of task completion exists, but they are less
> > suitable when dealing with continuous processes such as streaming IO.
> 
> You could use a pulse event instead. The event is reset when all waiting 
> tasks are released. It is not difficult to implement with protected 
> objects using entry count attribute.

I am not very keen on this option because if is incompatible with state polling, which is useful for all kinds of non-waiting scenarios including component testing ("what is the state of my event after performing this operation?").

 
> > I do not want to go in the direction of reusable events, as the
> > amount  of ways these can go wrong is all but infinite, however there has to be
> a better synchronization primitive for this kind of progressive evolution.
> 
> One solution is to have more states than Reset/Signaled. An event can 
> traverse a larger set of states being a small state machine. As well as 
> transitions may be initiated not only explicitly but also through 
> scheduling events, e.g. task release in case of the pulse event.

Would you mean something like, for example, a discrete or floating-point progress counter that can be programmed to fire an event when going above any arbitrary level of progress?

That could be a very interesting avenue to explore indeed!


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17  9:44 RFC: Prototype for a user threading library in Ada Hadrien Grasland
                   ` (2 preceding siblings ...)
  2016-06-18 11:38 ` Hadrien Grasland
@ 2016-06-20  8:42 ` Hadrien Grasland
  2016-07-10  0:45 ` rieachus
  4 siblings, 0 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-20  8:42 UTC (permalink / raw)


So, I think it is time for an update on the terminology front:

For now, the unanimous sentiment is that "job" is a better fit than "task" for the work-items of my tasking model, and Jeffrey even found prior usage where this terminology matches exactly. I will thus switch to this terminology shortly unless a strong negative sentiment is received.

On the event/completion front, I am as of yet unconvinced that this term really needs changing. But there is plenty of time before interface freeze so I'll be continuously listening for new feedback.

As for giving the whole project a name that can at least work as a temporary placeholder until I find something better, I still think socialist terminology could be a fun choice that has not been overused to death by the software community yet, but I am a bit concerned that "kolkhoz" in particular could have unfortunate connotations for people originating from ex-USSR.

Due to this, I'd tend to lean more towards "phalanstery", an obscure word which takes its roots in the original utopian socialism idea of voluntarily assembled cooperative communities, and has thus little odds of offending anyone. From this point of view, "kibbutz" could arguably also work, but I would be wary of naming one of my projects after a religious community which I am not part of and am not familiar with ^^

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-20  8:23             ` Hadrien Grasland
@ 2016-06-20  9:22               ` Dmitry A. Kazakov
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-20  9:22 UTC (permalink / raw)


On 20/06/2016 10:23, Hadrien Grasland wrote:
> Le samedi 18 juin 2016 13:53:13 UTC+2, Dmitry A. Kazakov a écrit :
>> On 2016-06-18 11:17, Hadrien Grasland wrote:
>>
>>> But nonblocking IO is something I want to study more during the
>>> evolution of this library, as I think it is something which stresses the
>>> limits of the event model I propose. Single-shot events are a good fit
>>> when a clear notion of task completion exists, but they are less
>>> suitable when dealing with continuous processes such as streaming IO.
>>
>> You could use a pulse event instead. The event is reset when all waiting
>> tasks are released. It is not difficult to implement with protected
>> objects using entry count attribute.
>
> I am not very keen on this option because if is incompatible with
> state polling, which is useful for all kinds of non-waiting scenarios
> including component testing ("what is the state of my event after
> performing this operation?").

If you know the tasks awaiting a pulse event you could make it 
compatible again by polling for event + task states. But since the 
environment is not really concurrent there is nothing to worry about. 
You cannot have a race condition within just one task.

>>> I do not want to go in the direction of reusable events, as the
>>> amount  of ways these can go wrong is all but infinite, however there has to be
>> a better synchronization primitive for this kind of progressive evolution..
>>
>> One solution is to have more states than Reset/Signaled. An event can
>> traverse a larger set of states being a small state machine. As well as
>> transitions may be initiated not only explicitly but also through
>> scheduling events, e.g. task release in case of the pulse event.
>
> Would you mean something like, for example, a discrete or
> floating-point progress counter that can be programmed to fire an event
> when going above any arbitrary level of progress?

That too. But generally for a client a possibility to wait for any of 
the states or for any disjunction of states.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17 16:46   ` Dmitry A. Kazakov
  2016-06-18  8:16     ` Hadrien Grasland
@ 2016-06-21  2:40     ` rieachus
  2016-06-21  7:34       ` Dmitry A. Kazakov
  1 sibling, 1 reply; 72+ messages in thread
From: rieachus @ 2016-06-21  2:40 UTC (permalink / raw)


On Friday, June 17, 2016 at 12:46:46 PM UTC-4, Dmitry A. Kazakov wrote:
> On 2016-06-17 18:18, Niklas Holsti wrote:
> 
> My take on this problematic is that there cannot exist a solution 
> implemented at the library level. All these frameworks maybe fun (for 
> the developer) but useless (horror for the end user) when the key 
> problem is not solved. That is, the control-flow state (as you said) and 
> the stack of the local objects both preserved between scheduling points. 
> This can be done IMO only at the language level as co-routines, 
> non-preemptive, cooperative, user-scheduled tasks, call it as you wish.

I've been trying to understand not just the code, but the goal.  I decided to start from a different perspective.  What if I had a problem and wanted to distribute the solution across thousands of processors?  Since I tend to bang my head against NP-hard or NP-complete problems, I would want a program structure that allowed me to start up at least one (Ada) task per processor, with enough data to complete, and for the job creation software to use a very wide tree.  A similar reverse tree could be used to collect results if needed.

But to do any of this, I head for the distributed systems annex.  It might be nice to have a simple example of how to do that on top of MPI:  https://computing.llnl.gov/tutorials/mpi/#MPI2-3 (To be honest, using the C or Fortran bindings is what I have done...)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-21  2:40     ` rieachus
@ 2016-06-21  7:34       ` Dmitry A. Kazakov
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-21  7:34 UTC (permalink / raw)


On 21/06/2016 04:40, rieachus@comcast.net wrote:
> On Friday, June 17, 2016 at 12:46:46 PM UTC-4, Dmitry A. Kazakov wrote:
>> On 2016-06-17 18:18, Niklas Holsti wrote:
>>
>> My take on this problematic is that there cannot exist a solution
>> implemented at the library level. All these frameworks maybe fun (for
>> the developer) but useless (horror for the end user) when the key
>> problem is not solved. That is, the control-flow state (as you said) and
>> the stack of the local objects both preserved between scheduling points.
>> This can be done IMO only at the language level as co-routines,
>> non-preemptive, cooperative, user-scheduled tasks, call it as you wish.
>
> I've been trying to understand not just the code, but the goal. I
> decided to start from a different perspective. What if I had a problem
> and wanted to distribute the solution across thousands of processors?
> Since I tend to bang my head against NP-hard or NP-complete problems, I
> would want a program structure that allowed me to start up at least one
> (Ada) task per processor, with enough data to complete, and for the job
> creation software to use a very wide tree. A similar reverse tree could
> be used to collect results if needed.

I have a similar problem. I implement network protocols. A classic 
implementation is to start one task per client-server connection, or two 
tasks when the exchange is full duplex. An OS can typically handle a few 
hundred tasks before it runs out of juice. So the solution is to share a 
large number of sockets between some few tasks. And so it runs into the 
problem of an I/O-events-driven design and thus to co-routines.

> But to do any of this, I head for the distributed systems annex. It
> might be nice to have a simple example of how to do that on top of MPI:
> https://computing.llnl.gov/tutorials/mpi/#MPI2-3 (To be honest, using
> the C or Fortran bindings is what I have done...)

Distributed annex is unusable for massively parallel systems because it 
is based on RPCs. It is the same problem in its core. Such a system 
requires asynchronous communication driven by I/O events. Yet the 
application logic is incoherent with the logic of I/O. In order to bring 
them together, again, [distributed] co-routines are needed to restore 
the application view of control flow.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-18  8:16     ` Hadrien Grasland
  2016-06-18  8:47       ` Dmitry A. Kazakov
@ 2016-06-23  1:42       ` Randy Brukardt
  2016-06-23  8:39         ` Dmitry A. Kazakov
  2016-06-24 21:06         ` Hadrien Grasland
  1 sibling, 2 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-06-23  1:42 UTC (permalink / raw)


"Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message 
news:d9d7f8f5-8b72-450b-8152-4b6116c6ce2c@googlegroups.com...
...
>I agree that implementation support for coroutines would be extremely 
>valuable, if it
>were available at the language level (as in Python, C#, Go...) or even in 
>specific
>implementations (as in Visual C++).

Coincidentally, we just spent quite substantial portion of the most recent 
ARG meeting discussing this. (See AI12-0197-1; there are proposed 
alternatives as well but those won't get posted for a few weeks - probably 
along with the minutes.)

The problem with such proposals is that they are quite expensive to 
implement, and they don't seem to buy that much above the existing Ada 
tasking model. [Especially as the proposal explicitly does not support any 
concurrency; one has to use POs/atomics in the normal way if concurrency is 
needed.] (After all, if you really want coroutines in Ada, just use 
Janus/Ada and regular tasks as it implements all tasks that way. :-)

The problem with the Janus/Ada implementation is the inability to use 
threads to implement that; that's fixable but I'd need a customer to help 
support the work. (I'd use a scheme internal to the task supervisor similar 
to your "events" rather than trying to assign tasks to threads.)

...
>However, I think that as it stands, we are just about as likely to see it 
>happening in Ada as we are to get lambdas ...

We also talked about limited lambdas in Pisa: see AI12-0190-1. So you're 
obviously right. ;-)

---

The problem I have with the library approach (and the coroutines and the 
like intended to support it) is that it does seem to solve any problems. I 
understand why such approaches get used in languages that don't have a real 
tasking model, but Ada hasn't had that problem since day 1. And the reasons 
that writing tasking code in Ada is too hard aren't getting addressed by 
these schemes (that is, race conditions, deadlocks [especially data 
deadlocks], and the like).

I'd prefer to concentrate on language features that make it as easy to write 
(restricted and correct) parallel code as it is to write sequential code. I 
don't see how libraries or coroutines or lambdas are getting us any closer 
to that.

I'd like to understand better the motivations for these features, so if you 
(or anyone else) wants to try to explain them to me, feel free. (But keep in 
mind that I tend to be hard to convince of anything these days, so don't 
bother if you're going to give up easily. ;-)

                                            Randy.




^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-23  1:42       ` Randy Brukardt
@ 2016-06-23  8:39         ` Dmitry A. Kazakov
  2016-06-23 22:12           ` Randy Brukardt
  2016-06-24  0:38           ` rieachus
  2016-06-24 21:06         ` Hadrien Grasland
  1 sibling, 2 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-23  8:39 UTC (permalink / raw)


On 23/06/2016 03:42, Randy Brukardt wrote:

> I'd like to understand better the motivations for these features, so if you
> (or anyone else) wants to try to explain them to me, feel free.

The motivation is a two-liner. Let you have some consumer of data:

    procedure Write (Buffer : String; Last : out Integer);

It may take less than the whole string when called, but will take more 
data later. So, the parameter Last. Now you want to write a program in a 
*normal* way:

    Write ("This");
    Write ("That");

That's it.

This applies to all kinds of asynchronous communication and naturally to 
all kinds of parallel and distributed programming.

Ada tasks + blocking exchange are far too heavy-weight for many 
applications. Consider a server handling 1K connections as an example or 
a tiny embedded board running HTTP server, MQTT server, MODBUS client + 
dozens of other protocols etc.

Protected objects + non-blocking exchange are OK, but you will have to 
design your program standing on your head = saving the state between 
portions of data, running a complicated state machine. This is not 
scalable and unmaintainable.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-23  8:39         ` Dmitry A. Kazakov
@ 2016-06-23 22:12           ` Randy Brukardt
  2016-06-24  7:34             ` Dmitry A. Kazakov
  2016-06-24  0:38           ` rieachus
  1 sibling, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-06-23 22:12 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nkg78j$1keu$1@gioia.aioe.org...
> On 23/06/2016 03:42, Randy Brukardt wrote:
>
>> I'd like to understand better the motivations for these features, so if 
>> you
>> (or anyone else) wants to try to explain them to me, feel free.
>
> The motivation is a two-liner. Let you have some consumer of data:
>
>    procedure Write (Buffer : String; Last : out Integer);
>
> It may take less than the whole string when called, but will take more 
> data later. So, the parameter Last. Now you want to write a program in a 
> *normal* way:
>
>    Write ("This");
>    Write ("That");
>
> That's it.

That wasn't my question; I'm wondering the motivation for these features in 
terms of parallelization. In today's world, it's impossible to consider any 
feature in a purely sequential manner. The expressiveness gain (if any) is 
secondary. And the OP was talking about parallelism, not generators.

The generator proposal as expressed in AI12-0197-1 is just too expensive to 
consider for a purely sequential feature. For Janus/Ada on Windows, we'd 
either have to throw away 1/3 of the back-end of the compiler (and generally 
use slower instructions in some cases, impacting all code), or implement 
these *exactly* as we do tasks (with a TCB, context switching, and so on). 
[Not to mention the extensive changes needed to the front-end.] For 
something to be worth that sort of effort, it has to benefit a large 
percentage of programs. This (as a purely sequential feature) doesn't do 
that.

                                                Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-23  8:39         ` Dmitry A. Kazakov
  2016-06-23 22:12           ` Randy Brukardt
@ 2016-06-24  0:38           ` rieachus
  2016-06-25  6:28             ` Dmitry A. Kazakov
  1 sibling, 1 reply; 72+ messages in thread
From: rieachus @ 2016-06-24  0:38 UTC (permalink / raw)


I don't get it.  If this is your "motivation":

> The motivation is a two-liner. Let you have some consumer of data: 
>
>    procedure Write (Buffer : String; Last : out Integer); 
>
> It may take less than the whole string when called, but will take more 
> data later. So, the parameter Last. Now you want to write a program in a 
> *normal* way: 
>
>    Write ("This"); 
>    Write ("That"); 
>
> That's it. 

You may want to make your Last parameter in or in out, but that's a detail.

The common Ada idiom is:

   procedure Write (S: in String) is
      Blanks: constant Buffer := (others := ' ');
   -- I assume that their are constant size buffers around somewhere we are
   -- matching.
      SS: String(1..S'Length) := S;
   begin
     Write(S & Blanks(Buffer(SS'Length+1..Blanks'Last), SS'Length);
   end Write;
   -- SS is just belt and suspenders in case S has unusual bounds. 

Put this in the right place, and your calls above will call your Write procedure.  Of course, most Ada IO doesn't require padding. (That is what Unconstrained_Strings are for.) But if you have a low-level routine that needs, padding, for example a protected object that has a procedure with a fixed buffer size? No big deal.

And don't worry about the constant Blanks.  Assuming it is a reasonable size, your compiler should put it in the executable somewhere.  (If it doesn't, move it to a library package.)

Little routines like this are what makes Ada.Text_IO comfortable, if you really have to use it.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-23 22:12           ` Randy Brukardt
@ 2016-06-24  7:34             ` Dmitry A. Kazakov
  2016-06-24 23:00               ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-24  7:34 UTC (permalink / raw)


On 24/06/2016 00:12, Randy Brukardt wrote:

> That wasn't my question; I'm wondering the motivation for these features in
> terms of parallelization.

Because fine-grained (tightly coupled) parallelism is dead on arrival if 
you use synchronous exchange. You cannot wait for a response without 
imposing huge accumulating latencies.

> In today's world, it's impossible to consider any
> feature in a purely sequential manner. The expressiveness gain (if any) is
> secondary. And the OP was talking about parallelism, not generators.

Of course you can. Nobody is capable to write anything more or less 
large in a data event-controlled way. All logic of exchanges between 
parties is always strictly sequential: you compute, publish data, get 
subscribed data, compute again.

> The generator proposal as expressed in AI12-0197-1 is just too expensive to
> consider for a purely sequential feature.

I don't see much use in the proposal. The key feature must be the points 
where the "task" yields control, enters non-busy wait for an external 
event. Ideally it must be shaped as an accept statement or an entry call.

> For Janus/Ada on Windows, we'd
> either have to throw away 1/3 of the back-end of the compiler (and generally
> use slower instructions in some cases, impacting all code), or implement
> these *exactly* as we do tasks (with a TCB, context switching, and so on).
> [Not to mention the extensive changes needed to the front-end.] For
> something to be worth that sort of effort, it has to benefit a large
> percentage of programs. This (as a purely sequential feature) doesn't do
> that.

I don't understand the point. All tasks are sequential. That didn't 
prevent them being used in parallel computing.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-23  1:42       ` Randy Brukardt
  2016-06-23  8:39         ` Dmitry A. Kazakov
@ 2016-06-24 21:06         ` Hadrien Grasland
  2016-06-26  3:09           ` Randy Brukardt
  1 sibling, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-24 21:06 UTC (permalink / raw)


Le jeudi 23 juin 2016 03:42:50 UTC+2, Randy Brukardt a écrit :
> "Hadrien Grasland" wrote :
> ...
> >I agree that implementation support for coroutines would be extremely 
> >valuable, if it
> >were available at the language level (as in Python, C#, Go...) or even in 
> >specific
> >implementations (as in Visual C++).
> 
> Coincidentally, we just spent quite substantial portion of the most recent 
> ARG meeting discussing this. (See AI12-0197-1; there are proposed 
> alternatives as well but those won't get posted for a few weeks - probably 
> along with the minutes.)

Count me pleasantly surprised!


> The problem with such proposals is that they are quite expensive to 
> implement, and they don't seem to buy that much above the existing Ada 
> tasking model. [Especially as the proposal explicitly does not support any 
> concurrency; one has to use POs/atomics in the normal way if concurrency is 
> needed.] (After all, if you really want coroutines in Ada, just use 
> Janus/Ada and regular tasks as it implements all tasks that way. :-)
> 
> The problem with the Janus/Ada implementation is the inability to use 
> threads to implement that; that's fixable but I'd need a customer to help 
> support the work. (I'd use a scheme internal to the task supervisor similar 
> to your "events" rather than trying to assign tasks to threads.)

I would be happy to beta-test that feature if you also integrated Ada 2012 support along the way! :)

That aside, let me explain what I think coroutines are good for. When people turn to threads, they usually look for some of the following things:

1. Exploiting the concurrent processing abilities of modern hardware (multicore, hyper-threading)
2. Providing the illusion of simultaneously running tasks to their users, in a fashion that extends beyond actual hardware concurrency.
3. Hide various kinds of latencies (IO, decision-making) by doing other processing in the meantime.
4. Handle IO-heavy workloads, the typical example being a web server going through millions of requests per second.

Unfortunately, no single threading implementation can be good at all of these. And outside the embedded world, the average modern OS is optimized to provide the best possible illusion of infinite multitasking through round-robin thread scheduling, and managing threads at the kernel level so that the kernel may quickly switch between them on clock interrupts instead of delegating that task to user processes.

Sadly, this setup is terrible for concurrent application performance, as can be easily tested by running a multithreaded computation with overcommitted CPU resources. If you allocate even just 2 times as many OS threads as you have hardware threads, you observe a huge performance drop. Why? Because instead of leaving computations alone, your OS keeps switching between threads during execution, each time doing a round trip through the kernel and trashing the CPU cache. There is no such thing as free concurrent lunch.

For IO-heavy applications, the situation is even worse: you will pay the aforementioned overhead not only at the scheduling rate of your round-robin algorithm (typically ~1 kHz), but every single time your application blocks for IO. This is why no web server application that allocates one OS thread per connection can scale to more than a couple thousand connections per second.

So if you don't desire the illusion of perfect multitasking, it is better to give up on the user convenience of round robin and use some batch-derived task scheduling algorithm instead. Which, because asking your customers to modify their OS kernel configuration is not usually acceptable, entails only allocating as many OS threads as there are hardware threads, and managing the remainder of your concurrency in user mode. Ergo, we need user threads, which are easiest to implement on top of language-level coroutine support.


> ...
> >However, I think that as it stands, we are just about as likely to see it 
> >happening in Ada as we are to get lambdas ...
> 
> We also talked about limited lambdas in Pisa: see AI12-0190-1. So you're 
> obviously right. ;-)

I stand once again pleasantly corrected, then :) Though I have to admit that in an Ada context, I miss first-class functions more than I miss lambdas: you can relatively easily replace a lambda with an expression function declared at the appropriate scope, but you need an awful lot of function-specific boilerplate in order to produce a standalone function object that can be easily transmitted to an outer scope after capturing some local state.


> The problem I have with the library approach (and the coroutines and the 
> like intended to support it) is that it does seem to solve any problems. I 
> understand why such approaches get used in languages that don't have a real 
> tasking model, but Ada hasn't had that problem since day 1. And the reasons 
> that writing tasking code in Ada is too hard aren't getting addressed by 
> these schemes (that is, race conditions, deadlocks [especially data 
> deadlocks], and the like).
>
> I'd prefer to concentrate on language features that make it as easy to write 
> (restricted and correct) parallel code as it is to write sequential code. I 
> don't see how libraries or coroutines or lambdas are getting us any closer 
> to that.
>
> I'd like to understand better the motivations for these features, so if you 
> (or anyone else) wants to try to explain them to me, feel free. (But keep in 
> mind that I tend to be hard to convince of anything these days, so don't 
> bother if you're going to give up easily. ;-)

See above. Being able to easily write highly concurrent code is of limited use if said code ends up running with terrible performance because modern OSs are not at all optimized for this kind of workload. We shouldn't need to worry about how our users' OS kernels are setup, and user threading and coroutines are a solution to this problem.

Not that I am against also providing abstractions that make concurrent code easier to write, mind you. I actually have plenty of ideas in that direction. It is just that I think this is something that can largely be done at the library level, without requiring too much help from the underlying programming language.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-24  7:34             ` Dmitry A. Kazakov
@ 2016-06-24 23:00               ` Randy Brukardt
  2016-06-25  7:11                 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-06-24 23:00 UTC (permalink / raw)


Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nkinr0$1h8t$1@gioia.aioe.org...
> On 24/06/2016 00:12, Randy Brukardt wrote:
>
>> That wasn't my question; I'm wondering the motivation for these features 
>> in
>> terms of parallelization.
>
> Because fine-grained (tightly coupled) parallelism is dead on arrival if 
> you use synchronous exchange. You cannot wait for a response without 
> imposing huge accumulating latencies.

Parallelism, in general, is DOA if you need any significant exchange at all. 
As you say, that makes the effort effectively sequential. The cases where it 
works tend to use order-independent accumlators and fairly large chunks of 
accumulation.

>> In today's world, it's impossible to consider any
>> feature in a purely sequential manner. The expressiveness gain (if any) 
>> is
>> secondary. And the OP was talking about parallelism, not generators.
>
> Of course you can. Nobody is capable to write anything more or less large 
> in a data event-controlled way. All logic of exchanges between parties is 
> always strictly sequential: you compute, publish data, get subscribed 
> data, compute again.

Right. If there's a lot of exchange, it's not a candidate for parallelism.

>> The generator proposal as expressed in AI12-0197-1 is just too expensive 
>> to
>> consider for a purely sequential feature.
>
> I don't see much use in the proposal. The key feature must be the points 
> where the "task" yields control, enters non-busy wait for an external 
> event. Ideally it must be shaped as an accept statement or an entry call.

Jean-Pierre Rosen proposed a ""passive task" feature as an alternative to 
the generator proposal. It uses the existing task syntax, but gets rid of 
the thread of control.

Of course, Janus/Ada implements all tasks that way today, so I don't see a 
lot of benefit to that. But I can see where it might help other 
implementations.

>> For Janus/Ada on Windows, we'd
>> either have to throw away 1/3 of the back-end of the compiler (and 
>> generally
>> use slower instructions in some cases, impacting all code), or implement
>> these *exactly* as we do tasks (with a TCB, context switching, and so 
>> on).
>> [Not to mention the extensive changes needed to the front-end.] For
>> something to be worth that sort of effort, it has to benefit a large
>> percentage of programs. This (as a purely sequential feature) doesn't do
>> that.
>
> I don't understand the point. All tasks are sequential. That didn't 
> prevent them being used in parallel computing.

The feature as proposed can't be used in a task unless one (manually) adds 
locking, meaning you still have all of the possibilities for deadlock and 
race conditions. I believe that if we're going to advance parallelism at 
all, we have to do something to make it possible to write (sufficiently 
restricted and compile-time checked) code where one doesn't need to worry 
about those things. After all, the main reason that it's too hard to write 
parallel code is the need to carefully consider every possible deadlock and 
race condition. Otherwise, something like Parafin would provide everything 
needed and Ada clearly would be the choice for parallel code. ;-)

So I don't see any reason to add any new sequential features that add to, 
rather than reduce, the possibilities of deadlock and races. Certainly not 
if they take a lot of effort to implement, and don't seem to have general 
utility.

                                           Randy.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-24  0:38           ` rieachus
@ 2016-06-25  6:28             ` Dmitry A. Kazakov
  2016-06-26  1:34               ` rieachus
  2016-06-26  3:21               ` Randy Brukardt
  0 siblings, 2 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-25  6:28 UTC (permalink / raw)


On 2016-06-24 02:38, rieachus@comcast.net wrote:
> I don't get it.  If this is your "motivation":
>
>> The motivation is a two-liner. Let you have some consumer of data:
>>
>>    procedure Write (Buffer : String; Last : out Integer);
>>
>> It may take less than the whole string when called, but will take more
>> data later. So, the parameter Last. Now you want to write a program in a
>> *normal* way:
>>
>>    Write ("This");
>>    Write ("That");
>>
>> That's it.
>
> You may want to make your Last parameter in or in out, but that's a detail.

It is not a detail. The caller of Write does not know how much data the 
transport layer is ready to accept. That is the nature of non-blocking 
I/O. Write takes as much data it can and tells through Last where the 
caller must continue *later*.

A blocking busy-waiting wrapper looks this way:

    procedure Write (Buffer : String) is
       First : Integer := Buffer'First;
       Last  : Integer;
    begin
       loop
          Write (Buffer (First..Buffer'Last), Last);
          exit when Last = Buffer'Last;
          First := Last + 1;
       end loop;
    end Write;

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-24 23:00               ` Randy Brukardt
@ 2016-06-25  7:11                 ` Dmitry A. Kazakov
  2016-06-26  2:02                   ` rieachus
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-25  7:11 UTC (permalink / raw)


On 2016-06-25 01:00, Randy Brukardt wrote:
> Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nkinr0$1h8t$1@gioia.aioe.org...
>> On 24/06/2016 00:12, Randy Brukardt wrote:
>>
>>> That wasn't my question; I'm wondering the motivation for these features
>>> in terms of parallelization.
>>
>> Because fine-grained (tightly coupled) parallelism is dead on arrival if
>> you use synchronous exchange. You cannot wait for a response without
>> imposing huge accumulating latencies.
>
> Parallelism, in general, is DOA if you need any significant exchange at all.

It is an oversimplification. There is a large area in between where it 
works. Look how multi-core architectures blossom. A multi-core is a kind 
of parallelism with far greater exchange rate and coupling (all shared 
memory) than an architecture of piped communications.

>> I don't see much use in the proposal. The key feature must be the points
>> where the "task" yields control, enters non-busy wait for an external
>> event. Ideally it must be shaped as an accept statement or an entry call.
>
> Jean-Pierre Rosen proposed a ""passive task" feature as an alternative to
> the generator proposal. It uses the existing task syntax, but gets rid of
> the thread of control.

And this is exactly what is needed, IMO. A set of "tasks" driven by 
another task, switched in a non-preemptive manner.

> Of course, Janus/Ada implements all tasks that way today, so I don't see a
> lot of benefit to that. But I can see where it might help other
> implementations.

Only losses for blocking I/O.

> The feature as proposed can't be used in a task unless one (manually) adds
> locking, meaning you still have all of the possibilities for deadlock and
> race conditions. I believe that if we're going to advance parallelism at
> all, we have to do something to make it possible to write (sufficiently
> restricted and compile-time checked) code where one doesn't need to worry
> about those things. After all, the main reason that it's too hard to write
> parallel code is the need to carefully consider every possible deadlock and
> race condition. Otherwise, something like Parafin would provide everything
> needed and Ada clearly would be the choice for parallel code. ;-)

The problem at hand is that parallel asynchronous code cannot be 
designed in a reasonable way, at all. It is written it in a form of a 
state machine, logically and technically *gotos*. It is the state of 
software engineering of 70s wrapped in Ada constructs.

Before we can talk about deadlocks, which I am almost sure impossible to 
statically eliminate in this case, we need to be able to design the 
software in a minimally sane way.

> So I don't see any reason to add any new sequential features that add to,
> rather than reduce, the possibilities of deadlock and races.

You cannot reduce them, they are in the nature of exchange protocols, 
application logic, physically distributed hardware and software. Each 
time you get a timeout running some networking application, that because 
of some deadlock somewhere.


-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-25  6:28             ` Dmitry A. Kazakov
@ 2016-06-26  1:34               ` rieachus
  2016-06-26  3:21               ` Randy Brukardt
  1 sibling, 0 replies; 72+ messages in thread
From: rieachus @ 2016-06-26  1:34 UTC (permalink / raw)


On Saturday, June 25, 2016 at 2:29:12 AM UTC-4, Dmitry A. Kazakov wrote:
> On 2016-06-24 02:38, rieachus@comcast.net wrote:
>
> It is not a detail. The caller of Write does not know how much data the 
> transport layer is ready to accept. That is the nature of non-blocking 
> I/O. Write takes as much data it can and tells through Last where the 
> caller must continue *later*...

Thanks for the correction.  I'm used to doing that with an asynchronous pipe abstraction managed as an array of buffers.  (In Ada it would be a protected object with Put, Get and Free operations.)  The Put fills a buffer--a pointer to data and a size rather than copying anything.  The Get is called from the disk manager, which may be one of a number of storage managers, but in any case is on  different hardware.  These calls can copy all the descriptors, and Free calls can free multiple buffers.  Once you have this mechanism coded or included in your massively parallel application, tuning is needed to choose the number of buffers and how often the storage manager comes by to collect output.  You can do this truly asynchronously, but pushing large amounts of data into the MPI fabric is considered bad form--pulling is much more efficient.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-25  7:11                 ` Dmitry A. Kazakov
@ 2016-06-26  2:02                   ` rieachus
  2016-06-26  6:26                     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: rieachus @ 2016-06-26  2:02 UTC (permalink / raw)


On Saturday, June 25, 2016 at 3:11:57 AM UTC-4, Dmitry A. Kazakov wrote:
 
> The problem at hand is that parallel asynchronous code cannot be 
> designed in a reasonable way, at all. It is written it in a form of a 
> state machine, logically and technically *gotos*. It is the state of 
> software engineering of 70s wrapped in Ada constructs.

Um, I've always thought it funny that Ada state machines and Ada tasking mix nicely, and both need to be at the library level to work efficiently. I always used LALR on Multics which could handle LALR(k) grammars for any k, and even some non-LR grammars.  But it could also take a source file which mixed PL/I, C, Fortran, or Ada code with the grammar productions, and generate a (huge) source file which used a table driven engine.

If, and I understand it is a big if you could map the input to one or more independent sequential file, everything was wonderful.  However if you wanted to handle interacting inputs you were up a creek.  This limitation is not as bad as it sounds.  To take am aircraft radio, there would be several subsystems that could be handled separately. The interrupt causing events (from operator input) could be mixed into the data stream:  operating_state ::= (receiver_input) | (transmission) |(interrupt) (operating_state);

I can't imagine building a complex real-time system without a grammar tool and  Lui Sha & Goodenough real-time scheduling support tools.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-24 21:06         ` Hadrien Grasland
@ 2016-06-26  3:09           ` Randy Brukardt
  2016-06-26  6:41             ` Dmitry A. Kazakov
  2016-06-26  9:09             ` Hadrien Grasland
  0 siblings, 2 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-06-26  3:09 UTC (permalink / raw)


"Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message 
news:f4c3e42a-1aa1-4667-83d7-de6f81a8fdd2@googlegroups.com...
...
>> I'd like to understand better the motivations for these features, so if 
>> you
>> (or anyone else) wants to try to explain them to me, feel free. (But keep 
>> in
>> mind that I tend to be hard to convince of anything these days, so don't
>> bother if you're going to give up easily. ;-)
>
>See above. Being able to easily write highly concurrent code is of limited 
>use
>if said code ends up running with terrible performance because modern OSs
>are not at all optimized for this kind of workload. We shouldn't need to
>worry about how our users' OS kernels are setup, and user threading and
>coroutines are a solution to this problem.

Only if you want to make the user work even harder than ever.

It seems to me that the problem is with the "typical" Ada implementation 
more than with the expressiveness of features, when it comes to highly 
parallel implementations. Mapping tasks directly to OS threads only works if 
the number of tasks is small. So if it hurts when you do that, then DON'T DO 
THAT!! :-)

There's no reason for any particular mapping of Ada tasks to OS threads. I 
agree with you that the best plan is most likely having a number of threads 
roughly the same as the number of cores (although that could vary for a 
highly I/O intensive task). Ada already exposes ways to map tasks to cores, 
and that clearly could be extended slightly to manage the tasking system's 
mapping of threads to tasks.

I use Ada because I want it to prevent or detect all of my programming 
mistakes before I have to debug something. (I HATE debugging!). I'd like to 
extend that to parallel programming to the extent possible. I don't want to 
be adding major new features that will make that harder.

                                            Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-25  6:28             ` Dmitry A. Kazakov
  2016-06-26  1:34               ` rieachus
@ 2016-06-26  3:21               ` Randy Brukardt
  2016-06-26  6:15                 ` Dmitry A. Kazakov
  1 sibling, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-06-26  3:21 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nkl8bm$19q7$1@gioia.aioe.org...
> On 2016-06-24 02:38, rieachus@comcast.net wrote:
>> I don't get it.  If this is your "motivation":
>>
>>> The motivation is a two-liner. Let you have some consumer of data:
>>>
>>>    procedure Write (Buffer : String; Last : out Integer);
>>>
>>> It may take less than the whole string when called, but will take more
>>> data later. So, the parameter Last. Now you want to write a program in a
>>> *normal* way:
>>>
>>>    Write ("This");
>>>    Write ("That");
>>>
>>> That's it.
>>
>> You may want to make your Last parameter in or in out, but that's a 
>> detail.
>
> It is not a detail. The caller of Write does not know how much data the 
> transport layer is ready to accept. That is the nature of non-blocking 
> I/O. Write takes as much data it can and tells through Last where the 
> caller must continue *later*.
>
> A blocking busy-waiting wrapper looks this way:
>
>    procedure Write (Buffer : String) is
>       First : Integer := Buffer'First;
>       Last  : Integer;
>    begin
>       loop
>          Write (Buffer (First..Buffer'Last), Last);
>          exit when Last = Buffer'Last;
>          First := Last + 1;
>       end loop;
>    end Write;

You forgot the "delay 0.0;" (or "yield;" as Ada 2012 aliased it):

    procedure Write (Buffer : String) is
       First : Integer := Buffer'First;
       Last  : Integer;
    begin
       loop
          Write (Buffer (First..Buffer'Last), Last);
          exit when Last = Buffer'Last;
          First := Last + 1;
          delay 0.0;
       end loop;
    end Write;

The delay gives up the processor so that other tasks can run. It's how most 
I/O works in Janus/Ada. (Note that if you know something about the latencies 
of Write, you can use a better value than "0.0", so you don't try again 
until you're pretty sure the system is ready. For instance, for socket I/O 
we typically use 0.01, so the program isn't churning and doing nothing. 
[Although we usually use a system of increasing delays, so that if it is 
ready almost immediately, we're continuing to write, else we're waiting 
longer and not wasting time trying something that's unlikely to be ready.])

That, combined with proper tasking runtimes, would seem to provide better 
results than doing all one thing (task = thread) or all othe ther (some 
fancy coroutine system).

                                      Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  3:21               ` Randy Brukardt
@ 2016-06-26  6:15                 ` Dmitry A. Kazakov
  2016-06-28 20:44                   ` Anh Vo
  2016-07-02  4:13                   ` Randy Brukardt
  0 siblings, 2 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-26  6:15 UTC (permalink / raw)


On 2016-06-26 05:21, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nkl8bm$19q7$1@gioia.aioe.org...
>> On 2016-06-24 02:38, rieachus@comcast.net wrote:
>>> I don't get it.  If this is your "motivation":
>>>
>>>> The motivation is a two-liner. Let you have some consumer of data:
>>>>
>>>>    procedure Write (Buffer : String; Last : out Integer);
>>>>
>>>> It may take less than the whole string when called, but will take more
>>>> data later. So, the parameter Last. Now you want to write a program in a
>>>> *normal* way:
>>>>
>>>>    Write ("This");
>>>>    Write ("That");
>>>>
>>>> That's it.
>>>
>>> You may want to make your Last parameter in or in out, but that's a
>>> detail.
>>
>> It is not a detail. The caller of Write does not know how much data the
>> transport layer is ready to accept. That is the nature of non-blocking
>> I/O. Write takes as much data it can and tells through Last where the
>> caller must continue *later*.
>>
>> A blocking busy-waiting wrapper looks this way:
>>
>>    procedure Write (Buffer : String) is
>>       First : Integer := Buffer'First;
>>       Last  : Integer;
>>    begin
>>       loop
>>          Write (Buffer (First..Buffer'Last), Last);
>>          exit when Last = Buffer'Last;
>>          First := Last + 1;
>>       end loop;
>>    end Write;
>
> You forgot the "delay 0.0;"

I didn't. With yielding processor it would no more be busy-waiting.

> That, combined with proper tasking runtimes, would seem to provide better
> results than doing all one thing (task = thread) or all othe ther (some
> fancy coroutine system).

Not really. The whole point is that in the imaginary case under 
consideration you don't need a timer event in order to wake Write up. I 
presume that there is an I/O event that tells this:

    procedure Write (Buffer : String) is
       First : Integer := Buffer'First;
       Last  : Integer;
    begin
       loop
          Write (Buffer (First..Buffer'Last), Last);
          exit when Last = Buffer'Last;
          First := Last + 1;
          Output_Buffer.Wait_For_State (Not_Full); -- A PO's entry call
       end loop;
    end Write;

Now the code is exactly same for a task and a co-routine. What is left 
is the overhead of thread scheduling to remove.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  2:02                   ` rieachus
@ 2016-06-26  6:26                     ` Dmitry A. Kazakov
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-26  6:26 UTC (permalink / raw)


On 2016-06-26 04:02, rieachus@comcast.net wrote:
> On Saturday, June 25, 2016 at 3:11:57 AM UTC-4, Dmitry A. Kazakov wrote:
>
>> The problem at hand is that parallel asynchronous code cannot be
>> designed in a reasonable way, at all. It is written it in a form of a
>> state machine, logically and technically *gotos*. It is the state of
>> software engineering of 70s wrapped in Ada constructs.
>
> Um, I've always thought it funny that Ada state machines and Ada
> tasking mix nicely, and both need to be at the library level to work
> efficiently. I always used LALR on Multics which could handle LALR(k)
> grammars for any k, and even some non-LR grammars. But it could also
> take a source file which mixed PL/I, C, Fortran, or Ada code with the
> grammar productions, and generate a (huge) source file which used a
> table driven engine.
>
> If, and I understand it is a big if you could map the input to one
> or  more independent sequential file, everything was wonderful. However if
> you wanted to handle interacting inputs you were up a creek. This
> limitation is not as bad as it sounds. To take am aircraft radio, there
> would be several subsystems that could be handled separately. The
> interrupt causing events (from operator input) could be mixed into the
> data stream: operating_state ::= (receiver_input) | (transmission)
> |(interrupt) (operating_state);
>
> I can't imagine building a complex real-time system without a
> grammar  tool and Lui Sha & Goodenough real-time scheduling support tools.

In my opinion formal grammars is one of the most useless CS invention, 
but that is beside the point. Which is yes, it would be interesting to 
try co-routines for writing parsers. It may work or not.

 From the software design POV it is a choice what is to be handled as a 
set of callbacks vs. as a control flow sequence. Co-routine is a method 
to convert callback into a sequence. It is not always useful and not 
always possible. If callbacks are short and independent, there is little 
volatile state to share and keep between callback they better remain 
callbacks.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  3:09           ` Randy Brukardt
@ 2016-06-26  6:41             ` Dmitry A. Kazakov
  2016-07-02  4:21               ` Randy Brukardt
  2016-06-26  9:09             ` Hadrien Grasland
  1 sibling, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-06-26  6:41 UTC (permalink / raw)


On 2016-06-26 05:09, Randy Brukardt wrote:
> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message
> news:f4c3e42a-1aa1-4667-83d7-de6f81a8fdd2@googlegroups.com...
> ....
> It seems to me that the problem is with the "typical" Ada implementation
> more than with the expressiveness of features, when it comes to highly
> parallel implementations. Mapping tasks directly to OS threads only works if
> the number of tasks is small. So if it hurts when you do that, then DON'T DO
> THAT!! :-)
>
> There's no reason for any particular mapping of Ada tasks to OS threads.

Ah, but there is the reason. The OS can switch threads on I/O events. If 
Ada RTS could do this without mapping tasks to threads, fine. Apparently 
it cannot without imposing same or higher overhead OS threads have. 
Co-routines could offer something in between.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  3:09           ` Randy Brukardt
  2016-06-26  6:41             ` Dmitry A. Kazakov
@ 2016-06-26  9:09             ` Hadrien Grasland
  2016-07-02  4:36               ` Randy Brukardt
  1 sibling, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-06-26  9:09 UTC (permalink / raw)


Le dimanche 26 juin 2016 05:09:25 UTC+2, Randy Brukardt a écrit :
> "Hadrien Grasland" wrote in message...
> ...
> >> I'd like to understand better the motivations for these features, so if 
> >> you
> >> (or anyone else) wants to try to explain them to me, feel free. (But keep 
> >> in
> >> mind that I tend to be hard to convince of anything these days, so don't
> >> bother if you're going to give up easily. ;-)
> >
> >See above. Being able to easily write highly concurrent code is of limited 
> >use
> >if said code ends up running with terrible performance because modern OSs
> >are not at all optimized for this kind of workload. We shouldn't need to
> >worry about how our users' OS kernels are setup, and user threading and
> >coroutines are a solution to this problem.
> 
> Only if you want to make the user work even harder than ever.

Not necessarily so, good higher-level abstractions can help here. However, I definitely agree with your following point:


> It seems to me that the problem is with the "typical" Ada implementation 
> more than with the expressiveness of features, when it comes to highly 
> parallel implementations. Mapping tasks directly to OS threads only works if 
> the number of tasks is small. So if it hurts when you do that, then DON'T DO 
> THAT!! :-)

Yes, the problem could be solved at the Ada implementation level. That would also help greatly with the abstraction side of things, as "natural" Ada abstractions could be made to work as expected (see below).

However, any code which relies on this implementation characteristic would then become unportable, unless the standard also imposes that all implementation follow this path. Would that really be a reasonable request to make?


> There's no reason for any particular mapping of Ada tasks to OS threads. I 
> agree with you that the best plan is most likely having a number of threads 
> roughly the same as the number of cores (although that could vary for a 
> highly I/O intensive task). Ada already exposes ways to map tasks to cores, 
> and that clearly could be extended slightly to manage the tasking system's 
> mapping of threads to tasks.
> 
> I use Ada because I want it to prevent or detect all of my programming 
> mistakes before I have to debug something. (I HATE debugging!). I'd like to 
> extend that to parallel programming to the extent possible. I don't want to 
> be adding major new features that will make that harder.

An Ada implementation which would want to make the life of concurrent programmers easier could do the following things:

1/ Keep the amount of OS threads low (about the amount of CPU cores, a bit more for I/O), and map tasks to threads in an 1:N fashion.
2/ Make sure that any Ada feature which blocks tasks does the right thing by switching to another task and taking care of waking up the blocked task later, instead of just blocking the underlying OS thread.
3/ Make sure that the Ada standard library implementation behaves in a similarly sensible way, by replacing blocking system calls with nonblocking alternatives.

That is essentially what the Go programming language designers designed their tasking model, so it is possible to do it in a newly created programming language/implementation. But how hard would it be to retrofit it inside an existing Ada implementation? This I could not tell.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  6:15                 ` Dmitry A. Kazakov
@ 2016-06-28 20:44                   ` Anh Vo
  2016-07-02  4:13                   ` Randy Brukardt
  1 sibling, 0 replies; 72+ messages in thread
From: Anh Vo @ 2016-06-28 20:44 UTC (permalink / raw)


On Saturday, June 25, 2016 at 11:15:48 PM UTC-7, Dmitry A. Kazakov wrote:
> On 2016-06-26 05:21, Randy Brukardt wrote:
> > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> > news:nkl8bm$19q7$1@gioia.aioe.org...
> >> On 2016-06-24 02:38, rieachus@comcast.net wrote:
> >>> I don't get it.  If this is your "motivation":
> >>>
> >>>> The motivation is a two-liner. Let you have some consumer of data:
> >>>>
> >>>>    procedure Write (Buffer : String; Last : out Integer);
> >>>>
> >>>> It may take less than the whole string when called, but will take more
> >>>> data later. So, the parameter Last. Now you want to write a program in a
> >>>> *normal* way:
> >>>>
> >>>>    Write ("This");
> >>>>    Write ("That");
> >>>>
> >>>> That's it.
> >>>
> >>> You may want to make your Last parameter in or in out, but that's a
> >>> detail.
> >>
> >> It is not a detail. The caller of Write does not know how much data the
> >> transport layer is ready to accept. That is the nature of non-blocking
> >> I/O. Write takes as much data it can and tells through Last where the
> >> caller must continue *later*.
> >>
> >> A blocking busy-waiting wrapper looks this way:
> >>
> >>    procedure Write (Buffer : String) is
> >>       First : Integer := Buffer'First;
> >>       Last  : Integer;
> >>    begin
> >>       loop
> >>          Write (Buffer (First..Buffer'Last), Last);
> >>          exit when Last = Buffer'Last;
> >>          First := Last + 1;
> >>       end loop;
> >>    end Write;
> >
> > You forgot the "delay 0.0;"
> 
> I didn't. With yielding processor it would no more be busy-waiting.
> 
> > That, combined with proper tasking runtimes, would seem to provide better
> > results than doing all one thing (task = thread) or all othe ther (some
> > fancy coroutine system).
> 
> Not really. The whole point is that in the imaginary case under 
> consideration you don't need a timer event in order to wake Write up. I 
> presume that there is an I/O event that tells this:
> 
>     procedure Write (Buffer : String) is
>        First : Integer := Buffer'First;
>        Last  : Integer;
>     begin
>        loop
>           Write (Buffer (First..Buffer'Last), Last);
>           exit when Last = Buffer'Last;
>           First := Last + 1;
>           Output_Buffer.Wait_For_State (Not_Full); -- A PO's entry call
>        end loop;
>     end Write;
 
This procedure works correctly if the entire buffer is written. However, it will not work as intended if partial buffer is written first. Thus, the exit statement will never be true. One of the fix is to replace it, after First is recalculated, with "exit when First > Buffer'Last" statement.

Anh Vo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  6:15                 ` Dmitry A. Kazakov
  2016-06-28 20:44                   ` Anh Vo
@ 2016-07-02  4:13                   ` Randy Brukardt
  2016-07-02 10:25                     ` Dmitry A. Kazakov
  1 sibling, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-02  4:13 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nknrui$15bn$1@gioia.aioe.org...
> On 2016-06-26 05:21, Randy Brukardt wrote:
>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>> news:nkl8bm$19q7$1@gioia.aioe.org...
>>> On 2016-06-24 02:38, rieachus@comcast.net wrote:
>>>> I don't get it.  If this is your "motivation":
>>>>
>>>>> The motivation is a two-liner. Let you have some consumer of data:
>>>>>
>>>>>    procedure Write (Buffer : String; Last : out Integer);
>>>>>
>>>>> It may take less than the whole string when called, but will take more
>>>>> data later. So, the parameter Last. Now you want to write a program in 
>>>>> a
>>>>> *normal* way:
>>>>>
>>>>>    Write ("This");
>>>>>    Write ("That");
>>>>>
>>>>> That's it.
>>>>
>>>> You may want to make your Last parameter in or in out, but that's a
>>>> detail.
>>>
>>> It is not a detail. The caller of Write does not know how much data the
>>> transport layer is ready to accept. That is the nature of non-blocking
>>> I/O. Write takes as much data it can and tells through Last where the
>>> caller must continue *later*.
>>>
>>> A blocking busy-waiting wrapper looks this way:
>>>
>>>    procedure Write (Buffer : String) is
>>>       First : Integer := Buffer'First;
>>>       Last  : Integer;
>>>    begin
>>>       loop
>>>          Write (Buffer (First..Buffer'Last), Last);
>>>          exit when Last = Buffer'Last;
>>>          First := Last + 1;
>>>       end loop;
>>>    end Write;
>>
>> You forgot the "delay 0.0;"
>
> I didn't. With yielding processor it would no more be busy-waiting.

Nothing wrong with busy-waiting; it gets a bad rep.

>> That, combined with proper tasking runtimes, would seem to provide better
>> results than doing all one thing (task = thread) or all othe ther (some
>> fancy coroutine system).
>
> Not really. The whole point is that in the imaginary case under 
> consideration you don't need a timer event in order to wake Write up. I 
> presume that there is an I/O event that tells this:
>
>    procedure Write (Buffer : String) is
>       First : Integer := Buffer'First;
>       Last  : Integer;
>    begin
>       loop
>          Write (Buffer (First..Buffer'Last), Last);
>          exit when Last = Buffer'Last;
>          First := Last + 1;
>          Output_Buffer.Wait_For_State (Not_Full); -- A PO's entry call
>       end loop;
>    end Write;
>
> Now the code is exactly same for a task and a co-routine. What is left is 
> the overhead of thread scheduling to remove.

I agree this is better than using Yield. If this sort of formulation is 
possible, it ought to be used. And then there is no problem -- there's no 
thread scheduling overhead with Janus/Ada, 'cause there are no threads! :-)

Seriously, I'm thinking that thread scheduling overhead could be 
reduced/eliminated by the task supervisor, and thus there's no real problem 
with writing Ada tasks this way -- except of course that existing 
implementations don't try to do this sort of optimization.

Or I could just be mad. :-) [When it comes to tasking, I know just enough to 
be dangerous. ;-)]

                               Randy.





^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  6:41             ` Dmitry A. Kazakov
@ 2016-07-02  4:21               ` Randy Brukardt
  2016-07-02 10:33                 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-02  4:21 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nknteq$177m$1@gioia.aioe.org...
> On 2016-06-26 05:09, Randy Brukardt wrote:
>> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message
>> news:f4c3e42a-1aa1-4667-83d7-de6f81a8fdd2@googlegroups.com...
>> ....
>> It seems to me that the problem is with the "typical" Ada implementation
>> more than with the expressiveness of features, when it comes to highly
>> parallel implementations. Mapping tasks directly to OS threads only works 
>> if
>> the number of tasks is small. So if it hurts when you do that, then DON'T 
>> DO
>> THAT!! :-)
>>
>> There's no reason for any particular mapping of Ada tasks to OS threads.
>
> Ah, but there is the reason. The OS can switch threads on I/O events. If 
> Ada RTS could do this without mapping tasks to threads, fine. Apparently 
> it cannot without imposing same or higher overhead OS threads have.

Dunno; I doubt that it really has been tried. If that was really impossible, 
virtualization wouldn't work either, and there's lots of evidence that works 
fine.

The problem (if there is a problem at all) is fast character-at-a-time I/O - 
but that's a bad idea for lots of reasons (it doesn't work very well for a 
sequential program either) so perhaps it wouldn't matter than much.

> Co-routines could offer something in between.

Sure, but I don't see any need to expose that. Certainly not beyond an 
aspect much like "Inline", which is just a suggestion to the compiler to do 
something. It's better to leave implementation decisions (and that's all 
this is) to implementations!

                               Randy.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-26  9:09             ` Hadrien Grasland
@ 2016-07-02  4:36               ` Randy Brukardt
  2016-07-02  5:30                 ` Simon Wright
  2016-07-02 11:13                 ` Hadrien Grasland
  0 siblings, 2 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-07-02  4:36 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 4140 bytes --]

"Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message 
news:1e32c714-34cf-4828-81fc-6b7fd77e4532@googlegroups.com...
>Le dimanche 26 juin 2016 05:09:25 UTC+2, Randy Brukardt a écrit :
>> "Hadrien Grasland" wrote in message...
>> ...
>> It seems to me that the problem is with the "typical" Ada implementation
>> more than with the expressiveness of features, when it comes to highly
>> parallel implementations. Mapping tasks directly to OS threads only works 
>> if
>> the number of tasks is small. So if it hurts when you do that, then DON'T 
>> DO
>> THAT!! :-)
>
>Yes, the problem could be solved at the Ada implementation level. That 
>would also help
>greatly with the abstraction side of things, as "natural" Ada abstractions 
>could be made to
>work as expected (see below).
>
>However, any code which relies on this implementation characteristic would 
>then become
>unportable, unless the >standard also imposes that all implementation 
>follow this path.
>Would that really be a reasonable request to make?

Performance is naturally unportable. You'll get rather different performance 
characteristics with Janus/Ada and GNAT on the same machine, even though 
they're both Ada implementations. It's the nature of the beast -- if they 
really were identical, who'd need more than one implementation?? If you're 
really depending on performance characteristics that much, you're probably 
not even portable to a different CPU or disk system anyway.

Still, the language could help with aspects or library calls to provide 
suggestions to the compiler/runtime. We already have stuff like that (Inline 
and Pack come to mind).

>> I use Ada because I want it to prevent or detect all of my programming
>> mistakes before I have to debug something. (I HATE debugging!). I'd like 
>> to
>> extend that to parallel programming to the extent possible. I don't want 
>> to
>> be adding major new features that will make that harder.

>An Ada implementation which would want to make the life of concurrent
>programmers easier could do the following things:

>1/ Keep the amount of OS threads low (about the amount of CPU cores, a bit 
>more
>for I/O), and map tasks to threads in an 1:N fashion.

Reasonably easy. (For Windows, stack limitations might be a problem; not a 
problem on a bare target.)

>2/ Make sure that any Ada feature which blocks tasks does the right thing 
>by switching
>to another task and taking care of waking up the blocked task later, 
>instead of just
>blocking the underlying OS thread.

I think this follows from (1) and the Ada semantics. Blocking the underlying 
thread wouldn't properly implement the semantics.

>3/ Make sure that the Ada standard library implementation behaves in a 
>similarly sensible
>way, by replacing blocking system calls with nonblocking alternatives.

We didn't do that because it leads to unacceptable performance for silly 
I/O, that is stuff like:
     Write (File, Char);

It would make sense to revisit that.

>That is essentially what the Go programming language designers designed 
>their tasking
>model, so it is possible to do it in a newly created programming 
>language/implementation.
>But how hard would it be to retrofit it inside an existing Ada 
>implementation? This I could
>not tell.

(1) and (2) describe how Janus/Ada works, with the obvious exception that 
the number of underlying threads is 1. We didn't do (3) because it makes 
silly (unbuffered) I/O quite slow. But we don't do much of that anymore 
(unbuffered I/O is slow by itself -- I/O calls are themselves pretty 
slow) -- sometime in the 1990s I redid the I/O system to be able to figure 
out the difference between files (which can and should be buffered) and 
other kinds of devices (for which buffering can be a nuisance or worse - 
consider a buffered keyboard!). So this wouldn't be anywhere near the 
problem it once was.

Ergo, I don't think this is a problem at all.

But I don't think that really helps the race and deadlock issues that are 
the real problem with programming with Ada tasks. I'd like to find some help 
there, too.

                                    Randy.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02  4:36               ` Randy Brukardt
@ 2016-07-02  5:30                 ` Simon Wright
  2016-07-05 21:29                   ` Randy Brukardt
  2016-07-02 11:13                 ` Hadrien Grasland
  1 sibling, 1 reply; 72+ messages in thread
From: Simon Wright @ 2016-07-02  5:30 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:

> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message 

>>An Ada implementation which would want to make the life of concurrent
>>programmers easier could do the following things:
>
>>1/ Keep the amount of OS threads low (about the amount of CPU cores, a
>>bit more for I/O), and map tasks to threads in an 1:N fashion.
>
> Reasonably easy. (For Windows, stack limitations might be a problem;
> not a problem on a bare target.)

Most things are a problem on a bare target.

Or do you mean "limitations that Windows places on the _use_ of the
stack"?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02  4:13                   ` Randy Brukardt
@ 2016-07-02 10:25                     ` Dmitry A. Kazakov
  2016-07-05 21:53                       ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-02 10:25 UTC (permalink / raw)


On 2016-07-02 06:13, Randy Brukardt wrote:

> Seriously, I'm thinking that thread scheduling overhead could be
> reduced/eliminated by the task supervisor, and thus there's no real problem
> with writing Ada tasks this way -- except of course that existing
> implementations don't try to do this sort of optimization.

There could be OS limitations on the number of threads a process may 
have. I was thinking about "user-scheduled" tasks, which are not 
scheduled at all. Some user task explicitly releases one task from the 
pool and gets control back when that task calls to an entry. The 
important point I believe is preemption. "User-scheduled" tasks would 
not be required to be preemptive.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02  4:21               ` Randy Brukardt
@ 2016-07-02 10:33                 ` Dmitry A. Kazakov
  2016-07-05 21:24                   ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-02 10:33 UTC (permalink / raw)


On 2016-07-02 06:21, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nknteq$177m$1@gioia.aioe.org...
>> On 2016-06-26 05:09, Randy Brukardt wrote:
>>> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message
>>> news:f4c3e42a-1aa1-4667-83d7-de6f81a8fdd2@googlegroups.com...
>>> ....
>>> It seems to me that the problem is with the "typical" Ada implementation
>>> more than with the expressiveness of features, when it comes to highly
>>> parallel implementations. Mapping tasks directly to OS threads only works
>>> if
>>> the number of tasks is small. So if it hurts when you do that, then DON'T
>>> DO
>>> THAT!! :-)
>>>
>>> There's no reason for any particular mapping of Ada tasks to OS threads.
>>
>> Ah, but there is the reason. The OS can switch threads on I/O events. If
>> Ada RTS could do this without mapping tasks to threads, fine. Apparently
>> it cannot without imposing same or higher overhead OS threads have.
>
> Dunno; I doubt that it really has been tried. If that was really impossible,
> virtualization wouldn't work either, and there's lots of evidence that works
> fine.

But it does not, in this sense. It can handle only virtual devices. In 
order to access a physical one, you need a software layer. It could be 
quite difficult for the RTS to follow this path.

> The problem (if there is a problem at all) is fast character-at-a-time I/O -
> but that's a bad idea for lots of reasons (it doesn't work very well for a
> sequential program either) so perhaps it wouldn't matter than much.

It will make normal I/O slower and RTS larger with compatibility issues.

>> Co-routines could offer something in between.
>
> Sure, but I don't see any need to expose that. Certainly not beyond an
> aspect much like "Inline", which is just a suggestion to the compiler to do
> something. It's better to leave implementation decisions (and that's all
> this is) to implementations!

Well, the difference is that Inline can be safely ignored. Limitation on 
the task number cannot be.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02  4:36               ` Randy Brukardt
  2016-07-02  5:30                 ` Simon Wright
@ 2016-07-02 11:13                 ` Hadrien Grasland
  2016-07-02 13:18                   ` Dmitry A. Kazakov
                                     ` (3 more replies)
  1 sibling, 4 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-07-02 11:13 UTC (permalink / raw)


Le samedi 2 juillet 2016 06:36:09 UTC+2, Randy Brukardt a écrit :
> But I don't think that really helps the race and deadlock issues that are 
> the real problem with programming with Ada tasks. I'd like to find some help 
> there, too.

Here's my view of this: people are heavily overusing shared mutable data in multitasking programs, and that is the source of too many data races, which in turn people attempt to fix with locks, thusly killing their performance and introducing deadlocks.

In many cases, better performance and correctness can be easily achieved by moving to asynchronous tasking runtimes, where the runtime internally manages an event-based dependency graph of tasks that can be processed condurrently, and data is kept task-private but can be moved around between tasks.

But shared mutable data is necessary to the efficient implementation of data-parallel algorithms, so it should remain available to expert programmers.

What I would love in the end is some kind of static analysis tool that warns people about shared mutable data in multitasking programs, but gives a way to bypass the warning to people who actually know what they are doing.

But to make this warning useful would also entail introducing a notion of "deferred constants", that is, values which are defined at runtime but will not be modified subsequently. Otherwise, there would be too many false positives about program parameters that are loaded at runtime.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 11:13                 ` Hadrien Grasland
@ 2016-07-02 13:18                   ` Dmitry A. Kazakov
  2016-07-02 16:49                     ` Hadrien Grasland
  2016-07-02 17:26                   ` Niklas Holsti
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-02 13:18 UTC (permalink / raw)


On 2016-07-02 13:13, Hadrien Grasland wrote:
> Le samedi 2 juillet 2016 06:36:09 UTC+2, Randy Brukardt a écrit :
>> But I don't think that really helps the race and deadlock issues that are
>> the real problem with programming with Ada tasks. I'd like to find some help
>> there, too.
>
> Here's my view of this: people are heavily overusing shared mutable
> data in multitasking programs,

This is motivated by the hardware architecture of multi-cores with 
shared memory.

> and that is the source of too many data
> races, which in turn people attempt to fix with locks, thusly killing
> their performance and introducing deadlocks.
>
> In many cases, better performance and correctness can be easily
> achieved by moving to asynchronous tasking runtimes, where the runtime
> internally manages an event-based dependency graph of tasks that can be
> processed condurrently, and data is kept task-private but can be moved
> around between tasks.

Event-controlled architecture is exposed to generators (life-locks), and 
no less to dead-locks and race conditions.

 From the software developing POV it is far worse than shared memory 
architecture. The only advantage it has over the former that it can be 
truly scalable while shared memory is a bottle neck.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 13:18                   ` Dmitry A. Kazakov
@ 2016-07-02 16:49                     ` Hadrien Grasland
  2016-07-02 21:33                       ` Niklas Holsti
  0 siblings, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-07-02 16:49 UTC (permalink / raw)


Le samedi 2 juillet 2016 15:18:58 UTC+2, Dmitry A. Kazakov a écrit :
> On 2016-07-02 13:13, Hadrien Grasland wrote:
> > Le samedi 2 juillet 2016 06:36:09 UTC+2, Randy Brukardt a écrit :
> >> But I don't think that really helps the race and deadlock issues that are
> >> the real problem with programming with Ada tasks. I'd like to find some help
> >> there, too.
> >
> > Here's my view of this: people are heavily overusing shared mutable
> > data in multitasking programs,
> 
> This is motivated by the hardware architecture of multi-cores with 
> shared memory.

Fair enough, but I would argue that outside of the realm of data-parallel algorithms, most of the benefits of shared memory architectures reside in sharing constant data, not mutable one.

In any hardware architecture that features caches, and in any compiler whose memory model allows instruction reordering, synchronization of shared mutable data is always going to be problematic. So better do without that synchronization whenever practical.


> > and that is the source of too many data
> > races, which in turn people attempt to fix with locks, thusly killing
> > their performance and introducing deadlocks.
> >
> > In many cases, better performance and correctness can be easily
> > achieved by moving to asynchronous tasking runtimes, where the runtime
> > internally manages an event-based dependency graph of tasks that can be
> > processed condurrently, and data is kept task-private but can be moved
> > around between tasks.
> 
> Event-controlled architecture is exposed to generators (life-locks), and 
> no less to dead-locks and race conditions.
> 
>  From the software developing POV it is far worse than shared memory 
> architecture. The only advantage it has over the former that it can be 
> truly scalable while shared memory is a bottle neck.

I would claim that if your architecture is based on a notion of data ownership and favors the model of moving data over that of concurrently accessing it, then...

- Performance does not have to suffer (if what you move around is pointers or references to data, not the data itself)
- Race conditions in the message-passing process are easy to deal with (just do a memory fence before a thread sends data and after another receives it)

As for deadlocks in event-driven programs, they do occur, but they are trivial to detect (since the task graph is explicit, one can just look for a cycle in it), and good interface design can make it hard to trigger accidentally (if the event produced by a task A cannot be easily used as a dependency of that same task).


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 11:13                 ` Hadrien Grasland
  2016-07-02 13:18                   ` Dmitry A. Kazakov
@ 2016-07-02 17:26                   ` Niklas Holsti
  2016-07-02 21:14                   ` Niklas Holsti
  2016-07-05 21:38                   ` Randy Brukardt
  3 siblings, 0 replies; 72+ messages in thread
From: Niklas Holsti @ 2016-07-02 17:26 UTC (permalink / raw)


On 16-07-02 14:13 , Hadrien Grasland wrote:
>
> What I would love in the end is some kind of static analysis tool
> that warns people about shared mutable data in multitasking programs,
> but gives a way to bypass the warning to people who actually know
> what they are doing.

AdaControl can do that, and I believe CodePeer can do it too.

> But to make this warning useful would also entail introducing a
> notion of "deferred constants", that is, values which are defined at
> runtime but will not be modified subsequently. Otherwise, there would
> be too many false positives about program parameters that are loaded
> at runtime.

Static analysis tools always have ways to manually silence alarms that 
have been determined (my manual analysis) to be false positives.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 11:13                 ` Hadrien Grasland
  2016-07-02 13:18                   ` Dmitry A. Kazakov
  2016-07-02 17:26                   ` Niklas Holsti
@ 2016-07-02 21:14                   ` Niklas Holsti
  2016-07-03  7:42                     ` Hadrien Grasland
  2016-07-05 21:38                   ` Randy Brukardt
  3 siblings, 1 reply; 72+ messages in thread
From: Niklas Holsti @ 2016-07-02 21:14 UTC (permalink / raw)


On 16-07-02 14:13 , Hadrien Grasland wrote:
> Le samedi 2 juillet 2016 06:36:09 UTC+2, Randy Brukardt a écrit :
>> But I don't think that really helps the race and deadlock issues
>> that are the real problem with programming with Ada tasks. I'd like
>> to find some help there, too.
>
> Here's my view of this: people are heavily overusing shared mutable
> data in multitasking programs, and that is the source of too many
> data races, which in turn people attempt to fix with locks, thusly
> killing their performance and introducing deadlocks.
>
> In many cases, better performance and correctness can be easily
> achieved by moving to asynchronous tasking runtimes, where the
> runtime internally manages an event-based dependency graph of tasks
> that can be processed condurrently, and data is kept task-private but
> can be moved around between tasks.

Do you have real experience of such "asynchronous tasking runtimes"? In 
Ada or in other languages?

It seems to me that it would be difficult for such a run-time system to 
support the full Ada tasking features, such as conditional entry calls. 
Ada tasks have more control over their own execution than such a 
run-time system would allow.

Moreover, in present Ada it seems to me that the only way to move 
task-private data from one task to another is to send a copy of the data 
from one task to the other. Copying data is often poison for performance.

Perhaps a new, restricted tasking profile would be needed, analogous to 
the Ravenscar profile but aimed not at real-time systems but at parallel 
computation in this event-based style.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 16:49                     ` Hadrien Grasland
@ 2016-07-02 21:33                       ` Niklas Holsti
  2016-07-03 20:56                         ` Hadrien Grasland
  0 siblings, 1 reply; 72+ messages in thread
From: Niklas Holsti @ 2016-07-02 21:33 UTC (permalink / raw)


On 16-07-02 19:49 , Hadrien Grasland wrote:

> As for deadlocks in event-driven programs, they do occur, but they
> are trivial to detect (since the task graph is explicit, one can just
> look for a cycle in it),

That is not always enough. In my experience, event-driven programs tend 
to have cycles where one task sends events or "requests" to another, and 
expects "replies" in return. To understand if this causes a deadlock, 
livelock, or works well, one must analyse the conditional control flow 
in the tasks and consider the feasible/infeasible chains of events.

> and good interface design can make it hard
> to trigger accidentally (if the event produced by a task A cannot be
> easily used as a dependency of that same task).

Good design is a cure for all kinds of deadlocks ;-)

I don't know if the request/reply event-cycle can be considered bad 
design. In the examples I've seen, it was implied and required by the 
roles assigned to the tasks. However, these were real-time, embedded 
applications, not parallel computation applications.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 21:14                   ` Niklas Holsti
@ 2016-07-03  7:42                     ` Hadrien Grasland
  2016-07-03  8:39                       ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-07-03  7:42 UTC (permalink / raw)


Le samedi 2 juillet 2016 23:14:30 UTC+2, Niklas Holsti a écrit :
> On 16-07-02 14:13 , Hadrien Grasland wrote:
> > Le samedi 2 juillet 2016 06:36:09 UTC+2, Randy Brukardt a écrit :
> >> But I don't think that really helps the race and deadlock issues
> >> that are the real problem with programming with Ada tasks. I'd like
> >> to find some help there, too.
> >
> > Here's my view of this: people are heavily overusing shared mutable
> > data in multitasking programs, and that is the source of too many
> > data races, which in turn people attempt to fix with locks, thusly
> > killing their performance and introducing deadlocks.
> >
> > In many cases, better performance and correctness can be easily
> > achieved by moving to asynchronous tasking runtimes, where the
> > runtime internally manages an event-based dependency graph of tasks
> > that can be processed condurrently, and data is kept task-private but
> > can be moved around between tasks.
> 
> Do you have real experience of such "asynchronous tasking runtimes"? In 
> Ada or in other languages?

I have quite a bit of experience with OpenCL and HPX. OpenCL is based on raw events, while HPX adds some nice syntaxic sugar for moving data across tasks in the form of futures.

OpenCL is an API that was designed for offloading parallel computations to pretty much every kind of programmable hardware, so its designers had to pay quite a bit of attention to IO latency concerns when designing it, which naturally led them to embrace an asynchronous tasking model. Whereas if you are familiar with the C++11 std::async construct, HPX is essentially that programming model on steroids (more limited in hardware scope than OpenCL, but more pleasant to use).

As for languages, OpenCL is a C API, though I ended up writing a C++ wrapper fairly quickly because the C constructs were hurting my eyes too much. HPX is a C++ API. I am not aware of any similar similar work in Ada, so since I really like this programming model, I thought I would give it a shot.


> It seems to me that it would be difficult for such a run-time system to 
> support the full Ada tasking features, such as conditional entry calls. 
> Ada tasks have more control over their own execution than such a 
> run-time system would allow.

Yes, to support the full Ada tasking feature set would likely entail getting support from compiler implementors, see the lengthy discussion on coroutines above for an example :)

 
> Moreover, in present Ada it seems to me that the only way to move 
> task-private data from one task to another is to send a copy of the data 
> from one task to the other. Copying data is often poison for performance.

Cheaply moving data around is possible in any language that has heap allocation and pointers. What is more difficult is to provide an easy to use syntax around it.


> Perhaps a new, restricted tasking profile would be needed, analogous to 
> the Ravenscar profile but aimed not at real-time systems but at parallel 
> computation in this event-based style.

It is true that for computation purposes, the full Ada tasking model may also be overkill. For example, asynchronous transfer of control can be hard to support, and is often overkill for compute scenarios.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-03  7:42                     ` Hadrien Grasland
@ 2016-07-03  8:39                       ` Dmitry A. Kazakov
  2016-07-03 21:15                         ` Hadrien Grasland
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-03  8:39 UTC (permalink / raw)


On 2016-07-03 09:42, Hadrien Grasland wrote:
> Le samedi 2 juillet 2016 23:14:30 UTC+2, Niklas Holsti a écrit :

>> Moreover, in present Ada it seems to me that the only way to move
>> task-private data from one task to another is to send a copy of the data
>> from one task to the other. Copying data is often poison for performance.
>
> Cheaply moving data around is possible in any language that has heap
> allocation and pointers.

That is the most expensive way doing it, and on top of that, it requires 
shared memory (pool) and thus process-global interlocking. You arrived 
at the starting point.

BTW, when talking about asynchronous model, marshaling must be 
asynchronous too. Another drawback of the method of doing that through 
pointers is that it must be atomic = synchronous => you could not deal 
with large objects, on-demand production models etc. And this is where 
event-controlled model stop working, as it separates data from 
data-related events. A proper abstraction must combine everything into 
ADT objects.

> What is more difficult is to provide an easy to
> use syntax around it.

Hmm, what is difficult about procedure call?

>> Perhaps a new, restricted tasking profile would be needed, analogous to
>> the Ravenscar profile but aimed not at real-time systems but at parallel
>> computation in this event-based style.
>
> It is true that for computation purposes, the full Ada tasking model
> may also be overkill. For example, asynchronous transfer of control can
> be hard to support, and is often overkill for compute scenarios.

Well, actually ATC is totally useless for parallel computing because it 
simply does not work when you wanted to abort an external blocking 
operation. There is a similarity between the ATC and co-routines. ATC 
does not work because the OS is not aware of ATC requests. It is exactly 
same as if a co-routine would perform asynchronous I/O. OS does not know 
how to continue it. My wild guess is that an Ada RTS capable to handle 
co-routines will easily handle ATC as a by product, and conversely.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 21:33                       ` Niklas Holsti
@ 2016-07-03 20:56                         ` Hadrien Grasland
  0 siblings, 0 replies; 72+ messages in thread
From: Hadrien Grasland @ 2016-07-03 20:56 UTC (permalink / raw)


Le samedi 2 juillet 2016 23:33:19 UTC+2, Niklas Holsti a écrit :
> On 16-07-02 19:49 , Hadrien Grasland wrote:
> 
> > As for deadlocks in event-driven programs, they do occur, but they
> > are trivial to detect (since the task graph is explicit, one can just
> > look for a cycle in it),
> 
> That is not always enough. In my experience, event-driven programs tend 
> to have cycles where one task sends events or "requests" to another, and 
> expects "replies" in return. To understand if this causes a deadlock, 
> livelock, or works well, one must analyse the conditional control flow 
> in the tasks and consider the feasible/infeasible chains of events.
>
>
> > and good interface design can make it hard
> > to trigger accidentally (if the event produced by a task A cannot be
> > easily used as a dependency of that same task).
> 
> Good design is a cure for all kinds of deadlocks ;-)
> 
> I don't know if the request/reply event-cycle can be considered bad 
> design. In the examples I've seen, it was implied and required by the 
> roles assigned to the tasks. However, these were real-time, embedded 
> applications, not parallel computation applications.

In compute code, it is quite common to model a problem as a combination of many simple, single-purpose tasks (e.g. "blend two images together", "compute the histogram of the resulting picture").

In this kind of pipe-and-filter architecture, it is critical to have cheap task creation/scheduling and inter-task communication. But on the pro side, complex task synchronization is less frequently needed, because task tend to only interact indirectly through dataflow.

You avoid modeling I/O as "task A starts an IO task B, then sleeps until B has completed". You rather model it as "task A performs IO, and feed the output to task B, which is started as the result of all its inputs being available".

But indeed, I could well see this architecture being less appropriate for other scenario, such as the event loop of a real-time application or web server.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-03  8:39                       ` Dmitry A. Kazakov
@ 2016-07-03 21:15                         ` Hadrien Grasland
  2016-07-04  7:44                           ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Hadrien Grasland @ 2016-07-03 21:15 UTC (permalink / raw)


Le dimanche 3 juillet 2016 10:39:48 UTC+2, Dmitry A. Kazakov a écrit :
> On 2016-07-03 09:42, Hadrien Grasland wrote:
> > Le samedi 2 juillet 2016 23:14:30 UTC+2, Niklas Holsti a écrit :
> 
> >> Moreover, in present Ada it seems to me that the only way to move
> >> task-private data from one task to another is to send a copy of the data
> >> from one task to the other. Copying data is often poison for performance.
> >
> > Cheaply moving data around is possible in any language that has heap
> > allocation and pointers.
> 
> That is the most expensive way doing it, and on top of that, it requires 
> shared memory (pool) and thus process-global interlocking. You arrived 
> at the starting point.

Please define what you mean by expensive. Whenever the hardware architecture allows for it, passing a pointer to a heap-allocated data block is almost always faster than deep-copying that data from one thread-private area of memory to another, as long as the data is large enough.

 
> BTW, when talking about asynchronous model, marshaling must be 
> asynchronous too. Another drawback of the method of doing that through 
> pointers is that it must be atomic = synchronous => you could not deal 
> with large objects, on-demand production models etc. And this is where 
> event-controlled model stop working, as it separates data from 
> data-related events. A proper abstraction must combine everything into 
> ADT objects.

Indeed, I also agree that when an event is used to notify a listener that some piece of data has been produced, it is almost always better to combine together the event and the data block being produced in a single abstraction, a future.

However, futures are less general than events. In runtimes that only have futures, like HPX, you cannot easily express the completion of procedural operations. You just end up with horrible hacks such as hpx::future<void> ("a future of nothing"). This is why I believe that events are best as a low-level layer, on top of which higher-level abstractions such as futures can be built.


> > What is more difficult is to provide an easy to
> > use syntax around it.
> 
> Hmm, what is difficult about procedure call?

Basically, it is best if producers and consumers do not actually manipulate raw pointers to the heap-allocated blocks.

Otherwise, there is always a risk of a producer retaining access to the produced block and later manipulating it in a racey fashion.

A better option is to have a system where at the point where a producer "emits" data, it loses access to the associated data block. Of course, this is impossible to achieve in a perfectly fool-proof way, but we can make it harder for developers to shoot themselves in the foot, using things like futures.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-03 21:15                         ` Hadrien Grasland
@ 2016-07-04  7:44                           ` Dmitry A. Kazakov
  0 siblings, 0 replies; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-04  7:44 UTC (permalink / raw)


On 03/07/2016 23:15, Hadrien Grasland wrote:
> Le dimanche 3 juillet 2016 10:39:48 UTC+2, Dmitry A. Kazakov a écrit :
>> On 2016-07-03 09:42, Hadrien Grasland wrote:
>>> Le samedi 2 juillet 2016 23:14:30 UTC+2, Niklas Holsti a écrit :
>>
>>>> Moreover, in present Ada it seems to me that the only way to move
>>>> task-private data from one task to another is to send a copy of the data
>>>> from one task to the other. Copying data is often poison for performance.
>>>
>>> Cheaply moving data around is possible in any language that has heap
>>> allocation and pointers.
>>
>> That is the most expensive way doing it, and on top of that, it requires
>> shared memory (pool) and thus process-global interlocking. You arrived
>> at the starting point.
>
> Please define what you mean by expensive.

Any operation that blocks all CPUs.

> Whenever the hardware
> architecture allows for it, passing a pointer to a heap-allocated data
> block is almost always faster than deep-copying that data from one
> thread-private area of memory to another, as long as the data is large
> enough.

This might be true only for architectures of few cores and shared 
memory. A massively parallel or distributed system cannot be built this way.

>> BTW, when talking about asynchronous model, marshaling must be
>> asynchronous too. Another drawback of the method of doing that through
>> pointers is that it must be atomic = synchronous => you could not deal
>> with large objects, on-demand production models etc. And this is where
>> event-controlled model stop working, as it separates data from
>> data-related events. A proper abstraction must combine everything into
>> ADT objects.
>
> Indeed, I also agree that when an event is used to notify a listener
> that some piece of data has been produced, it is almost always better to
> combine together the event and the data block being produced in a single
> abstraction, a future.
>
> However, futures are less general than events. In runtimes that only
> have futures, like HPX, you cannot easily express the completion of
> procedural operations. You just end up with horrible hacks such as
> hpx::future<void> ("a future of nothing"). This is why I believe that
> events are best as a low-level layer, on top of which higher-level
> abstractions such as futures can be built.

The idea is make it looking like normal tasks and protected objects 
doing normal blocking operations on normal objects without exposing 
implementation details like events. We could still have events 
implemented through  protected objects.

>>> What is more difficult is to provide an easy to
>>> use syntax around it.
>>
>> Hmm, what is difficult about procedure call?
>
> Basically, it is best if producers and consumers do not actually
> manipulate raw pointers to the heap-allocated blocks.

In Ada parameter passing is up to the implementation (with some 
exceptions). So if by-reference is the most efficient method of 
parameter passing the compiler will choose it. That is why I see no 
problem with the syntax.

> Otherwise, there is always a risk of a producer retaining access to
> the produced block and later manipulating it in a racey fashion.

OK, the aliasing problem is always a problem.

> A better option is to have a system where at the point where a
> producer "emits" data, it loses access to the associated data block. Of
> course, this is impossible to achieve in a perfectly fool-proof way, but
> we can make it harder for developers to shoot themselves in the foot,
> using things like futures.

One method is passing a controlled handle to the shared 
reference-counted object. The handle is invalidated upon return to the 
producer so that it will not be able to access the target object after that.

Another method is reference-counted "transactional" objects. A mutator 
operation on the object looks at the reference count, if it is greater 
than 1, it copies the object, uses the copy, sets the handle to point to 
the copy upon return.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 10:33                 ` Dmitry A. Kazakov
@ 2016-07-05 21:24                   ` Randy Brukardt
  2016-07-06 13:46                     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-05 21:24 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nl859c$hso$1@gioia.aioe.org...
> On 2016-07-02 06:21, Randy Brukardt wrote:
>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>> news:nknteq$177m$1@gioia.aioe.org...
>>> On 2016-06-26 05:09, Randy Brukardt wrote:
>>>> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message
>>>> news:f4c3e42a-1aa1-4667-83d7-de6f81a8fdd2@googlegroups.com...
...
>>> Ah, but there is the reason. The OS can switch threads on I/O events. If
>>> Ada RTS could do this without mapping tasks to threads, fine. Apparently
>>> it cannot without imposing same or higher overhead OS threads have.
>>
>> Dunno; I doubt that it really has been tried. If that was really 
>> impossible,
>> virtualization wouldn't work either, and there's lots of evidence that 
>> works
>> fine.
>
> But it does not, in this sense. It can handle only virtual devices. In 
> order to access a physical one, you need a software layer. It could be 
> quite difficult for the RTS to follow this path.

Not sure why you say that: every RTS that I've ever seen wraps the I/O 
system in a layer, and indeed the OS itself is another such layer. Hard to 
say why thickening the existing layer a bit more would be harmful.

>> The problem (if there is a problem at all) is fast character-at-a-time 
>> I/O -
>> but that's a bad idea for lots of reasons (it doesn't work very well for 
>> a
>> sequential program either) so perhaps it wouldn't matter than much.
>
> It will make normal I/O slower and RTS larger with compatibility issues.

??? I can believe that there is *some* program that would be slower, but I'd 
rather the doubt that most programs would be slower.

I'd look at replacing:
     Read (Handle, Data, Last);
with something like:
     loop
         if Ready (Handle) then
             Read_Ready_Data (Handle, Data, Last);
             exit if Last > Data'Last;
         end if;
         Yield;
     end loop;

The expensive operation would only be called if the I/O isn't ready. And in 
that case, this task has to wait anyway, so the performance of Yield really 
doesn't matter. (It is unlikely to be significant if Read takes 49 or 50 
milliseconds.) So the only possible issue is the underlying cost of "Ready". 
I'd expect that to be cheap (especially if the I/O system is buffering I/O 
anyway), but one would need to try it in real conditions to be sure.

I suppose there is a downside to this scheme, in that user-written I/O (that 
is, directly to devices) also has to be written this way, but that may be a 
small price to pay.

>>> Co-routines could offer something in between.
>>
>> Sure, but I don't see any need to expose that. Certainly not beyond an
>> aspect much like "Inline", which is just a suggestion to the compiler to 
>> do
>> something. It's better to leave implementation decisions (and that's all
>> this is) to implementations!
>
> Well, the difference is that Inline can be safely ignored. Limitation on 
> the task number cannot be.

Same either way - we're "just" talking about performance. (The program will 
work in either case if the advice is ignored.) One has to be careful about 
exposing too much of the mechanism specifically for performance, as that 
almost always harms abstraction.

And, as I pointed out before, performance is never portable. To the extent 
that one depends on performance, it has to be retested and retuned on every 
change (architecture, compiler, even major compiler version).

                               Randy.




^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02  5:30                 ` Simon Wright
@ 2016-07-05 21:29                   ` Randy Brukardt
  0 siblings, 0 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-07-05 21:29 UTC (permalink / raw)


"Simon Wright" <simon@pushface.org> wrote in message 
news:lyd1mwip2w.fsf@pushface.org...
> "Randy Brukardt" <randy@rrsoftware.com> writes:
>
>> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message
>
>>>An Ada implementation which would want to make the life of concurrent
>>>programmers easier could do the following things:
>>
>>>1/ Keep the amount of OS threads low (about the amount of CPU cores, a
>>>bit more for I/O), and map tasks to threads in an 1:N fashion.
>>
>> Reasonably easy. (For Windows, stack limitations might be a problem;
>> not a problem on a bare target.)
>
> Most things are a problem on a bare target.
>
> Or do you mean "limitations that Windows places on the _use_ of the
> stack"?

Yes. On Windows (and quite likely other modern OSes, because of the problems 
with C language program bugs causing attack vectors), one can't put any old 
memory address into the stack pointer (the SP register). If you do, you'll 
get an immediate fault. (I discovered this when developing our original Ada 
tasking system - the task stacks were not allowed, which was a problem). 
There probably are ways around this, but I don't know for sure. (If there 
isn't a way around it, it probably would be fatal, one has to switch the 
stacks with the underlying jobs.)

On a bare target, you can write any code you like. Of course, if you get it 
wrong, things go south in a hurry.

                                      Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 11:13                 ` Hadrien Grasland
                                     ` (2 preceding siblings ...)
  2016-07-02 21:14                   ` Niklas Holsti
@ 2016-07-05 21:38                   ` Randy Brukardt
  3 siblings, 0 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-07-05 21:38 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1829 bytes --]

"Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message 
news:bb07afac-5eaf-4622-b264-da6aef96e8d6@googlegroups.com...
>Le samedi 2 juillet 2016 06:36:09 UTC+2, Randy Brukardt a écrit :
>> But I don't think that really helps the race and deadlock issues that are
>> the real problem with programming with Ada tasks. I'd like to find some 
>> help
>> there, too

...
>But shared mutable data is necessary to the efficient implementation of
>data-parallel algorithms, so it should remain available to expert 
>programmers.

Right. Gotta have it.

>What I would love in the end is some kind of static analysis tool that 
>warns
>people about shared mutable data in multitasking programs, but gives a way
>to bypass the warning to people who actually know what they are doing.

I want the Ada compiler (and thus language) to do this sort of checking. 
Otherwise we have the C -- Lint sort of situation, and we know from 
experience that doesn't really help.

We've got some building blocks for this approach in the pipeline (aspect 
Nonblocking, aspect Global) which could help a lot if properly applied. I'd 
rather look at that than anything about performance details.

>But to make this warning useful would also entail introducing a notion of
>"deferred constants", that is, values which are defined at runtime but will
>not be modified subsequently. Otherwise, there would be too many false
>positives about program parameters that are loaded at runtime.

Ada of course has these, they're called "constants" :-). One needs to modify 
their programming style somewhat to use them (must initialize with 
functions, for instance), but often that's an improvement. And it can make a 
world of difference to tools (and the Global aspect, should that ever get 
fully defined).

                                      Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-02 10:25                     ` Dmitry A. Kazakov
@ 2016-07-05 21:53                       ` Randy Brukardt
  2016-07-06  9:25                         ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-05 21:53 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nl84rj$h70$1@gioia.aioe.org...
> On 2016-07-02 06:13, Randy Brukardt wrote:
>
>> Seriously, I'm thinking that thread scheduling overhead could be
>> reduced/eliminated by the task supervisor, and thus there's no real 
>> problem
>> with writing Ada tasks this way -- except of course that existing
>> implementations don't try to do this sort of optimization.
>
> There could be OS limitations on the number of threads a process may have.

It would be weird if one couldn't have as many threads as there are cores. 
(And if not, then one has to have more processes rather than more threads.)

> I was thinking about "user-scheduled" tasks, which are not scheduled at 
> all. Some user task explicitly releases one task from the pool and gets 
> control back when that task calls to an entry.

Yes, that's how all tasks work in Janus/Ada.

> The important point I believe is preemption. "User-scheduled" tasks would 
> not be required to be preemptive.

Nothing in the core language of Ada requires tasks to be pre-emptive. I 
believe any such requirement is a mistake, one that the runtime I'm talking 
about isn't going to make. (I.e., the Janus/Ada runtime doesn't support 
preemption, and that's very unlikely to change even when multiple threads 
get involved.) Such an implementation can't meet some of the default 
requirements of Annex D; that's another reason for an aspect to be involved 
(to say that Annex D doesn't apply). But of course there's no requirement to 
implement any SNA (such as Annex D) to implement Ada. There's a reason these 
things are separate!

                              Randy.






^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-05 21:53                       ` Randy Brukardt
@ 2016-07-06  9:25                         ` Dmitry A. Kazakov
  2016-07-07  0:32                           ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-06  9:25 UTC (permalink / raw)


On 05/07/2016 23:53, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nl84rj$h70$1@gioia.aioe.org...
>> On 2016-07-02 06:13, Randy Brukardt wrote:
>>
>>> Seriously, I'm thinking that thread scheduling overhead could be
>>> reduced/eliminated by the task supervisor, and thus there's no real
>>> problem
>>> with writing Ada tasks this way -- except of course that existing
>>> implementations don't try to do this sort of optimization.
>>
>> There could be OS limitations on the number of threads a process may have.
>
> It would be weird if one couldn't have as many threads as there are cores.
> (And if not, then one has to have more processes rather than more threads.)

I strongly disagree, especially when we are talking about tasks. A task 
is a logical entity independent on the host architecture. Then if the OS 
supports non-busy I/O waiting, and most OSes do, it makes a lot of sense 
to have many threads, most of which blocked waiting for something.

>> I was thinking about "user-scheduled" tasks, which are not scheduled at
>> all. Some user task explicitly releases one task from the pool and gets
>> control back when that task calls to an entry.
>
> Yes, that's how all tasks work in Janus/Ada.

Of course, the problem is to be able to mix both types of tasks being 
able to hint the compiler which one it should take.

>> The important point I believe is preemption. "User-scheduled" tasks would
>> not be required to be preemptive.
>
> Nothing in the core language of Ada requires tasks to be pre-emptive.

Some applications rely on time sharing and most do on close to instant 
switching to a task of higher priority.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-05 21:24                   ` Randy Brukardt
@ 2016-07-06 13:46                     ` Dmitry A. Kazakov
  2016-07-07  1:00                       ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-06 13:46 UTC (permalink / raw)


On 05/07/2016 23:24, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nl859c$hso$1@gioia.aioe.org...
>> On 2016-07-02 06:21, Randy Brukardt wrote:
>>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>>> news:nknteq$177m$1@gioia.aioe.org...
>>>> On 2016-06-26 05:09, Randy Brukardt wrote:
>>>>> "Hadrien Grasland" <hadrien.grasland@gmail.com> wrote in message
>>>>> news:f4c3e42a-1aa1-4667-83d7-de6f81a8fdd2@googlegroups.com...
> ...
>>>> Ah, but there is the reason. The OS can switch threads on I/O events. If
>>>> Ada RTS could do this without mapping tasks to threads, fine. Apparently
>>>> it cannot without imposing same or higher overhead OS threads have.
>>>
>>> Dunno; I doubt that it really has been tried. If that was really
>>> impossible,
>>> virtualization wouldn't work either, and there's lots of evidence that
>>> works
>>> fine.
>>
>> But it does not, in this sense. It can handle only virtual devices. In
>> order to access a physical one, you need a software layer. It could be
>> quite difficult for the RTS to follow this path.
>
> Not sure why you say that: every RTS that I've ever seen wraps the I/O
> system in a layer, and indeed the OS itself is another such layer. Hard to
> say why thickening the existing layer a bit more would be harmful.

The difference is whether this layer is user- or kernel-space. RTS is 
user-space. Virtualization I/O layer is kernel-space.

>>> The problem (if there is a problem at all) is fast character-at-a-time I/O -
>>> but that's a bad idea for lots of reasons (it doesn't work very well for
>>> a
>>> sequential program either) so perhaps it wouldn't matter than much.
>>
>> It will make normal I/O slower and RTS larger with compatibility issues.
>
> ??? I can believe that there is *some* program that would be slower, but I'd
> rather the doubt that most programs would be slower.
>
> I'd look at replacing:
>      Read (Handle, Data, Last);
> with something like:
>      loop
>          if Ready (Handle) then
>              Read_Ready_Data (Handle, Data, Last);
>              exit if Last > Data'Last;
>          end if;
>          Yield;
>      end loop;

This is in order of several magnitudes slower because after yield to 
return back not before next time slot, e.g. timer interrupt. It is no 
different from polling.

Bit even if RTS had proper drivers on top of the system ones it would 
still greatly slower in most case.

>>>> Co-routines could offer something in between.
>>>
>>> Sure, but I don't see any need to expose that. Certainly not beyond an
>>> aspect much like "Inline", which is just a suggestion to the compiler to
>>> do
>>> something. It's better to leave implementation decisions (and that's all
>>> this is) to implementations!
>>
>> Well, the difference is that Inline can be safely ignored. Limitation on
>> the task number cannot be.
>
> Same either way - we're "just" talking about performance.

Unless the system limit on the number of physical threads apply. And the 
situation is reverse. With inlining you may fail at compile time with 
GNAT when it runts out of memory, and it is recoverable error since you 
can forbid inlining. With mapping tasks to threads you fail at run-time, 
and there is nothing you can do to prevent that from happening.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-06  9:25                         ` Dmitry A. Kazakov
@ 2016-07-07  0:32                           ` Randy Brukardt
  2016-07-07  6:08                             ` Niklas Holsti
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-07  0:32 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nliir0$1a7l$1@gioia.aioe.org...
> On 05/07/2016 23:53, Randy Brukardt wrote:
...
>> Nothing in the core language of Ada requires tasks to be pre-emptive.
>
> Some applications rely on time sharing and most do on close to instant 
> switching to a task of higher priority.

Priorities are evil and almost always used poorly, that is to do something 
that should be accomplished explicitly with locking or the like. No system 
I'm contemplating has any priorities.

                                  Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-06 13:46                     ` Dmitry A. Kazakov
@ 2016-07-07  1:00                       ` Randy Brukardt
  2016-07-07 14:23                         ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-07  1:00 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nlj244$43i$1@gioia.aioe.org...
> On 05/07/2016 23:24, Randy Brukardt wrote:
>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
...
>> I'd look at replacing:
>>      Read (Handle, Data, Last);
>> with something like:
>>      loop
>>          if Ready (Handle) then
>>              Read_Ready_Data (Handle, Data, Last);
>>              exit if Last > Data'Last;
>>          end if;
>>          Yield;
>>      end loop;
>
> This is in order of several magnitudes slower because after yield to 
> return back not before next time slot, e.g. timer interrupt. It is no 
> different from polling.

That's actually an advantage, as it greatly decreases the possibility that 
this task will waste time retrying repeatedly. And it is very rare that it 
would matter (assuming that I/O is properly buffered), as the vast majority 
of I/O will run at full speed, and that which doesn't is very slow 
(relatively speaking) anyway. (i.e., you're waiting for a packet to come 
across a network -- that's a long time -- usually tenths of seconds --  
compared to the slot time.)

Finally, our task supervisor doesn't use slots, and I've spent quite a bit 
of effort avoiding the use of Sleep in Windows such that it would take 
longer than the next delay expiration. It wouldn't be good for a low-power 
application, but I doubt any tasking system would be there.

> Bit even if RTS had proper drivers on top of the system ones it would 
> still greatly slower in most case.

Disagree. My experience with Internet servers is that one can and should use 
relatively long delays in I/O without hurting response times. I think files 
would be similar (although the delays would be smaller). No way to tell for 
sure without building some examples.

...
>>> Well, the difference is that Inline can be safely ignored. Limitation on
>>> the task number cannot be.
>>
>> Same either way - we're "just" talking about performance.
>
> Unless the system limit on the number of physical threads apply.

That would never happen.

> And the situation is reverse. With inlining you may fail at compile time 
> with GNAT when it runts out of memory, and it is recoverable error since 
> you can forbid inlining.

Usually, compilers just don't inline in such circumstances, and no one 
really cares unless they happen to look at a machine code listing. (Since 
the inlining probably didn't make much difference in the first place.) 
Inline is a suggestion, not a promise (much like Suppress). There's never a 
reason to treat such things as commands.

[Yes, one of the reasons RR doesn't have the sort of customer base that 
AdaCore has is that I'm not willing to implement stupid stuff just because 
some customers are confused. :-)]

> With mapping tasks to threads you fail at run-time, and there is nothing 
> you can do to prevent that from happening.

Certainly not: the threads are all created at program-startup (5 threads for 
4 cores, I think), and the number doesn't change after that, unless one 
calls the library routine for that purpose. That would be a very rare thing, 
and failure wouldn't cause the program to fail, just run slower. An OS that 
didn't allow as many threads as there are cores isn't going to work for this 
system anyway, so the compiler would never exist in the first place. (So 
you'd get your compile-time detection.)

No language-defined I/O (or the sockets library, for that matter) would use 
blocking I/O. User-defined code could, of course, call blocking I/O (that 
being the main reason to allow more threads to be added manually). But it 
would be strongly discouraged.

                                        Randy.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-07  0:32                           ` Randy Brukardt
@ 2016-07-07  6:08                             ` Niklas Holsti
  2016-07-08  0:03                               ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Niklas Holsti @ 2016-07-07  6:08 UTC (permalink / raw)


On 16-07-07 03:32 , Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nliir0$1a7l$1@gioia.aioe.org...
>> On 05/07/2016 23:53, Randy Brukardt wrote:
> ...
>>> Nothing in the core language of Ada requires tasks to be pre-emptive.
>>
>> Some applications rely on time sharing and most do on close to instant
>> switching to a task of higher priority.
>
> Priorities are evil and almost always used poorly, that is to do something
> that should be accomplished explicitly with locking or the like. No system
> I'm contemplating has any priorities.

Are you not contemplating any real-time systems? If you are, what do you 
use instead of priorites, to ensure that urgent activities are done in time?

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-07  1:00                       ` Randy Brukardt
@ 2016-07-07 14:23                         ` Dmitry A. Kazakov
  2016-07-07 23:43                           ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-07 14:23 UTC (permalink / raw)


On 07/07/2016 03:00, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nlj244$43i$1@gioia.aioe.org...
>> On 05/07/2016 23:24, Randy Brukardt wrote:
>>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> ...
>>> I'd look at replacing:
>>>      Read (Handle, Data, Last);
>>> with something like:
>>>      loop
>>>          if Ready (Handle) then
>>>              Read_Ready_Data (Handle, Data, Last);
>>>              exit if Last > Data'Last;
>>>          end if;
>>>          Yield;
>>>      end loop;
>>
>> This is in order of several magnitudes slower because after yield to
>> return back not before next time slot, e.g. timer interrupt. It is no
>> different from polling.
>
> That's actually an advantage, as it greatly decreases the possibility that
> this task will waste time retrying repeatedly.

Note that the alternative guaranties that there is I/O data ready, 
because the task is waked up by an I/O event rater than by timer. It is 
polling which guaranties wasting time.

But the main problem is latencies. In the query-answer scenario it is 
extremely slow to work this way.

>> Bit even if RTS had proper drivers on top of the system ones it would
>> still greatly slower in most case.
>
> Disagree. My experience with Internet servers is that one can and should use
> relatively long delays in I/O without hurting response times.

That is because there is a human user on the other end. For automatic 
systems such latencies are unacceptable. E.g. the initialization cycle 
of EtherCAT distributed clock is 10_000 send packet/receive packet 
cycles. It would never end with 500ms latencies.

>> And the situation is reverse. With inlining you may fail at compile time
>> with GNAT when it runts out of memory, and it is recoverable error since
>> you can forbid inlining.
>
> Usually, compilers just don't inline in such circumstances, and no one
> really cares unless they happen to look at a machine code listing.

How the compiler can predict it will run out of memory?

>> With mapping tasks to threads you fail at run-time, and there is nothing
>> you can do to prevent that from happening.
>
> Certainly not: the threads are all created at program-startup (5 threads for
> 4 cores, I think),

In that case the compiler must do all system I/O in an asynchronous way 
presenting it as blocking for Ada task. This is what is I would gladly 
have and what will satisfy the OP. The problem is, I suppose, it is not 
what the compiler does.

> No language-defined I/O (or the sockets library, for that matter) would use
> blocking I/O. User-defined code could, of course, call blocking I/O (that
> being the main reason to allow more threads to be added manually). But it
> would be strongly discouraged.

Ah, but that is the point. The tasking model is not allowed to change 
the semantics. This means that if your RTS takes the liberty to use 
single thread for multiple tasks, then it also must convert *all* 
synchronous I/O into asynchronous transparently to the Ada program.

I know that Ada RM does not require it, but I assure you that virtually 
no Ada user will accept anything else.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-07 14:23                         ` Dmitry A. Kazakov
@ 2016-07-07 23:43                           ` Randy Brukardt
  2016-07-08  8:23                             ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-07 23:43 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nlloma$1vh8$1@gioia.aioe.org...
> On 07/07/2016 03:00, Randy Brukardt wrote:
...
>> Usually, compilers just don't inline in such circumstances, and no one
>> really cares unless they happen to look at a machine code listing.
>
> How the compiler can predict it will run out of memory?

Why would it have to? Inlining is just the duplication of some internal data 
structure; if you run out of memory doing that duplication, you just forget 
the operation and back out any changes. Our optimizer works that way in the 
unlikely case that it runs out of memory (at least it is supposed to; it's 
hard to test because it's hard to make the necessary conditions).

Ada programs at least have a chance of recovering from an out-of-memory 
situation, and it certainly makes sense to use that here. (One could also 
precheck that enough memory is available, but of course that's never certain 
to be true when you actually do the allocation, so you still need the 
fallback code.)

>>> With mapping tasks to threads you fail at run-time, and there is nothing
>>> you can do to prevent that from happening.
>>
>> Certainly not: the threads are all created at program-startup (5 threads 
>> for
>> 4 cores, I think),
>
> In that case the compiler must do all system I/O in an asynchronous way 
> presenting it as blocking for Ada task. This is what is I would gladly 
> have and what will satisfy the OP. The problem is, I suppose, it is not 
> what the compiler does.
>
>> No language-defined I/O (or the sockets library, for that matter) would 
>> use
>> blocking I/O. User-defined code could, of course, call blocking I/O (that
>> being the main reason to allow more threads to be added manually). But it
>> would be strongly discouraged.
>
> Ah, but that is the point. The tasking model is not allowed to change the 
> semantics. This means that if your RTS takes the liberty to use single 
> thread for multiple tasks, then it also must convert *all* synchronous I/O 
> into asynchronous transparently to the Ada program.

But of course it's impossible for the compiler to change user-written code. 
If you do a direct interface to some C API, there is no way the compiler 
could change that (nor would it be a good idea to do so).

> I know that Ada RM does not require it, but I assure you that virtually no 
> Ada user will accept anything else.

That's demonstratably false, since Janus/Ada has always worked this way, and 
we've had far more than than zero customers over the years.

In any case, there would be no issue unless the programmer writes their own 
I/O; using anything we provide (language-defined or implementation-defined) 
would work. I'd think the vast majority of Ada programs would use Stream_IO 
compared to something of their own design. (Sockets is the big issue, since 
the language doesn't have it, but portable libraries can easily be made to 
do the right thing.)

                             Randy.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-07  6:08                             ` Niklas Holsti
@ 2016-07-08  0:03                               ` Randy Brukardt
  2016-07-08  7:32                                 ` Dmitry A. Kazakov
  2016-07-08 20:17                                 ` Niklas Holsti
  0 siblings, 2 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-07-08  0:03 UTC (permalink / raw)


"Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 
news:du69ugF3631U1@mid.individual.net...
> On 16-07-07 03:32 , Randy Brukardt wrote:
>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>> news:nliir0$1a7l$1@gioia.aioe.org...
>>> On 05/07/2016 23:53, Randy Brukardt wrote:
>> ...
>>>> Nothing in the core language of Ada requires tasks to be pre-emptive.
>>>
>>> Some applications rely on time sharing and most do on close to instant
>>> switching to a task of higher priority.
>>
>> Priorities are evil and almost always used poorly, that is to do 
>> something
>> that should be accomplished explicitly with locking or the like. No 
>> system
>> I'm contemplating has any priorities.
>
> Are you not contemplating any real-time systems? If you are, what do you 
> use instead of priorites, to ensure that urgent activities are done in 
> time?

I'm not contemplating hard-real-time systems (under 10ms response time). I 
don't think it is possible to create implementation-independent code for 
those sorts of deadlines, and as such it doesn't really matter from a 
language perspective how that's done (it won't be portable in any case).

I'm unconvinced that the way to ensure that "urgent activies" are done is 
some sort of magic (and priorities are essentially magic). I'd rather make 
sure that no task is hogging the system, and avoid overcommitting. That 
usually happens naturally, and in the unusual case where it doesn't, there's 
almost always someplace where adding a synchronization point (usually a 
"Yield" aka delay 0.0) fixes the problem. [Most of the cases have involved 
I/O, and the system I've been discussing with Dmitry would pretty much have 
eliminated that. We had considered adding synchronization points to loops 
many years back, but couldn't figure out how to do that cheaply enough to be 
useful (the MS-DOS clock was much too expensive) - that could be revisited 
if necessary].

It might actually be possible to make this system work with some form of 
priorities (we have them in our task supervisor, we just have the range of 
priorities set to 1 so that preemption doesn't come into play). That would 
certainly require the compiler to break loop and call cycles with 
synchronization points, and whether that could be done without visible 
performance impact is definitely TBD. (And I don't want to make my head hurt 
worrying about priorities when just getting task mapping to work is complex 
enough.)

                                   Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-08  0:03                               ` Randy Brukardt
@ 2016-07-08  7:32                                 ` Dmitry A. Kazakov
  2016-07-11 19:40                                   ` Randy Brukardt
  2016-07-08 20:17                                 ` Niklas Holsti
  1 sibling, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-08  7:32 UTC (permalink / raw)


On 08/07/2016 02:03, Randy Brukardt wrote:
> "Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message
> news:du69ugF3631U1@mid.individual.net...
>> On 16-07-07 03:32 , Randy Brukardt wrote:
>>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>>> news:nliir0$1a7l$1@gioia.aioe.org...
>>>> On 05/07/2016 23:53, Randy Brukardt wrote:
>>> ...
>>>>> Nothing in the core language of Ada requires tasks to be pre-emptive.
>>>>
>>>> Some applications rely on time sharing and most do on close to instant
>>>> switching to a task of higher priority.
>>>
>>> Priorities are evil and almost always used poorly, that is to do
>>> something
>>> that should be accomplished explicitly with locking or the like. No
>>> system
>>> I'm contemplating has any priorities.
>>
>> Are you not contemplating any real-time systems? If you are, what do you
>> use instead of priorites, to ensure that urgent activities are done in
>> time?
>
> I'm not contemplating hard-real-time systems (under 10ms response time).

Some of our customers have 0.2ms response time requirement and that not 
just local but over the network.

> I
> don't think it is possible to create implementation-independent code for
> those sorts of deadlines, and as such it doesn't really matter from a
> language perspective how that's done (it won't be portable in any case).

I don't see why. Any reasonable implementation would do. An 
implementation that does not preempt lower priority tasks is not 
reasonable. If you wanted to push the argument you would end up with 
disabling hardware interrupts.

> I'm unconvinced that the way to ensure that "urgent activies" are done is
> some sort of magic (and priorities are essentially magic). I'd rather make
> sure that no task is hogging the system, and avoid overcommitting. That
> usually happens naturally, and in the unusual case where it doesn't, there's
> almost always someplace where adding a synchronization point (usually a
> "Yield" aka delay 0.0) fixes the problem.

No it does not and it is a *very* bad design, because it distributes 
making the decision to switch to the parts of the software which must 
know nothing about the reason when switching is necessary. E.g. that a 
task solving some differential equation decides whether to switch to the 
keyboard interrupt handler. That belongs to the keyboard driver, not to 
the equation solver.

Yield is a premature optimization of worst kind and I bet it is highly 
inefficient comparing to preemptive scheduling on any modern hardware. 
It is just like arguing for return codes over exceptions.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-07 23:43                           ` Randy Brukardt
@ 2016-07-08  8:23                             ` Dmitry A. Kazakov
  2016-07-11 19:44                               ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-08  8:23 UTC (permalink / raw)


On 08/07/2016 01:43, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nlloma$1vh8$1@gioia.aioe.org...
>> On 07/07/2016 03:00, Randy Brukardt wrote:
> ...
>>> Usually, compilers just don't inline in such circumstances, and no one
>>> really cares unless they happen to look at a machine code listing.
>>
>> How the compiler can predict it will run out of memory?
>
> Why would it have to?

No idea, it is a huge problem with GNAT. I am not sure if that happens 
because of inlining or because of cross-references it builds, but GNAT 
promptly runs out of 2GB memory when compiling large projects. The 
workaround I know is to reduce the number of with clauses per unit. It 
is not enough to with clauses move them to another packet "with" that 
one. The distance must be at least 2 units away. It is quite annoying.

>> Ah, but that is the point. The tasking model is not allowed to change the
>> semantics. This means that if your RTS takes the liberty to use single
>> thread for multiple tasks, then it also must convert *all* synchronous I/O
>> into asynchronous transparently to the Ada program.
>
> But of course it's impossible for the compiler to change user-written code.
> If you do a direct interface to some C API, there is no way the compiler
> could change that (nor would it be a good idea to do so).

You were talking about a virtualization layer. So theoretically it could 
be possible. But practically it is a non-starter, that is why 
user-scheduled tasks must be programmer's choice and not mere 
optimization as you suggested.

>> I know that Ada RM does not require it, but I assure you that virtually no
>> Ada user will accept anything else.
>
> That's demonstratably false, since Janus/Ada has always worked this way, and
> we've had far more than than zero customers over the years.

You didn't tell them? (:-))

> In any case, there would be no issue unless the programmer writes their own
> I/O; using anything we provide (language-defined or implementation-defined)
> would work. I'd think the vast majority of Ada programs would use Stream_IO
> compared to something of their own design. (Sockets is the big issue, since
> the language doesn't have it, but portable libraries can easily be made to
> do the right thing.)

Yes, a numeric application need none of this stuff. But new shiny things 
in Ada appear in embedded and distributed systems moving some real 
hardware. These are the areas where problems arise. And as Gnoga will 
gain its popularity the same questions will appear for fat server 
applications.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-08  0:03                               ` Randy Brukardt
  2016-07-08  7:32                                 ` Dmitry A. Kazakov
@ 2016-07-08 20:17                                 ` Niklas Holsti
  1 sibling, 0 replies; 72+ messages in thread
From: Niklas Holsti @ 2016-07-08 20:17 UTC (permalink / raw)


On 16-07-08 03:03 , Randy Brukardt wrote:
> "Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message
> news:du69ugF3631U1@mid.individual.net...
>> On 16-07-07 03:32 , Randy Brukardt wrote:
>>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>>> news:nliir0$1a7l$1@gioia.aioe.org...
>>>> On 05/07/2016 23:53, Randy Brukardt wrote:
>>> ...
>>>>> Nothing in the core language of Ada requires tasks to be pre-emptive.
>>>>
>>>> Some applications rely on time sharing and most do on close to instant
>>>> switching to a task of higher priority.
>>>
>>> Priorities are evil and almost always used poorly, that is to do
>>> something
>>> that should be accomplished explicitly with locking or the like. No
>>> system
>>> I'm contemplating has any priorities.
>>
>> Are you not contemplating any real-time systems? If you are, what do you
>> use instead of priorites, to ensure that urgent activities are done in
>> time?
>
> I'm not contemplating hard-real-time systems (under 10ms response time). I
> don't think it is possible to create implementation-independent code for
> those sorts of deadlines, and as such it doesn't really matter from a
> language perspective how that's done (it won't be portable in any case).

I'm currently working on a project which has one response deadline of 1 
ms and a main cyclic task that runs with 2 ms period -- but Dmitry 
bested this with 0.2 ms ...

Clearly such applications require a certain level of real-time 
performance from the computer, but at or above that performance 
threshold the proper use of priorities does allow a portable 
implementation, at least with a "bare-metal" RTS.

> I'm unconvinced that the way to ensure that "urgent activies" are done is
> some sort of magic (and priorities are essentially magic).

Magic? Priority-based scheduling and schedulability analysis have been 
studied and developed scientifically and mathematically for a long time, 
with ever more powerful methods and tools becoming available to prove 
that all deadlines are met under all circumstances. Is the Pythagorean 
Theorem magic?

> I'd rather make sure that no task is hogging the system, and avoid
> overcommitting. That usually happens naturally, and in the unusual case
> where it doesn't, there's almost always someplace where adding a
> synchronization point (usually a "Yield" aka delay 0.0) fixes the problem.

I fully share Dmitry's abhorrence of such manual, distributed 
scheduling. It would definitely be awful in my domain. To Dmitry's 
comments I add that it would make it harder to share/reuse SW components 
between projects, because the proper density of Yields in the 
shared/reused code would often depend on the real-time architecture of 
the application that uses the code. Moreover, a Yield is not allowed in 
a protected operation, but in an application with a large range of 
deadlines some protected operations, at low priorities, may be long 
enough that they must be preempted by higher-priority tasks.

Long ago, I wrote a couple of cooperative, priority-based multi-taskers, 
one for the HP2100 16-bit minicomputer and another for the TRS-80 PC 
(Zilog Z80 processor), in which equivalents of Yield were used. These 
systems worked and were usable in small applications, with a small 
number of tasks and a small number of priority levels, but I would 
certainly not like to base my current applications on such schedulers, 
and I believe that even my customers would object.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-06-17  9:44 RFC: Prototype for a user threading library in Ada Hadrien Grasland
                   ` (3 preceding siblings ...)
  2016-06-20  8:42 ` Hadrien Grasland
@ 2016-07-10  0:45 ` rieachus
  4 siblings, 0 replies; 72+ messages in thread
From: rieachus @ 2016-07-10  0:45 UTC (permalink / raw)


On Friday, June 17, 2016 at 5:44:18 AM UTC-4, Hadrien Grasland wrote:
> So, a while ago, after playing with the nice user-mode threading libraries that are available these days in C++, like Intel TBB and HPX, I thought it would be nice if Ada had something similar.

I've been following this thread for a while, and I keep thinking that there is a lot of barking up the wrong tree.  But then again, I am used to thinking about the size of problem that takes supercomputers.  Ada is a decent fit but the real issues are language independent--and as you may have grown tired of seeing me say, the optimal program--in any language will finish within a few seconds of the estimate from only looking at moving data.  Even if you have thousands of CPU hours of number crunching to do, your program looks like this:

Distribute the program code to the correct number of nodes.
Start the program.  The nodes will collect the data from the data storage system, interact (with hopefully only adjacent nodes) to complete the computation.
Collect and assemble the finished data for presentation.

The hard part of supercomputer programming today is not writing the code that does the work, it is distributing the program and data, and then assembling the results.

How to do this?  Let's forget about Ada tasking for a moment.  It may be great for tying together the (relatively small) number of CPUs at a node which share memory, and in some cases even cache memory.  What you need to start with assuming you wrote your program as a thousand or more distributed annex programs, is either to use tools that are part of the supercomputer to distribute thousands of copies of the identical code to different nodes, or to create your program with a hierarchical tree structure.  Otherwise you will spend all your time just getting the code distributed from a single node.

The other end of the program presents the same issues.  Imagine you want to simulate the evolution of the universe over a space that grows to a cube a million parsecs on a side and a period of a billion years.  After each time step you want to collect the data for that epoch for later viewing.  Again, if all the nodes are hammering on a single file, your program won't run very fast.

One solution is to store result data on a per node basis, then when the program is finished, run a program that converts from slices through time to 3d images at a given point in time.  This program may run several times as long as your big number crunching program.  But you can do it on your workstation over the weekend. ;-)

Another, tricky but it works, is to skew time in the main program.  This can result in duplicate number crunching at the boundaries, but as I keep saying, that is not the problem.  Now you feed the data into files, but you have several hundred files corresponding to different points in time, all open and collecting data.

Why post all this here?  The Ada distributed annex is actually a good fit for the tools available on most supercomputers.  Combining the distributed annex with Ada tasking on the nodes provide by modern CPUs, is a good fit.  Unfortunately, or we just have to learn to live with it, supercomputers are being taken over by GPUs.  The CPUs are still there, but they end up delegated to moving data between nodes, and doing whatever synchronization is needed.

Technically the tools are there to write code in high-level languages and run it on GPUs.  But right now, you end up with lots of structural artifacts in your program that make it non-portable.  (The biggest of these is the number of shader processors per GPU.  AMD has done a nice thing in breaking the shaders into groups of 64 on all their cards.  Helps a lot when working on a single machine, but...  Besides, right now most supercomputers use nVidia cards.  This may change if AMD gets their GFLOPS/watt down to where nVidia is.)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-08  7:32                                 ` Dmitry A. Kazakov
@ 2016-07-11 19:40                                   ` Randy Brukardt
  2016-07-12  8:37                                     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 72+ messages in thread
From: Randy Brukardt @ 2016-07-11 19:40 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nlnl0b$93q$1@gioia.aioe.org...
> On 08/07/2016 02:03, Randy Brukardt wrote:
...
>>> Are you not contemplating any real-time systems? If you are, what do you
>>> use instead of priorites, to ensure that urgent activities are done in
>>> time?
>>
>> I'm not contemplating hard-real-time systems (under 10ms response time).
>
> Some of our customers have 0.2ms response time requirement and that not 
> just local but over the network.

Your customers wouldn't be a candidate for this implementation, then.

>> I don't think it is possible to create implementation-independent code 
>> for
>> those sorts of deadlines, and as such it doesn't really matter from a
>> language perspective how that's done (it won't be portable in any case).
>
> I don't see why. Any reasonable implementation would do. An implementation 
> that does not preempt lower priority tasks is not reasonable. If you 
> wanted to push the argument you would end up with disabling hardware 
> interrupts.

Your definition of reasonable and mine are obviously incompatible. Hardware 
interrupts are evil; possibly a necessary evil but still evil and one wants 
to minimize them as much as possible. Obviously you disagree.

>> I'm unconvinced that the way to ensure that "urgent activies" are done is
>> some sort of magic (and priorities are essentially magic). I'd rather 
>> make
>> sure that no task is hogging the system, and avoid overcommitting. That
>> usually happens naturally, and in the unusual case where it doesn't, 
>> there's
>> almost always someplace where adding a synchronization point (usually a
>> "Yield" aka delay 0.0) fixes the problem.
>
> No it does not and it is a *very* bad design, because it distributes 
> making the decision to switch to the parts of the software which must know 
> nothing about the reason when switching is necessary. E.g. that a task 
> solving some differential equation decides whether to switch to the 
> keyboard interrupt handler. That belongs to the keyboard driver, not to 
> the equation solver.

Wrong. In this scheme *all* software switches, all the time. Only the task 
supervisor is making any decisions, but it is getting the opportunity to do 
so at reasonable intervals. Neither the keyboard nor the diffy-Q solver has 
anything to do with it.

I agree that having to put in such choices manually isn't a good idea, but I 
wasn't suggesting that (outside of user-written I/O that doesn't use 
language-defined or implementation-defined libraries). The compiler would do 
it at appropriate points.

> Yield is a premature optimization of worst kind and I bet it is highly 
> inefficient comparing to preemptive scheduling on any modern hardware.

Dunno; it wasn't very efficient on MS-DOS, but the problem there was that 
MS-DOS itself wasn't re-enterant. With multiple threads running at all 
times, the equation is very different.

I don't want to make a claim that it will be better in some sense, the 
experiment has to be done before any answer is known. (Task switching 
appears to be *much* cheaper in this model, so the question remains whether 
that savings makes up for the extra cost.)

> It is just like arguing for return codes over exceptions.

Given that almost none of this is visible to the Ada user, I don't see the 
analogy. (And certainly, return codes are much more efficient than 
exceptions; the problem is the distribution of concerns. That's not 
happening here.)

                                        Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-08  8:23                             ` Dmitry A. Kazakov
@ 2016-07-11 19:44                               ` Randy Brukardt
  0 siblings, 0 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-07-11 19:44 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
news:nlnnvt$d3f$1@gioia.aioe.org...
> On 08/07/2016 01:43, Randy Brukardt wrote:
...
>> In any case, there would be no issue unless the programmer writes their 
>> own
>> I/O; using anything we provide (language-defined or 
>> implementation-defined)
>> would work. I'd think the vast majority of Ada programs would use 
>> Stream_IO
>> compared to something of their own design. (Sockets is the big issue, 
>> since
>> the language doesn't have it, but portable libraries can easily be made 
>> to
>> do the right thing.)
>
> Yes, a numeric application need none of this stuff. But new shiny things 
> in Ada appear in embedded and distributed systems moving some real 
> hardware. These are the areas where problems arise. And as Gnoga will gain 
> its popularity the same questions will appear for fat server applications.

I'm no expert in embedded systems, so I'll not comment there. As far as fat 
servers go, I have several of those running there this way, and whatever 
problems they have aren't related to the tasking model. (Mostly seem to have 
contention issues with shared data, which isn't surprising to me.)

                                Randy.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-11 19:40                                   ` Randy Brukardt
@ 2016-07-12  8:37                                     ` Dmitry A. Kazakov
  2016-07-12 21:31                                       ` Randy Brukardt
  0 siblings, 1 reply; 72+ messages in thread
From: Dmitry A. Kazakov @ 2016-07-12  8:37 UTC (permalink / raw)


On 2016-07-11 21:40, Randy Brukardt wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
> news:nlnl0b$93q$1@gioia.aioe.org...

>>> I don't think it is possible to create implementation-independent code for
>>> those sorts of deadlines, and as such it doesn't really matter from a
>>> language perspective how that's done (it won't be portable in any case).
>>
>> I don't see why. Any reasonable implementation would do. An implementation
>> that does not preempt lower priority tasks is not reasonable. If you
>> wanted to push the argument you would end up with disabling hardware
>> interrupts.
>
> Your definition of reasonable and mine are obviously incompatible. Hardware
> interrupts are evil; possibly a necessary evil but still evil and one wants
> to minimize them as much as possible. Obviously you disagree.

What is evil in hardware interrupts? It is a piece of hardware serving 
certain purpose. There is no moral component there from the software 
developing POV, especially because no alternative solution ever existed.

>>> I'm unconvinced that the way to ensure that "urgent activies" are done is
>>> some sort of magic (and priorities are essentially magic). I'd rather make
>>> sure that no task is hogging the system, and avoid overcommitting. That
>>> usually happens naturally, and in the unusual case where it doesn't, there's
>>> almost always someplace where adding a synchronization point (usually a
>>> "Yield" aka delay 0.0) fixes the problem.
>>
>> No it does not and it is a *very* bad design, because it distributes
>> making the decision to switch to the parts of the software which must know
>> nothing about the reason when switching is necessary. E.g. that a task
>> solving some differential equation decides whether to switch to the
>> keyboard interrupt handler. That belongs to the keyboard driver, not to
>> the equation solver.
>
> Wrong. In this scheme *all* software switches, all the time. Only the task
> supervisor is making any decisions, but it is getting the opportunity to do
> so at reasonable intervals. Neither the keyboard nor the diffy-Q solver has
> anything to do with it.

The keyboard driver is at the receiving end, as a subscriber to the 
interrupt, or keyboard event. Solver and other tasks sharing the 
processor have nothing to do with that. Whether all events must be 
routed through the supervisor is an implementation detail. In any case 
the supervisor is not a part of the software, it is an OS/RTS component, 
thus non-existent from the SW design POV.

> I agree that having to put in such choices manually isn't a good idea, but I
> wasn't suggesting that (outside of user-written I/O that doesn't use
> language-defined or implementation-defined libraries). The compiler would do
> it at appropriate points.

How is that different then? If the compiler inserts re-scheduling code 
after each few instructions, that logically is *exactly* same as 
re-scheduling at timer interrupts, except than incredibly inefficient. 
This would be nothing but poor-man's preemptive scheduling.

My point was that we need non-preemptive user-controlled (explicit, 
cooperative) scheduling of certain tasks on top of the standard schema. 
And this looks much simpler to implement than code insertions you suggested.

>> It is just like arguing for return codes over exceptions.
>
> Given that almost none of this is visible to the Ada user, I don't see the
> analogy. (And certainly, return codes are much more efficient than
> exceptions; the problem is the distribution of concerns. That's not
> happening here.)

The analogy is that instead of signaling an event (exception, scheduling 
event) at its source with the advantage of hardware acceleration, you 
poll for the event state all over the code. Doing that you lose hardware 
support and you have a huge problem with third party libraries that does 
succumb to your schema.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: RFC: Prototype for a user threading library in Ada
  2016-07-12  8:37                                     ` Dmitry A. Kazakov
@ 2016-07-12 21:31                                       ` Randy Brukardt
  0 siblings, 0 replies; 72+ messages in thread
From: Randy Brukardt @ 2016-07-12 21:31 UTC (permalink / raw)


>Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message 
>news:nm2a9s$pkr$1@gioia.aioe.org...
> On 2016-07-11 21:40, Randy Brukardt wrote:
>> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
>> news:nlnl0b$93q$1@gioia.aioe.org...
...
>> Your definition of reasonable and mine are obviously incompatible. 
>> Hardware
>> interrupts are evil; possibly a necessary evil but still evil and one 
>> wants
>> to minimize them as much as possible. Obviously you disagree.
>
> What is evil in hardware interrupts? It is a piece of hardware serving 
> certain purpose. There is no moral component there from the software 
> developing POV, especially because no alternative solution ever existed.

Asynchronous actions are evil. (Necessary in some cases, but evil.) They 
essentially force all code to be concurrency-aware, something that neither 
programming languages nor humans have done very well.

The usual solution to that is to have handlers that do nothing but set an 
atomic object, but of course the program then essentially is the same thing 
as polling (or mutex waiting, which is essentially the same thing).

...
>> I agree that having to put in such choices manually isn't a good idea, 
>> but I
>> wasn't suggesting that (outside of user-written I/O that doesn't use
>> language-defined or implementation-defined libraries). The compiler would 
>> do
>> it at appropriate points.
>
> How is that different then? If the compiler inserts re-scheduling code 
> after each few instructions, that logically is *exactly* same as 
> re-scheduling at timer interrupts, except than incredibly inefficient. 
> This would be nothing but poor-man's preemptive scheduling.

Not "every few instructions"; just a handful of strategic places. And the 
idea of course is that it works enough like preemptive scheduling without 
the excessive costs of preemption. So I think "poor-man's preemptive 
scheduling" would be a complement (as a "poor-man" has to do more with 
less - I think that's a virtue :-).

> My point was that we need non-preemptive user-controlled (explicit, 
> cooperative) scheduling of certain tasks on top of the standard schema. 
> And this looks much simpler to implement than code insertions you 
> suggested.

Have you ever tried to implement this sort of stuff?

For Janus/Ada, implementing some sort of separate co-routines would 
definitely require rewriting 1/3rd of the front-end from scratch (everything 
that uses local variables and parameters would have to be changed to avoid 
the task stack; probably ), and potentially would require major changes in 
the back-end as well. Not to mention whatever changes are made to the task 
supervisor.

A "passive" aspect would only need changes to the task supervisor (it would 
have no effect on the generated code). The scheme I'm considering would 
require an extra code insertion into the stack check subprogram (trivial) 
and an extra call at "end loop" (also pretty trivial). The work needed is 
10% of supporting co-routines (or full pre-emption, for that matter).

>>> It is just like arguing for return codes over exceptions.
>>
>> Given that almost none of this is visible to the Ada user, I don't see 
>> the
>> analogy. (And certainly, return codes are much more efficient than
>> exceptions; the problem is the distribution of concerns. That's not
>> happening here.)
>
> The analogy is that instead of signaling an event (exception, scheduling 
> event) at its source with the advantage of hardware acceleration, you poll 
> for the event state all over the code. Doing that you lose hardware 
> support and you have a huge problem with third party libraries that does 
> succumb to your schema.

I prefer to put as little trust as absolutely necessary into anything that I 
don't have direct control of. That means hardware, third-party software, 
etc. I suspect I'd be happiest on a bare RISC machine without any 
interrupts. (I'd also probably be alone on that machine, which would also 
make me happy other than in the wallet. :-)

                                        Randy.


^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2016-07-12 21:31 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-17  9:44 RFC: Prototype for a user threading library in Ada Hadrien Grasland
2016-06-17 16:18 ` Niklas Holsti
2016-06-17 16:46   ` Dmitry A. Kazakov
2016-06-18  8:16     ` Hadrien Grasland
2016-06-18  8:47       ` Dmitry A. Kazakov
2016-06-18  9:17         ` Hadrien Grasland
2016-06-18 11:53           ` Dmitry A. Kazakov
2016-06-20  8:23             ` Hadrien Grasland
2016-06-20  9:22               ` Dmitry A. Kazakov
2016-06-23  1:42       ` Randy Brukardt
2016-06-23  8:39         ` Dmitry A. Kazakov
2016-06-23 22:12           ` Randy Brukardt
2016-06-24  7:34             ` Dmitry A. Kazakov
2016-06-24 23:00               ` Randy Brukardt
2016-06-25  7:11                 ` Dmitry A. Kazakov
2016-06-26  2:02                   ` rieachus
2016-06-26  6:26                     ` Dmitry A. Kazakov
2016-06-24  0:38           ` rieachus
2016-06-25  6:28             ` Dmitry A. Kazakov
2016-06-26  1:34               ` rieachus
2016-06-26  3:21               ` Randy Brukardt
2016-06-26  6:15                 ` Dmitry A. Kazakov
2016-06-28 20:44                   ` Anh Vo
2016-07-02  4:13                   ` Randy Brukardt
2016-07-02 10:25                     ` Dmitry A. Kazakov
2016-07-05 21:53                       ` Randy Brukardt
2016-07-06  9:25                         ` Dmitry A. Kazakov
2016-07-07  0:32                           ` Randy Brukardt
2016-07-07  6:08                             ` Niklas Holsti
2016-07-08  0:03                               ` Randy Brukardt
2016-07-08  7:32                                 ` Dmitry A. Kazakov
2016-07-11 19:40                                   ` Randy Brukardt
2016-07-12  8:37                                     ` Dmitry A. Kazakov
2016-07-12 21:31                                       ` Randy Brukardt
2016-07-08 20:17                                 ` Niklas Holsti
2016-06-24 21:06         ` Hadrien Grasland
2016-06-26  3:09           ` Randy Brukardt
2016-06-26  6:41             ` Dmitry A. Kazakov
2016-07-02  4:21               ` Randy Brukardt
2016-07-02 10:33                 ` Dmitry A. Kazakov
2016-07-05 21:24                   ` Randy Brukardt
2016-07-06 13:46                     ` Dmitry A. Kazakov
2016-07-07  1:00                       ` Randy Brukardt
2016-07-07 14:23                         ` Dmitry A. Kazakov
2016-07-07 23:43                           ` Randy Brukardt
2016-07-08  8:23                             ` Dmitry A. Kazakov
2016-07-11 19:44                               ` Randy Brukardt
2016-06-26  9:09             ` Hadrien Grasland
2016-07-02  4:36               ` Randy Brukardt
2016-07-02  5:30                 ` Simon Wright
2016-07-05 21:29                   ` Randy Brukardt
2016-07-02 11:13                 ` Hadrien Grasland
2016-07-02 13:18                   ` Dmitry A. Kazakov
2016-07-02 16:49                     ` Hadrien Grasland
2016-07-02 21:33                       ` Niklas Holsti
2016-07-03 20:56                         ` Hadrien Grasland
2016-07-02 17:26                   ` Niklas Holsti
2016-07-02 21:14                   ` Niklas Holsti
2016-07-03  7:42                     ` Hadrien Grasland
2016-07-03  8:39                       ` Dmitry A. Kazakov
2016-07-03 21:15                         ` Hadrien Grasland
2016-07-04  7:44                           ` Dmitry A. Kazakov
2016-07-05 21:38                   ` Randy Brukardt
2016-06-21  2:40     ` rieachus
2016-06-21  7:34       ` Dmitry A. Kazakov
2016-06-18  7:56   ` Hadrien Grasland
2016-06-18  8:33 ` Hadrien Grasland
2016-06-18 11:38 ` Hadrien Grasland
2016-06-18 13:17   ` Niklas Holsti
2016-06-18 16:27   ` Jeffrey R. Carter
2016-06-20  8:42 ` Hadrien Grasland
2016-07-10  0:45 ` rieachus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox