From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!news.glorb.com!peer02.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!post02.iad.highwinds-media.com!fx09.iad.POSTED!not-for-mail From: Brad Moore User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: =?windows-1252?Q?GNAT=A0and_Tasklets?= References: <455d0987-734a-4505-bb39-37bfd1a2cc6b@googlegroups.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Message-ID: NNTP-Posting-Host: 68.145.219.148 X-Complaints-To: internet.abuse@sjrb.ca X-Trace: 1418850512 68.145.219.148 (Wed, 17 Dec 2014 21:08:32 UTC) NNTP-Posting-Date: Wed, 17 Dec 2014 21:08:32 UTC Date: Wed, 17 Dec 2014 14:08:31 -0700 X-Received-Bytes: 9368 X-Received-Body-CRC: 3688213886 Xref: news.eternal-september.org comp.lang.ada:24075 Date: 2014-12-17T14:08:31-07:00 List-Id: On 14-12-17 06:06 AM, vincent.diemunsch@gmail.com wrote: > Hello Brad, > >> I dont think this is accurate, as creating tasks in Ada generally serves >> a different purpose than adding improved parallelism. Tasks are useful >> constructs for creating independent concurrent activities. It is a way >> of breaking an application into separate independent logical executions >> that separate concerns, improving the logic and understanding of a >> program. Parallelism on the other hand is only about making the program >> execute faster. If the parallelism does not do that, it fails to serve >> its purpose. > > I am rather surprised that you made a distinction between creating tasks > and parallelism. I agree that the goal of parallelism is to increase CPU > usage and therefore make the program run faster. For me creating tasks is > the Ada way of implementing parallelism. And it is a sound way of doing it since compilers, as far as I know, are not really able to find automaticaly parallelism in a program. Moreover using things like state machines to > create parallelism is to complex for a programmer and needs the use of a > dedicated langage. So tasks are fine. I made the distinction because they are not the same. Parallelism is a form of concurrency but is really just a subset of concurrency. In parallelism, multiple tasks are executing at the same time. In more general concurrency, this is not necessarily the case. Time slicing might be used for example, so that in reality only one task is executing at a given instance in time. I am not disagreeing that that tasks can be useful for implementing parallelism. The Paraffin libraries for instance use Ada tasks as the underlying worker. It just that to effectively use multicores/manycore architectures for parallelism, tasks generally are too course grained a construct for the application programmer to have to use as a starting point every time they want to to introduce parallelism. It's just too much code to have to write each time if you want to take advantage of things like load balancing, reductions, variable numbers of cores, oversubscription of parallelism prevention, choosing the right number of workers for the job, obtaining workers from a task pool, etc. Having a library that does all this work for you makes sense, so that the programmer can focus more on the algorithm without having to think so much about the parallelism. The other alternative is to build in more smarts into the compiler so that it can implicitly generate parallelism. Ada compilers are already able to automatically parallelize some things, and I believe some of them (all of them?) do. Someone posted an example a year or so ago of a loop in Ada that GNAT could optimize to utilize the cores. It might be that the parallelism was implemented as vectorization of the GCC backend, and it might have been that the compiler was taking advantage of hardware parallelism instructions rather than using a software thread based approach. But there are limits to what the compiler can safely parallelize. The compiler needs to ensure that data races are not introduced, for instance, which could cause an otherwise good sequential program to fail disastrously. In our HILT paper from last October, we presented a notion of being able to define a Global aspect, which identify dependencies on global data, that can be applied to subprograms. This is an extension of the Global aspect associated with SPARK. We also have a similar aspect that identifies subprograms that are potentially blocking. If the compiler can statically tell which subprograms have unsafe dependencies on global data, or can tell that no such dependencies exist, then it should be able to implicitly parallelize a loop that includes such calls. Without such information being statically available to the compiler, it cannot safely inject parallelism, and has to play it safe and generate sequential code. If we get into the realm of having the compiler generate the parallelism implicitly, then the underlying worker does not necessarily need to be a task. It could be, but if the compiler can make use of some lighter weight mechanism, it can do so, so long as the semantic effect of the parallelism is the same. > >> So the availability of a parallelism library shouldn't really affect the >> way one structures their program into a collection of tasks. >> I find such a library is useful when one ones to improve the execution >> time of one/some of the tasks in the application where performance is >> not adequate. Tasks and parallelism libraries can complement each other >> to achieve the best of both worlds. > > I am sorry to disagree : the very existence of a parallelism Library shows > the inability of the current Ada technology to do deal directly with parallelism > inside the Ada langage. I realy think this is due to the weakness of current compilers, but if there are also problems inside the langage they should be addressed (like the Ravenscar restriction that allowed predictable tasking, or > special constructs to express parallelism, or "aspects" to indicate that a task > should be run on a GPU...). Since should be only a few features of Ada 202X. I dont think we are disagreeing here. I was only mentioning that for non-parallelism concurrency, or for courser grained parallelism, tasks can still be used in much the same way they are used for concurrency on a single core. We are hoping that some new aspects and syntax could be considered for Ada 202x, but if that doesn't get the support needed for standardization, then one could either ask their compiler vendors for implementation defined, non-portable support, or they can resort to using libraries such as Paraffin, which actually are portable. I have used Paraffin on both GNAT and the ICC Ada compiler. > > >> My understanding is that GNAT generally maps tasks to OS threads on a >> one to one basis, but as others have pointed out, there may be >> configurations where other mappings are also available. > > I could understand that a Library level task (i.e. a task declared immediately > in a package that is at lirary level) be mapped to an OS thread, but a > simple local task should definetly not. Why not? And even that is a simplification since > as you pointed, there is often no use to create more kernel threads than the > number of available CPU. It depends. For example, if those threads block, then while they are blocked the cores are not being used, so it is actually beneficial to have more threads than there are CPU for such cases. > >> My understanding also is that at one time, GNAT had an implementation >> built on top of FSU threads developed at Florida State University, by >> Ted Baker. This implementation ran all tasks under one OS thread. >> [...] The FSU >> thread implementation gives you concurrency by allowing tasks to execute >> independently from each other, using some preemptive scheduling model to >> shift the processor between the multiple tasks of an application. > > The solution of all tasks under one kernel thread is good for monoprocessors, and since User Level threads are lightweight compared to Kernel threads, it was acceptable to map a task to a thread. > But with multiple cores, we need all tasks running on a pool of kernel threads, one thread per core. And I suppose that when multicores came, it has been considered easier to drop the FSU implementation and simply map one task to a kernel thread. But doing this is an oversimplification that gives poor performances for pure parallel computing, and gave rise to the need of parallelism Library ! (Not to mention the problem of GPU that are commonly used for highly demanding computations and are not supported by GNAT... ) > > What we need now is a new implementation of tasking in GNAT, able to treat > local tasks as jobs. I'm not convinced here. In Paraffin, I have several versions of the libraries. Some that use task pools where worker tasks are started beforehand, and some that create local tasks on the fly. I was quite surprised at the good performance of creating local tasks on the fly on the platforms I've tried. (Windows, Linux, and Android). Brad > > Regards, > > Vincent >