* Re: Threadpool with priority version 1.1 ... [not found] ` <1id5xnuz0x892$.1odbic5ppiv07.dlg@40tude.net> @ 2010-03-24 14:55 ` Georg Bauhaus 2010-03-24 16:40 ` Warren 2010-03-25 8:39 ` Dmitry A. Kazakov 0 siblings, 2 replies; 14+ messages in thread From: Georg Bauhaus @ 2010-03-24 14:55 UTC (permalink / raw) Dmitry A. Kazakov schrieb: > how the proposed algorithms map onto the > Ada tasking model, especially taking into account that Ada tasking > primitives are higher level, than ones known in other languages. As a side note: it seems anything but easy to explain the idea of a concurrent language, not a library, and not CAS things either, as the means to support the programmer who wishes to express concurrency. Concurrency is not seen as one of the modes of expression in language X. Rather, concurrency is seen as an effect of interweaving concurrency primitives and some algorithm. What can one do about this? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-24 14:55 ` Threadpool with priority version 1.1 Georg Bauhaus @ 2010-03-24 16:40 ` Warren 2010-03-24 18:27 ` Ada parallelism (was: Re: Threadpool with priority version 1.1 ...) Georg Bauhaus 2010-03-24 21:46 ` Threadpool with priority version 1.1 Maciej Sobczak 2010-03-25 8:39 ` Dmitry A. Kazakov 1 sibling, 2 replies; 14+ messages in thread From: Warren @ 2010-03-24 16:40 UTC (permalink / raw) Georg Bauhaus expounded in news:4baa27f2$0$6770$9b4e6d93 @newsspool3.arcor-online.net: > Dmitry A. Kazakov schrieb: >> how the proposed algorithms map onto the >> Ada tasking model, especially taking into account that Ada tasking >> primitives are higher level, than ones known in other languages. > > As a side note: it seems anything but easy to explain > the idea of a concurrent language, not a library, and > not CAS things either, as the means to support the programmer > who wishes to express concurrency. > Concurrency is not seen as one of the modes of expression > in language X. Rather, concurrency is seen as an effect > of interweaving concurrency primitives and some algorithm. > > What can one do about this? I thought the Cilk project was rather interesting in their attempt to make C (and C++) more parallel to take advantage of multi-core cpus. But the language still requires that the programmer program the parallel aspects of the code with some simple language enhancements. As cores eventually move to 128+-way cores, this needs to change to take full advantage of shortened elapsed times, obviously. I think this might require a radical new high-level language to do it. Another barrier I see to this is the high cost of starting a new thread and stack space allocation. I was disappointed to learn that the Cilk compiler uses multiple stacks in the same way that any pthread implementation would. If a single threaded version of the program needs S bytes of stack, a P-cpu threaded version would require P * S bytes of stack. They do get tricky with stack frames when they perform "work stealing" on a different cpu. But that is as close as they get to a cactus stack. Somehow you gotta make thread startup and shutdown cheaper. The only other thing to do is to create a pool of re-usable threads. But in my mind, the optimizing compiler is probably in the best place to make parallel code optimizations that consist of short runs of code. Warren ^ permalink raw reply [flat|nested] 14+ messages in thread
* Ada parallelism (was: Re: Threadpool with priority version 1.1 ...) 2010-03-24 16:40 ` Warren @ 2010-03-24 18:27 ` Georg Bauhaus 2010-03-24 20:04 ` Warren 2010-03-24 21:46 ` Threadpool with priority version 1.1 Maciej Sobczak 1 sibling, 1 reply; 14+ messages in thread From: Georg Bauhaus @ 2010-03-24 18:27 UTC (permalink / raw) Warren schrieb: > Georg Bauhaus expounded in news:4baa27f2$0$6770$9b4e6d93 > @newsspool3.arcor-online.net: > >> Dmitry A. Kazakov schrieb: >>> how the proposed algorithms map onto the >>> Ada tasking model, especially taking into account that Ada tasking >>> primitives are higher level, than ones known in other languages. >> As a side note: it seems anything but easy to explain >> the idea of a concurrent language, not a library, and >> not CAS things either, as the means to support the programmer >> who wishes to express concurrency. >> Concurrency is not seen as one of the modes of expression >> in language X. Rather, concurrency is seen as an effect >> of interweaving concurrency primitives and some algorithm. >> >> What can one do about this? > > I thought the Cilk project was rather interesting in > their attempt to make C (and C++) more parallel > to take advantage of multi-core cpus. But the language > still requires that the programmer program the parallel > aspects of the code with some simple language enhancements. > > As cores eventually move to 128+-way cores, this needs > to change to take full advantage of shortened elapsed > times, obviously. I think this might require a radical > new high-level language to do it. Or efficient multicore Ada will have to go radically back to the roots ;-) How did they achieve efficient execution on massively parallel processors? HPF? Occam? What do Sisal implementations do? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ada parallelism (was: Re: Threadpool with priority version 1.1 ...) 2010-03-24 18:27 ` Ada parallelism (was: Re: Threadpool with priority version 1.1 ...) Georg Bauhaus @ 2010-03-24 20:04 ` Warren 2010-03-25 8:24 ` Ada parallelism Dmitry A. Kazakov 0 siblings, 1 reply; 14+ messages in thread From: Warren @ 2010-03-24 20:04 UTC (permalink / raw) Georg Bauhaus expounded in news:4baa5987$0$6762$9b4e6d93 @newsspool3.arcor-online.net: > Warren schrieb: >> Georg Bauhaus expounded in news:4baa27f2$0$6770$9b4e6d93 >> @newsspool3.arcor-online.net: >> >>> Dmitry A. Kazakov schrieb: >>>> how the proposed algorithms map onto the >>>> Ada tasking model, especially taking into account that Ada tasking >>>> primitives are higher level, than ones known in other languages. > >>> As a side note: it seems anything but easy to explain >>> the idea of a concurrent language, not a library, and >>> not CAS things either, as the means to support the programmer >>> who wishes to express concurrency. >>> Concurrency is not seen as one of the modes of expression >>> in language X. Rather, concurrency is seen as an effect >>> of interweaving concurrency primitives and some algorithm. >>> >>> What can one do about this? >> >> I thought the Cilk project was rather interesting in >> their attempt to make C (and C++) more parallel >> to take advantage of multi-core cpus. But the language >> still requires that the programmer program the parallel >> aspects of the code with some simple language enhancements. >> >> As cores eventually move to 128+-way cores, this needs >> to change to take full advantage of shortened elapsed >> times, obviously. I think this might require a radical >> new high-level language to do it. > > Or efficient multicore Ada will have to go radically back to > the roots ;-) I do believe that an Ada compiler probably has enough internal info to manage something along this line. Some work would also have to be done to deal with explicitly coded tasking. > How did they achieve efficient execution on > massively parallel processors? HPF? Occam? What do Sisal > implementations do? I don't know about them, but if any of them are "interpreted", then there would be execution time semantics possible. Warren ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ada parallelism 2010-03-24 20:04 ` Warren @ 2010-03-25 8:24 ` Dmitry A. Kazakov 2010-03-25 13:44 ` Robert A Duff 0 siblings, 1 reply; 14+ messages in thread From: Dmitry A. Kazakov @ 2010-03-25 8:24 UTC (permalink / raw) On Wed, 24 Mar 2010 20:04:18 +0000 (UTC), Warren wrote: > Georg Bauhaus expounded in news:4baa5987$0$6762$9b4e6d93 > @newsspool3.arcor-online.net: > >> Warren schrieb: >>> Georg Bauhaus expounded in news:4baa27f2$0$6770$9b4e6d93 >>> @newsspool3.arcor-online.net: >>> >>>> Dmitry A. Kazakov schrieb: >>>>> how the proposed algorithms map onto the >>>>> Ada tasking model, especially taking into account that Ada tasking >>>>> primitives are higher level, than ones known in other languages. >> >>>> As a side note: it seems anything but easy to explain >>>> the idea of a concurrent language, not a library, and >>>> not CAS things either, as the means to support the programmer >>>> who wishes to express concurrency. >>>> Concurrency is not seen as one of the modes of expression >>>> in language X. Rather, concurrency is seen as an effect >>>> of interweaving concurrency primitives and some algorithm. >>>> >>>> What can one do about this? >>> >>> I thought the Cilk project was rather interesting in >>> their attempt to make C (and C++) more parallel >>> to take advantage of multi-core cpus. But the language >>> still requires that the programmer program the parallel >>> aspects of the code with some simple language enhancements. >>> >>> As cores eventually move to 128+-way cores, this needs >>> to change to take full advantage of shortened elapsed >>> times, obviously. I think this might require a radical >>> new high-level language to do it. >> >> Or efficient multicore Ada will have to go radically back to >> the roots ;-) > > I do believe that an Ada compiler probably has enough > internal info to manage something along this line. Some > work would also have to be done to deal with explicitly > coded tasking. > >> How did they achieve efficient execution on >> massively parallel processors? HPF? Occam? What do Sisal >> implementations do? > > I don't know about them, but if any of them are "interpreted", > then there would be execution time semantics possible. Occam was compiled, but its concurrency model was extremely low level (communication channels) and heavy-weight as compared to Ada. I believe that Ada's tasks on massively parallel processors could really shine if compiler supported. Especially because Ada's parameter passing model is so flexible to support both memory sharing and marshaling. Well, there is a problem with tagged types which are by-reference, this must be fixed (e.g. by providing "tagged" types without tags, and thus copyable). -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ada parallelism 2010-03-25 8:24 ` Ada parallelism Dmitry A. Kazakov @ 2010-03-25 13:44 ` Robert A Duff 2010-03-25 14:09 ` Dmitry A. Kazakov 0 siblings, 1 reply; 14+ messages in thread From: Robert A Duff @ 2010-03-25 13:44 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > I believe that Ada's tasks on massively parallel processors could really > shine if compiler supported. Especially because Ada's parameter passing > model is so flexible to support both memory sharing and marshaling. Well, > there is a problem with tagged types which are by-reference, this must be > fixed (e.g. by providing "tagged" types without tags, and thus copyable). Tagged types are passed by copy when doing a remote procedure call. By "by copy" I mean marshalling/unmarshalling, which of course involves copying the data. Or did you mean something else? - Bob ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Ada parallelism 2010-03-25 13:44 ` Robert A Duff @ 2010-03-25 14:09 ` Dmitry A. Kazakov 0 siblings, 0 replies; 14+ messages in thread From: Dmitry A. Kazakov @ 2010-03-25 14:09 UTC (permalink / raw) On Thu, 25 Mar 2010 09:44:11 -0400, Robert A Duff wrote: > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > >> I believe that Ada's tasks on massively parallel processors could really >> shine if compiler supported. Especially because Ada's parameter passing >> model is so flexible to support both memory sharing and marshaling. Well, >> there is a problem with tagged types which are by-reference, this must be >> fixed (e.g. by providing "tagged" types without tags, and thus copyable). > > Tagged types are passed by copy when doing a remote procedure call. > By "by copy" I mean marshalling/unmarshalling, which of course > involves copying the data. Or did you mean something else? Yes, this is what I meant. An entry call to a task running on another processor (with no shared memory) should marshal. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-24 16:40 ` Warren 2010-03-24 18:27 ` Ada parallelism (was: Re: Threadpool with priority version 1.1 ...) Georg Bauhaus @ 2010-03-24 21:46 ` Maciej Sobczak 2010-03-25 17:21 ` Warren 2010-03-25 17:30 ` Warren 1 sibling, 2 replies; 14+ messages in thread From: Maciej Sobczak @ 2010-03-24 21:46 UTC (permalink / raw) On 24 Mar, 17:40, Warren <ve3...@gmail.com> wrote: > Another barrier I see to this is the high cost of > starting a new thread and stack space allocation. > Somehow you gotta make thread startup and shutdown > cheaper. Why? The problem of startup/shutdown cost and how many cores you have are completely orthogonal. I see no problem in starting N threads at the initialization time, use them throughout the application lifetime and then shut down at the end (or never). The cost of these operations is irrelevant. Make it 10x what it is and I will be still fine. If your favorite programming model involves lots of short-running threads that have to be created and torn down repeatedly, then it has no relation to multicore. It is just a bad resource usage pattern. -- Maciej Sobczak * http://www.inspirel.com YAMI4 - Messaging Solution for Distributed Systems http://www.inspirel.com/yami4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-24 21:46 ` Threadpool with priority version 1.1 Maciej Sobczak @ 2010-03-25 17:21 ` Warren 2010-03-25 17:30 ` Warren 1 sibling, 0 replies; 14+ messages in thread From: Warren @ 2010-03-25 17:21 UTC (permalink / raw) Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38 @h18g2000yqo.googlegroups.com: > On 24 Mar, 17:40, Warren <ve3...@gmail.com> wrote: > >> Another barrier I see to this is the high cost of >> starting a new thread and stack space allocation. > >> Somehow you gotta make thread startup and shutdown >> cheaper. > > Why? > > The problem of startup/shutdown cost and how many cores you have are > completely orthogonal. > I see no problem in starting N threads at the initialization time, use > them throughout the application lifetime and then shut down at the end > (or never). Yes, I am aware of that option. > If your favorite programming model involves lots of short-running > threads that have to be created and torn down repeatedly, then it has > no relation to multicore. It is just a bad resource usage pattern. > Maciej Sobczak * http://www.inspirel.com That's a rather sweeping statement to make ("bad resource usage pattern"). Unless there are leaps in language design, I believe that is what you will mostly get in automatic parallel thread generation. As humans we tend to think in sequential steps, and consequently code things. The media seems to suggest that we shouldn't have to change our mindset to do parallism (i.e. the compilers should arrange it for us). Certainly that would make a wish list item. I don't know much about Intel's hyper-threads, but I believe it was one approach to doing this (presumably largely without compiler help). So I can't buy into your conclusion on that. Warren ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-24 21:46 ` Threadpool with priority version 1.1 Maciej Sobczak 2010-03-25 17:21 ` Warren @ 2010-03-25 17:30 ` Warren 2010-03-26 8:19 ` Dmitry A. Kazakov 1 sibling, 1 reply; 14+ messages in thread From: Warren @ 2010-03-25 17:30 UTC (permalink / raw) Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38 @h18g2000yqo.googlegroups.com: > On 24 Mar, 17:40, Warren <ve3...@gmail.com> wrote: > >> Another barrier I see to this is the high cost of >> starting a new thread and stack space allocation. > >> Somehow you gotta make thread startup and shutdown >> cheaper. > > Why? > > The problem of startup/shutdown cost and how many cores you have are > completely orthogonal. > I see no problem in starting N threads at the initialization time, use > them throughout the application lifetime and then shut down at the end > (or never)... I forgot to mention that the disadvantage of this approach is that you have to "pre-allocate" stack space for each thread (whether by default amount or by a specific designed amount). If you used a true cactus stack, this is not an issue. But with a traditional thread, you could choose stack requirements at the point of thread creation. Not so, if you create them all up front. So there are downsides to this approach. Warren ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-25 17:30 ` Warren @ 2010-03-26 8:19 ` Dmitry A. Kazakov 2010-03-26 9:30 ` Maciej Sobczak 0 siblings, 1 reply; 14+ messages in thread From: Dmitry A. Kazakov @ 2010-03-26 8:19 UTC (permalink / raw) On Thu, 25 Mar 2010 17:30:05 +0000 (UTC), Warren wrote: > Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38 > @h18g2000yqo.googlegroups.com: > >> On 24 Mar, 17:40, Warren <ve3...@gmail.com> wrote: >> >>> Another barrier I see to this is the high cost of >>> starting a new thread and stack space allocation. >> >>> Somehow you gotta make thread startup and shutdown >>> cheaper. >> >> Why? >> >> The problem of startup/shutdown cost and how many cores you have are >> completely orthogonal. >> I see no problem in starting N threads at the initialization time, use >> them throughout the application lifetime and then shut down at the end >> (or never)... > > I forgot to mention that the disadvantage of this approach is that > you have to "pre-allocate" stack space for each thread (whether > by default amount or by a specific designed amount). BTW, if this approach worked for an application, it should also do for the OS, e.g. why not to start all threads for all not yet running processes upon booting? If that worked, the effective observed startup time of a thread would be 0, and thus there would be nothing to care about. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-26 8:19 ` Dmitry A. Kazakov @ 2010-03-26 9:30 ` Maciej Sobczak 2010-03-26 19:35 ` Warren 0 siblings, 1 reply; 14+ messages in thread From: Maciej Sobczak @ 2010-03-26 9:30 UTC (permalink / raw) On 26 Mar, 09:19, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de> wrote: > BTW, if this approach worked for an application, it should also do for the > OS, It is true, obtaining resources up-front requires more careful analysis of the problem that is being solve and is not always possible. The difference between application and OS is in the amount of knowledge about what the software will do and applications tend to know more than OS in this aspect. That is why it is more realistic to have applications allocating their resources during initialization phase than to see that at the OS level. I'm not a big fan of programs that allocate and deallocate the same resource repeatedly - this is an obvious candidate for caching and object reuse, where the cost of allocation is amortized. Fortunately, it is not even necessary for a user code to do that - think about a caching memory allocator, there are analogies. And the language standard does not prevent implementations from reusing physical threads, if they are used as implementation foundations for tasks. -- Maciej Sobczak * http://www.inspirel.com YAMI4 - Messaging Solution for Distributed Systems http://www.inspirel.com/yami4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-26 9:30 ` Maciej Sobczak @ 2010-03-26 19:35 ` Warren 0 siblings, 0 replies; 14+ messages in thread From: Warren @ 2010-03-26 19:35 UTC (permalink / raw) Maciej Sobczak expounded in news:7b059d0f-791b-4ac9-bf64-c50448ec99f7@b30g2000yqd.googlegroups.com: .. > The difference between application and OS is in the amount of > knowledge about what the software will do and applications tend to > know more than OS in this aspect. Yes. > That is why it is more realistic to have applications allocating their > resources during initialization phase than to see that at the OS > level. I would generally agree with that, unless the cost of resource management was cleverly reduced. > I'm not a big fan of programs that allocate and deallocate the same > resource repeatedly - this is an obvious candidate for caching and > object reuse, where the cost of allocation is amortized. As a general principle this is right. But memory is another resource that sometimes needs careful management. With only 1 thread, you have a heap growing up to the stack and a stack that grows towards the heap. Either stack or heap can be huge (potentially at least), as long as both are not at the same time (overlapping). The moment you add 1 [additional] thread, you've now drawn the line in the sand for the lowest existing stack, and putting a smaller limit on it. This disadvantage is ok for probably most threaded programs, but perhaps not for a video rendering program that might hog resources on both heap and stack sides at differing times. In the end, the application programmer must plan this out, but this is a limitation that I dislike about our current execution environments. I suppose, just increasing the size of your VM address space, postpones the problem until we hit limits again. ;-) > Fortunately, > it is not even necessary for a user code to do that - think about a > caching memory allocator, there are analogies. And the language > standard does not prevent implementations from reusing physical > threads, if they are used as implementation foundations for tasks. > Maciej Sobczak * http://www.inspirel.com From an efficiency pov, this is all well and good. But if you want maximum dynamic allocation of heap+stack, then you might prefer fewer (if any) pre-allocated threads (implying additional stacks). Warren ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Threadpool with priority version 1.1 ... 2010-03-24 14:55 ` Threadpool with priority version 1.1 Georg Bauhaus 2010-03-24 16:40 ` Warren @ 2010-03-25 8:39 ` Dmitry A. Kazakov 1 sibling, 0 replies; 14+ messages in thread From: Dmitry A. Kazakov @ 2010-03-25 8:39 UTC (permalink / raw) On Wed, 24 Mar 2010 15:55:45 +0100, Georg Bauhaus wrote: > Dmitry A. Kazakov schrieb: >> how the proposed algorithms map onto the >> Ada tasking model, especially taking into account that Ada tasking >> primitives are higher level, than ones known in other languages. > > As a side note: it seems anything but easy to explain > the idea of a concurrent language, not a library, and > not CAS things either, as the means to support the programmer > who wishes to express concurrency. This is a strange claim. A library cannot express concurrency, I mean the procedural decomposition cannot. There is some magic added which tells that the procedure is called on a context of a thread or process etc, for nether is a part of a non-concurrent language. So the idea of a scheduled item with a context in part independent on the rest and in part sharing things with other scheduled items needs a lot of words to explain. > Concurrency is not seen as one of the modes of expression > in language X. That is a design fault of the corresponding language. Then you will need to specify the semantics of shared objects in presence of concurrency anyway. How would you do this *outside* the language? > Rather, concurrency is seen as an effect > of interweaving concurrency primitives and some algorithm. No, concurrent algorithms are quite different from the sequential ones. The same can be said about objects (in the context of OOP). -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-03-26 19:35 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <21e6697e-fd7c-4c5e-93dc-8d894449b5e6@f8g2000yqn.googlegroups.com> [not found] ` <ff3671a8-cf19-4cee-8b71-305bb6b1e9c1@l25g2000yqd.googlegroups.com> [not found] ` <4ba9e189$0$6886$9b4e6d93@newsspool2.arcor-online.net> [not found] ` <1id5xnuz0x892$.1odbic5ppiv07.dlg@40tude.net> 2010-03-24 14:55 ` Threadpool with priority version 1.1 Georg Bauhaus 2010-03-24 16:40 ` Warren 2010-03-24 18:27 ` Ada parallelism (was: Re: Threadpool with priority version 1.1 ...) Georg Bauhaus 2010-03-24 20:04 ` Warren 2010-03-25 8:24 ` Ada parallelism Dmitry A. Kazakov 2010-03-25 13:44 ` Robert A Duff 2010-03-25 14:09 ` Dmitry A. Kazakov 2010-03-24 21:46 ` Threadpool with priority version 1.1 Maciej Sobczak 2010-03-25 17:21 ` Warren 2010-03-25 17:30 ` Warren 2010-03-26 8:19 ` Dmitry A. Kazakov 2010-03-26 9:30 ` Maciej Sobczak 2010-03-26 19:35 ` Warren 2010-03-25 8:39 ` Dmitry A. Kazakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox