From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,8e11100f675ea2df X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit X-Received: by 10.180.86.201 with SMTP id r9mr12391119wiz.4.1357166640911; Wed, 02 Jan 2013 14:44:00 -0800 (PST) Path: l12ni279469wiv.1!nntp.google.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Niklas Holsti Newsgroups: comp.lang.ada Subject: Re: asynchronous task communication Date: Thu, 03 Jan 2013 00:43:58 +0200 Organization: Tidorum Ltd Message-ID: References: <1c2dnd5E6PMDR33NnZ2dnUVZ_sednZ2d@earthlink.com> <50e18094$0$6583$9b4e6d93@newsspool3.arcor-online.net> <7NednS4s2oukfXzNnZ2dnUVZ_oadnZ2d@earthlink.com> <7cudnYloBfQDw3_NnZ2dnUVZ_rKdnZ2d@earthlink.com> Mime-Version: 1.0 X-Trace: individual.net Q7WcPXTZFS+0KqGyQ+PgDg0o4MU2k5BXAVFQjcpG74LHZwlpaQLTtPSbGq/cMMPkqc Cancel-Lock: sha1:n56HLIW7E9gKMRM5c7LQxz+FMxs= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:17.0) Gecko/17.0 Thunderbird/17.0 In-Reply-To: <7cudnYloBfQDw3_NnZ2dnUVZ_rKdnZ2d@earthlink.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Date: 2013-01-03T00:43:58+02:00 List-Id: On 13-01-01 05:51 , Charles Hixson wrote: > On 12/31/2012 01:09 PM, Niklas Holsti wrote: >> On 12-12-31 20:52 , Charles Hixson wrote: >>> Thank you for your answer. >>> >>> The problem with that solution is that I've been advised that I >>> shouldn't have many more tasks than available cores, >> >> IMO that advice is useful only if your application logically needs fewer >> tasks than you have cores, but you want to increase the number of tasks >> in order to have more cores running in parallel .... >> But if your application logically needs more tasks than there are cores, >> because of the way tasks communicate and interact, there is absolutely >> no reason to limit the number of tasks to the number of cores, or >> anything related to the number of cores. >> > My application logically "needs" thousands or millions of tasks. Hmm. I'm not sure that I agree with you, more below. > But this is clearly impractical, so I'm trying to work around it. A few thousands could be practical, but I agree that a million tasks is not. Even if the scheduling could work, the stacks will need a huge amount of RAM. > The application is clearly going to be "communication bound". I'm not sure about that, either... > OTOH, even > with millions of separate cores, I'd still need asynchronous communication. > > FWIW, you can think of what I'm doing is a neural network simulation. > There are reasons why that isn't quite accurate, but it's close. And > each message is likely (but not certain) to get a reply and may also > cause further messages to be sent on. That sounds more like discrete event simulation, simulating a network of interacting automata with discrete states and state transitions. It doesn't sound so much like an analog neural network. > Occasionally new nodes will be > added to the net. So logically, each "cell" should be a separate task, There is always a choice of two representations for an automaton: as an algorithm (i.e. a task body), or as a data object. The former represents the automaton's state as a point in the algorithm (i.e. program counter + call stack + local variables), the latter as a state identifier (an enumeration) and perhaps some additional variables (if the automaton model has this concept). If you have several automata, and represent them as algorithms, and the automata change states at different, scattered times, then it is almost necessary to implement them as tasks -- independently progressing threads of control. In your case, I suspect (but of course I'm not sure) that the logical states of an individual automaton (a network node) are not complex enough to merit representing as an algorithm, so the representation as a data object seems equally valid. Which is why I'm not sure that your application "needs" millions of tasks, even logically (but this logic is subjective). > but even so asynchronous communication is needed to avoid deadlock. Very likely, if the network nodes would be implemented as tasks. But this depends on how your simulation handles time. Do you have a global concept of time-step or cycle, in which all network nodes progress by one elementary step or action, or perhaps stay in the same state if they receive no messages? Or do all nodes run "in parallel" with arbitrary relative speeds? Another question: do you expect the whole network to converge to some stable state (fixed point), as for example in the iterative solution of a partial differential equation? Or do you expect the network to continue working and changing, and the interesting result is this dynamic activity rather than some stable final state? In the former case (iteration to stability), I would expect (by analogy with PDEs) that the precise relative execution speeds of the network nodes or cells is less important, and could even be pseudo-random, as long as all nodes make positive progress. In the latter case (iteration to simulate dynamic behaviour) I assume you need some global notion of time and cycles. > Since I can't model cells as tasks, I'm planning on modeling them as > protected variables. Even this may be too limiting, because of the > requirement for asynchronous communication. Particularly since > protected variables are passive, rather than active, so each one is > going to need to wait for a task to activate it. It sounds like each cell implements more or less the same automaton (state transitions), assuming that all the protected objects (variables) are instances of the same protected type. If so, it seems to me that this system is a good candidate for the "worker task pool" approach. A "job" would be the execution of one state transition (message reception?) for one cell. Would this kind of design make some sense: - Define a record type to represent the state of one cell (including its connections to other cells). This does not have to be a protected type. - Create as many worker tasks as there are processor cores. - Allocate a subset of the cells to each worker task. - Each worker task executes the state transitions and actions of all the cells allocated to this task. The task maintains a queue of messages to these cells (a "work list"). - When a cell sends a message to another cell allocated to the same task, the message is simply appended to the message queue of that task. The queue does not need to be protected against concurrent access, because this queue is accessed only by this task. - When a cell sends a message to another cell allocated to another task, the message is appended to a queue of "incoming" messages, which is a protected object associated with that other task. - Each worker tasks takes and processes messages from its "incoming" queue at suitable times, depending on the presence or absence of a global time cycle. Since all communication is internal, the only reason for a worker task to be idle is if it happens that none of its cells has received any message, which seems unlikely to happen if there are millions of cells. So this design would be computation-bound, not communication-bound. If the incoming-message queues become a contention bottleneck, there could be a separate incoming-message queue for each pair of (sender task, receiver task), or perhaps a lock-free queue implementation would help. > But this appears to be > quite likely to interfere with normal execution. The classic answer is > to have two copies of the network and to cycle through cone copying it > and state changes into the second. Then, at the end of a cycle, reverse > it. This requires the whole thing to be copied in each cycle, even the > parts that haven't changed in this cycle. You can put the two copies of the state in each record object that represents a cell, together with a per-cell indication of which copy is "current". If the cell does not change its state, the current-state indicator is unchanged; nothing need be copied. If the cell changes its state, some data are changed, but at least the old and new states are close to each other, and perhaps in the same cache lines, or at least in the same VM page, which should reduce the cost of the copy. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .