asynchronous task communication

comp.lang.ada
 help / color / mirror / Atom feed

* asynchronous task communication
@ 2012-12-31  0:16 Charles Hixson
  2012-12-31  9:04 ` Simon Wright
                   ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Charles Hixson @ 2012-12-31  0:16 UTC (permalink / raw)


Is it possible to call an entry point of a task from another task and 
not wait for the task to finish?
I'm thinking of an entry point that would have only _in_ parameters, and 
which is not expected to fail, but the thread called might be busy, so 
I'd like to just queue it and go on to something else.  This doesn't 
appear to be what an "asynchronous select" does, though I'll admit I'm 
not sure, as I can't tell why they call it asynchronous...timed seems 
more reasonable than asynchronous,

What I'm after is sort of like sending a letter from one task to 
another.  The sender doesn't need to wait for the receiver to accept the 
message (though, ideally, there would also be a "return receipt 
requested" option).

The only alternative that I've come up with is to have each task have an 
access variable to a protected type instance.  This can be done, but it 
makes the control in other parts of the program trickier.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31  0:16 asynchronous task communication Charles Hixson
@ 2012-12-31  9:04 ` Simon Wright
  2012-12-31 11:49   ` Simon Wright
  2012-12-31 10:50 ` J-P. Rosen
  2012-12-31 12:09 ` Georg Bauhaus
  2 siblings, 1 reply; 56+ messages in thread
From: Simon Wright @ 2012-12-31  9:04 UTC (permalink / raw)


Charles Hixson <charleshixsn@earthlink.net> writes:

> Is it possible to call an entry point of a task from another task and
> not wait for the task to finish?
> I'm thinking of an entry point that would have only _in_ parameters,
> and which is not expected to fail, but the thread called might be
> busy, so I'd like to just queue it and go on to something else.  This
> doesn't appear to be what an "asynchronous select" does, though I'll
> admit I'm not sure, as I can't tell why they call it
> asynchronous...timed seems more reasonable than asynchronous,

Requeue[1] (with abort)?

> What I'm after is sort of like sending a letter from one task to
> another.  The sender doesn't need to wait for the receiver to accept
> the message (though, ideally, there would also be a "return receipt
> requested" option).

Not quite so straightforward.

> The only alternative that I've come up with is to have each task have
> an access variable to a protected type instance.  This can be done,
> but it makes the control in other parts of the program trickier.

If you can use Ada2012, you could try a synchronized queue[2, 3] just
for these two tasks.

[1] http://www.ada-auth.org/standards/12rm/html/RM-9-5-4.html
[2] http://www.ada-auth.org/standards/12rm/html/RM-A-18-27.html
[3] http://www.ada-auth.org/standards/12rm/html/RM-A-18-28.html



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31  9:04 ` Simon Wright
@ 2012-12-31 11:49   ` Simon Wright
  0 siblings, 0 replies; 56+ messages in thread
From: Simon Wright @ 2012-12-31 11:49 UTC (permalink / raw)


Simon Wright <simon@pushface.org> writes:

> Charles Hixson <charleshixsn@earthlink.net> writes:
>
>> Is it possible to call an entry point of a task from another task and
>> not wait for the task to finish?
>> I'm thinking of an entry point that would have only _in_ parameters,
>> and which is not expected to fail, but the thread called might be
>> busy, so I'd like to just queue it and go on to something else.  This
>> doesn't appear to be what an "asynchronous select" does, though I'll
>> admit I'm not sure, as I can't tell why they call it
>> asynchronous...timed seems more reasonable than asynchronous,
>
> Requeue[1] (with abort)?

Oops, not useful. The caller task will still block.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31  0:16 asynchronous task communication Charles Hixson
  2012-12-31  9:04 ` Simon Wright
@ 2012-12-31 10:50 ` J-P. Rosen
  2012-12-31 12:09 ` Georg Bauhaus
  2 siblings, 0 replies; 56+ messages in thread
From: J-P. Rosen @ 2012-12-31 10:50 UTC (permalink / raw)


Le 31/12/2012 01:16, Charles Hixson a ï¿½crit :
> What I'm after is sort of like sending a letter from one task to another.
Well, if you need a mailbox, write a mailbox!

Takes only a few lines, using a task or a PO

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31  0:16 asynchronous task communication Charles Hixson
  2012-12-31  9:04 ` Simon Wright
  2012-12-31 10:50 ` J-P. Rosen
@ 2012-12-31 12:09 ` Georg Bauhaus
  2012-12-31 18:52   ` Charles Hixson
  2 siblings, 1 reply; 56+ messages in thread
From: Georg Bauhaus @ 2012-12-31 12:09 UTC (permalink / raw)


On 31.12.12 01:16, Charles Hixson wrote:
>
> The only alternative that I've come up with is to have each task have an access variable to a protected type instance.  This can be done, but it makes the control in other parts of the program trickier.

You could make the letter actively deliver itself.

Just in case you do no want to write a mail box, as
suggested by J-P. Rosen.

package Letters is

    task type Receiver is
       entry Knock (Message : String);
    end Receiver;

    task type Letter (Recipient : access constant Receiver) is
       entry Write (Message : String);
    end Letter;

    task type Sender is
    end Sender;

end Letters;

package body Letters is

    type Text_Buffer is access constant String;
    type Letter_Pointer is access Letter;
    Receiver_In_Scope : aliased Receiver;

    task body Receiver is
    begin
       accept Knock (Message : String);
    end Receiver;

    task body Letter is
       Text : Text_Buffer;
    begin
       --  get written, then actively get sent, then be finished
       accept Write (Message : String) do
          Letter.Text := new String'(Message);
       end Write;
       Recipient.Knock (Text.all);
    end Letter;

    task body Sender is

       procedure Send;
       --  prepare a self-delivering letter

       procedure Send is
          Order : Letter_Pointer;
       begin
          Order := new Letter (Recipient => Receiver_In_Scope'Access);
          Order.Write ("Meet me at 10 o'clock!");
       end Send;

    begin
       null;  -- ...
       Send;
       null;  -- ...
    end Sender;

end Letters;





^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 12:09 ` Georg Bauhaus
@ 2012-12-31 18:52   ` Charles Hixson
  2012-12-31 20:18     ` Shark8
                       ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Charles Hixson @ 2012-12-31 18:52 UTC (permalink / raw)

Thank you for your answer.

The problem with that solution is that I've been advised that I 
shouldn't have many more tasks than available cores, and that uses 2 of 
my available 6.  True, several clients could use the same postmaster, 
but it's still too resource intensive.

It's looking more and more like protected objects on the heap are the 
best answer.  I just wasn't sure that a built-in answer didn't exist.

(FWIW, I think I could get away with reducing the postmaster to only 
using one thread, but that's still too resource intensive.  Actually, 
I'll probably redesign your postmaster to be a single protected object 
on the heap, as storing a message shouldn't block for long, and reading 
a message should be even less limiting.  I'd go with messages to 
individual protected objects, but there's a feed-back interaction that 
would cause everything to lock up.)

OTOH, actual non-blocking (i.e. asynchronous) calls to either threads or 
protected items is what I really need.  These other things are 
make-shift kludges.  They centralize control that should be distributed. 
  But since the called item will want to send a message back (and 
possibly onwards)...everything freezes up unless the message sending is 
asynchronous.

On 12/31/2012 04:09 AM, Georg Bauhaus wrote:
> On 31.12.12 01:16, Charles Hixson wrote:
>>
>> The only alternative that I've come up with is to have each task have
>> an access variable to a protected type instance. This can be done, but
>> it makes the control in other parts of the program trickier.
>
> You could make the letter actively deliver itself.
>
> Just in case you do no want to write a mail box, as
> suggested by J-P. Rosen.
>
> package Letters is
>
> task type Receiver is
> entry Knock (Message : String);
> end Receiver;
>
> task type Letter (Recipient : access constant Receiver) is
> entry Write (Message : String);
> end Letter;
>
> task type Sender is
> end Sender;
>
> end Letters;
>
> package body Letters is
>
> type Text_Buffer is access constant String;
> type Letter_Pointer is access Letter;
> Receiver_In_Scope : aliased Receiver;
>
> task body Receiver is
> begin
> accept Knock (Message : String);
> end Receiver;
>
> task body Letter is
> Text : Text_Buffer;
> begin
> -- get written, then actively get sent, then be finished
> accept Write (Message : String) do
> Letter.Text := new String'(Message);
> end Write;
> Recipient.Knock (Text.all);
> end Letter;
>
> task body Sender is
>
> procedure Send;
> -- prepare a self-delivering letter
>
> procedure Send is
> Order : Letter_Pointer;
> begin
> Order := new Letter (Recipient => Receiver_In_Scope'Access);
> Order.Write ("Meet me at 10 o'clock!");
> end Send;
>
> begin
> null; -- ...
> Send;
> null; -- ...
> end Sender;
>
> end Letters;
>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 18:52   ` Charles Hixson
@ 2012-12-31 20:18     ` Shark8
  2012-12-31 21:09     ` Niklas Holsti
  2013-01-01 19:59     ` J-P. Rosen
  2 siblings, 0 replies; 56+ messages in thread
From: Shark8 @ 2012-12-31 20:18 UTC (permalink / raw)

On Monday, December 31, 2012 12:52:22 PM UTC-6, Charles Hixson wrote:
> Thank you for your answer.
> 
> The problem with that solution is that I've been advised that I 
> shouldn't have many more tasks than available cores, and that uses 2 of 
> my available 6.  True, several clients could use the same postmaster, 
> but it's still too resource intensive.

I'm rather dubious of this advice -- without any reasoning it reeks of the "premature optimization" problem.

Besides, unless you're making it for a specific [homogenized] computer platform (I.E. the PlayStation 2) there's no guarantee that the user will have the same set-up... further, it restricts you to your particular computer. (I recently got a new computer, which has a different number of cores than my old one... I would not like to be compelled to recompile my programs because of such a minor difference.)

> It's looking more and more like protected objects on the heap are the 
> best answer.  I just wasn't sure that a built-in answer didn't exist.

Protected objects are pretty nice, so are the Containers (w/ the generalized for-loop). But if it's something that logically maps well to TASK, there's no reason to avoid TASK -- that is, conform your code to the problem, not to the details of the executing computer.

> OTOH, actual non-blocking (i.e. asynchronous) calls to either threads or 
> protected items is what I really need.

Ada can do it; I don't have my books available, but I believe asynchronous select is covered in both:
http://www.amazon.com/Building-Parallel-Embedded-Real-Time-Applications/dp/0521197163/ref=sr_1_2?ie=UTF8&qid=1356984746&sr=8-2&keywords=Ada+Parallel+programming
and
http://www.amazon.com/Concurrent-Real-Time-Programming-Ada-ebook/dp/B001GS6TBO/ref=sr_1_3?ie=UTF8&qid=1356984746&sr=8-3&keywords=Ada+Parallel+programming

Or just go to the RM:
 http://www.ada-auth.org/standards/12aarm/html/AA-9-7-4.html

> These other things are make-shift kludges.  They centralize control that should be distributed. 

Hm, then should you be looking at the DSA? -- Or is that a different 'distributed' than you are talking about?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 18:52   ` Charles Hixson
  2012-12-31 20:18     ` Shark8
@ 2012-12-31 21:09     ` Niklas Holsti
  2012-12-31 22:15       ` Randy Brukardt
  2013-01-01  3:51       ` Charles Hixson
  2013-01-01 19:59     ` J-P. Rosen
  2 siblings, 2 replies; 56+ messages in thread
From: Niklas Holsti @ 2012-12-31 21:09 UTC (permalink / raw)

On 12-12-31 20:52 , Charles Hixson wrote:
> Thank you for your answer.
> 
> The problem with that solution is that I've been advised that I
> shouldn't have many more tasks than available cores,

IMO that advice is useful only if your application logically needs fewer
tasks than you have cores, but you want to increase the number of tasks
in order to have more cores running in parallel. It can then make sense
to replicate some of your tasks (the "pool of worker tasks" idea), so
that several identical tasks work in parallel on the same kind of jobs,
or to split some of your tasks into smaller tasks that work in parallel
on different consecutive steps of a job (the "task pipeline" idea).

As you increase the number of tasks in this way, performance increases
because more tasks can be run in parallel, until you reach one of two
bounds:

1. new tasks have nothing to do, and just wait for input or for some
other task to do something, or

2. all cores are already busy, on average, so new tasks must wait for a
core to become free.

Case (1) means that your application is somehow "I/O bound" or
"communication bound", while case (2) means that your application is
"computation bound". The advice you got applies to case (2).

But if your application logically needs more tasks than there are cores,
because of the way tasks communicate and interact, there is absolutely
no reason to limit the number of tasks to the number of cores, or
anything related to the number of cores. There are applications on
single-core processors with hundreds of tasks.

So if you need an extra task, or ten extra tasks, in order to decouple
the timing of some other tasks from each other, go ahead.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 21:09     ` Niklas Holsti
@ 2012-12-31 22:15       ` Randy Brukardt
  2013-01-01  3:58         ` Charles Hixson
  2013-01-01  3:51       ` Charles Hixson
  1 sibling, 1 reply; 56+ messages in thread
From: Randy Brukardt @ 2012-12-31 22:15 UTC (permalink / raw)


"Niklas Holsti" <niklas.holsti@tidorum.invalid> wrote in message 
news:aked87Fpfc1U1@mid.individual.net...
> On 12-12-31 20:52 , Charles Hixson wrote:
>> Thank you for your answer.
>>
>> The problem with that solution is that I've been advised that I
>> shouldn't have many more tasks than available cores,
>
> IMO that advice is useful only if your application logically needs fewer
> tasks than you have cores, but you want to increase the number of tasks
> in order to have more cores running in parallel. It can then make sense
> to replicate some of your tasks (the "pool of worker tasks" idea), so
> that several identical tasks work in parallel on the same kind of jobs,
> or to split some of your tasks into smaller tasks that work in parallel
> on different consecutive steps of a job (the "task pipeline" idea).
>
> As you increase the number of tasks in this way, performance increases
> because more tasks can be run in parallel, until you reach one of two
> bounds:
>
> 1. new tasks have nothing to do, and just wait for input or for some
> other task to do something, or
>
> 2. all cores are already busy, on average, so new tasks must wait for a
> core to become free.
>
> Case (1) means that your application is somehow "I/O bound" or
> "communication bound", while case (2) means that your application is
> "computation bound". The advice you got applies to case (2).
>
> But if your application logically needs more tasks than there are cores,
> because of the way tasks communicate and interact, there is absolutely
> no reason to limit the number of tasks to the number of cores, or
> anything related to the number of cores. There are applications on
> single-core processors with hundreds of tasks.
>
> So if you need an extra task, or ten extra tasks, in order to decouple
> the timing of some other tasks from each other, go ahead.

Right, I touched on this last week. The issue is the amount of CPU load for 
each task, not the raw numbers of tasks.

Most tasks spend most of their time blocked (the mailbox tasks surely are in 
this category). Such tasks don't contribute much to the CPU usage and surely 
shouldn't be counted as "using up a core". Tasks that are intended to be 
running most of the time do "use up a core", but even then, you may be 
better off structuring your program with more tasks.

Someone else touched on it, but premature optimization is a very common 
evil. And that extends just as much to the use of tasks as it does with 
anything else.

                                        Randy.





^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 22:15       ` Randy Brukardt
@ 2013-01-01  3:58         ` Charles Hixson
  2013-01-01  4:48           ` tmoran
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-01  3:58 UTC (permalink / raw)


On 12/31/2012 02:15 PM, Randy Brukardt wrote:
> "Niklas Holsti"<niklas.holsti@tidorum.invalid>  wrote in message
> news:aked87Fpfc1U1@mid.individual.net...
>> On 12-12-31 20:52 , Charles Hixson wrote:
>>> Thank you for your answer.
>>>
>>> The problem with that solution is that I've been advised that I
>>> shouldn't have many more tasks than available cores,
>>
>> IMO that advice is useful only if your application logically needs fewer
>> tasks than you have cores, but you want to increase the number of tasks
>> in order to have more cores running in parallel. It can then make sense
>> to replicate some of your tasks (the "pool of worker tasks" idea), so
>> that several identical tasks work in parallel on the same kind of jobs,
>> or to split some of your tasks into smaller tasks that work in parallel
>> on different consecutive steps of a job (the "task pipeline" idea).
>>
>> As you increase the number of tasks in this way, performance increases
>> because more tasks can be run in parallel, until you reach one of two
>> bounds:
>>
>> 1. new tasks have nothing to do, and just wait for input or for some
>> other task to do something, or
>>
>> 2. all cores are already busy, on average, so new tasks must wait for a
>> core to become free.
>>
>> Case (1) means that your application is somehow "I/O bound" or
>> "communication bound", while case (2) means that your application is
>> "computation bound". The advice you got applies to case (2).
>>
>> But if your application logically needs more tasks than there are cores,
>> because of the way tasks communicate and interact, there is absolutely
>> no reason to limit the number of tasks to the number of cores, or
>> anything related to the number of cores. There are applications on
>> single-core processors with hundreds of tasks.
>>
>> So if you need an extra task, or ten extra tasks, in order to decouple
>> the timing of some other tasks from each other, go ahead.
>
> Right, I touched on this last week. The issue is the amount of CPU load for
> each task, not the raw numbers of tasks.
>
> Most tasks spend most of their time blocked (the mailbox tasks surely are in
> this category). Such tasks don't contribute much to the CPU usage and surely
> shouldn't be counted as "using up a core". Tasks that are intended to be
> running most of the time do "use up a core", but even then, you may be
> better off structuring your program with more tasks.
>
> Someone else touched on it, but premature optimization is a very common
> evil. And that extends just as much to the use of tasks as it does with
> anything else.
>
>                                          Randy.
>
>
For the application I'm designing, blocking is VERY evil, and also can't 
be avoided.  Think of it as a neural net with feedback circuits.
Also, to model it correctly I'd need millions of tasks, and I'd still 
need asynchronous communication to avoid deadlock.

OTOH, I think I've found a solution:
For each cell, implement it as a record of two protected objects.  One 
of which is the cell, as previously conceived, and the other of which is 
a mailbox.  The mailbox would never block for long.  I am worried a bit 
about the increase in RAM requirements, but perhaps that can't be 
avoided.  At least it allows nearly non-blocking communication.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01  3:58         ` Charles Hixson
@ 2013-01-01  4:48           ` tmoran
  2013-01-01 17:59             ` Charles Hixson
  0 siblings, 1 reply; 56+ messages in thread
From: tmoran @ 2013-01-01  4:48 UTC (permalink / raw)


> OTOH, I think I've found a solution:
> For each cell, implement it as a record of two protected objects.  One
> of which is the cell, as previously conceived, and the other of which is
> a mailbox.  The mailbox would never block for long.  I am worried a bit
> about the increase in RAM requirements, but perhaps that can't be
> avoided.  At least it allows nearly non-blocking communication.
   So this is like an apartment building with each tenant being a cell,
and a wall of mailboxes in the lobby?  With blocking only occurring when
somebody wants to deposit mail in a box where the tenant is in the process
of extracting his mail.  As opposed to having a concierge who may be busy
with Smith when Jones wants something.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01  4:48           ` tmoran
@ 2013-01-01 17:59             ` Charles Hixson
  0 siblings, 0 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-01 17:59 UTC (permalink / raw)


On 12/31/2012 08:48 PM, tmoran@acm.org wrote:
>> OTOH, I think I've found a solution:
>> For each cell, implement it as a record of two protected objects.  One
>> of which is the cell, as previously conceived, and the other of which is
>> a mailbox.  The mailbox would never block for long.  I am worried a bit
>> about the increase in RAM requirements, but perhaps that can't be
>> avoided.  At least it allows nearly non-blocking communication.
>     So this is like an apartment building with each tenant being a cell,
> and a wall of mailboxes in the lobby?  With blocking only occurring when
> somebody wants to deposit mail in a box where the tenant is in the process
> of extracting his mail.  As opposed to having a concierge who may be busy
> with Smith when Jones wants something.
That's the idea.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 21:09     ` Niklas Holsti
  2012-12-31 22:15       ` Randy Brukardt
@ 2013-01-01  3:51       ` Charles Hixson
  2013-01-01  9:59         ` Dmitry A. Kazakov
                           ` (3 more replies)
  1 sibling, 4 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-01  3:51 UTC (permalink / raw)

On 12/31/2012 01:09 PM, Niklas Holsti wrote:
> On 12-12-31 20:52 , Charles Hixson wrote:
>> Thank you for your answer.
>>
>> The problem with that solution is that I've been advised that I
>> shouldn't have many more tasks than available cores,
>
> IMO that advice is useful only if your application logically needs fewer
> tasks than you have cores, but you want to increase the number of tasks
> in order to have more cores running in parallel. It can then make sense
> to replicate some of your tasks (the "pool of worker tasks" idea), so
> that several identical tasks work in parallel on the same kind of jobs,
> or to split some of your tasks into smaller tasks that work in parallel
> on different consecutive steps of a job (the "task pipeline" idea).
>
> As you increase the number of tasks in this way, performance increases
> because more tasks can be run in parallel, until you reach one of two
> bounds:
>
> 1. new tasks have nothing to do, and just wait for input or for some
> other task to do something, or
>
> 2. all cores are already busy, on average, so new tasks must wait for a
> core to become free.
>
> Case (1) means that your application is somehow "I/O bound" or
> "communication bound", while case (2) means that your application is
> "computation bound". The advice you got applies to case (2).
>
> But if your application logically needs more tasks than there are cores,
> because of the way tasks communicate and interact, there is absolutely
> no reason to limit the number of tasks to the number of cores, or
> anything related to the number of cores. There are applications on
> single-core processors with hundreds of tasks.
>
> So if you need an extra task, or ten extra tasks, in order to decouple
> the timing of some other tasks from each other, go ahead.
>
My application logically "needs" thousands or millions of tasks.  But 
this is clearly impractical, so I'm trying to work around it.  The 
application is clearly going to be "communication bound".  OTOH, even 
with millions of separate cores, I'd still need asynchronous communication.

FWIW, you can think of what I'm doing is a neural network simulation. 
There are reasons why that isn't quite accurate, but it's close.  And 
each message is likely (but not certain) to get a reply and may also 
cause further messages to be sent on.  Occasionally new nodes will be 
added to the net.  So logically, each "cell" should be a separate task, 
but even so asynchronous communication is needed to avoid deadlock.

Since I can't model cells as tasks, I'm planning on modeling them as 
protected variables.  Even this may be too limiting, because of the 
requirement for asynchronous communication.  Particularly since 
protected variables are passive, rather than active, so each one is 
going to need to wait for a task to activate it.  But this appears to be 
quite likely to interfere with normal execution.  The classic answer is 
to have two copies of the network and to cycle through cone copying it 
and state changes into the second.  Then, at the end of a cycle, reverse 
it.  This requires the whole thing to be copied in each cycle, even the 
parts that haven't changed in this cycle.  I'm trying to avoid that, but 
I may not be able to unless I can break this requirement for blocking.

P.S.:  Yes, there are slick ways to minimize the amount of copying 
needed.  If I must take that approach, I'll be using dirty flags, 
deltas, changed-in-cycle-# etc. But that's NOT an approach that is at 
all appealing, and it doesn't model the process I'm trying to simulate 
at all well.  Maybe I'll need to do it as a single threaded application.

(The more I look at a central postmaster, the less desirable it looks. 
But perhaps I can have a postmaster attached to each cell, or maybe the 
cell should be attached to the same record as the mailbox.  Then for 
each cell there'd be an item on the heap that had two protected items: 
1. the mailbox and 2. the cell, so sending a message to the cell 
wouldn't block on the cell being busy, and also wouldn't block on 
another cell being sent a message at the same time.  Perhaps that's the 
way the thing I was hoping was built-in would have been implemented.)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01  3:51       ` Charles Hixson
@ 2013-01-01  9:59         ` Dmitry A. Kazakov
  2013-01-01 10:38         ` Brian Drummond
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 56+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-01  9:59 UTC (permalink / raw)

On Mon, 31 Dec 2012 19:51:36 -0800, Charles Hixson wrote:

> My application logically "needs" thousands or millions of tasks.

An OS can support no more than few hundred task. 

> FWIW, you can think of what I'm doing is a neural network simulation.

Well, it is a kind of structure (data-driven machine) which is very
difficult for concurrent computing based on traditional architectures. Even
if you could have a task per node/cell, the performance could be actually
worse due to task switching, which would eat most of the resources.

> Since I can't model cells as tasks, I'm planning on modeling them as 
> protected variables.

I would not do that. There is a possibility that protected actions would
use system-wide locks. And it is a very bad idea to do perform anything
time consuming (e.g. calculations) from a protected action. Logically a
protected action is considered "instant."

For a NN, if you really wanted to experiment with concurrency, I would use
a pool of tasks (there already were good advices how many of them make
sense to have on a multi-core). The tasks would service groups of nodes of
the same layer. For data you would need no locking at all, since the layer
N+1 would not start before the layer N is completed.

For a completely data-driven architecture I would again recommend a pool of
tasks looking into a blackboard for requests [lock-free]. As I said there
is a fair possibility that you would actually lose more than gain, but you
could try. [lock-free FIFO is for 1-1, blackboard for 1-n producer-consumer
scenarios].

Happy New Year to all,

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01  3:51       ` Charles Hixson
  2013-01-01  9:59         ` Dmitry A. Kazakov
@ 2013-01-01 10:38         ` Brian Drummond
  2013-01-01 12:32         ` Jeffrey Carter
  2013-01-02 22:43         ` Niklas Holsti
  3 siblings, 0 replies; 56+ messages in thread
From: Brian Drummond @ 2013-01-01 10:38 UTC (permalink / raw)

On Mon, 31 Dec 2012 19:51:36 -0800, Charles Hixson wrote:

> On 12/31/2012 01:09 PM, Niklas Holsti wrote:
>> On 12-12-31 20:52 , Charles Hixson wrote:

> FWIW, you can think of what I'm doing is a neural network simulation.
> There are reasons why that isn't quite accurate, but it's close.  And
> each message is likely (but not certain) to get a reply and may also
> cause further messages to be sent on.  Occasionally new nodes will be
> added to the net.  So logically, each "cell" should be a separate task,
> but even so asynchronous communication is needed to avoid deadlock.

This is sounding a lot like the VHDL model of inter-process communication 
using (VHDL's version of) signals, with delta cycles to allow further 
messages causing further messages and so on... of all the logic 
simulation languages, is unique in managing to maintain determinism in 
what looks from the outside like asynchronous message passing. It also 
avoids the whole mess of "blocking" vs "non-blocking" assignments that 
plagues the other big contender...

> Since I can't model cells as tasks, I'm planning on modeling them as
> protected variables.  Even this may be too limiting, because of the
> requirement for asynchronous communication.  Particularly since
> protected variables are passive, rather than active, so each one is
> going to need to wait for a task to activate it.  But this appears to be
> quite likely to interfere with normal execution.  The classic answer is
> to have two copies of the network and to cycle through cone copying it
> and state changes into the second. 

VHDL has a different model, avoiding the duplication and copying.

The basic units are the process (which resembles an Ada task or C 
program) and the signal. 

Each process can be woken either by an event on its "sensitivity list" or 
an event it is waiting for. Then it runs to either completion (in the 
former case) or an explicit "wait" such as "wait for some_event" in the 
latter case. 
During its entire run, (which takes notionally zero time) it may assign 
to signals, but the signal values it sees are frozen at their initial 
values. The new assignments don't actually happen at this stage; signal 
assignments are known as "postponed assignments". If a process assigns 
several times to the same signal, the last assignment wins.
(Like Ada tasks, processes can have variables, and they follow the usual 
assignment rules)

Only when all processes are suspended will the postponed signal 
assignments happen. These assignments are events; they schedule selected 
process to be woken in the next delta cycle.
Then the simulation time advances by one delta cycle, and finally all 
those processes are woken - seeing new, but now frozen, signal values.

This process repeats until a delta cycle occurs in which no process is 
scheduled. (If this does not happen, there is a "combinational loop" 
which is unwanted in VHDL!). Now real time can be advanced by a tick 
(femtosecond, nanosecond, etc) until a timed event or external input 
happens.

A combinational loop might be
a <= not a;		-- loop;  "<=" is signal assignment
however
a <= not a after 2 ns;  -- no loop
is not a combinational loop (because the "after" clause schedules an 
event in the future. (This is a model of an oscillator!)

This may sound inefficient, but the delta cycle model guarantees that no 
matter the order of execution of processes within a delta cycle, or the 
order of signal assignments between them, the result is the same. 

You might map VHDL processes on to your cells, and signals to your "not-
neurons". There might be a few Ada tasks (or even a 1:1 mapping) running 
processes until the scheduled-process-queue is empty. Then run the 
corresponding list (or lists : one per task) of postponed assignments.

> Then, at the end of a cycle, reverse
> it.  This requires the whole thing to be copied in each cycle, even the
> parts that haven't changed in this cycle.  I'm trying to avoid that, but
> I may not be able to unless I can break this requirement for blocking.

I have described the VHDL model because it may offer a different 
perspective on your problem rather than as an immediately applicable 
solution to it. However it is a successful model with benign formal 
properties and breaks the requirement for blocking on new signal values, 
(admittedly, by blocking until the next delta cycle instead!)

> (The more I look at a central postmaster, the less desirable it looks.
> But perhaps I can have a postmaster attached to each cell, or maybe the
> cell should be attached to the same record as the mailbox. 

The VHDL model could be seen as a central postmaster; delivering mail 
between each delta cycle. But it certainly doesn't need to be single-
threaded!

- Brian

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01  3:51       ` Charles Hixson
  2013-01-01  9:59         ` Dmitry A. Kazakov
  2013-01-01 10:38         ` Brian Drummond
@ 2013-01-01 12:32         ` Jeffrey Carter
  2013-01-01 18:21           ` Charles Hixson
  2013-01-02 22:43         ` Niklas Holsti
  3 siblings, 1 reply; 56+ messages in thread
From: Jeffrey Carter @ 2013-01-01 12:32 UTC (permalink / raw)

On 12/31/2012 08:51 PM, Charles Hixson wrote:>
> FWIW, you can think of what I'm doing is a neural network simulation. There
> are reasons why that isn't quite accurate, but it's close. And each message
> is likely (but not certain) to get a reply and may also cause further
> messages to be sent on. Occasionally new nodes will be added to the net. So
> logically, each "cell" should be a separate task, but even so asynchronous
> communication is needed to avoid deadlock.

I actually wrote such a thing in Ada 83, a tasking implementation of a neural 
network with feedback connections. Each node was a task, and any 2 nodes which 
were connected had a bounded, blocking queue of length 1 between them. There 
were cycles in the network (feedback), but this structure worked well and did 
not deadlock. Being Ada 83, each queue was also a task. There were fewer than 
100 total tasks, but then this was a 286-class DOS computer with about 1 MB of 
RAM, and it could have handled more such tasks; I never tested to determine the 
maximum it could run. On a modern, multi-core processor, and using protected 
objects for the queues, I doubt that a million nodes would be a problem.

I don't know whether such an approach would work for your problem; we used a 
static network layout, for example, so adding nodes was not considered.

At the time I wrote that network, most people writing concurrent S/W would have 
told me that such a design would not work and I would have to use a cyclic 
executive. My experience, and that reported in every paper by people who used 
Ada(-83) tasks as intended, was otherwise. Such people were writing 
unnecessarily complex code because of premature optimization. Modern Ada S/W 
engineers like you and me would never make such an error, would we?

-- 
Jeff Carter
"We call your door-opening request a silly thing."
Monty Python & the Holy Grail
17

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01 12:32         ` Jeffrey Carter
@ 2013-01-01 18:21           ` Charles Hixson
  2013-01-01 18:54             ` Robert A Duff
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-01 18:21 UTC (permalink / raw)

On 01/01/2013 04:32 AM, Jeffrey Carter wrote:
> On 12/31/2012 08:51 PM, Charles Hixson wrote:>
>> FWIW, you can think of what I'm doing is a neural network simulation.
>> There
>> are reasons why that isn't quite accurate, but it's close. And each
>> message
>> is likely (but not certain) to get a reply and may also cause further
>> messages to be sent on. Occasionally new nodes will be added to the
>> net. So
>> logically, each "cell" should be a separate task, but even so
>> asynchronous
>> communication is needed to avoid deadlock.
>
> I actually wrote such a thing in Ada 83, a tasking implementation of a
> neural network with feedback connections. Each node was a task, and any
> 2 nodes which were connected had a bounded, blocking queue of length 1
> between them. There were cycles in the network (feedback), but this
> structure worked well and did not deadlock. Being Ada 83, each queue was
> also a task. There were fewer than 100 total tasks, but then this was a
> 286-class DOS computer with about 1 MB of RAM, and it could have handled
> more such tasks; I never tested to determine the maximum it could run.
> On a modern, multi-core processor, and using protected objects for the
> queues, I doubt that a million nodes would be a problem.
>
> I don't know whether such an approach would work for your problem; we
> used a static network layout, for example, so adding nodes was not
> considered.
>
> At the time I wrote that network, most people writing concurrent S/W
> would have told me that such a design would not work and I would have to
> use a cyclic executive. My experience, and that reported in every paper
> by people who used Ada(-83) tasks as intended, was otherwise. Such
> people were writing unnecessarily complex code because of premature
> optimization. Modern Ada S/W engineers like you and me would never make
> such an error, would we?
>
It sounds like we came to the same basic answer.  And your prior 
experience augurs well.

FWIW, premature optimization is a difficulty, and hard to avoid, but if 
you don't evaluate efficiency, you can't be sure the basic design will 
work usably even if the code is correct.  I'm planning on allowing 
multiple messages in the mailbox, though.  My current though is to split 
the cells by id# to allot them to tasks, and cycle through the net.  The 
first thing each cell will need to do on being activated is handle it's 
old mail.  Mail sent on one cycle will then usually not be acted upon 
until the next cycle.  I'm undecided as to whether or not to include a 
cycle count, so as to enforce that implicit regularity.  It seems as if 
I shouldn't need to, as the order of cycling is deterministic, even if 
what is done isn't.

P.S.:  Ada is NOT the first language that I looked at when trying to 
solve this.  I really preferred a language with a garbage collector, and 
where strings didn't default to an inflexible length.  (And where they 
defaulted to utf-8.  And where binary files could have header records 
without a lot of effort. And ...)  But I kept running into problems with 
the way they handled shared memory between tasks.  Either they were too 
protective, or much too wide open.  Ada is the only language that has 
appeared to allow me to construct the memory handling protocol I want, 
without simultaneously making it totally unguarded.  (Or not doing huge 
amounts of copying of RAM between tasks, which would result in some 
copies holding stale data.)  I must have looked at 10 different 
languages.  Of the other languages I looked at, only Erlang appeared to 
handle things in the way that I need, and I have trouble wrapping my 
mind around the way Erlang does things.  (I started my programming life 
with Fortran, and I'm a bit inflexible in the direction of Haskell and 
Erlang.)  So when I look into "How should I implement this?", I'm really 
evaluating what I can do in the language.  And unfortunately it's got to 
be an evaluation made without really knowing the language well.  So I'm 
really pleased that the way I thought could be done in Ada is confirmed 
by your report of doing it successfully in an earlier version.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01 18:21           ` Charles Hixson
@ 2013-01-01 18:54             ` Robert A Duff
  2013-01-02  7:36               ` Charles Hixson
  0 siblings, 1 reply; 56+ messages in thread
From: Robert A Duff @ 2013-01-01 18:54 UTC (permalink / raw)


Charles Hixson <charleshixsn@earthlink.net> writes:

>...I really preferred a language with a garbage collector, 

I have used the Boehm garbage collector successfully with Ada.

You could also consider use-defined storage pools.
I don't know enough detail of what you're doing to know
if they would help.

- Bob



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01 18:54             ` Robert A Duff
@ 2013-01-02  7:36               ` Charles Hixson
  2013-01-02  9:55                 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-02  7:36 UTC (permalink / raw)

On 01/01/2013 10:54 AM, Robert A Duff wrote:
> Charles Hixson<charleshixsn@earthlink.net>  writes:
>
>> ...I really preferred a language with a garbage collector,
>
> I have used the Boehm garbage collector successfully with Ada.
>
> You could also consider use-defined storage pools.
> I don't know enough detail of what you're doing to know
> if they would help.
>
> - Bob
I don't think storage pools would help, though I admit I don't 
understand them.  And at my level of skill at Ada, attempting to install 
a non-standard component (i.e. the garbage collector) is inviting disaster.

Actually, for the current project the need for garbage collection is 
minimal...but I often do things where garbage collection really 
simplifies memory management.  Even on this one I'm going to need to 
figure out how to recycle memory with Ada.Collections.Vectors attached, 
as different cells will connect to differing numbers of other cells. 
The documentation is a bit unclear as to exactly what various routines 
in the Ada.Collections.Vectors do to memory that has been allocated. 
(By a a bit unclear I mean they give you the name, and expect you to 
know how it handles memory...either that, or it isn't specified by the 
standard.  I haven't dug into the AARM...the first couple of pages of it 
were too intimidating to go any further.  I should try again on a copy 
without the changes.)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-02  7:36               ` Charles Hixson
@ 2013-01-02  9:55                 ` Dmitry A. Kazakov
  2013-01-02 19:02                   ` Charles Hixson
  0 siblings, 1 reply; 56+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-02  9:55 UTC (permalink / raw)

On Tue, 01 Jan 2013 23:36:13 -0800, Charles Hixson wrote:

> On 01/01/2013 10:54 AM, Robert A Duff wrote:
>> Charles Hixson<charleshixsn@earthlink.net>  writes:
>>
>>> ...I really preferred a language with a garbage collector,
>>
>> I have used the Boehm garbage collector successfully with Ada.
>>
>> You could also consider use-defined storage pools.
>> I don't know enough detail of what you're doing to know
>> if they would help.
>>
> I don't think storage pools would help, though I admit I don't 
> understand them.

Storage pools allow implementation of arenas and stacks which are extremely
useful when dealing with graphs. I don't know how many nodes you have, but
in my system a decision tree might have hundreds of thousands of nodes.
Merely finalization of such a structure would take a lot of time if nodes
are allocated in the standard memory pool.

> Actually, for the current project the need for garbage collection is 
> minimal...but I often do things where garbage collection really 
> simplifies memory management.

Not really. And for graphs seems just the opposite.

Though you would need to carefully design for which graph edges the
references will be strong and which ones weak.

> Even on this one I'm going to need to 
> figure out how to recycle memory with Ada.Collections.Vectors attached, 
> as different cells will connect to differing numbers of other cells.

Variable structures like Ada.Containers usually allocate memory in advance
to speed up incremental insertions. For NN I would consider a custom
fixed-size structure.

You might also consider refactoring subgraphs. Structures like that tend to
combinatorial explosion when the same graph pattern keeps on repeating
itself. Memory use and training/classification times could be reduced when
repeating subgraphs are shared rather than replicated. This is another
argument to consider a custom implementation.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-02  9:55                 ` Dmitry A. Kazakov
@ 2013-01-02 19:02                   ` Charles Hixson
  2013-01-02 20:35                     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-02 19:02 UTC (permalink / raw)

On 01/02/2013 01:55 AM, Dmitry A. Kazakov wrote:
> On Tue, 01 Jan 2013 23:36:13 -0800, Charles Hixson wrote:
>
>> On 01/01/2013 10:54 AM, Robert A Duff wrote:
>>> Charles Hixson<charleshixsn@earthlink.net>   writes:
>>>
>>>> ...I really preferred a language with a garbage collector,
>>>
>>> I have used the Boehm garbage collector successfully with Ada.
>>>
>>> You could also consider use-defined storage pools.
>>> I don't know enough detail of what you're doing to know
>>> if they would help.
>>>
>> I don't think storage pools would help, though I admit I don't
>> understand them.
>
> Storage pools allow implementation of arenas and stacks which are extremely
> useful when dealing with graphs. I don't know how many nodes you have, but
> in my system a decision tree might have hundreds of thousands of nodes.
> Merely finalization of such a structure would take a lot of time if nodes
> are allocated in the standard memory pool.
>
>> Actually, for the current project the need for garbage collection is
>> minimal...but I often do things where garbage collection really
>> simplifies memory management.
>
> Not really. And for graphs seems just the opposite.
>
> Though you would need to carefully design for which graph edges the
> references will be strong and which ones weak.
>
>> Even on this one I'm going to need to
>> figure out how to recycle memory with Ada.Collections.Vectors attached,
>> as different cells will connect to differing numbers of other cells.
>
> Variable structures like Ada.Containers usually allocate memory in advance
> to speed up incremental insertions. For NN I would consider a custom
> fixed-size structure.
>
> You might also consider refactoring subgraphs. Structures like that tend to
> combinatorial explosion when the same graph pattern keeps on repeating
> itself. Memory use and training/classification times could be reduced when
> repeating subgraphs are shared rather than replicated. This is another
> argument to consider a custom implementation.
>
A constant size container is not reasonable...though it would certainly 
make things simpler if it was.  Perhaps I could have a few standard 
sizes...but the range between the smallest and the largest will be a 
factor of about (uncertain) 5000, with the smallest size dominating in 
frequency.  And cells will change in the number of their connections.
At least if I do things right.  But I really can't see how storage pools 
would help, as cells would not be freed in any predictable pattern, and 
not en-mass.

The shape of the graph is going to be dependent on the data, as what I'm 
doing is essentially measuring what comes near what, as a prediction of 
what parts of a pattern are missing.  I expect sub-graphs to develop 
automatically, but they can't be imposed from the start.

OTOH, limiting a combinatorial explosion is certainly one of the things 
that is going to require work.  But if I limit propagation too much, 
then the thing won't work.  So even if I get the basic design right, 
it's going to require a lot of ad hoc tuning.  I am currently planning, 
when I get things designed sufficiently, to tinker with activation 
levels and decay rates to control the combinatorial explosion...and also 
to keep signals from dying away too quickly.  That's probably going to 
require some "evolutionary programming" where similar programs with 
slightly different "gosh numbers" are run against each other, as I don't 
see any theoretical grounds for picking particular values.

Probably, when I get sufficient experience with Ada, I'll install the 
Boehm garbage collector.  Now that looks like asking for trouble.  Even 
GPS is giving me grief.

P.S.:  Is there any way to generate documentation for Ada, better than 
robodoc?  I don't count turning the program files into HTML to be 
documentation.  It doesn't add much over the bare files.  Something 
similar to DOxygen or Javadoc?  (Though DOxygen also tends to be too 
verbose.)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-02 19:02                   ` Charles Hixson
@ 2013-01-02 20:35                     ` Dmitry A. Kazakov
  2013-01-03  0:20                       ` Charles Hixson
  0 siblings, 1 reply; 56+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-02 20:35 UTC (permalink / raw)


On Wed, 02 Jan 2013 11:02:54 -0800, Charles Hixson wrote:

> On 01/02/2013 01:55 AM, Dmitry A. Kazakov wrote:
>> On Tue, 01 Jan 2013 23:36:13 -0800, Charles Hixson wrote:
>>
>>> On 01/01/2013 10:54 AM, Robert A Duff wrote:
>>>> Charles Hixson<charleshixsn@earthlink.net>   writes:
>>>>
>>>>> ...I really preferred a language with a garbage collector,
>>>>
>>>> I have used the Boehm garbage collector successfully with Ada.
>>>>
>>>> You could also consider use-defined storage pools.
>>>> I don't know enough detail of what you're doing to know
>>>> if they would help.
>>>>
>>> I don't think storage pools would help, though I admit I don't
>>> understand them.
>>
>> Storage pools allow implementation of arenas and stacks which are extremely
>> useful when dealing with graphs. I don't know how many nodes you have, but
>> in my system a decision tree might have hundreds of thousands of nodes.
>> Merely finalization of such a structure would take a lot of time if nodes
>> are allocated in the standard memory pool.
>>
>>> Actually, for the current project the need for garbage collection is
>>> minimal...but I often do things where garbage collection really
>>> simplifies memory management.
>>
>> Not really. And for graphs seems just the opposite.
>>
>> Though you would need to carefully design for which graph edges the
>> references will be strong and which ones weak.
>>
>>> Even on this one I'm going to need to
>>> figure out how to recycle memory with Ada.Collections.Vectors attached,
>>> as different cells will connect to differing numbers of other cells.
>>
>> Variable structures like Ada.Containers usually allocate memory in advance
>> to speed up incremental insertions. For NN I would consider a custom
>> fixed-size structure.
>>
>> You might also consider refactoring subgraphs. Structures like that tend to
>> combinatorial explosion when the same graph pattern keeps on repeating
>> itself. Memory use and training/classification times could be reduced when
>> repeating subgraphs are shared rather than replicated. This is another
>> argument to consider a custom implementation.
>>
> A constant size container is not reasonable...though it would certainly 
> make things simpler if it was.

You make node a discriminated record type. Usually when a node is created
it is already known how many children and/or parents it will have. E.g.

   type Node;
   type Node_Ptr is access Node;
   for Node_Ptr'Storage_Pool use Arena;
   type Node_Ptr_Array is array (Positive range <>) of Node_Ptr;
   type Node (Children_Number : Positive) is record
       Successors : Node_Ptr_Array (1..Children_Number);
   end record;

> But I really can't see how storage pools 
> would help, as cells would not be freed in any predictable pattern, and 
> not en-mass.

You allocate all graph or a layer of in the storage pool. Nodes are usually
allocated in the LIFO order which makes allocation much more efficient then
allocation in a standard pool. Deallocation is usually all-at-once, e.g.
you drop the pool and that is.
 
> P.S.:  Is there any way to generate documentation for Ada, better than 
> robodoc?

By hands. I write docs manually. It is tedious, I admit. Consider it as an
extra code review step, helps finding inconsistencies and gaps which
otherwise slip through.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-02 20:35                     ` Dmitry A. Kazakov
@ 2013-01-03  0:20                       ` Charles Hixson
  2013-01-03  6:34                         ` Charles Hixson
                                           ` (3 more replies)
  0 siblings, 4 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-03  0:20 UTC (permalink / raw)


On 01/02/2013 12:35 PM, Dmitry A. Kazakov wrote:
> On Wed, 02 Jan 2013 11:02:54 -0800, Charles Hixson wrote:
>
>> On 01/02/2013 01:55 AM, Dmitry A. Kazakov wrote:
>>> On Tue, 01 Jan 2013 23:36:13 -0800, Charles Hixson wrote:
>>>
>>>> On 01/01/2013 10:54 AM, Robert A Duff wrote:
>>>>> Charles Hixson<charleshixsn@earthlink.net>    writes:
>>>>>
>>>>>> ...I really preferred a language with a garbage collector,
>>>>>
>>>>> I have used the Boehm garbage collector successfully with Ada.
>>>>>
>>>>> You could also consider use-defined storage pools.
>>>>> I don't know enough detail of what you're doing to know
>>>>> if they would help.
>>>>>
>>>> I don't think storage pools would help, though I admit I don't
>>>> understand them.
>>>
>>> Storage pools allow implementation of arenas and stacks which are extremely
>>> useful when dealing with graphs. I don't know how many nodes you have, but
>>> in my system a decision tree might have hundreds of thousands of nodes.
>>> Merely finalization of such a structure would take a lot of time if nodes
>>> are allocated in the standard memory pool.
>>>
>>>> Actually, for the current project the need for garbage collection is
>>>> minimal...but I often do things where garbage collection really
>>>> simplifies memory management.
>>>
>>> Not really. And for graphs seems just the opposite.
>>>
>>> Though you would need to carefully design for which graph edges the
>>> references will be strong and which ones weak.
>>>
>>>> Even on this one I'm going to need to
>>>> figure out how to recycle memory with Ada.Collections.Vectors attached,
>>>> as different cells will connect to differing numbers of other cells.
>>>
>>> Variable structures like Ada.Containers usually allocate memory in advance
>>> to speed up incremental insertions. For NN I would consider a custom
>>> fixed-size structure.
>>>
>>> You might also consider refactoring subgraphs. Structures like that tend to
>>> combinatorial explosion when the same graph pattern keeps on repeating
>>> itself. Memory use and training/classification times could be reduced when
>>> repeating subgraphs are shared rather than replicated. This is another
>>> argument to consider a custom implementation.
>>>
>> A constant size container is not reasonable...though it would certainly
>> make things simpler if it was.
>
> You make node a discriminated record type. Usually when a node is created
> it is already known how many children and/or parents it will have. E.g.
That's not true in this case.  The node adds references depending on the 
data that it encounters.
OTOH, I suppose I could declare a series of different node types, each 
with a maximum number of children, and when I hit the maximum I 
reallocate the node with a different type.  If I doubled the number of 
children with each type I'd need around 10-12 different types...

Yes, it's doable, but I'd *really* rather avoid that.  And I'm sure 
there must be some reasonable way to deallocate a Vector.  That I don't 
know what it is doesn't convince me that it doesn't exist.  Perhaps 
Clear or Delete would do it, but I haven't found this documented.  I'm 
guessing that if I store access variables in a Vector, and then 
deallocate the item the access variable holds (updating the access 
variable to null), that will free most of the space, but deallocating 
the Vector itself is not so clear.  I can't tell whether there are any 
handles to it that I don't know about.  Perhaps it's somewhere in the AARM.
>
>     type Node;
>     type Node_Ptr is access Node;
>     for Node_Ptr'Storage_Pool use Arena;
>     type Node_Ptr_Array is array (Positive range<>) of Node_Ptr;
>     type Node (Children_Number : Positive) is record
>         Successors : Node_Ptr_Array (1..Children_Number);
>     end record;
>
>> But I really can't see how storage pools
>> would help, as cells would not be freed in any predictable pattern, and
>> not en-mass.
>
> You allocate all graph or a layer of in the storage pool. Nodes are usually
> allocated in the LIFO order which makes allocation much more efficient then
> allocation in a standard pool. Deallocation is usually all-at-once, e.g.
> you drop the pool and that is.
When you say "layer", I know I haven't explained the model of what I'm 
trying to do well enough.  That's a part of why I explicitly denied that 
it was what is normally meant by a neural net.  That's just the closest 
model in common language.  Even the "base layer", such as it is, isn't 
allocated all at once, but only as appropriate stimuli arrive.  Higher 
levels *try* to be a similar distance from the "base layer" (Note the 
quotes! Don't take this description literally.), but this isn't 
guaranteed, and, indeed, I expect that mixtures between layers will be 
common.  And I'm not actually convinced that I should even try to 
minimize this.  (There are lots of reasons why it might improve responses.)

That said, I'm well aware that this may well not work out.  But if I 
don't try, I'll never know.  There are lots of places where I don't have 
a clear model of what I'm trying to build, but initial cell generation 
is one place where I'm pretty sure, except for some details.
>
>> P.S.:  Is there any way to generate documentation for Ada, better than
>> robodoc?
>
> By hands. I write docs manually. It is tedious, I admit. Consider it as an
> extra code review step, helps finding inconsistencies and gaps which
> otherwise slip through.
In my experience, code that's documented that way tends to have 
documentation that's incomplete, out of date, or both.  Perhaps this is 
just me, but enough people seem to have had the same experience that I 
doubt it.

Adabrowse can be used to produce decent end-user documentation, but I 
want developer documentation.  That means documenting 
function/procedures/types/etc that only exist in *.adb files.  I'm after 
something that makes it easy for me to see what routines I've written, 
what they're called, what they're for, and any notes to myself that 
aren't appropriate for compile time warnings.  I want to be able to come 
back to the code in 6 months or a year and figure out what it's doing 
easily.  (Not what I intended for it to do, which is what the specs give 
me.  But what I actually did, and where I chose an approach because it 
was quick to do, but that needs to be updated. ... cases that aren't 
handled deserve a compile time warning, and an execution time error, but 
this for is things like an optimization that was postponed.)  For this 
kind of thing my only complaint about DOxygen is that it's too verbose. 
  Well, and that it doesn't handle Ada.  (Actually DOxygen can produce 
both user and developer documentation, depending on how you set it to 
handle display of private variable, routines, etc.  But it doesn't 
handle Ada.)

I sure hope I can find something better than robodoc.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03  0:20                       ` Charles Hixson
@ 2013-01-03  6:34                         ` Charles Hixson
  2013-01-03  8:50                         ` Dmitry A. Kazakov
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-03  6:34 UTC (permalink / raw)


On 01/02/2013 04:20 PM, Charles Hixson wrote:
> ... >
> Adabrowse can be used to produce decent end-user documentation, but I
> want developer documentation. That means documenting
> function/procedures/types/etc that only exist in *.adb files. I'm after
> something that makes it easy for me to see what routines I've written,
> what they're called, what they're for, and any notes to myself that
> aren't appropriate for compile time warnings. I want to be able to come
> back to the code in 6 months or a year and figure out what it's doing
> easily. (Not what I intended for it to do, which is what the specs give
> me. But what I actually did, and where I chose an approach because it
> was quick to do, but that needs to be updated. ... cases that aren't
> handled deserve a compile time warning, and an execution time error, but
> this for is things like an optimization that was postponed.) For this
> kind of thing my only complaint about DOxygen is that it's too verbose.
> Well, and that it doesn't handle Ada. (Actually DOxygen can produce both
> user and developer documentation, depending on how you set it to handle
> display of private variable, routines, etc. But it doesn't handle Ada.)
>
> I sure hope I can find something better than robodoc.

NaturalDocs seems to be that "something better".



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03  0:20                       ` Charles Hixson
  2013-01-03  6:34                         ` Charles Hixson
@ 2013-01-03  8:50                         ` Dmitry A. Kazakov
  2013-01-03 19:01                           ` Charles Hixson
  2013-01-03 10:01                         ` J-P. Rosen
  2013-01-03 22:27                         ` Randy Brukardt
  3 siblings, 1 reply; 56+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-03  8:50 UTC (permalink / raw)

On Wed, 02 Jan 2013 16:20:11 -0800, Charles Hixson wrote:

> On 01/02/2013 12:35 PM, Dmitry A. Kazakov wrote:

>> You make node a discriminated record type. Usually when a node is created
>> it is already known how many children and/or parents it will have. E.g.

> That's not true in this case.  The node adds references depending on the 
> data that it encounters.

OK, that does not look like NN.

> Even the "base layer", such as it is, isn't 
> allocated all at once, but only as appropriate stimuli arrive.

That looks like a design choice. Why should it be this way? I mean, it is
not a RT system, but anyway this kind of allocation policy would inflict
heavy latencies at least initially. It may have sense only if you had a
huge number of nodes of which only a minor part were actually involved in
actual computations. Such loosely coupled systems are not very common.

>> By hands. I write docs manually. It is tedious, I admit. Consider it as an
>> extra code review step, helps finding inconsistencies and gaps which
>> otherwise slip through.
> In my experience, code that's documented that way tends to have 
> documentation that's incomplete, out of date, or both.

That depends. I consider documentation as an integral part of the process.
I never release/update anything without documenting it first.

Generated documentation is a chaotic collection of comments you once wrote
around declarations. If you don't write/update them no tool could fix that.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03  8:50                         ` Dmitry A. Kazakov
@ 2013-01-03 19:01                           ` Charles Hixson
  0 siblings, 0 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-03 19:01 UTC (permalink / raw)


On 01/03/2013 12:50 AM, Dmitry A. Kazakov wrote:
> On Wed, 02 Jan 2013 16:20:11 -0800, Charles Hixson wrote:
>
>> On 01/02/2013 12:35 PM, Dmitry A. Kazakov wrote:
>
>>> You make node a discriminated record type. Usually when a node is created
>>> it is already known how many children and/or parents it will have. E.g.
>
>> That's not true in this case.  The node adds references depending on the
>> data that it encounters.
>
> OK, that does not look like NN.
>
>> Even the "base layer", such as it is, isn't
>> allocated all at once, but only as appropriate stimuli arrive.
>
> That looks like a design choice. Why should it be this way? I mean, it is
> not a RT system, but anyway this kind of allocation policy would inflict
> heavy latencies at least initially. It may have sense only if you had a
> huge number of nodes of which only a minor part were actually involved in
> actual computations. Such loosely coupled systems are not very common.
Well, it's a design choice, but it's one driven by circumstance.  The 
number of possible input stimuli is exceedingly large.  Basically it's 
the set of space delimited strings of characters that it could 
encounter...that's not quite right, and the final decision of what the 
input will be is still being thought about, but that's about what it is. 
  I'm contemplating how to deal with non-ASCII chars, and with 
punctuation both internally and terminally, but almost all of the input 
is english text.  Even if I used unicode chars as the basic stimulus, 
though (more flexible, but LOTS more processing required) there would 
still be more possible inputs than I would want to pre-allocate, and 
most of them would never show up.  But some would.  There are occasional 
Greek, Hindi, and Chinese words or characters, though embedded in 
English text.  And I can't say that something I haven't thought of won't 
show up soon.  Musical notation, perhaps (if that can be done in unicode).
>
>>> By hands. I write docs manually. It is tedious, I admit. Consider it as an
>>> extra code review step, helps finding inconsistencies and gaps which
>>> otherwise slip through.
>> In my experience, code that's documented that way tends to have
>> documentation that's incomplete, out of date, or both.
>
> That depends. I consider documentation as an integral part of the process.
> I never release/update anything without documenting it first.
>
> Generated documentation is a chaotic collection of comments you once wrote
> around declarations. If you don't write/update them no tool could fix that.
>
There's more that one purpose for documentation.  What the documentation 
should be depends on who it's directed at.  I've written user manuals 
for programs (that I've written), and they don't mention ANY of the 
internals, but developer documentation NEEDS those comments, yet it 
needs to be compact enough that it doesn't take up too much screen 
space.  And library user documentation (which is what AdaBrowse can 
generate) is yet something else again.

For my current needs, developer documentation, it's looking as if 
NaturalDocs will fill the bill.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03  0:20                       ` Charles Hixson
  2013-01-03  6:34                         ` Charles Hixson
  2013-01-03  8:50                         ` Dmitry A. Kazakov
@ 2013-01-03 10:01                         ` J-P. Rosen
  2013-01-03 19:29                           ` Charles Hixson
  2013-01-03 22:27                         ` Randy Brukardt
  3 siblings, 1 reply; 56+ messages in thread
From: J-P. Rosen @ 2013-01-03 10:01 UTC (permalink / raw)


Le 03/01/2013 01:20, Charles Hixson a ï¿½crit :
> Adabrowse can be used to produce decent end-user documentation, but I
> want developer documentation.  That means documenting
> function/procedures/types/etc that only exist in *.adb files.
Just put comments after the specification. These will show up when you
fly over any occurrence of the subprogram name in GPS - Very convenient

>  I'm after
> something that makes it easy for me to see what routines I've written,
> what they're called, what they're for, 
The outline view in GPS shows you this - and the flyover feature of GPS
works there too

> and any notes to myself that
> aren't appropriate for compile time warnings.  I want to be able to come
> back to the code in 6 months or a year and figure out what it's doing
> easily.
Especially for implementation, nothing beats appropriate (repeat:
"appropriate") comments in the code.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03 10:01                         ` J-P. Rosen
@ 2013-01-03 19:29                           ` Charles Hixson
  2013-01-04  8:17                             ` J-P. Rosen
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-03 19:29 UTC (permalink / raw)


On 01/03/2013 02:01 AM, J-P. Rosen wrote:
> Le 03/01/2013 01:20, Charles Hixson a ï¿½crit :
>> Adabrowse can be used to produce decent end-user documentation, but I
>> want developer documentation.  That means documenting
>> function/procedures/types/etc that only exist in *.adb files.
> Just put comments after the specification. These will show up when you
> fly over any occurrence of the subprogram name in GPS - Very convenient
I've just given up on using GPS.  Perhaps it works better on MSWindows, 
but there are too many "silent failures".  Last time it wouldn't 
generate a docs file because I'd forgotten a "private" declaration, but 
it didn't even tell me that it failed.  Building the program fails, but 
not silently.  I also don't like the way it handles tabs.  (My default 
tab is 3 spaces, not 8, but if there's a way to customize this, I 
haven't found it.) Etc.  Many other small annoyances.  I prefer a simple 
editor with a command pane, like geany or kate.  At least when things 
fail, they let you know why.  However currently I'm using Eclipse, 
because the "simple editors" don't recognize Ada symbols.  But the 
Eclipse plug-in seems to be highly version specific, so I may fall back 
to the editors yet.  (The editors *do* have Ada specific highlighting, 
though.)
>
>>   I'm after
>> something that makes it easy for me to see what routines I've written,
>> what they're called, what they're for,
> The outline view in GPS shows you this - and the flyover feature of GPS
> works there too
>
>> and any notes to myself that
>> aren't appropriate for compile time warnings.  I want to be able to come
>> back to the code in 6 months or a year and figure out what it's doing
>> easily.
> Especially for implementation, nothing beats appropriate (repeat:
> "appropriate") comments in the code.
Yes.

What I've found, that looks as if it will do what I want, is called 
NaturalDocs.  It's not very Ada-specific, so it could be a lot better, 
but it recognizes Ada comments, and it has a useful selection of 
keywords.  (It would be nice if it distinguished between function and 
procedure is the main problem I've noticed so far, but it's clear that 
there will be others, as Ada programs aren't structured the same way 
that programs in C, C++, or Java are structured.  And I wish the output 
form were more compact.  People spend too much time (or possibly not, as 
it may be easy) making the format glitzy at the cost of compact readability.





^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03 19:29                           ` Charles Hixson
@ 2013-01-04  8:17                             ` J-P. Rosen
  2013-01-05  4:31                               ` Charles Hixson
  0 siblings, 1 reply; 56+ messages in thread
From: J-P. Rosen @ 2013-01-04  8:17 UTC (permalink / raw)


Le 03/01/2013 20:29, Charles Hixson a ï¿½crit :
> I also don't like the way it handles tabs.  (My default tab is 3 spaces,
> not 8, but if there's a way to customize this, I haven't found it.) Etc.
Huh? Edit/Preferences/Editor/Ada

Personnally, I never use hard tabs, but let the editor do the
indentation (Ctrl+Tab, but it can be reassigned to Tab if you prefer).
Note the option to automatically change tabulations to spaces (uncheck
"use tabulations").

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-04  8:17                             ` J-P. Rosen
@ 2013-01-05  4:31                               ` Charles Hixson
  2013-01-09  8:34                                 ` Stephen Leake
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-05  4:31 UTC (permalink / raw)

On 01/04/2013 12:17 AM, J-P. Rosen wrote:
> Le 03/01/2013 20:29, Charles Hixson a ï¿½crit :
>> I also don't like the way it handles tabs.  (My default tab is 3 spaces,
>> not 8, but if there's a way to customize this, I haven't found it.) Etc.
> Huh? Edit/Preferences/Editor/Ada
>
> Personnally, I never use hard tabs, but let the editor do the
> indentation (Ctrl+Tab, but it can be reassigned to Tab if you prefer).
> Note the option to automatically change tabulations to spaces (uncheck
> "use tabulations").
>
Thank you.  That's one of the minor problems that is fixed.  The "silent 
failures" is a more severe one.  There are generally valid reasons for 
the failure.  (So far it's always been a syntax problem.)  But I can't 
think of a valid reason for it to fail silently.

OTOH, the editors never do the indents quite the way I desire them.  (In 
Ada I'm sort of adapting to the editor style generally, but details 
matter for readability.)
As for hard tabs vs. spaces, well, I prefer hard tabs, because it's 
easier to change indentation after it's entered, but I can live with it 
either way.  The 8 spaces/tab rule, though, I find ridiculous.  I know 
it's an old default from way back when, but even then it didn't make any 
sense to me.  In Fortran (forget the version, but it was one of the 
fixed format ones) I used an editor that let me set the tab stops.  I 
think I set tabs at both 6 and 7, and every three columns after that. 
And of course key punch machines let you make control cards that would 
do the same.  Editors seem to have simplified their controls in this 
area, though give the predominance of free format languages that makes a 
bit of sense.

P.S.:  I just checked my settings on GPS.  They are already set as 
desired.  That's just not the result I'm getting.  (OTOH, GPS has been 
maintaining hard tabs.  Those tend to get lost when I pull the code into 
Eclipse, though.  But Eclipse handles the spacing better.  And I haven't 
gotten silent errors...though admittedly the Eclipse plug-in doesn't try 
to do as much as does GPS.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-05  4:31                               ` Charles Hixson
@ 2013-01-09  8:34                                 ` Stephen Leake
  0 siblings, 0 replies; 56+ messages in thread
From: Stephen Leake @ 2013-01-09  8:34 UTC (permalink / raw)

Charles Hixson <charleshixsn@earthlink.net> writes:

> Thank you.  That's one of the minor problems that is fixed.  The
> "silent failures" is a more severe one.  There are generally valid
> reasons for the failure.  (So far it's always been a syntax problem.)
> But I can't think of a valid reason for it to fail silently.

You should only run the document generator on code that compiles
cleanly. I use a Makefile for things like that; probably not easy to do
with GPS based tools. It is easy to do in Emacs.

> OTOH, the editors never do the indents quite the way I desire them.
> (In Ada I'm sort of adapting to the editor style generally, but
> details matter for readability.)

Have you tried Emacs Ada mode 5.0? I'm currently rewriting it; now is
the time to get it to work the way you want. It's in good enough shape
to use on real projects, for most editing tasks.
http://stephe-leake.org/emacs/ada-mode/emacs-ada-mode.html 

> As for hard tabs vs. spaces, well, I prefer hard tabs, because it's
> easier to change indentation after it's entered, but I can live with
> it either way. 

You are in the minority, but Emacs supports that style.

> The 8 spaces/tab rule, though, I find ridiculous. 

The point is that you need some standard, so you can use your editor to
display my code, and vice versa. If my editor has a hard tab every 3
spaces, and yours every 4, our code will be pretty unreadable in the
other editor.

Or you need a standard way to store the tab settings in the source file,
so the editor can automatically set the right hard tab stops. Emacs and
other editors support ways to do this, but they are not portable between
editors, so it doesn't solve the problem. OpenDoc has a standard way,
but that's not suitable for source code. Although I guess you could run
OpenDoc source thru a preprocessor to strip out all the meta stuff, then
feed it to the compiler.

> I know it's an old default from way back when, but even then it didn't
> make any sense to me. 

Defaults never make sense; they just exist :).

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03  0:20                       ` Charles Hixson
                                           ` (2 preceding siblings ...)
  2013-01-03 10:01                         ` J-P. Rosen
@ 2013-01-03 22:27                         ` Randy Brukardt
  2013-01-05  5:18                           ` Charles Hixson
  3 siblings, 1 reply; 56+ messages in thread
From: Randy Brukardt @ 2013-01-03 22:27 UTC (permalink / raw)


"Charles Hixson" <charleshixsn@earthlink.net> wrote in message 
news:MY2dnX5O5NO_TXnNnZ2dnUVZ_hudnZ2d@earthlink.com...
...
> Yes, it's doable, but I'd *really* rather avoid that.  And I'm sure there 
> must be some reasonable way to deallocate a Vector.  That I don't know 
> what it is doesn't convince me that it doesn't exist.  Perhaps Clear or 
> Delete would do it, but I haven't found this documented.  I'm guessing 
> that if I store access variables in a Vector, and then deallocate the item 
> the access variable holds (updating the access variable to null), that 
> will free most of the space, but deallocating the Vector itself is not so 
> clear.  I can't tell whether there are any handles to it that I don't know 
> about.  Perhaps it's somewhere in the AARM.

I don't understand this comment (and probably no one else has, either, as no 
one has commented on it).

The entire reason the Ada.Containers library of packages (including Vectors) 
exists is to eliminate the need to manage storage. (If you don't mind 
managing the storage, the operations are trivial to write yourself, and 
probably more efficient as well.) As such, there is quite purposely no 
visibility into the storage management of any of the Ada.Containers. All 
that's required is that the memory is freed when the object is destroyed or 
overwritten (see A.18.2(253)).

Of course, if you're putting an access to some object into a container, you 
still have to manage the storage for the object. It's better (but not always 
possible) to put the objects themselves directly into the containers. When 
that's not possible, sometimes its possible to put cursors (for another 
container) into a container. In both cases, the containers will manage the 
storage so you don't have to.

Personally, I wouldn't bother with using a Vector container for access 
values. You're still going to have to do most of the storage management 
yourself; you might as well do all of it yourself. An access to 
unconstrained array of access values is easy to manage and you won't have 
any questions about when memory is freed. The most painful case happens when 
you need to grow the structure, but just copying it into a larger value is 
not likely to be very expensive because of the small size of the 
elements.(After all, this is pretty much what is happening inside of an 
instance of Vectors.)

The big value of the containers is when managing large elements directly, 
especially indefinite elements like class-wide objects (something Ada 
doesn't allow you to do directly). Just because you *can* store access 
values into a container doesn't mean you should. But of course YMMV.

                                          Randy.





^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03 22:27                         ` Randy Brukardt
@ 2013-01-05  5:18                           ` Charles Hixson
  2013-01-05  8:48                             ` Niklas Holsti
  2013-01-08  2:41                             ` Randy Brukardt
  0 siblings, 2 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-05  5:18 UTC (permalink / raw)


On 01/03/2013 02:27 PM, Randy Brukardt wrote:
> "Charles Hixson"<charleshixsn@earthlink.net>  wrote in message
> news:MY2dnX5O5NO_TXnNnZ2dnUVZ_hudnZ2d@earthlink.com...
> ....
>> Yes, it's doable, but I'd *really* rather avoid that.  And I'm sure there
>> must be some reasonable way to deallocate a Vector.  That I don't know
>> what it is doesn't convince me that it doesn't exist.  Perhaps Clear or
>> Delete would do it, but I haven't found this documented.  I'm guessing
>> that if I store access variables in a Vector, and then deallocate the item
>> the access variable holds (updating the access variable to null), that
>> will free most of the space, but deallocating the Vector itself is not so
>> clear.  I can't tell whether there are any handles to it that I don't know
>> about.  Perhaps it's somewhere in the AARM.
>
> I don't understand this comment (and probably no one else has, either, as no
> one has commented on it).
Well, I've now been through the relevant section of the AARM, and 
there's nothing about deallocating Vectors.  The contents of the vector, 
yes, but not the vector itself.  Perhaps this means it's safe to use 
Unchecked_Deallocation on the vector, as long as *I* don't have any 
dangling pointers.
>
> The entire reason the Ada.Containers library of packages (including Vectors)
> exists is to eliminate the need to manage storage. (If you don't mind
> managing the storage, the operations are trivial to write yourself, and
> probably more efficient as well.) As such, there is quite purposely no
> visibility into the storage management of any of the Ada.Containers. All
> that's required is that the memory is freed when the object is destroyed or
> overwritten (see A.18.2(253)).
>
> Of course, if you're putting an access to some object into a container, you
> still have to manage the storage for the object. It's better (but not always
> possible) to put the objects themselves directly into the containers. When
> that's not possible, sometimes its possible to put cursors (for another
> container) into a container. In both cases, the containers will manage the
> storage so you don't have to.
It would be possible to put the objects directly into the containers, I 
think.  But they will be changing a lot, and copying them in and out 
would be undesirable.  But my main reason for using the container is to 
get a variable size array.  (Second thoughts:  No, it wouldn't be 
possible to put the items into a container, if I substitute item for 
access variables, because there is a mutual reference and forward 
reference, and everything would end up as one big lump.)
>
> Personally, I wouldn't bother with using a Vector container for access
> values. You're still going to have to do most of the storage management
> yourself; you might as well do all of it yourself. An access to
> unconstrained array of access values is easy to manage and you won't have
> any questions about when memory is freed. The most painful case happens when
> you need to grow the structure, but just copying it into a larger value is
> not likely to be very expensive because of the small size of the
> elements.(After all, this is pretty much what is happening inside of an
> instance of Vectors.)
You are probably right that a good custom implementation would be more 
efficient, but (as is probably obvious) I'm not very familiar with Ada. 
  So my presumption has been that the library implementation would not 
only be much more likely to be correct than something that I wrote, but 
that it would also be more efficient.
>
> The big value of the containers is when managing large elements directly,
> especially indefinite elements like class-wide objects (something Ada
> doesn't allow you to do directly). Just because you *can* store access
> values into a container doesn't mean you should. But of course YMMV.
>
>                                            Randy.
Well, the things that I'm planning on managing are themselves indefinite 
in size, in that they contain variable length arrays,  So an access type 
seems the right handle.  Besides, that allows me to call routines that 
modify them without copying the whole thing.  But all I really want is a 
variable size array.  The fancy stuff, like cursors, etc., don't seem to 
have any application to my purpose.  I want an array, because I want to 
be able to directly address individual items.    OTOH, I don't even see 
that I need to use tagged types, much less class wide variables.  So 
maybe containers is the wrong thing to look at.  But it was the only 
variable size array library that I found.
Please note that when I say variable size, I'm not talking about 
indefinite.  At each instant the array is definite in size...but the 
size isn't a constant.  This kind of thing is normally implemented by an 
thing that starts at some base size, and whenever it runs out of space 
it allocates a new copy on the heap that will hold perhaps twice as many 
items, and references to this new item replace the old item.  Which is 
the kind of thing I thought Ada.Containers.Vectors was.  But details of 
the implementation are important if you want this to be efficient, so I 
thought I'd use the Ada.Containers library version.  It sounds like this 
was a mistake.






^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-05  5:18                           ` Charles Hixson
@ 2013-01-05  8:48                             ` Niklas Holsti
  2013-01-06 22:55                               ` Charles Hixson
  2013-01-08  2:41                             ` Randy Brukardt
  1 sibling, 1 reply; 56+ messages in thread
From: Niklas Holsti @ 2013-01-05  8:48 UTC (permalink / raw)

On 13-01-05 07:18 , Charles Hixson wrote:
>> "Charles Hixson"<charleshixsn@earthlink.net>  wrote in message
>> news:MY2dnX5O5NO_TXnNnZ2dnUVZ_hudnZ2d@earthlink.com...
>> ....
>>> And I'm sure there
>>> must be some reasonable way to deallocate a Vector. 
  [ snip ]
> Well, I've now been through the relevant section of the AARM, and
> there's nothing about deallocating Vectors.  The contents of the vector,
> yes, but not the vector itself.  Perhaps this means it's safe to use
> Unchecked_Deallocation on the vector, as long as *I* don't have any
> dangling pointers.

It means that Vectors handle their own storage, more or less as if there
is automatic garbage collection especially for the data in Vector
objects. The implementation uses the Ada "Controlled" types to ensure
that when a Vector object goes out of scope and disappears, the storage
it has allocated to hold the data is automatically deallocated.

You should not use Unchecked_Deallocation on a Vector, unless, of
course, you yourself have explicitly allocated the Vector itself on the
heap, with the "new" syntax. If you do allocate a Vector on the heap,
and then do Unchecked_Deallocation on the (access to the) Vector, the
storage that the Vector has allocated for its data is automatically
deallocated as a consequence of finalizing the Vector object as part of
the Unchecked_Deallocation.

The end result is that, from the storage allocation point of view, you
can think of a Vector in the same way as you think of an ordinary array
(of a fixed size). When the program leaves the block in which the array
variable was declared, the storage for the array is automatically
discarded. The same holds for Vectors (and the other standard Ada
containers).

If your ordinary array holds accesses to other objecs, nothing happens
to those other objects when the array is discarded (there is no
automatic "deep" deallocation). The same holds for Vectors: if the
Vector holds accesses to other objecs, only the accesses are discarded
when the Vector is discarded; nothing happens to those other accessed
objects.

> But all I really want is a
> variable size array. ....
> This kind of thing is normally implemented by an
> thing that starts at some base size, and whenever it runs out of space
> it allocates a new copy on the heap that will hold perhaps twice as many
> items, and references to this new item replace the old item.  Which is
> the kind of thing I thought Ada.Containers.Vectors was.

You were right (although the implementation is of course hidden).

> But details of
> the implementation are important if you want this to be efficient, so I
> thought I'd use the Ada.Containers library version.  It sounds like this
> was a mistake.

From the functional point of view, Ada.Containers is exactly what you
want. From the performance point of view, note that the Ada standard
containers come with some big-Oh complexity requirements. But the only
way to find out if they are fast enough for you is to measure.

If you don't need cursors, and are not worried about what may happen if
some parts of your application insert or delete elements in a Vector
that another part of your application is traversing in a loop, you could
probably make a slightly faster implementation of variable-sized arrays.
I have several such simple implementations, created before
Ada.Containers existed. I don't know if they are faster than Ada
Vectors, though, and I plan to retire them by and by, and switch to
Ada.Containers.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-05  8:48                             ` Niklas Holsti
@ 2013-01-06 22:55                               ` Charles Hixson
  2013-01-07  0:38                                 ` tmoran
                                                   ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Charles Hixson @ 2013-01-06 22:55 UTC (permalink / raw)


On 01/05/2013 12:48 AM, Niklas Holsti wrote:
> On 13-01-05 07:18 , Charles Hixson wrote:
>>> "Charles Hixson"<charleshixsn@earthlink.net>   wrote in message
>>> news:MY2dnX5O5NO_TXnNnZ2dnUVZ_hudnZ2d@earthlink.com...
>>> ....
>>>> And I'm sure there
>>>> must be some reasonable way to deallocate a Vector.
>    [ snip ]
>> Well, I've now been through the relevant section of the AARM, and
>> there's nothing about deallocating Vectors.  The contents of the vector,
>> yes, but not the vector itself.  Perhaps this means it's safe to use
>> Unchecked_Deallocation on the vector, as long as *I* don't have any
>> dangling pointers.
>
> It means that Vectors handle their own storage, more or less as if there
> is automatic garbage collection especially for the data in Vector
> objects. The implementation uses the Ada "Controlled" types to ensure
> that when a Vector object goes out of scope and disappears, the storage
> it has allocated to hold the data is automatically deallocated.
>
> You should not use Unchecked_Deallocation on a Vector, unless, of
> course, you yourself have explicitly allocated the Vector itself on the
> heap, with the "new" syntax. If you do allocate a Vector on the heap,
> and then do Unchecked_Deallocation on the (access to the) Vector, the
> storage that the Vector has allocated for its data is automatically
> deallocated as a consequence of finalizing the Vector object as part of
> the Unchecked_Deallocation.
>
> The end result is that, from the storage allocation point of view, you
> can think of a Vector in the same way as you think of an ordinary array
> (of a fixed size). When the program leaves the block in which the array
> variable was declared, the storage for the array is automatically
> discarded. The same holds for Vectors (and the other standard Ada
> containers).
>
> If your ordinary array holds accesses to other objecs, nothing happens
> to those other objects when the array is discarded (there is no
> automatic "deep" deallocation). The same holds for Vectors: if the
> Vector holds accesses to other objecs, only the accesses are discarded
> when the Vector is discarded; nothing happens to those other accessed
> objects.
>
>> But all I really want is a
>> variable size array. ....
>> This kind of thing is normally implemented by an
>> thing that starts at some base size, and whenever it runs out of space
>> it allocates a new copy on the heap that will hold perhaps twice as many
>> items, and references to this new item replace the old item.  Which is
>> the kind of thing I thought Ada.Containers.Vectors was.
>
> You were right (although the implementation is of course hidden).
>
>> But details of
>> the implementation are important if you want this to be efficient, so I
>> thought I'd use the Ada.Containers library version.  It sounds like this
>> was a mistake.
>
>  From the functional point of view, Ada.Containers is exactly what you
> want. From the performance point of view, note that the Ada standard
> containers come with some big-Oh complexity requirements. But the only
> way to find out if they are fast enough for you is to measure.
>
> If you don't need cursors, and are not worried about what may happen if
> some parts of your application insert or delete elements in a Vector
> that another part of your application is traversing in a loop, you could
> probably make a slightly faster implementation of variable-sized arrays.
> I have several such simple implementations, created before
> Ada.Containers existed. I don't know if they are faster than Ada
> Vectors, though, and I plan to retire them by and by, and switch to
> Ada.Containers.
>
It has been suggested that my intended use of Vectors is a mistake. 
That it is unreasonable to store access variables into vectors.  This is 
a pity, as I know of no other Ada construct that would suit my needs. 
(I *do* need to deallocate the vectors, though.  I understood that I 
would need to handle the deallocation of the items that the access 
variables referred to, but I need to be able to deallocate "objects" 
that contain the vectors.

Perhaps I should wait until there is a good text on using them.  I don't 
seem to use the correct language for people to understand what I'm 
talking about, even when I'm not being a bit fuzzy because I'm confused. 
  (And I must admit that this happens often when I am trying to 
understand some parts of Ada.  If it isn't described in Barnes, Ada 95, 
I'm likely to not know how to talk about it.)

At all events, the discussion on how to implement concurrency locking 
has been very valuable to me, and clarified many ideas that I had on the 
subject.  I appreciate this a lot.




^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-06 22:55                               ` Charles Hixson
@ 2013-01-07  0:38                                 ` tmoran
  2013-01-07  6:07                                 ` Shark8
  2013-01-07 10:49                                 ` Brian Drummond
  2 siblings, 0 replies; 56+ messages in thread
From: tmoran @ 2013-01-07  0:38 UTC (permalink / raw)


> (I *do* need to deallocate the vectors, though.
  You can manually deallocate things you allocated with "new", but usually
one finds there's a LIFO structure to the allocation and deallocation, so
in Ada you don't usually "allocate" or "deallocate" but rather you declare
an object (in a block or a subprogram) and let the system clean up when
you leave the block/subprogram.
  procedure sub is
    x1, x2 : some_kind_of_object;
    ...
  begin
    ...
  end sub;
The system will "allocate" x1 and x2 on entry to the subprogram and
automatically "deallocate" them on exit.
  If "some_kind_of_object" is a "controlled type" then you can write a
"Finalize" procedure for that type which the system will call on the way
out of the subprogram.  If "some_kind_of_object" isn't itself controlled,
but is, say, an array or record or something containing object(s) of
controlled type, the Finalize for the individual entries will be called as
part of exiting the subprogram.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-06 22:55                               ` Charles Hixson
  2013-01-07  0:38                                 ` tmoran
@ 2013-01-07  6:07                                 ` Shark8
  2013-01-07 10:49                                 ` Brian Drummond
  2 siblings, 0 replies; 56+ messages in thread
From: Shark8 @ 2013-01-07  6:07 UTC (permalink / raw)

On Sunday, January 6, 2013 4:55:05 PM UTC-6, Charles Hixson wrote:
> On 01/05/2013 12:48 AM, Niklas Holsti wrote:
> 
> It has been suggested that my intended use of Vectors is a mistake.
> That it is unreasonable to store access variables into vectors. This is 
> a pity, as I know of no other Ada construct that would suit my needs.

I think that's some miscommunication, probably because prior to the Containers in Ada the only way to use "vectors" [read dynamically-sized arrays] for indefinite-types was (a) to use access values, or (b) construction via function. (This is really still the case, though (a) is cleaned up with implicit dereferencing.)

Your problem, as described, was that there could be multiple, varying-sized 'cells' which would be known at generation-time, but also might themselves need to grow. (A perfect candidate for Vector.)

You *could* store the access values (I'd recommend a not-null constraint, if possible) in an array, but unless you need to keep a big 'network' active (which sounds dubious with your mailbox/message analogy; though you might need it in a different part than the message-box).

With Indefinite_Vectors you could have something like the following:
Type Message( Length: Positive ) is record
  Data : Array(1..Length) of Character:= ( Others => ' ' );
end record;

Package Message_Box_Pkg is New Ada.Containers.Indefinite_Vector( Positive, Message );

Message_Box : Message_Box_Pkg.Vector;

And that's it.
Indeed, going with the "apartment wall of mailboxes" you could do something like:

Subtype Apt_Address is Positive;
Package Mail_Wall is Ada.Containers.Indefinite_Vector
  ( Apt_Address, Message_Box_Pkg.Vector );

Mail_Center : Mail_Wall.Vector;

As you can see, none of those are using (exposed) access types; which is generally a good thing.

For using vectors w/ access values, you could use something like this:

Type Cell(<>);
Type Link is Not Null Access Cell;
Type Data is Array(Positive Range <>) of Link;

Type Cell( References: Natural ) is
  case References is
    When 0 => Null;
    -- You might even be able to change Links to a Vector_Package.Vector
    When others => Links : Data(1..References);
  end case;
end record;

Package Vector_Package is new Ada.Containers.Vectors(Positive, Link);

Petri_Dish : Vector_Package.Vector;

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-06 22:55                               ` Charles Hixson
  2013-01-07  0:38                                 ` tmoran
  2013-01-07  6:07                                 ` Shark8
@ 2013-01-07 10:49                                 ` Brian Drummond
  2013-01-07 18:27                                   ` Jeffrey Carter
  2 siblings, 1 reply; 56+ messages in thread
From: Brian Drummond @ 2013-01-07 10:49 UTC (permalink / raw)

On Sun, 06 Jan 2013 14:55:05 -0800, Charles Hixson wrote:

> On 01/05/2013 12:48 AM, Niklas Holsti wrote:
>> On 13-01-05 07:18 , Charles Hixson wrote:
>>>> "Charles Hixson"<charleshixsn@earthlink.net>   wrote in message
>>>> news:MY2dnX5O5NO_TXnNnZ2dnUVZ_hudnZ2d@earthlink.com...
>>>> ....
>>>>> And I'm sure there must be some reasonable way to deallocate a
>>>>> Vector.

>> If your ordinary array holds accesses to other objecs, nothing happens
>> to those other objects when the array is discarded (there is no
>> automatic "deep" deallocation). The same holds for Vectors: if the
>> Vector holds accesses to other objecs, only the accesses are discarded
>> when the Vector is discarded; nothing happens to those other accessed
>> objects.
>>
>>> But all I really want is a variable size array. ....

> It has been suggested that my intended use of Vectors is a mistake. That
> it is unreasonable to store access variables into vectors.  This is a
> pity, as I know of no other Ada construct that would suit my needs. 

I think the message may have been to avoid if possible, use of access 
types rather than vectors! 

For example, Ada.Containers.Indefinite_Vectors allows variable-sized 
objects (not just pointers to them) to be stored in the vectors. It's 
probably implemented internally using access types, but that's another 
matter...

Then how do you maintain links from one such object to others, to build a 
network or graph? You can access a vector element via a cursor - is it 
safe and reasonable to store and use cursors to maintain references to an 
object as the vector grows?

As an alternative, wrap the access type in a controlled type, so that the 
controlled object can take care of deallocating the accessed resources 
when it is finalised - and store controlled objects in the vector.

When you create such a controlled type, you can specify its Finalize 
operation (overriding a default) in which you can free its contents 
however is appropriate - decrement its reference count, 
Unchecked_Deallocate etc.

> (I  *do* need to deallocate the vectors, though.  

No ... when you are done with the contents, you need to *clear* the 
vectors.

This will free their contents. If the contents are controlled types, 
these will be Finalized to free their own contents in turn - see above. 
The Barnes 2005 book mentions Clear in connection with Lists, and goes on 
to say that Lists and Vectors have a lot in common, so it's easy to miss 
that point.

- Brian

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-07 10:49                                 ` Brian Drummond
@ 2013-01-07 18:27                                   ` Jeffrey Carter
  2013-01-08 12:02                                     ` Brian Drummond
  0 siblings, 1 reply; 56+ messages in thread
From: Jeffrey Carter @ 2013-01-07 18:27 UTC (permalink / raw)


On 01/07/2013 03:49 AM, Brian Drummond wrote:
>
> Then how do you maintain links from one such object to others, to build a
> network or graph? You can access a vector element via a cursor - is it
> safe and reasonable to store and use cursors to maintain references to an
> object as the vector grows?

A "vector" is an unbounded array. You access an array component using an index, 
and that is also how to reference a component of an unbounded array. The index 
of a specific component will not change unless the user specifically changes it 
(by inserting a component with a smaller index, for example).

-- 
Jeff Carter
"Now look, Col. Batguano, if that really is your name."
Dr. Strangelove
31



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-07 18:27                                   ` Jeffrey Carter
@ 2013-01-08 12:02                                     ` Brian Drummond
  2013-01-08 17:12                                       ` Jeffrey Carter
  0 siblings, 1 reply; 56+ messages in thread
From: Brian Drummond @ 2013-01-08 12:02 UTC (permalink / raw)


On Mon, 07 Jan 2013 11:27:20 -0700, Jeffrey Carter wrote:

> On 01/07/2013 03:49 AM, Brian Drummond wrote:
>>
>> Then how do you maintain links from one such object to others, to build
>> a network or graph? You can access a vector element via a cursor - is
>> it safe and reasonable to store and use cursors to maintain references
>> to an object as the vector grows?
> 
> A "vector" is an unbounded array. You access an array component using an
> index,
> and that is also how to reference a component of an unbounded array. The
> index of a specific component will not change unless the user
> specifically changes it (by inserting a component with a smaller index,
> for example).

And the index is a Cursor type.

So, holding onto a cursor would be unsafe (i.e. no longer refer to the 
original object) after an "insert" operation, possibly earlier in the 
vector?

- Brian



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 12:02                                     ` Brian Drummond
@ 2013-01-08 17:12                                       ` Jeffrey Carter
  2013-01-08 18:18                                         ` Simon Wright
  2013-01-08 22:26                                         ` Brian Drummond
  0 siblings, 2 replies; 56+ messages in thread
From: Jeffrey Carter @ 2013-01-08 17:12 UTC (permalink / raw)

On 01/08/2013 05:02 AM, Brian Drummond wrote:
>
> And the index is a Cursor type.

No. The index is a discrete type chosen by the user. Cursor is a private type 
defined by the package. Both serve to represent a "location" in the array. While 
it's clear that Cursor was included to make the package similar to the other 
containers, Cursor doesn't really make sense for the unbounded-array abstraction.

> So, holding onto a cursor would be unsafe (i.e. no longer refer to the
> original object) after an "insert" operation, possibly earlier in the
> vector?

If you have an item in your (unbounded) array, indexed by Positive, at index 3, 
it will still be at index 3 if you assign to components at other indices, or 
extend the array by adding new components beyond the current end of the array. 
However, if you insert a new component before the existing item at index 3, then 
all the components after the new component will shift up, and the existing item 
will then be at index 4.

This should be pretty obvious if you think about how you would implement an 
unbounded array.

One would think that a Cursor would behave similarly.

-- 
Jeff Carter
"Whatever it is, I'm against it."
Horse Feathers
46

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 17:12                                       ` Jeffrey Carter
@ 2013-01-08 18:18                                         ` Simon Wright
  2013-01-08 20:29                                           ` Dmitry A. Kazakov
  2013-01-08 21:01                                           ` Jeffrey Carter
  2013-01-08 22:26                                         ` Brian Drummond
  1 sibling, 2 replies; 56+ messages in thread
From: Simon Wright @ 2013-01-08 18:18 UTC (permalink / raw)


Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:

> On 01/08/2013 05:02 AM, Brian Drummond wrote:

>> So, holding onto a cursor would be unsafe (i.e. no longer refer to the
>> original object) after an "insert" operation, possibly earlier in the
>> vector?
>
> If you have an item in your (unbounded) array, indexed by Positive, at
> index 3, it will still be at index 3 if you assign to components at
> other indices, or extend the array by adding new components beyond the
> current end of the array. However, if you insert a new component
> before the existing item at index 3, then all the components after the
> new component will shift up, and the existing item will then be at
> index 4.
>
> This should be pretty obvious if you think about how you would
> implement an unbounded array.
>
> One would think that a Cursor would behave similarly.

Not so! The Cursor becomes ambiguous[1], and using it is a bounded
error. GNAT GPL 2012 implements Cursor using an Index, & behaves the way
you'd expect.

[1] http://www.ada-auth.org/standards/12rm/html/RM-A-18-2.html#p240



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 18:18                                         ` Simon Wright
@ 2013-01-08 20:29                                           ` Dmitry A. Kazakov
  2013-01-08 21:01                                           ` Jeffrey Carter
  1 sibling, 0 replies; 56+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-08 20:29 UTC (permalink / raw)

On Tue, 08 Jan 2013 18:18:44 +0000, Simon Wright wrote:

> Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:
> 
>> On 01/08/2013 05:02 AM, Brian Drummond wrote:
> 
>>> So, holding onto a cursor would be unsafe (i.e. no longer refer to the
>>> original object) after an "insert" operation, possibly earlier in the
>>> vector?
>>
>> If you have an item in your (unbounded) array, indexed by Positive, at
>> index 3, it will still be at index 3 if you assign to components at
>> other indices, or extend the array by adding new components beyond the
>> current end of the array. However, if you insert a new component
>> before the existing item at index 3, then all the components after the
>> new component will shift up, and the existing item will then be at
>> index 4.
>>
>> This should be pretty obvious if you think about how you would
>> implement an unbounded array.
>>
>> One would think that a Cursor would behave similarly.
> 
> Not so! The Cursor becomes ambiguous[1], and using it is a bounded
> error. GNAT GPL 2012 implements Cursor using an Index, & behaves the way
> you'd expect.
> 
> [1] http://www.ada-auth.org/standards/12rm/html/RM-A-18-2.html#p240

Yep, cursor is a bad abstraction, with the semantics sometimes ill-defined. 
Pointers fall into same category.

On the contrary, index is a good abstraction for all ordered containers 
like arrays are. The semantics of index is well defined for all operations 
on the container.

For a graph or a tree, vector is an inappropriate interface because edges 
are usually enumerated in a different manner than vector does. So, if 
vectors should be used then rather privately. And if nodes of the graph 
need to be referenced then cursors or indices should again be used only 
privately.

Usually graphs require some reference counting schema when nodes are 
accessed with read and read-write accesses differentiated when graph should 
be updated. Furthermore read-write access may require cloning of nodes on 
demand preserving consistency of other views etc. It is not a simple issue, 
highly dependant on the problem space.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 18:18                                         ` Simon Wright
  2013-01-08 20:29                                           ` Dmitry A. Kazakov
@ 2013-01-08 21:01                                           ` Jeffrey Carter
  2013-01-08 21:14                                             ` Simon Wright
  1 sibling, 1 reply; 56+ messages in thread
From: Jeffrey Carter @ 2013-01-08 21:01 UTC (permalink / raw)


On 01/08/2013 11:18 AM, Simon Wright wrote:
> Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:
>
>> If you have an item in your (unbounded) array, indexed by Positive, at
>> index 3, it will still be at index 3 if you assign to components at
>> other indices, or extend the array by adding new components beyond the
>> current end of the array. However, if you insert a new component
>> before the existing item at index 3, then all the components after the
>> new component will shift up, and the existing item will then be at
>> index 4.
>>
>> One would think that a Cursor would behave similarly.
>
> Not so! The Cursor becomes ambiguous[1], and using it is a bounded
> error. GNAT GPL 2012 implements Cursor using an Index, & behaves the way
> you'd expect.

To my mind, that is Cursor behaving similarly. Just as the index 3 is no longer 
the index of the item in question, so the Cursor should not be relied on.

-- 
Jeff Carter
"Whatever it is, I'm against it."
Horse Feathers
46



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 21:01                                           ` Jeffrey Carter
@ 2013-01-08 21:14                                             ` Simon Wright
  2013-01-08 22:11                                               ` Randy Brukardt
  2013-01-08 22:52                                               ` Jeffrey Carter
  0 siblings, 2 replies; 56+ messages in thread
From: Simon Wright @ 2013-01-08 21:14 UTC (permalink / raw)


Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:

> On 01/08/2013 11:18 AM, Simon Wright wrote:
>> Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:
>>
>>> If you have an item in your (unbounded) array, indexed by Positive, at
>>> index 3, it will still be at index 3 if you assign to components at
>>> other indices, or extend the array by adding new components beyond the
>>> current end of the array. However, if you insert a new component
>>> before the existing item at index 3, then all the components after the
>>> new component will shift up, and the existing item will then be at
>>> index 4.
>>>
>>> One would think that a Cursor would behave similarly.
>>
>> Not so! The Cursor becomes ambiguous[1], and using it is a bounded
>> error. GNAT GPL 2012 implements Cursor using an Index, & behaves the way
>> you'd expect.
>
> To my mind, that is Cursor behaving similarly. Just as the index 3 is
> no longer the index of the item in question, so the Cursor should not
> be relied on.

In AdaCore's implementation, yes. But another implementation might
validly raise PE. To my mind, that'd be preferable.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 21:14                                             ` Simon Wright
@ 2013-01-08 22:11                                               ` Randy Brukardt
  2013-01-08 22:52                                               ` Jeffrey Carter
  1 sibling, 0 replies; 56+ messages in thread
From: Randy Brukardt @ 2013-01-08 22:11 UTC (permalink / raw)


"Simon Wright" <simon@pushface.org> wrote in message 
news:lyvcb7v0lk.fsf@pushface.org...
> Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:
>
>> On 01/08/2013 11:18 AM, Simon Wright wrote:
>>> Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:
>>>
>>>> If you have an item in your (unbounded) array, indexed by Positive, at
>>>> index 3, it will still be at index 3 if you assign to components at
>>>> other indices, or extend the array by adding new components beyond the
>>>> current end of the array. However, if you insert a new component
>>>> before the existing item at index 3, then all the components after the
>>>> new component will shift up, and the existing item will then be at
>>>> index 4.
>>>>
>>>> One would think that a Cursor would behave similarly.
>>>
>>> Not so! The Cursor becomes ambiguous[1], and using it is a bounded
>>> error. GNAT GPL 2012 implements Cursor using an Index, & behaves the way
>>> you'd expect.
>>
>> To my mind, that is Cursor behaving similarly. Just as the index 3 is
>> no longer the index of the item in question, so the Cursor should not
>> be relied on.
>
> In AdaCore's implementation, yes. But another implementation might
> validly raise PE. To my mind, that'd be preferable.

That's what Janus/Ada will do (whenever I get the containers written). It's 
actually fairly cheap to check, and it's silly to allow a bug to go 
undetected if it can be detected without too much expense. It's not possible 
to check indexes, of course, so this will be an advantage to using cursors. 
(But usually you use a vector rather than a list because you need 
computability of indexes, which of course you can't do with cursors.)

                                   Randy.





^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 21:14                                             ` Simon Wright
  2013-01-08 22:11                                               ` Randy Brukardt
@ 2013-01-08 22:52                                               ` Jeffrey Carter
  1 sibling, 0 replies; 56+ messages in thread
From: Jeffrey Carter @ 2013-01-08 22:52 UTC (permalink / raw)


On 01/08/2013 02:14 PM, Simon Wright wrote:
> Jeffrey Carter <spam.jrcarter.not@spam.not.acm.org> writes:
>
>> To my mind, that is Cursor behaving similarly. Just as the index 3 is
>> no longer the index of the item in question, so the Cursor should not
>> be relied on.
>
> In AdaCore's implementation, yes. But another implementation might
> validly raise PE. To my mind, that'd be preferable.

Whether it gives you a different component, or raises an exception, it can't be 
relied on.

-- 
Jeff Carter
"Whatever it is, I'm against it."
Horse Feathers
46



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-08 17:12                                       ` Jeffrey Carter
  2013-01-08 18:18                                         ` Simon Wright
@ 2013-01-08 22:26                                         ` Brian Drummond
  1 sibling, 0 replies; 56+ messages in thread
From: Brian Drummond @ 2013-01-08 22:26 UTC (permalink / raw)

On Tue, 08 Jan 2013 10:12:30 -0700, Jeffrey Carter wrote:

> On 01/08/2013 05:02 AM, Brian Drummond wrote:
>>
>> And the index is a Cursor type.
> 
> No. The index is a discrete type chosen by the user. Cursor is a private
> type defined by the package. Both serve to represent a "location" in the
> array.

The context was Ada.Containers, where other forms of container use Cursor 
for indexing; I've been using List recently so it seemed odd that Vector 
has both index and cursor - but I stand corrected, they do appear to be 
separate, with overlapping functionality.

>> So, holding onto a cursor would be unsafe (i.e. no longer refer to the
>> original object) after an "insert" operation, possibly earlier in the
>> vector?
> 
> ... if you insert a new component before
> the existing item at index 3, then all the components after the new
> component will shift up, and the existing item will then be at index 4.
>
> One would think that a Cursor would behave similarly.

Question answered; whether it raises PE or follows Index in losing its 
place, Cursor cannot be used as a link to an object in the presence of 
Insert or Remove operations.

There is quite a lot of magic already in Ada.Containers (in the sense of 
making storage management easy and reliable) - that would be a bit too 
much to ask. (Not fundamentally impossible, but unlikely to be efficient)

Underlying point : as Dmitry says, Vector is the wrong abstraction for 
some (most?) graphs.

- Brian

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-05  5:18                           ` Charles Hixson
  2013-01-05  8:48                             ` Niklas Holsti
@ 2013-01-08  2:41                             ` Randy Brukardt
  1 sibling, 0 replies; 56+ messages in thread
From: Randy Brukardt @ 2013-01-08  2:41 UTC (permalink / raw)


"Charles Hixson" <charleshixsn@earthlink.net> wrote in message 
news:E_6dnZMOnLGnJHrNnZ2dnUVZ_rednZ2d@earthlink.com...
...
> You are probably right that a good custom implementation would be more 
> efficient, but (as is probably obvious) I'm not very familiar with Ada. So 
> my presumption has been that the library implementation would not only be 
> much more likely to be correct than something that I wrote, but that it 
> would also be more efficient.

It definitely would be more correct, but it's unlikely to be more efficient. 
The issue being that the containers have to support fairly general 
facilities, while a custom implementation only needs to support what you 
actually need.

Typically, it makes sense to start by using the containers, but then 
planning to switch to a custom structure if they are not fast enough. (It's 
usually very hard to tell what *really* is going to be the bottleneck until 
you at least have a prototype implementation to test.)

But if you need access types, its pretty trivial to make a dynamic array 
type in Ada. Something like:

     type My_Vect is array (Positive range <>) of Some_Access_Type;
     type My_Vect_Access is access My_Vect;

Then, you just allocate the size needed with new:

     Obj := new My_Vect (1..10);

If you need to make the object bigger, just allocate a new one, copy the old 
one's elements into it, swap the access values, and deallocate the old one. 
Pretty simple.

> Well, the things that I'm planning on managing are themselves indefinite 
> in size, in that they contain variable length arrays,  So an access type 
> seems the right handle.  Besides, that allows me to call routines that 
> modify them without copying the whole thing.  But all I really want is a 
> variable size array.  The fancy stuff, like cursors, etc., don't seem to 
> have any application to my purpose.

Here is where you're confused. A cursor serves the same purpose for a 
container that an access type serves for one of the built-in datatypes. In 
addition, you get some (depends on the implementation exactly how much) 
dangling reference detection when you use cursors. (For some 
implementations, it's nearly complete.) And, when you use a container to 
store the objects, and cursors to provide the references to them, you don't 
have to worry about storage management at all. The container will manage it 
for you.

So I would probably put your objects into some container, and then use the 
cursors rather than access values in your Vectors. Then you can totally 
forget about storage management (unless and until you start running out of 
memory).

                                     Randy.




  I want an array, because I want to
> be able to directly address individual items.    OTOH, I don't even see 
> that I need to use tagged types, much less class wide variables.  So maybe 
> containers is the wrong thing to look at.  But it was the only variable 
> size array library that I found.
> Please note that when I say variable size, I'm not talking about 
> indefinite.  At each instant the array is definite in size...but the size 
> isn't a constant.  This kind of thing is normally implemented by an thing 
> that starts at some base size, and whenever it runs out of space it 
> allocates a new copy on the heap that will hold perhaps twice as many 
> items, and references to this new item replace the old item.  Which is the 
> kind of thing I thought Ada.Containers.Vectors was.  But details of the 
> implementation are important if you want this to be efficient, so I 
> thought I'd use the Ada.Containers library version.  It sounds like this 
> was a mistake.
>
>
> 





^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-01  3:51       ` Charles Hixson
                           ` (2 preceding siblings ...)
  2013-01-01 12:32         ` Jeffrey Carter
@ 2013-01-02 22:43         ` Niklas Holsti
  2013-01-03  1:30           ` Charles Hixson
  3 siblings, 1 reply; 56+ messages in thread
From: Niklas Holsti @ 2013-01-02 22:43 UTC (permalink / raw)

On 13-01-01 05:51 , Charles Hixson wrote:
> On 12/31/2012 01:09 PM, Niklas Holsti wrote:
>> On 12-12-31 20:52 , Charles Hixson wrote:
>>> Thank you for your answer.
>>>
>>> The problem with that solution is that I've been advised that I
>>> shouldn't have many more tasks than available cores,
>>
>> IMO that advice is useful only if your application logically needs fewer
>> tasks than you have cores, but you want to increase the number of tasks
>> in order to have more cores running in parallel
   ....
>> But if your application logically needs more tasks than there are cores,
>> because of the way tasks communicate and interact, there is absolutely
>> no reason to limit the number of tasks to the number of cores, or
>> anything related to the number of cores.
>>
> My application logically "needs" thousands or millions of tasks.

Hmm. I'm not sure that I agree with you, more below.

> But this is clearly impractical, so I'm trying to work around it.

A few thousands could be practical, but I agree that a million tasks is
not. Even if the scheduling could work, the stacks will need a huge
amount of RAM.

> The application is clearly going to be "communication bound".

I'm not sure about that, either...

> OTOH, even
> with millions of separate cores, I'd still need asynchronous communication.
> 
> FWIW, you can think of what I'm doing is a neural network simulation.
> There are reasons why that isn't quite accurate, but it's close.  And
> each message is likely (but not certain) to get a reply and may also
> cause further messages to be sent on.

That sounds more like discrete event simulation, simulating a network of
interacting automata with discrete states and state transitions. It
doesn't sound so much like an analog neural network.

> Occasionally new nodes will be
> added to the net.  So logically, each "cell" should be a separate task,

There is always a choice of two representations for an automaton: as an
algorithm (i.e. a task body), or as a data object. The former represents
the automaton's state as a point in the algorithm (i.e. program counter
+ call stack + local variables), the latter as a state identifier (an
enumeration) and perhaps some additional variables (if the automaton
model has this concept).

If you have several automata, and represent them as algorithms, and the
automata change states at different, scattered times, then it is almost
necessary to implement them as tasks -- independently progressing
threads of control.

In your case, I suspect (but of course I'm not sure) that the logical
states of an individual automaton (a network node) are not complex
enough to merit representing as an algorithm, so the representation as a
data object seems equally valid. Which is why I'm not sure that your
application "needs" millions of tasks, even logically (but this logic is
subjective).

> but even so asynchronous communication is needed to avoid deadlock.

Very likely, if the network nodes would be implemented as tasks. But
this depends on how your simulation handles time. Do you have a global
concept of time-step or cycle, in which all network nodes progress by
one elementary step or action, or perhaps stay in the same state if they
receive no messages? Or do all nodes run "in parallel" with arbitrary
relative speeds?

Another question: do you expect the whole network to converge to some
stable state (fixed point), as for example in the iterative solution of
a partial differential equation? Or do you expect the network to
continue working and changing, and the interesting result is this
dynamic activity rather than some stable final state?

In the former case (iteration to stability), I would expect (by analogy
with PDEs) that the precise relative execution speeds of the network
nodes or cells is less important, and could even be pseudo-random, as
long as all nodes make positive progress. In the latter case (iteration
to simulate dynamic behaviour) I assume you need some global notion of
time and cycles.

> Since I can't model cells as tasks, I'm planning on modeling them as
> protected variables.  Even this may be too limiting, because of the
> requirement for asynchronous communication.  Particularly since
> protected variables are passive, rather than active, so each one is
> going to need to wait for a task to activate it.

It sounds like each cell implements more or less the same automaton
(state transitions), assuming that all the protected objects (variables)
are instances of the same protected type.

If so, it seems to me that this system is a good candidate for the
"worker task pool" approach. A "job" would be the execution of one state
transition (message reception?) for one cell.

Would this kind of design make some sense:

- Define a record type to represent the state of one cell (including its
connections to other cells). This does not have to be a protected type.

- Create as many worker tasks as there are processor cores.

- Allocate a subset of the cells to each worker task.

- Each worker task executes the state transitions and actions of all the
cells allocated to this task. The task maintains a queue of messages to
these cells (a "work list").

- When a cell sends a message to another cell allocated to the same
task, the message is simply appended to the message queue of that task.
The queue does not need to be protected against concurrent access,
because this queue is accessed only by this task.

- When a cell sends a message to another cell allocated to another task,
the message is appended to a queue of "incoming" messages, which is a
protected object associated with that other task.

- Each worker tasks takes and processes messages from its "incoming"
queue at suitable times, depending on the presence or absence of a
global time cycle.

Since all communication is internal, the only reason for a worker task
to be idle is if it happens that none of its cells has received any
message, which seems unlikely to happen if there are millions of cells.
So this design would be computation-bound, not communication-bound.

If the incoming-message queues become a contention bottleneck, there
could be a separate incoming-message queue for each pair of (sender
task, receiver task), or perhaps a lock-free queue implementation would
help.

> But this appears to be
> quite likely to interfere with normal execution.  The classic answer is
> to have two copies of the network and to cycle through cone copying it
> and state changes into the second.  Then, at the end of a cycle, reverse
> it.  This requires the whole thing to be copied in each cycle, even the
> parts that haven't changed in this cycle.

You can put the two copies of the state in each record object that
represents a cell, together with a per-cell indication of which copy is
"current". If the cell does not change its state, the current-state
indicator is unchanged; nothing need be copied. If the cell changes its
state, some data are changed, but at least the old and new states are
close to each other, and perhaps in the same cache lines, or at least in
the same VM page, which should reduce the cost of the copy.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
      .      @       .

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-02 22:43         ` Niklas Holsti
@ 2013-01-03  1:30           ` Charles Hixson
  2013-01-03 12:11             ` Georg Bauhaus
  0 siblings, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-03  1:30 UTC (permalink / raw)


On 01/02/2013 02:43 PM, Niklas Holsti wrote:
> On 13-01-01 05:51 , Charles Hixson wrote:
>> On 12/31/2012 01:09 PM, Niklas Holsti wrote:
>>> On 12-12-31 20:52 , Charles Hixson wrote:
>>>> Thank you for your answer.
>>>>
>>>> The problem with that solution is that I've been advised that I
>>>> shouldn't have many more tasks than available cores,
>>>
>>> IMO that advice is useful only if your application logically needs fewer
>>> tasks than you have cores, but you want to increase the number of tasks
>>> in order to have more cores running in parallel
>     ....
>>> But if your application logically needs more tasks than there are cores,
>>> because of the way tasks communicate and interact, there is absolutely
>>> no reason to limit the number of tasks to the number of cores, or
>>> anything related to the number of cores.
>>>
>> My application logically "needs" thousands or millions of tasks.
>
> Hmm. I'm not sure that I agree with you, more below.
>
>> But this is clearly impractical, so I'm trying to work around it.
>
> A few thousands could be practical, but I agree that a million tasks is
> not. Even if the scheduling could work, the stacks will need a huge
> amount of RAM.
>
>> The application is clearly going to be "communication bound".
>
> I'm not sure about that, either...
>
>> OTOH, even
>> with millions of separate cores, I'd still need asynchronous communication.
>>
>> FWIW, you can think of what I'm doing is a neural network simulation.
>> There are reasons why that isn't quite accurate, but it's close.  And
>> each message is likely (but not certain) to get a reply and may also
>> cause further messages to be sent on.
>
> That sounds more like discrete event simulation, simulating a network of
> interacting automata with discrete states and state transitions. It
> doesn't sound so much like an analog neural network.
It's clearly discrete simulation, but what is being simulated are cells 
that are acting vaguely like neurons.  (But the accent *is* on vaguely).
>
>> Occasionally new nodes will be
>> added to the net.  So logically, each "cell" should be a separate task,
>
> There is always a choice of two representations for an automaton: as an
> algorithm (i.e. a task body), or as a data object. The former represents
> the automaton's state as a point in the algorithm (i.e. program counter
> + call stack + local variables), the latter as a state identifier (an
> enumeration) and perhaps some additional variables (if the automaton
> model has this concept).
Yes, but what I'm doing is modeling the interactions of the cells.  I 
expect them to act in a certain way. (Though there are enough "gosh 
numbers" that I'm going to be doing a whole lot of "try and find out". 
Probably enough that I'll need to set up an environment where different 
versions can compete for the best solution, and breed the best to get a 
better answer.)  OTOH, each cell is going to be maintaining a complex 
state, so that may not work out either.
>
> If you have several automata, and represent them as algorithms, and the
> automata change states at different, scattered times, then it is almost
> necessary to implement them as tasks -- independently progressing
> threads of control.
>
> In your case, I suspect (but of course I'm not sure) that the logical
> states of an individual automaton (a network node) are not complex
> enough to merit representing as an algorithm, so the representation as a
> data object seems equally valid. Which is why I'm not sure that your
> application "needs" millions of tasks, even logically (but this logic is
> subjective).
They probably aren't that complex.  Close, though.  I expect them to use 
a common algorithm, but they'll have different input histories, which 
will create a complex state.  Actually, one set of them will be 
"directly connected to external stimuli", and the goal is to predict 
what the next stimulus will be.
>
>> but even so asynchronous communication is needed to avoid deadlock.
>
> Very likely, if the network nodes would be implemented as tasks. But
> this depends on how your simulation handles time. Do you have a global
> concept of time-step or cycle, in which all network nodes progress by
> one elementary step or action, or perhaps stay in the same state if they
> receive no messages? Or do all nodes run "in parallel" with arbitrary
> relative speeds?
The mailbox design sort of coerces a semi-global concept of time.  Each 
thread will run at whatever speed it can manage, but it's going to be 
cycling through a subset of the cells, and different cells will process 
different inputs at different speeds.  So with the current design there 
will be a thread level concept of time.  I don't see any advantage in 
slowing down the fast threads to synchronize with the slow ones, though 
I may try it by implementing a "cycle number", where everyone needs to 
agree on the cycle before they can start a new one.  That would produce 
a global synchronization as a sort of time.
>
> Another question: do you expect the whole network to converge to some
> stable state (fixed point), as for example in the iterative solution of
> a partial differential equation? Or do you expect the network to
> continue working and changing, and the interesting result is this
> dynamic activity rather than some stable final state?
The intent is to provide it with a continuing set of inputs that it will 
continually try to predict.  It will be basically impossible to achieve 
real accuracy, but only probabilistic accuracy.  (I'm planning on 
feeding it strings of text segmented into "words". [Note the quotes. 
They'll be wordish sort of things, but I may merge trailing punctuation 
onto the words, I haven't decided.])
>
> In the former case (iteration to stability), I would expect (by analogy
> with PDEs) that the precise relative execution speeds of the network
> nodes or cells is less important, and could even be pseudo-random, as
> long as all nodes make positive progress. In the latter case (iteration
> to simulate dynamic behaviour) I assume you need some global notion of
> time and cycles.
I think I may need a direction of time, but I don't think I *need* 
anything more explicit.  The implementation I'm contemplating, however, 
provides that automatically.
>
>> Since I can't model cells as tasks, I'm planning on modeling them as
>> protected variables.  Even this may be too limiting, because of the
>> requirement for asynchronous communication.  Particularly since
>> protected variables are passive, rather than active, so each one is
>> going to need to wait for a task to activate it.
>
> It sounds like each cell implements more or less the same automaton
> (state transitions), assuming that all the protected objects (variables)
> are instances of the same protected type.
>
> If so, it seems to me that this system is a good candidate for the
> "worker task pool" approach. A "job" would be the execution of one state
> transition (message reception?) for one cell.
>
> Would this kind of design make some sense:
>
> - Define a record type to represent the state of one cell (including its
> connections to other cells). This does not have to be a protected type.
That's interesting.  I had supposed that it did need to be protected. 
But now that I think about it, all interactions with other cells are 
going to be via it's mailbox (given my current design), so I guess that 
it really wouldn't need to be protected.  That's probably just a 
hang-over from an earlier design that didn't work out.
>
> - Create as many worker tasks as there are processor cores.
Check.
>
> - Allocate a subset of the cells to each worker task.
Check.
>
> - Each worker task executes the state transitions and actions of all the
> cells allocated to this task. The task maintains a queue of messages to
> these cells (a "work list").
Is there some reason the mailbox shouldn't be protected, and attached to 
each cell?  I want each cell to have it's own mailbox, which is a 
protected structure.  Cells from other tasks, as well as from this task, 
will be posting messages there.
>
> - When a cell sends a message to another cell allocated to the same
> task, the message is simply appended to the message queue of that task.
> The queue does not need to be protected against concurrent access,
> because this queue is accessed only by this task.
Check.  But cells from the same task can still send messages to the cell.
>
> - When a cell sends a message to another cell allocated to another task,
> the message is appended to a queue of "incoming" messages, which is a
> protected object associated with that other task.
The problem is that the task that contains the cell may be busy, at 
which point a task-attached mailbox would block, even though posting the 
message would be a very quick action.  And if the thread were busy for 
very long, the entire system might slow down considerably.

>
> - Each worker tasks takes and processes messages from its "incoming"
> queue at suitable times, depending on the presence or absence of a
> global time cycle.
It's my intention that when the thread activates a cell, the first thing 
the cell should do is read its incoming mail.
>
> Since all communication is internal, the only reason for a worker task
> to be idle is if it happens that none of its cells has received any
> message, which seems unlikely to happen if there are millions of cells.
> So this design would be computation-bound, not communication-bound.
Even if a cell hasn't received any input, there is a possibility (random 
chance) that it might spontaneously activate.  This will be influenced 
by it's current activation level.
>
> If the incoming-message queues become a contention bottleneck, there
> could be a separate incoming-message queue for each pair of (sender
> task, receiver task), or perhaps a lock-free queue implementation would
> help.
>
>> But this appears to be
>> quite likely to interfere with normal execution.  The classic answer is
>> to have two copies of the network and to cycle through cone copying it
>> and state changes into the second.  Then, at the end of a cycle, reverse
>> it.  This requires the whole thing to be copied in each cycle, even the
>> parts that haven't changed in this cycle.
>
> You can put the two copies of the state in each record object that
> represents a cell, together with a per-cell indication of which copy is
> "current". If the cell does not change its state, the current-state
> indicator is unchanged; nothing need be copied. If the cell changes its
> state, some data are changed, but at least the old and new states are
> close to each other, and perhaps in the same cache lines, or at least in
> the same VM page, which should reduce the cost of the copy.
>
The mailbox solution occurred to me after that was written.  If it would 
work, as it seems to me that it should, then this isn't needed, as the 
mailbox would contain all "change the state" information.  (Now I just 
need to design a message format that will handle all cases.)

OTOH, if the mailbox needs to be attached per thread rather than per 
cell, then this may well still be necessary.  (But the state is most of 
what *is* the cell, so that's almost like copying everything...except, 
of course, for the ones that didn't change.  But perhaps a thread 
attached mailbox could also only hold the state deltas.)

I guess a part of the question is "What is the overhead on a protected 
variable?".  If it's small, then I'd prefer to have one attached to 
every cell, as it eliminates one centralized spot of contention, and 
means that slightly simpler messages can be passed.  (They don't need to 
hold the intended recipient.  It's only one access variable and one 
integer/message, but that would add up.)

Also, if I decide to limit the number of messages that a cell can 
receive in a cycle, then the cell attached mailbox would make this easier.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03  1:30           ` Charles Hixson
@ 2013-01-03 12:11             ` Georg Bauhaus
  2013-01-03 13:17               ` Dmitry A. Kazakov
  2013-01-05 20:19               ` Charles Hixson
  0 siblings, 2 replies; 56+ messages in thread
From: Georg Bauhaus @ 2013-01-03 12:11 UTC (permalink / raw)


On 03.01.13 02:30, Charles Hixson wrote:
> I guess a part of the question is "What is the overhead on a protected variable?".  If it's small, then I'd prefer to have one attached to every cell, as it eliminates one centralized spot of contention, and means that slightly simpler messages can be passed.

Chances are that a protected object  employs a locking mechanism
of the OS (at least in a "PC run-time"). This is probably not the
fastest, leanest thing to have. However, I guess you could use
CAS based access to the letter boxes, in particular if the number
of messages is fixed.

Pat Rogers has illustrated an example, also referred to in Ada
Gems #93 and #98,
http://benchmarksgame.alioth.debian.org/u32q/program.php?test=chameneosredux&lang=gnat&id=2
http://www.adacore.com/adaanswers/gems/gem-93-high-performance-multi-core-programming-part-1/




^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03 12:11             ` Georg Bauhaus
@ 2013-01-03 13:17               ` Dmitry A. Kazakov
  2013-01-05 20:19               ` Charles Hixson
  1 sibling, 0 replies; 56+ messages in thread
From: Dmitry A. Kazakov @ 2013-01-03 13:17 UTC (permalink / raw)


On Thu, 03 Jan 2013 13:11:21 +0100, Georg Bauhaus wrote:

> However, I guess you could use
> CAS based access to the letter boxes, in particular if the number
> of messages is fixed.

If there is strictly one producer, which is normally the case for such
structures, you do not need anything but simple atomic (RM C.6(3)) read and
update in order to implement an lock-free queue or blackboard. No intrinsic
CAS.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-03 12:11             ` Georg Bauhaus
  2013-01-03 13:17               ` Dmitry A. Kazakov
@ 2013-01-05 20:19               ` Charles Hixson
  2013-01-07  4:01                 ` Shark8
  1 sibling, 1 reply; 56+ messages in thread
From: Charles Hixson @ 2013-01-05 20:19 UTC (permalink / raw)

On 01/03/2013 04:11 AM, Georg Bauhaus wrote:
> On 03.01.13 02:30, Charles Hixson wrote:
>> I guess a part of the question is "What is the overhead on a protected
>> variable?". If it's small, then I'd prefer to have one attached to
>> every cell, as it eliminates one centralized spot of contention, and
>> means that slightly simpler messages can be passed.
>
> Chances are that a protected object employs a locking mechanism
> of the OS (at least in a "PC run-time"). This is probably not the
> fastest, leanest thing to have. However, I guess you could use
> CAS based access to the letter boxes, in particular if the number
> of messages is fixed.
>
> Pat Rogers has illustrated an example, also referred to in Ada
> Gems #93 and #98,
> http://benchmarksgame.alioth.debian.org/u32q/program.php?test=chameneosredux&lang=gnat&id=2
>
> http://www.adacore.com/adaanswers/gems/gem-93-high-performance-multi-core-programming-part-1/
>
>
Unfortunately, the size is not fixed.  I can say that there is one 
mailbox per cell, but the number of cells is not fixed either.  And 
while I'm planning on limiting the size (capacity) of the mailbox, the 
largest size is expected to be far larger than the mean size.

OTOH, simplifying things is that the mailbox has only one reader, and 
writers, except the reader, can only append.  Also, the reader, "takes 
possession" of the contents of the mailbox, and replaces it with an 
empty mailbox.  At least that's my current design.  I've been 
contemplating a modified design where the mailbox is "attached" to the 
cell, and hold both out-going and in-coming messages.   There would also 
need to be a "letter-carrier" that collected and distributed the 
messages.  This would reduce the sync requirements to a boolean flag set 
and cleared by the mailbox accessors that said whether the mailbox was 
busy.  If it was busy, the accessing task could decide whether to wait a 
bit or to do something else and the try again.  This would require a 
"clear and set" on the flag, but nowhere else.
This modified design appears clearly superior if there are enough tasks 
that one can be assigned to be the letter-carrier, and I guess that if I 
arrange it to be inactive most of the time it might work even on my 
limited system.  (Actually the real design would then make the cell 
logically dependent on the mailbox, as in normal operation the mailbox 
could control the rolling in of the cell from secondary storage. 
Perhaps only occasionally would the entire roster of cells be 
sequentially activated.)

This idea for a design modification is because an investigation of 
sizing constraints shows that the original design it totally impossible 
on my system.  Even this redesigned system may need to be modified so 
that mailboxes spend much of their time on disk, if I can devise a way 
to enable this.  Perhaps some sort of rolling activation.  Virtual 
memory isn't a plausible answer because the program needs to be able to 
save its state quickly and shutdown...and then to pick-up where it left 
off.  (But this is getting far away from the original problem of 
concurrent design, which now looks solved...I've got in hand a design 
for a CAS based stack, which is all that the mailbox needs, though I 
need to modify it a bit to "popall" rather than simply "pop".)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2013-01-05 20:19               ` Charles Hixson
@ 2013-01-07  4:01                 ` Shark8
  0 siblings, 0 replies; 56+ messages in thread
From: Shark8 @ 2013-01-07  4:01 UTC (permalink / raw)

On Saturday, January 5, 2013 2:19:02 PM UTC-6, Charles Hixson wrote:
> On 01/03/2013 04:11 AM, Georg Bauhaus wrote:
> 
> Perhaps some sort of rolling activation.  Virtual memory isn't a plausible
> answer because the program needs to be able to save its state quickly and
> shutdown...and then to pick-up where it left off.

I'm somewhat reminded of the Firebird (aka Interbase) database, for some reason this brings to mind the method they use for handling resolution of conflicting updates & how it handles crashes.

My DB class was a few years ago, so I don't really recall these in-detail, but they sound like solutions for the problem you're presenting WRT HD save/load.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: asynchronous task communication
  2012-12-31 18:52   ` Charles Hixson
  2012-12-31 20:18     ` Shark8
  2012-12-31 21:09     ` Niklas Holsti
@ 2013-01-01 19:59     ` J-P. Rosen
  2 siblings, 0 replies; 56+ messages in thread
From: J-P. Rosen @ 2013-01-01 19:59 UTC (permalink / raw)


Le 31/12/2012 19:52, Charles Hixson a ï¿½crit :
> The problem with that solution is that I've been advised that I
> shouldn't have many more tasks than available cores, and that uses 2 of
> my available 6.  True, several clients could use the same postmaster,
> but it's still too resource intensive.
This might be justified for (almost) permanently active tasks, but don't
count those tasks (like a mailbox) that are almost always blocked.

-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr



^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2013-01-09  8:34 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-31  0:16 asynchronous task communication Charles Hixson
2012-12-31  9:04 ` Simon Wright
2012-12-31 11:49   ` Simon Wright
2012-12-31 10:50 ` J-P. Rosen
2012-12-31 12:09 ` Georg Bauhaus
2012-12-31 18:52   ` Charles Hixson
2012-12-31 20:18     ` Shark8
2012-12-31 21:09     ` Niklas Holsti
2012-12-31 22:15       ` Randy Brukardt
2013-01-01  3:58         ` Charles Hixson
2013-01-01  4:48           ` tmoran
2013-01-01 17:59             ` Charles Hixson
2013-01-01  3:51       ` Charles Hixson
2013-01-01  9:59         ` Dmitry A. Kazakov
2013-01-01 10:38         ` Brian Drummond
2013-01-01 12:32         ` Jeffrey Carter
2013-01-01 18:21           ` Charles Hixson
2013-01-01 18:54             ` Robert A Duff
2013-01-02  7:36               ` Charles Hixson
2013-01-02  9:55                 ` Dmitry A. Kazakov
2013-01-02 19:02                   ` Charles Hixson
2013-01-02 20:35                     ` Dmitry A. Kazakov
2013-01-03  0:20                       ` Charles Hixson
2013-01-03  6:34                         ` Charles Hixson
2013-01-03  8:50                         ` Dmitry A. Kazakov
2013-01-03 19:01                           ` Charles Hixson
2013-01-03 10:01                         ` J-P. Rosen
2013-01-03 19:29                           ` Charles Hixson
2013-01-04  8:17                             ` J-P. Rosen
2013-01-05  4:31                               ` Charles Hixson
2013-01-09  8:34                                 ` Stephen Leake
2013-01-03 22:27                         ` Randy Brukardt
2013-01-05  5:18                           ` Charles Hixson
2013-01-05  8:48                             ` Niklas Holsti
2013-01-06 22:55                               ` Charles Hixson
2013-01-07  0:38                                 ` tmoran
2013-01-07  6:07                                 ` Shark8
2013-01-07 10:49                                 ` Brian Drummond
2013-01-07 18:27                                   ` Jeffrey Carter
2013-01-08 12:02                                     ` Brian Drummond
2013-01-08 17:12                                       ` Jeffrey Carter
2013-01-08 18:18                                         ` Simon Wright
2013-01-08 20:29                                           ` Dmitry A. Kazakov
2013-01-08 21:01                                           ` Jeffrey Carter
2013-01-08 21:14                                             ` Simon Wright
2013-01-08 22:11                                               ` Randy Brukardt
2013-01-08 22:52                                               ` Jeffrey Carter
2013-01-08 22:26                                         ` Brian Drummond
2013-01-08  2:41                             ` Randy Brukardt
2013-01-02 22:43         ` Niklas Holsti
2013-01-03  1:30           ` Charles Hixson
2013-01-03 12:11             ` Georg Bauhaus
2013-01-03 13:17               ` Dmitry A. Kazakov
2013-01-05 20:19               ` Charles Hixson
2013-01-07  4:01                 ` Shark8
2013-01-01 19:59     ` J-P. Rosen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox