Re: Distributed Ada, robustness etc.

comp.lang.ada
 help / color / mirror / Atom feed

From: "Dr. Adrian Wrigley" <amtw@linuxchip.demon.co.uk.uk.uk>
Subject: Re: Distributed Ada, robustness etc.
Date: Wed, 31 May 2006 12:40:40 GMT
Date: 2006-05-31T12:40:40+00:00	[thread overview]
Message-ID: <pan.2006.05.31.12.45.29.104165@linuxchip.demon.co.uk.uk.uk> (raw)
In-Reply-To: 87odxeg2vz.fsf@ludovic-brenta.org

On Wed, 31 May 2006 07:49:20 +0200, Ludovic Brenta wrote:

> Dr. Adrian Wrigley writes:
>> On Mon, 29 May 2006 00:55:11 +0000, Dr. Adrian Wrigley wrote:
>>
>>> On Thu, 25 May 2006 01:12:08 +0000, Dr. Adrian Wrigley wrote:
>>> 
>>>> <snip>
>>>> 
>>>> Hmm.  Seems to have gone quiet round here!
>>> 
>>> perhaps it's the long weekend...
>>> (...continuing the monolog)
>>
>> ...sometimes it feels lonely as an Ada programmer ;-|
> 
> [...]
> 
>> I'm trying to make the code resiliant to unexpected partition termination,
>> bugs, perhaps reboots.  But the gremlins keep thwarting the attempts!
> 
> Hi Adrian,
> 
> I don't have anything useful to tell you, but please keep posting
> here; I find this quite interesting.  Indeed, you seem to be at the
> forefront of distributed Ada technology :)

Thanks for the encouragement!

I had a suspition that the silence was because c.l.a readers hadn't
met these problems before, rather than my decent into *everybody's*
kill-file :o

Anyway, continuing the story:

I realised that one of the problem areas of the design was that
calls are often attempted into terminated partitions.
The "nameserver" registers the partitions as they are elaborated,
but only unregisters them when communication with them has been
lost.

So I decided to add an unregister RCI call to the nameserver.
When the package that registered itself is in a terminating
partition, it makes the call to unregister itself.  Seemed
like a really sensible idea...

Unfortunately, the unregister call *always* fails.  I had
implemented the call through the finalization of a Controlled
type.  An instance of the type is in the package, and the
Finalize procedure is called when the partition terminates.
This call, however, seems to take place *after* the PCS for
the partition is brought down, and so the unregister
immediately fails.

I do now try to call unregister when a partition terminates,
using another mechanism.  But this doesn't quite match the
needs, and can't be used in all circumstances.  And it's
more complicated :(   If a partition is aborted, the only
way the code can find out is by attempting a call and watching
it fail.  This may be a weakness in the Annex E.

Is it legitimate to make RCI calls as library-level
units are finalized?  Shouldn't the PCS be brought up very early,
and be shut down late in a partition's life cycle, so that this
can be done?  Is there another way to implement package
finalization code?

Overall, the system seems to be working OK at the moment, with
overnight testing showing no anomolies.  But I'd like the whole
system to stay up for several months+, with thousands of client
partitions being invoked (serially, not concurrently!).
It's also important that substitution or failure of third-party
library code can happen while the system runs.
I may have achieved this already - only time will tell!
--
Adrian

next prev parent reply	other threads:[~2006-05-31 12:40 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-23 12:14 Distributed Ada, robustness etc Dr. Adrian Wrigley
2006-05-25  1:12 ` Dr. Adrian Wrigley
2006-05-25 10:34   ` Dmitry A. Kazakov
2006-05-29  0:55   ` Dr. Adrian Wrigley
2006-05-30 15:11     ` Dr. Adrian Wrigley
2006-05-31  5:49       ` Ludovic Brenta
2006-05-31 12:40         ` Dr. Adrian Wrigley [this message]
2006-05-31 13:21           ` Jean-Pierre Rosen
2006-05-31 14:38             ` Dr. Adrian Wrigley
2006-05-31 15:38               ` Jean-Pierre Rosen
2006-06-02 10:27           ` Stephen Leake

replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox