From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,d46468aa410c0403 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news4.google.com!news.glorb.com!news-in.ntli.net!newsrout1-win.ntli.net!ntli.net!news.highwinds-media.com!newspeer1-win.ntli.net!newsfe3-gui.ntli.net.POSTED!53ab2750!not-for-mail From: "Dr. Adrian Wrigley" Subject: Re: Distributed Ada, robustness etc. User-Agent: Pan/0.14.2 (This is not a psychotic episode. It's a cleansing moment of clarity.) Message-Id: Newsgroups: comp.lang.ada References: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Date: Tue, 30 May 2006 15:11:29 GMT NNTP-Posting-Host: 82.10.238.153 X-Trace: newsfe3-gui.ntli.net 1149001889 82.10.238.153 (Tue, 30 May 2006 16:11:29 BST) NNTP-Posting-Date: Tue, 30 May 2006 16:11:29 BST Organization: NTL Xref: g2news2.google.com comp.lang.ada:4607 Date: 2006-05-30T15:11:29+00:00 List-Id: On Mon, 29 May 2006 00:55:11 +0000, Dr. Adrian Wrigley wrote: > On Thu, 25 May 2006 01:12:08 +0000, Dr. Adrian Wrigley wrote: > >> >> >> Hmm. Seems to have gone quiet round here! > > perhaps it's the long weekend... > (...continuing the monolog) ...sometimes it feels lonely as an Ada programmer ;-| I thought I'd put in some code to check if a server partition is still alive (these functions in an RCI unit): ----------------- function PartitionIsLive1 (SDK : SDK_T) return boolean is begin declare Status : String := SDKStatus (SDK.all); -- call another partition begin return True; -- If we got a result... good! end; exception -- normally a SYSTEM.RPC comms exception when others => return False; -- If we got any exception... Bad :( end PartitionIsLive1; --------------------- and this works about 99% of the time. The other 1%, it gets stuck forever on the SDKStatus call :( (why?) So, thinking I'd be clever, I put a select/delay timout: ----------------- function PartitionIsLive2 (SDK : SDK_T) return boolean is begin select delay 30.0; -- give the partition adequate time to reply then abort begin declare Status : String := SDKStatus (SDK.all); -- call partition begin return True; -- If we got a result... good! end; exception -- normally a SYSTEM.RPC comms exception when others => return False; -- If we got any exception... Bad :( end; end select; return False; -- Couldn't get reply in time :( end PartitionIsLive2; --------------------- this works virtually all of the time. But not quite. Sometimes it still jams. And all subsequent calls (from other tasks) jam on an SDKStatus call to the absent partition. The whole system then gets "gummed up". Why doesn't the select/delay method guarantee a timely return from PartitionIsLive2? I'm trying to make the code resiliant to unexpected partition termination, bugs, perhaps reboots. But the gremlins keep thwarting the attempts! -- Adrian