From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: f43e6,5ac12f5a60b1bfe
X-Google-Attributes: gidf43e6,public
X-Google-Thread: 101deb,f96f757d5586710a
X-Google-Attributes: gid101deb,public
X-Google-Thread: 103376,5ac12f5a60b1bfe
X-Google-Attributes: gid103376,public
From: rav@goanna.cs.rmit.edu.au (++           robin)
Subject: Re: Ariane 5 - not an exception?
Date: 1996/08/01
Message-ID: <4totv7$o9f@goanna.cs.rmit.edu.au>
X-Deja-AN: 171314166
expires: 1 November 1996 00:00:00 GMT
references: <Dv45EJ.8r@fsa.bris.ac.uk> <4t9vdg$jfb@goanna.cs.rmit.edu.au>
 <31FE35BC.1A0D@sanders.lockheed.com>
organization: Comp Sci, RMIT, Melbourne, Australia
newsgroups: comp.software-eng,comp.lang.ada,comp.lang.pl1
nntp-posting-user: rav
Date: 1996-08-01T00:00:00+00:00
List-Id: <comp.lang.ada>


	Steve O'Neill <smoneill@sanders.lockheed.com> writes:

	>++ robin wrote:
	>> ---I think the real lessons are that
	>> 1. real-time programming requires special expertise.

	>Agreed wholeheartedly

	>> 2. the choice of language is suspect.  A better-established
	>>    language such as PL/I -- specifically designed for
	>>    real-time programming -- with robust compilers, and
	>>    with its base of experienced programming
	>>    staff could well have prevented this disaster.

	>I disagree completely!  The language was not the
	>problem the design decisions in how the language 
	>was used were.

---The choice of language is indeed very relevant.
What I wrote in an earlier posting on this topic is highly
apt:

"A PL/I programmer
experienced with real time systems, would have CHALLENGED
such a stupid requirement that the computer be shut down by the
error-handler in the event of a fixed-point overflow.  He would
have had it changed.

"I'd go further to say that no experienced PL/I programmer
would have shut down the system as a result of a fixed-point
overflow.

"Furthermore, he would have included a check that the value
did not go out of range;"

	>Ada is completely capable the realm [sic]
	>of real-time programming, has robust 
	>compilers and tools, and has quite a few experienced
	>software engineers capable of implementing 
	>just about any requirements thrown their way (been there, done that).  

	>Had the designers of the system allowed the
	>implementors to use Ada exception mechanisms fully 
	>and properly they could have localized the failure
	>to, at worst, the alignment function

---But all it needed was a check that the value was in range.
Such checks had been included on other similar conversions in
the vicinity!

	>(which 
	>was not necessary at the time of the failure anyway)

---what?  The OBC was using the attitude information to
direct the nozzles.  It was their [the nozzles] sudden change
that caused the space vehicle to break up, thereby forcing
the vehicle to self-destruct automatically [that sudden
change was the result of the OBC interpreting the error
readout from the shut-down SRI computer as attitide data.]

	> without shutting down the entire device.  
	>Instead, as is common practice in the safety-
	>critical world, local exception handlers are 
	>frequently banned and a global 'shut it all
	>down' handler is the only stop gap measure.  
	>Unbelievably the rationale for disallowing local
	>handlers is because they make it difficult to 
	>verify complete code coverage since they are
	>only executed in the case of exceptional conditions 

---As I wrote in an earlier post:

"This project might well have been written in PL/I, which
has excellent real-time facilities, including error
handling, error simulation and validation facilities.
The language has robust compilers, and experts with many
years of PL/I programming experience.

"As to PL/I facilities, I refer to the SIGNAL statement,
with which given conditions (errors such as fixed-point
overflow) can be signalled as if the condition (error)
actually occurred.

"This alone would have showed up the deficiency of the
overall design (that the system would shut itself down for
fixed-point overflow)."

	>(i.e. given the expected data (Ariane 4 profile)
	>the handlers are not executed and therefore we 
	>can't prove that all of our code has been
	>exercised at least once).

---But they can be, and shown to be, in PL/I -- the language
with the right tools -- with the SIGNAL statement.  That
statement leaves an indisputable footprint!

	>I find this logic suspect in 
	>the extreme!  As somebody once said "expect the
	>unexpected".  In addition to trying for fault 
	>avoidance through analysis we should also be
	>planning for fault resiliency in the presence of 
	>reality.

---Exactly what I wrote in an earlier posting.

	>You're other conclusions are right on target
	>though - you should never shut a system down 
	>(unless its presence is impacting system performance
	>as in the case of babbling nodes et.al.) but 
	>do indicate its distress to a higher authority
	>who then can take this into account in using the 
	>information provided.

	>Steve O'Neill                      | "No,no,no, don't tug on that!
	>Sanders, A Lockheed Martin Company |  You never know what it might
	>smoneill@sanders.lockheed.com      |  be attached to."