comp.lang.ada
 help / color / mirror / Atom feed
From: rgilbert@unconfigured.xvnews.domain (Bob Gilbert)
Subject: Re: Ariane Crash (Was: Adriane crash)
Date: 1996/07/31
Date: 1996-07-31T00:00:00+00:00	[thread overview]
Message-ID: <4tnip9$k0s@zeus.orl.mmc.com> (raw)
In-Reply-To: 4tkfe5$did@goanna.cs.rmit.edu.au


In article <4tkfe5$did@goanna.cs.rmit.edu.au>, rav@goanna.cs.rmit.edu.au (++           robin) writes:
> 	rgilbert@unconfigured.xvnews.domain (Bob Gilbert) writes:
> 
> 	>The error was assuming that the Ariane 4 design would be adaquate
> 	>for the Ariane 5 system.
> 
> 	>> The specific error was that a conversion of a double-precision
> 	>> floating-point value (~58 significant bits) to 15 significant
> 	>> bits caused fixed-point overflow.  The conversion was not
> 	>> checked for overflow.  It should have been.
> 
> 	>It was checked, hence the exception and an exception handler to
> 	>take corrective action.
> 
> ---The SRI computer (& its backup) had an exception
> handler, to be sure, but it did not have an exception
> handler to take corrective action.  The exception handler
> shut the computer down.

Which was the specified corrective action.

> 	> Unfortunately the corrective action was
> 	>to assume that the SRI had failed and to shut it down.  The
> 	>software performed exactly as designed.
> 
> ---The software did not performed as designed.  It was
> intended to shut down the computer only in the event of
> a hardware error.

The out of bounds data was considered to be indictative of a random hardware
fault, at least for the Ariane 4.  Perhaps this was not a valid method
of determining a hardware fault, but it was the design decision.

>  The software shut down the computer
> because of a programming error.  The software performed
> only as written!
> 
> 	>>  This is, after all,
> 	>> a real-time system.  It's a fundamental check that a programmer
> 	>> experienced in real-time systems should have carried out.
> 	>> 
> 	>>    Control was then passed to the interrupt handler, which
> 	>> shut down the system.
> 
> 	>Exactly as designed.
> 
> ---Again, not as designed.  It was designed to shut down only
> in the event that the SRI computer failed.  Then the backup
> would be used.

Again, the (wrongly assumed) SRI failure was determined by the detection 
of out of bounds data.  It was a requirements oversight, not a programming
oversight, and most certainly not influenced by the programming language used.

To quote the report:

  Although the source of the Operand Error has been identified, this in 
  itself did not cause the mission to fail. The specification of the 
  exception-handling mechanism also contributed to the failure. In the
  event of any kind of exception, the system specification stated that:
  the failure should be indicated on the databus, the failure context 
  should be stored in an EEPROM memory (which was recovered and read out
  for Ariane 501), and finally, the SRI processor should be shut down.

The last sentence of the above is what the requirements stated, and
exactly what the software did, exactly as designed.


-Bob











  reply	other threads:[~1996-07-31  0:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1996-07-23  0:00 Adriane crash Jerry van Dijk
1996-07-25  0:00 ` Peter Hermann
1996-07-27  0:00   ` Jerry van Dijk
1996-07-25  0:00 ` Ariane Crash (Was: Adriane crash) John McCabe
1996-07-26  0:00   ` ++           robin
1996-07-29  0:00     ` John McCabe
1996-07-29  0:00     ` Bob Gilbert
1996-07-30  0:00       ` ++           robin
1996-07-31  0:00         ` Bob Gilbert [this message]
1996-07-31  0:00           ` William Clodius
1996-08-01  0:00           ` ++           robin
1996-08-02  0:00       ` root
1996-07-25  0:00 ` Adriane crash Steve O'Neill
1996-07-26  0:00 ` David Verrier
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox