comp.lang.ada
 help / color / mirror / Atom feed
From: William Clodius <wclodius@lanl.gov>
Subject: Re: Ariane 5 failure (Was: Size code Ada and C)
Date: 1998/07/02
Date: 1998-07-02T00:00:00+00:00	[thread overview]
Message-ID: <359BFC60.446B@lanl.gov> (raw)
In-Reply-To: 6ng8ua$1jp$1@goanna.cs.rmit.edu.au


robin wrote:
> <snip>
> No, it was the unchecked conversion.  If the conversion
> had undergone a magnitude check, the OS would have never
> shut down the SRI.  Any kind of error would cause the
> SRI computer to shut down.  Thus, the programmer should
> have undertaken every proecaution to ensire that each and
> every possible cause of an interrupt could not occur.
> <snip>

Your reasoning might be valid if the programmers were unaware that Ada's
default semantics would cause an exception to be thrown and that this
would shut down the computer. The papers indicate that programmers did
not include an explicit check because they were aware of the semantics
and its consequences and made the decision that if the "error" would
occur it was cause for shutting down the computer. Quoting the report

"To determine the vulnerability of unprotected code, an analysis was
performed on every operation which could give rise to an exception,
including an Operand Error. In particular, the conversion of floating
point values to integers was analysed and operations involving seven
variables were at risk of leading to an Operand Error. This led to
protection being added to four of the variables, evidence of which
appears in the Ada code. However, three of the variables were left
unprotected. No reference to justification of this decision was found
directly in the source code. Given the large amount of documentation
associated with any industrial application, the assumption, although
agreed, was essentially obscured, though not deliberately, from any
external review.

The reason for the three remaining variables, including the one denoting
horizontal bias, being unprotected was that further reasoning indicated
that they were either physically limited or that there was a large
margin of safety, a reasoning which in the case of the variable BH
turned out to be faulty. It is important to note that the decision to
protect certain variables but not others was taken jointly by project
partners at several contractual levels."
...
"The specification of the exception-handling mechanism also contributed
to the failure. In the event of any kind of exception, the system
specification stated that: the failure should be indicated on the
databus, the failure context should be stored in an EEPROM memory (which
was recovered and read out for Ariane 501), and finally, the SRI
processor should be shut down.

It was the decision to cease the processor operation which finally
proved fatal. Restart is not feasible since attitude is too difficult to
re-calculate after a processor shutdown; therefore the Inertial
Reference System becomes useless. The reason behind this drastic action
lies in the culture within the Ariane programme of only addressing
random hardware failures. From this point of view exception - or error -
handling mechanisms are designed for a random hardware failure which can
quite rationally be handled by a backup system."

The assumptions given in the second paragraph were valid for Ariane 4
but not Ariane 5. The resulting turn off of the computer was not
required Ada by exception handling, but an explicit decision of the
Ariane team driven by the culture of the Ariane team.

To emphasize, the team made an explicit decision that any unhandled
exception that occurred was evidence of a hardware error and
justification for turning off the computer. They examined the code for
possible overflows that could trigger such exceptions, found this
specific part of the code, and determined that overflows for this
quantity were not physically possible and hence indicative (for the
Ariane 4) of a hardware failure. They apparently made the decision
several times not to handle this specific exception AND wWERE AWARE OF
THE CONSEQUENCES OF NOT HANDLING AN EXCEPTION. Even in the absence of a
language defined exception handling, their reasoning (culture) likely
would have caused them to explicitly insert the check and have it set a
flag which would turn off the computer at a lower level.

-- 

William B. Clodius		Phone: (505)-665-9370
Los Alamos Nat. Lab., NIS-2     FAX: (505)-667-3815
PO Box 1663, MS-C323    	Group office: (505)-667-5776
Los Alamos, NM 87545            Email: wclodius@lanl.gov




  reply	other threads:[~1998-07-02  0:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <35921271.E51E36DF@aonix.fr>
     [not found] ` <3598358A.73FF35CC@pipeline.com>
     [not found]   ` <dewar.899298949@merv>
1998-07-03  0:00     ` Performance Ada and C, was Re: Size code Ada and C Van Snyder
1998-07-03  0:00       ` Performance " Markus Kuhn
1998-07-03  0:00         ` Robert Dewar
1998-07-03  0:00           ` Markus Kuhn
1998-07-04  0:00             ` ak
1998-07-07  0:00             ` Frank Klemm
1998-07-13  0:00               ` Daren Scot Wilson
     [not found] ` <m3zpf1tyr8.fsf@zaphod.enst.fr>
     [not found]   ` <6mtiv0$9j3@gcsin3.geccs.gecm.com>
     [not found]     ` <dewar.898962846@merv>
     [not found]       ` <6n8393$hoi$2@platane.wanadoo.fr>
     [not found]         ` <6n84im$79q@gcsin3.geccs.gecm.com>
     [not found]           ` <m3u35470ds.fsf@zaphod.enst.fr>
     [not found]             ` <6n8b7u$9hm@gcsin3.geccs.gecm.com>
     [not found]               ` <m3vhpk5f0d.fsf@zaphod.enst.fr>
     [not found]                 ` <3597db2d.1017430@news.demon.co.uk>
     [not found]                   ` <EACHUS.98Jun30173656@spectre.mitre.org>
1998-07-03  0:00                     ` Size code " John McCabe
1998-07-03  0:00                       ` Larry Elmore
1998-07-03  0:00                         ` John McCabe
1998-07-07  0:00                         ` Robert I. Eachus
     [not found]         ` <dewar.899298821@merv>
1998-07-07  0:00           ` Robert I. Eachus
     [not found]       ` <6n7jut$al0$1@nnrp1.dejanews.com>
     [not found]         ` <6navqt$shc$1@goanna.cs.rmit.edu.au>
     [not found]           ` <359A53E2.41C6@lanl.gov>
     [not found]             ` <dewar.899334821@merv>
     [not found]               ` <6nfp0v$dgl@gcsin3.geccs.gecm.com>
1998-07-02  0:00                 ` Ariane 5 failure (Was: Size code Ada and C) Jean-Pierre Rosen
1998-07-03  0:00             ` robin
1998-07-02  0:00               ` William Clodius [this message]
1998-07-09  0:00             ` Plenty of unnecessary contraint tests " Frank Klemm
1998-07-09  0:00               ` Robert Dewar
1998-07-10  0:00                 ` Frank Klemm
1998-07-10  0:00               ` Robert S. White
1998-07-10  0:00               ` Ariane 5 failure " Dale Stanbrough
1998-07-10  0:00                 ` John McCabe
1998-07-10  0:00                   ` Frank Klemm
1998-07-10  0:00                   ` Pat Rogers
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox