comp.lang.ada
 help / color / mirror / Atom feed
From: rav@goanna.cs.rmit.edu.au (++           robin)
Subject: Re: Ariane 5 Failure - Summary Report
Date: 1996/07/26
Date: 1996-07-26T00:00:00+00:00	[thread overview]
Message-ID: <4t9tcp$gvo@goanna.cs.rmit.edu.au> (raw)
In-Reply-To: 4t6opg$4cp@goanna.cs.rmit.edu.au


	>Ken Garlington <garlingtonke@lmtas.lmco.com> writes:

	>Don't know what happened there, but I was just going to point out
	>that the Ariane 5 report is at:

	>  http://www.esrin.esa.it/htdocs/tidc/Press/Press96/press33.html

	>Be sure to read the full report, which is linked to this page. It
	>goes into some length about the sequence of events (which includes
	>an Ada exception I never heard of before, Operand Error?

---That's fixed-point overflow.  Converting a 64-bit
floating-point value to a 16 bit signed integer.
The conversion was unchecked (programming error--
other conversions in the same module were
checked; the assumption was made that the value would
be within range); consequently the error condition was raised.
The exception-handling routine was to record the
status of the error and to then shut down the system.

	>Maybe it's user
	>defined, or there's a language difference at work).

---A user-defined data conversion that went unchecked.  Three
programming mistakes were made here:

1.  The size of the variable to hold the value (16 bits) was
    inadequate; and

2.  It was assumed that the value would not be large enough
    to overflow ; therefore, it was not checked; and

3.  The folly that a floating-point value of some
    58 significant bits could be converted "safely" to
    16 bits.

  The problem then went to the error-handler, which was
designed to shut down the system.  This was a major
blunder.

   An error-handler for overflow should have been included,
but should have returned control directly to the program
(this only as an emergency resort).  The code should have
included a check for data out of range (or better, storage
of adequate size.)

   This project might well have been written in PL/I, which
has excellent real-time facilities, including error
handling, error simulation and validation facilities.
The language has robust compilers, and experts with many
years of PL/I programming experience.

   As to PL/I facilities, I refer to the SIGNAL statement,
with which given conditions (errors such as fixed-point
overflow) can be signalled as if the condition (error)
actually occurred.

   This alone would have showed up the deficiency of the
overall design (that the system would shut itself down for 
fixed-point overflow).

   Further, an ON unit can return control simply and easily
to some re-start point, or another convenient point in the
program, or even pass control to the following statement.

        >With Definitely good "lessons learned" about:

	>1. The limits of exceptions (they are only as good as what you can do
	>when they are raised).

---There's a lot you can do with an exception.  One of
them isn't to shut down the computer.  I've already itemized
what can be done with an exception.  But in this case,
the proper course is to ensure that values are within
range and to take appropriate action, rather than
to let it get as far as the error handler, which should
be a last resort for catching something overlooked (and
hopefully, there's none of those).

	>2. The problems with reusing items outside their original environment.

	>3. The need to check inputs and outputs aggressively.

	>4. The pitfalls of assuming that testing all of the components of a
	>system equates to testing the system, as well as the need to use
	>realistic test scenarios.

	>5. The problems with isolating the safety-critical components of a
	>system.

	>So, anyway, we now have another software package written in Ada that
	>caused the loss of a system, and again specification and design issues
	>outside Ada's control are the culprit.

---No, this is a clear programming error.  A PL/I programmer
experienced with real time systems, would have CHALLENGED
such a stupid requirement that the computer be shut down by the
error-handler in the event of a fixed-point overflow.  He would
have had it changed.

   I'd go further to say that no experienced PL/I programmer
would have shut down the system as a result of a fixed-point
overflow.

   Furthermore, he would have included a check that the value
did not go out of range;

   Skills in PL/I and real time systems would not have gone
astray here.  And probably skills in Ada too.

___________________________________________________________

Extract from full report:

"  * The internal SRI software exception was caused during execution of a
     data conversion from 64-bit floating point to 16-bit signed integer
     value. The floating point number which was converted had a value
     greater than what could be represented by a 16-bit signed integer.
     This resulted in an Operand Error. The data conversion instructions
     (in Ada code) were not protected from causing an Operand Error,
     although other conversions of comparable variables in the same place
     in the code were protected."




  reply	other threads:[~1996-07-26  0:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <31F60E8A.2D74@lmtas.lmco.com>
1996-07-24  0:00 ` Ariane 5 Failure - Summary Report Ken Garlington
1996-07-24  0:00   ` Byron B. Kauffman
1996-07-24  0:00     ` Stephen D. House
1996-07-25  0:00     ` Theodore E. Dennison
1996-07-25  0:00   ` Dale Stanbrough
1996-07-26  0:00     ` OS2 User
1996-07-25  0:00   ` ++           robin
1996-07-25  0:00   ` Alan Brain
1996-07-29  0:00     ` Ken Garlington
1996-07-30  0:00       ` John McCabe
1996-07-25  0:00   ` ++           robin
1996-07-26  0:00     ` ++           robin [this message]
1996-07-26  0:00     ` Ken Garlington
1996-07-30  0:00       ` Theodore E. Dennison
1996-07-26  0:00   ` Con Bradley
1996-07-26  0:00     ` P. Cnudde VH14 (8218)
1996-07-26  0:00     ` Peter Hermann
1996-08-01  0:00   ` root
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox