comp.lang.ada
 help / color / mirror / Atom feed
From: rav@goanna.cs.rmit.edu.au (++           robin)
Subject: Re: Ariane 5 - not an exception?
Date: 1996/07/26
Date: 1996-07-26T00:00:00+00:00	[thread overview]
Message-ID: <4t9vdg$jfb@goanna.cs.rmit.edu.au> (raw)
In-Reply-To: Dv45EJ.8r@fsa.bris.ac.uk


	simonb@pact.srf.ac.uk (Simon Bluck) writes:

	>The Ariane 501 flight failure was due to the raising of an unexpected
	>Ada exception,

---An exception, yes, but not unexpected.

   The programming mistake made was in assuming that a
floating-point value of some 58 significant bits would
somehow "fit" into a 15-bit integer.

   There was no check that the data conversion would not
result in overflow, so the problem went to the error
handler, which shut down the system.

 	>which was handled by switching off the computer.  The
	>report on this:

	>   http://www.esrin.esa.it/htdocs/tidc/Press/Press96/ariane5rep.html

	>is clear and hard-hitting: it will result in much improved software.
	>But does it get right to the bottom of the issues, and does the
	>software community appreciate that there are fundamental software
	>control problems which can directly give rise to such enormous
	>failures, in this particular case thankfully without loss of life?

	>It is most unfortunate, but must be accepted as true, that if the
	>Ariane software had been written in a less powerful language the
	>numeric overflow might have gone unnoticed, the computers would have
	>remained switched on, and the rocket would have continued its upward
	>flight.

	>Exceptions and assertions are both used, in Ada and C/C++,

---and PL/I

	>to detect
	>software/hardware anomalies.  When one of these trips, it is
	>frequently very difficult for the designer to know how best to handle
	>the problem.

---Not in the case of a simple fixed-point overflow -- as was the
case with Ariane.  It is a fact that real-time programming
has been available in PL/I for some 30 years, and recovery
from errors is standard established practice.

	> To continue may result in corrupt data;

---To continue in this case probably would need the value to
be set to the maximum. And it wouldn't be corrupt data.

	>to abort is
	>drastic but eliminates the possibility that further processing will
	>compound the problem.

---What?  Here, the lack of further processing resulted in
destruction of the project!

	>The more checks you have, the more likely it is that one of them will
	>trip.  If you can't think of good ways of handling these checks, the
	>end result, for the user, may well be very much worse than if the
	>check had never been performed in the first place.

	>Of the two handling options, neither is really acceptable.  However,
	>there is a third option which ought to be considered: to continue but
	>mark the processed data as suspect.

There are other better approaches.  One is to continue
with the maximum value; another is to avoid the use of
a 16-bit variable, and to use a variable as the same
size and type (here floating-point storage),
thus avoiding the problem altogether.

	>I.e. each data item would have a truth value of 1.0 for good data,
	>0.0 for absolutely rotten data, utilising values in between if you
	>have some idea how good the data is.  If you have numeric overflow,
	>you could set the data to the largest value available, and mark it as
	>suspect.

	>Any data further derived from suspect data must also be marked as
	>suspect.

	>Taking a probabilistic attitude to data would bring a lot of software
	>into the real world where failures can happen at all levels.  Using
	>this approach would made complex mission-critical software like the
	>failing Ariane software much easier to understand and control.  Data
	>would be processed along the same path regardless of whether it is
	>suspect or entirely valid.  Only the end-users of the data would be
	>affected, and where duplication of systems provides redundancy, the
	>algorithm would be to switch to the backup on receiving suspect data,
	>and switch back to the main source if the backup was suspect.

---In Ariane, both the active processor and the backup failed at
the same time, because it was a *programming* error that was
encountered at the same time in both processors, and both
processors were shut down at the same time by their respective
error handlers.

	>  If both sources are suspect, then take the least suspect source.  This
	>is simple and you don't lose your vital input data.  The data truth
	>values would be passed on from system to system along with the data.

	>You _never_ switch off a computer, but you may have cause to mark all
	>data emanating from it as suspect.  Leave it up to the users of the
	>data to decide if they want to use it or not - they may have no
	>choice.

---Indeed.

	>Along with the data truth attribute, you need a data type attribute.
	>This is tending to be relatively standard stuff now that objects are
	>around and need to know what kind of object they are.  But adding a
	>data type field is still something that designers skimp on if not
	>supplied by the language, relying instead on implicit coding of type
	>information in the senders and receivers of data.

	>Lack of type information accounts for why the Ariane flight control
	>was able to interpret diagnostic data as attitude data, virtually
	>guaranteeing catastrophic failure.  At least if attitude data had
	>been cut short it could have continued in a straight line.

---This is more of a lack of communication between the two
programs.  Another design error.

	>Well, those are what I think are the important lessons to be learned.

---I think the real lessons are that
1. real-time programming requires special expertise.
2. the choice of language is suspect.  A better-established
   language such as PL/I -- specifically designed for
   real-time programming -- with robust compilers, and
   with its base of experienced programming
   staff could well have prevented this disaster.




  parent reply	other threads:[~1996-07-26  0:00 UTC|newest]

Thread overview: 194+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1996-07-25  0:00 Ariane 5 - not an exception? Simon Bluck
1996-07-25  0:00 ` Multiple reasons for failure of Ariane 5 (was: Re: Ariane 5 - not an exception?) Kirk Beitz
1996-07-26  0:00   ` ++           robin
1996-08-05  0:00     ` Darren C Davenport
1996-08-06  0:00       ` U32872
1996-08-07  0:00         ` Robert Dewar
1996-08-08  0:00           ` Pascal Martin @lone
1996-08-09  0:00             ` Robert Dewar
1996-08-10  0:00               ` dwnoon
1996-08-11  0:00                 ` Robert Dewar
1996-08-15  0:00                   ` dwnoon
1996-08-16  0:00                     ` Robert Dewar
1996-08-20  0:00                       ` dwnoon
1996-08-12  0:00                 ` Ken Garlington
1996-08-15  0:00                 ` Richard Riehle
1996-08-22  0:00                   ` ++           robin
1996-08-23  0:00                     ` Ken Garlington
1996-08-31  0:00                     ` Ada versus PL/I " Richard Riehle
1996-09-02  0:00                       ` ++           robin
1996-09-02  0:00                         ` Richard A. O'Keefe
1996-09-03  0:00                           ` ++           robin
1996-09-03  0:00                             ` Robb Nebbe
1996-09-17  0:00                             ` shmuel
1996-09-17  0:00                               ` Jay McFadyen
1996-09-18  0:00                                 ` John McCabe
1996-09-20  0:00                               ` shmuel
1996-09-03  0:00                       ` ++           robin
1996-09-04  0:00                         ` Robert Dewar
1996-09-07  0:00                           ` ++           robin
1996-09-06  0:00                             ` PL/I or PL/1 Larry Hazel
1996-09-03  0:00                       ` Ada versus PL/I (was: Re: Ariane 5 - not an exception?) J. Kanze
1996-09-07  0:00                         ` Robert Dewar
1996-09-09  0:00                           ` ++           robin
1996-09-09  0:00                             ` Robert Dewar
1996-09-09  0:00                               ` Ken Garlington
1996-09-11  0:00                     ` Multiple reasons for failure of Ariane 5 " J.Worringen
1996-09-12  0:00                       ` Ken Garlington
1996-09-14  0:00                       ` David Alex Lamb
1996-09-14  0:00                       ` Use DejaNews to retrieve Ariane discussion David Alex Lamb
1996-09-19  0:00                         ` Earl H. Kinmonth
1996-08-11  0:00               ` Multiple reasons for failure of Ariane 5 (was: Re: Ariane 5 - not an exception?) ++           robin
     [not found]               ` <4uibvh$References: <Dv45EJ.8r@fsa.bris.ac.uk>
1996-08-16  0:00                 ` A. Grant
1996-08-08  0:00         ` bohn
1996-07-26  0:00   ` Robert I. Eachus
1996-08-23  0:00   ` Jon S Anthony
1996-08-26  0:00     ` ++           robin
1996-08-23  0:00   ` Jon S Anthony
1996-08-23  0:00     ` ++           robin
1996-08-23  0:00       ` Richard A. O'Keefe
1996-08-23  0:00         ` Ken Garlington
1996-08-26  0:00         ` ++           robin
1996-08-27  0:00           ` Ken Garlington
1996-08-28  0:00             ` Larry Kilgallen
1996-08-29  0:00               ` Ken Garlington
1996-08-30  0:00             ` ++           robin
1996-08-30  0:00               ` David Weller
1996-09-04  0:00               ` Ken Garlington
1996-09-06  0:00                 ` Sandy McPherson
1996-09-09  0:00                   ` Ken Garlington
1996-08-30  0:00         ` Jon S Anthony
1996-08-26  0:00       ` Ken Garlington
1996-08-26  0:00         ` Dave Jones
1996-08-27  0:00           ` Ken Garlington
1996-08-30  0:00             ` ++           robin
1996-09-04  0:00               ` Ken Garlington
1996-09-06  0:00                 ` ++           robin
1996-09-18  0:00               ` Merlin Dorfman
1996-09-20  0:00                 ` John McCabe
1996-08-30  0:00         ` ++           robin
1996-08-30  0:00           ` John McCabe
1996-09-06  0:00       ` Jon S Anthony
1996-09-06  0:00         ` Robert Dewar
1996-07-26  0:00 ` ++           robin [this message]
1996-07-29  0:00   ` Ariane 5 - not an exception? Bill Angel
1996-07-29  0:00     ` Paul_Green
1996-07-30  0:00     ` Lloyd Fischer
1996-07-30  0:00     ` Ken Garlington
1996-07-30  0:00     ` Nancy Mead
1996-07-31  0:00       ` Steve O'Neill
1996-07-31  0:00       ` Tucker Taft
1996-08-01  0:00       ` root
1996-08-01  0:00         ` Tucker Taft
1996-07-30  0:00     ` Richard Shetron
1996-07-30  0:00       ` ++           robin
1996-07-30  0:00     ` Bob Kurtz
1996-08-04  0:00     ` Richard Riehle
1996-08-05  0:00       ` John McCabe
1996-08-05  0:00       ` Nigel Tzeng
1996-08-06  0:00         ` John McCabe
1996-08-05  0:00       ` Fergus Henderson
1996-08-13  0:00       ` ++           robin
1996-08-13  0:00         ` Ken Garlington
1996-08-13  0:00           ` Kirk Bradley
1996-08-14  0:00             ` Ken Garlington
1996-08-18  0:00           ` PL/I Versus Ada (Was: Arianne ...) Richard Riehle
1996-08-19  0:00             ` Robert Dewar
1996-08-20  0:00             ` Lon Amick
1996-08-21  0:00             ` Lon D. Gowen, Ph.D.
1996-08-21  0:00             ` Tony Konashenok
1996-08-28  0:00               ` Richard Riehle
1996-08-29  0:00                 ` Lon D. Gowen, Ph.D.
1996-08-30  0:00                   ` Tony Konashenok
1996-08-30  0:00                     ` Adam Beneschan
1996-08-30  0:00                 ` John McCabe
1996-08-21  0:00             ` Tim Dugan
1996-08-23  0:00             ` arbuckj
1996-08-22  0:00           ` Ariane 5 - not an exception? ++           robin
1996-08-22  0:00             ` Ken Garlington
1996-08-13  0:00         ` Darren C Davenport
1996-08-14  0:00         ` John McCabe
1996-08-19  0:00           ` Chris Papademetrious
1996-08-22  0:00           ` ++           robin
1996-08-22  0:00             ` John McCabe
1996-08-23  0:00               ` Ken Garlington
1996-08-24  0:00                 ` John McCabe
1996-08-26  0:00                   ` Byron B. Kauffman
1996-08-27  0:00                     ` John McCabe
1996-08-28  0:00                       ` Byron B. Kauffman
1996-08-28  0:00                         ` Robert Dewar
1996-08-29  0:00                           ` Ted Dennison
1996-08-30  0:00                         ` John McCabe
1996-08-22  0:00             ` Martin Tom Brown
1996-08-23  0:00             ` Bob Gilbert
1996-08-24  0:00               ` Robert I. Eachus
1996-08-25  0:00                 ` John McCabe
1996-08-27  0:00                 ` Tom Speer
1996-08-26  0:00               ` Jon S Anthony
1996-08-20  0:00         ` Richard Riehle
1996-07-30  0:00   ` Ken Garlington
1996-08-02  0:00     ` Craig P. Beyers
1996-07-30  0:00   ` Steve O'Neill
1996-07-31  0:00     ` Martin Tom Brown
1996-07-31  0:00       ` Nigel Tzeng
1996-08-02  0:00       ` Ken Garlington
1996-08-03  0:00         ` Thomas Kendelbacher
1996-08-01  0:00     ` ++           robin
1996-08-01  0:00       ` Ken Garlington
1996-08-05  0:00         ` John McCabe
1996-08-06  0:00           ` Mark van Walraven
1996-08-06  0:00           ` Ken Garlington
1996-08-06  0:00           ` Ken Garlington
1996-08-02  0:00       ` Pascal Martin @lone
1996-08-03  0:00         ` Dr. Richard Botting
1996-08-05  0:00           ` system
1996-08-06  0:00         ` ++           robin
1996-08-08  0:00           ` Darius Blasband
1996-08-10  0:00             ` dwnoon
1996-08-12  0:00               ` Thomas Kendelbacher
1996-08-13  0:00                 ` ++           robin
1996-08-13  0:00             ` Roy Gardiner
1996-08-13  0:00               ` Lance Kibblewhite
1996-08-13  0:00               ` Ken Garlington
1996-08-13  0:00             ` ++           robin
1996-08-15  0:00             ` Richard Riehle
1996-08-05  0:00       ` Steve O'Neill
1996-08-06  0:00         ` Francis Lipski
1996-08-07  0:00           ` Martin Tom Brown
1996-08-09  0:00             ` Ken Garlington
1996-08-06  0:00         ` Frank Manning
1996-08-08  0:00           ` Steve O'Neill
1996-08-09  0:00             ` Pat Rogers
1996-08-09  0:00           ` JP Thornley
1996-08-13  0:00         ` ++           robin
1996-08-13  0:00           ` Steve O'Neill
1996-08-01  0:00   ` Jon S Anthony
1996-08-02  0:00   ` James Kanze US/ESC 60/3/141 #40763
1996-08-06  0:00   ` Robert I. Eachus
1996-08-06  0:00   ` Stefan 'Stetson' Skoglund
1996-07-26  0:00 ` Bob Gilbert
1996-07-29  0:00   ` Martin Tom Brown
1996-07-30  0:00     ` John McCabe
1996-07-31  0:00       ` Greg Bond
1996-08-03  0:00         ` John McCabe
1996-07-26  0:00 ` Theodore E. Dennison
1996-07-29  0:00   ` Ken Garlington
1996-07-26  0:00 ` JP Thornley
1996-07-29  0:00   ` Nigel Tzeng
1996-07-29  0:00   ` JP Thornley
1996-07-29  0:00   ` Ken Garlington
1996-07-30  0:00   ` Robert I. Eachus
1996-07-31  0:00     ` JP Thornley
1996-08-01  0:00       ` Alan Brain
1996-08-02  0:00         ` JP Thornley
1996-08-01  0:00   ` Ken Garlington
1996-07-27  0:00 ` Bill Angel
1996-07-30  0:00 ` Dr. Richard Botting
1996-07-30  0:00   ` David Weller
1996-07-30  0:00     ` Robert Dewar
  -- strict thread matches above, loose matches on Subject: below --
1996-08-08  0:00 Marin David Condic, 407.796.8997, M/S 731-93
1996-08-09  0:00 ` John McCabe
1996-08-13  0:00 Marin David Condic, 407.796.8997, M/S 731-93
1996-08-15  0:00 ` John McCabe
1996-08-13  0:00 Marin David Condic, 407.796.8997, M/S 731-93
1996-08-15  0:00 ` John McCabe
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox