From: rav@goanna.cs.rmit.edu.au (++ robin)
Subject: Re: Ariane Crash (Was: Adriane crash)
Date: 1996/08/01
Date: 1996-08-01T00:00:00+00:00 [thread overview]
Message-ID: <4torim$ku8@goanna.cs.rmit.edu.au> (raw)
In-Reply-To: 4tnip9$k0s@zeus.orl.mmc.com
rgilbert@unconfigured.xvnews.domain (Bob Gilbert) writes:
>In article <4tkfe5$did@goanna.cs.rmit.edu.au>, rav@goanna.cs.rmit.edu.au (++ robin) writes:
>> rgilbert@unconfigured.xvnews.domain (Bob Gilbert) writes:
>>
>> >The error was assuming that the Ariane 4 design would be adaquate
>> >for the Ariane 5 system.
>>
>> >> The specific error was that a conversion of a double-precision
>> >> floating-point value (~58 significant bits) to 15 significant
>> >> bits caused fixed-point overflow. The conversion was not
>> >> checked for overflow. It should have been.
>>
>> >It was checked, hence the exception and an exception handler to
>> >take corrective action.
>>
>> ---The SRI computer (& its backup) had an exception
>> handler, to be sure, but it did not have an exception
>> handler to take corrective action. The exception handler
>> shut the computer down.
>Which was the specified corrective action.
---Calling it "corrective" action is stretching the English
Language a bit. In no way shape or form was the
action "corrective".
>> > Unfortunately the corrective action was
>> >to assume that the SRI had failed and to shut it down. The
>> >software performed exactly as designed.
>>
>> ---The software did not performed as designed. It was
>> intended to shut down the computer only in the event of
>> a hardware error.
>The out of bounds data was considered to be indictative of a random hardware
>fault, at least for the Ariane 4. Perhaps this was not a valid method
>of determining a hardware fault, but it was the design decision.
---Please read what I wrote. The overflow was not a hardware
fault. It was a programming error that should not have occurred,
bearing in mind the "sudden death" nature of the shutdown in the
event of any kind of interrupt..
>> The software shut down the computer
>> because of a programming error. The software performed
>> only as written!
>>
>> >> This is, after all,
>> >> a real-time system. It's a fundamental check that a programmer
>> >> experienced in real-time systems should have carried out.
>> >>
>> >> Control was then passed to the interrupt handler, which
>> >> shut down the system.
>>
>> >Exactly as designed.
>>
>> ---Again, not as designed. It was designed to shut down only
>> in the event that the SRI computer failed. Then the backup
>> would be used.
>Again, the (wrongly assumed) SRI failure was determined by the detection
>of out of bounds data. It was a requirements oversight, not a programming
>oversight, and most certainly not influenced by the programming language used.
---If you make an assumption about the range of data,
and you are wrong, it is a programming error.
>To quote the report:
> Although the source of the Operand Error has been identified, this in
> itself did not cause the mission to fail. The specification of the
> exception-handling mechanism also contributed to the failure. In the
> event of any kind of exception, the system specification stated that:
> the failure should be indicated on the databus, the failure context
> should be stored in an EEPROM memory (which was recovered and read out
> for Ariane 501), and finally, the SRI processor should be shut down.
>The last sentence of the above is what the requirements stated, and
>exactly what the software did, exactly as designed.
---Again, the interrupt for fixed-point overflow was
not expected to happen. The software DID NOT OPERATE
AS DESIGNED. It failed. You're placing too literal an
interpretation on the first sentence.
next prev parent reply other threads:[~1996-08-01 0:00 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
1996-07-23 0:00 Adriane crash Jerry van Dijk
1996-07-25 0:00 ` Steve O'Neill
1996-07-25 0:00 ` Peter Hermann
1996-07-27 0:00 ` Jerry van Dijk
1996-07-25 0:00 ` Ariane Crash (Was: Adriane crash) John McCabe
1996-07-26 0:00 ` ++ robin
1996-07-29 0:00 ` Bob Gilbert
1996-07-30 0:00 ` ++ robin
1996-07-31 0:00 ` Bob Gilbert
1996-07-31 0:00 ` William Clodius
1996-08-01 0:00 ` ++ robin [this message]
1996-08-02 0:00 ` root
1996-07-29 0:00 ` John McCabe
1996-07-26 0:00 ` Adriane crash David Verrier
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox