From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: fac41,a48e5b99425d742a X-Google-Attributes: gidfac41,public X-Google-Thread: 103376,a48e5b99425d742a X-Google-Attributes: gid103376,public X-Google-Thread: f43e6,a48e5b99425d742a X-Google-Attributes: gidf43e6,public X-Google-Thread: 1108a1,5da92b52f6784b63 X-Google-Attributes: gid1108a1,public X-Google-Thread: 107d55,a48e5b99425d742a X-Google-Attributes: gid107d55,public From: Karel Th�nissen Subject: Re: Ariane-5: can you clarify? (Re: Please do not start a language war) Date: 1997/03/19 Message-ID: <3330541E.61AB@hello.nl> X-Deja-AN: 226816163 References: <332B5495.167EB0E7@eiffel.com> Organization: Hello Technologies Newsgroups: comp.lang.eiffel,comp.object,comp.software-eng,comp.lang.ada,comp.lang.java.tech Date: 1997-03-19T00:00:00+00:00 List-Id: nouser@nohost.nodomain wrote: > > There are multiple ways of reducing the probability of a catastrophic > software defect like Ariane 5. Two of them are the following. > (1) You can test more, enable runtime checks, and implement multiple > levels of exception handling and recovery. (2) You can adopt a Design > by Contract methodology. > > (1) strikes me as more fundamental and powerful than (2). Design > by Contract is clearly useful, but secondary if a system isn't built > on a solid and safe substrate in the first place. Of course, (1) > costs more money and may also reduce payloads, but that additional > expense is paid back in lowered risk. I disagree: ASSERTIONS COULD HAVE DISQUALIFIED THE SOFTWARE EVEN BEFORE TESTING, THE USE OF ASSERTIONS DURING TESTING DEPENDS ON THE QUALITY OF THE TEST SET AND ON-FLIGHT THERE IS NO USE WHATSOEVER FOR ASSERTIONS The problem is: no amount of runtime checks, tests, contracts or assertions can gracefully handle on-flight the type of error that occurred. It is just impossible to convert a 64-bit number with at least 17 significant bits into 16-bit representation. This 16-bit software (as far as the representation of flight data is concerned) should not have been taken into space. Truncation or provision of best guesses instead of exact data would not have stalled the system, but the error in trajectory calculations would accumulate to such heights that self destruction becomes inevitable, albeit perhaps a few seconds later than now. THERE COULD NOT POSSIBLY HAVE BEEN A ROLE FOR ASSERTIONS ON-FLIGHT a) Having the assertions unchecked will cause exceptions somewhere later in the program (with probably the same disasterous effect) or give erroneous results. Very likely, either case will give raise to the destruction of the launcher. The only thing we do know for sure then is that program behaviour becomes hard to predict. As the software was not performing an essential task (oh irony), the latter may not have given problems for this launch (by luck rather than science), but from a software engineering point of view that makes no difference. It was known that the calculations were, in fact, superfluous, but the software was believed to be reliable and harmless and that was proven wrong. This time, it was an exception gone astray in the superfluous subsystem, next time it is in an essential part. b) Having the assertions checked on flight does not bring us much further either, because there is no way to handle the conversion problem, except by having a similar routine with, say, 32-bit representation. But if we expect that 32-bit routine to be used, we would not use the 16-bit software in the first place. And still then, there is always a risc that even more bits than 32 may be needed. At this point an assumption about the maximum horizontal bias has to be made and any assumption about the outside world can prove wrong. It was assumed that 16 bits were sufficient, and under that assumption the software was correct. And for the programming team there was no reason to believe that the 16 bit representation would cause problems, as the software was developed for Ariane 4. Unfortunately, that assumption was neither specified as a clear ex-ante or ex-post specification for the software. This allowed for reuse outside the applicability range of the software. !!! THIS SOFTWARE SHOULD NOT HAVE BEEN TAKEN INTO FLIGHT FOR ARIANE 5, NO MATTER HOW MANY SAFEGUARDS !!! However, assertions and the like can be very useful on the ground. Proper tools can signify the assumptions and report them as ex-post specifications of the software module as a whole. Assumptions are all those assertions that raise exceptions that are not properly caught and handled by exception handlers. Then during assembly for Ariane 5, one would have seen that one of the assumptions regarding the SRI was not met. Assertions in this respect can be viewed as reporting aid from the programmer who made an essential design decision somewhere deep down in the software (it was documented somewhere, so the programmer(s) were aware of this design decision), up to those who are going to test, verify or use the software. Any tester or verifier worth his income would have noticed the invalid assumption (for Ariane 5) that emerged from the implementation. Of course, there still is the possibility that these ex-post specifications would not be used, but that at least supposes an extra level of incompentency or negligence (by this I do not want to insult the people on the Ariane project by suggesting that currently there is incompetence or negligence). Notice, that hand-crafted documentation or free formatted in-code comments are too unreliable for this purpose and that normal condition testings in the software do not necessarily signify assumptions. Therefore, assertions as a means of documentation add an additional layer of security that could not easily and reliably be obtained by other coding practices. Of course, there are a lot of spots in the entire SRI project where different decisions at the time may have saved the launcher. But most decisions seemed perfectly reasonable, both the decisions during the development of the system and later for its reuse. Testing might have revealed the software fault, this time, but next time a fault will get uncaught. Surely, now that we know the bug, it is simple to think of nice tests that would have relealed the bug cheaplier. Testing can reveale the presence of bugs not the absence. At some point one must stop testing, and the team thought it had arived at that point. Testing was not forgotten, but argued obsolete. Maybe this is not entirely true: FLIGHT 501 WAS A TEST FLIGHT, SO THE SOFTWARE WAS TESTED, AND THE PRESENCE OF BUGS WAS PROVEN, THEREFORE TEST FLIGHT 501 WAS A SUCCESS. Any piece of software has a applicability scope. Outside this scope the software becomes indeterminate. Assertions are an important (and automatic!) aid in the documentation of this applicabilty scope. NEVER DO WITHOUT THEM. Groeten, Karel