From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00, PP_MIME_FAKE_ASCII_TEXT autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII X-Google-Thread: fac41,a48e5b99425d742a X-Google-Attributes: gidfac41,public X-Google-Thread: f43e6,a48e5b99425d742a X-Google-Attributes: gidf43e6,public X-Google-Thread: 103376,a48e5b99425d742a X-Google-Attributes: gid103376,public X-Google-Thread: 1108a1,5da92b52f6784b63 X-Google-Attributes: gid1108a1,public From: Ken Garlington Subject: Re: Ariane-5: can you clarify? (Re: Please do not start a language war) Date: 1997/03/24 Message-ID: <3336C20D.5C96@lmtas.lmco.com> X-Deja-AN: 228040898 References: <332B5495.167EB0E7@eiffel.com> Organization: Lockheed Martin Tactical Aircraft Systems Newsgroups: comp.lang.eiffel,comp.object,comp.software-eng,comp.lang.ada Date: 1997-03-24T00:00:00+00:00 List-Id: Karel Th�nissen wrote: > > The problem is: no amount of runtime checks, tests, contracts or > assertions can gracefully handle on-flight the type of error that > occurred. It is just impossible to convert a 64-bit number with at least > 17 significant bits into 16-bit representation. Actually, it's quire trivial - so long as you are willing to give up precision. We do it all the time with our flight software. > This 16-bit software (as > far as the representation of flight data is concerned) should not have > been taken into space. Truncation or provision of best guesses instead > of exact data would not have stalled the system, but the error in > trajectory calculations would accumulate to such heights that self > destruction becomes inevitable, albeit perhaps a few seconds later than > now. Many times sensors, particularly those not purpose-built for a particular airframe, generate precision far in excess of what is needed for a given environment. I don't know for sure that was the case here, but given my experience with IRS/flight control coupling, I would be willing to bet it was so. > b) Having the assertions checked on flight does not bring us much > further either, because there is no way to handle the conversion > problem, except by having a similar routine with, say, 32-bit > representation. Actually, in this particular case, it may have been sufficient to terminate the routine that was doing the alignment, since you can't generally do alignments while the platform is in motion (one obvious exception is aligning aircraft on a Navy carrier, which can get alignment data from the carrier). However, I do agree that there are cases where in-flight exceptions have no obvious resolution. > But if we expect that 32-bit routine to be used, we > would not use the 16-bit software in the first place. And still then, > there is always a risc that even more bits than 32 may be needed. At > this point an assumption about the maximum horizontal bias has to be > made and any assumption about the outside world can prove wrong. It was > assumed that 16 bits were sufficient, and under that assumption the > software was correct. And for the programming team there was no reason > to believe that the 16 bit representation would cause problems, as the > software was developed for Ariane 4. Unfortunately, that assumption was > neither specified as a clear ex-ante or ex-post specification for the > software. This allowed for reuse outside the applicability range of the > software. > > !!! THIS SOFTWARE SHOULD NOT HAVE BEEN TAKEN INTO FLIGHT FOR ARIANE 5, > NO MATTER HOW MANY SAFEGUARDS !!! > > However, assertions and the like can be very useful on the ground. > Proper tools can signify the assumptions and report them as ex-post > specifications of the software module as a whole. Assumptions are all > those assertions that raise exceptions that are not properly caught and > handled by exception handlers. Then during assembly for Ariane 5, one > would have seen that one of the assumptions regarding the SRI was not > met. Only if the assertion was exercised on the ground, of course (assuming the assertion is not static, as was the case here.) There was no attempt to provide realistic flight data to the IRS, so any non-static assertions in the code would have been left unexercised. > Assertions in this respect can be viewed as reporting aid from the > programmer who made an essential design decision somewhere deep down in > the software (it was documented somewhere, so the programmer(s) were > aware of this design decision), up to those who are going to test, > verify or use the software. Any tester or verifier worth his income > would have noticed the invalid assumption (for Ariane 5) that emerged > from the implementation. If the assertion were exercised, this would be the case. Of course, if the assertion were left out, but the test executed as if the assertion were there, the test team would have seen the IRS go off-line (as in flight) and taken action. > Of course, there still is the possibility that these ex-post > specifications would not be used, but that at least supposes an extra > level of incompentency or negligence (by this I do not want to insult > the people on the Ariane project by suggesting that currently there is > incompetence or negligence). > > Notice, that hand-crafted documentation or free formatted in-code > comments are too unreliable for this purpose and that normal condition > testings in the software do not necessarily signify assumptions. > Therefore, assertions as a means of documentation add an additional > layer of security that could not easily and reliably be obtained by > other coding practices. > > Of course, there are a lot of spots in the entire SRI project where > different decisions at the time may have saved the launcher. But most > decisions seemed perfectly reasonable, both the decisions during the > development of the system and later for its reuse. > > Testing might have revealed the software fault, this time, but next time > a fault will get uncaught. I don't understand this argument at all. I can see only three ways in which assertions provide benefit: 1. Documentation for the human reader. However, if "free formatted in-code comments are too unreliable for this purpose", then this cannot be the primary benefit? 2. Static analysis by automated means. This is certainly useful (in fact, it's one of the primary reasons why I use Ada). However, in this case, such analysis would have to either reject a working case (the Ariane 4) or miss a failure case (Ariane 5). 3. Dynamic analysis by automated means. In order for dynamic analysis to work, useful input data must be provided (at least, as far as I can tell). This was not done for Ariane 5. Moreover, given that #2 and #3 are examples of testing, it's clear to me that testing is an essential part of making assertions useful. The claim that testing does not detect all errors is certainly true, but since this claim is also quite true for assertions (Design By Contract or otherwise), what does it prove? > Surely, now that we know the bug, it is > simple to think of nice tests that would have relealed the bug > cheaplier. Testing can reveale the presence of bugs not the absence. At > some point one must stop testing, and the team thought it had arived at > that point. Note that the exact same claim is true for assertions: At some point, you have to stop writing them, for several reasons: a. There is a finite amount of time to do the analysis to generate the assertions. For executable assertions, there is a finite amount of time to exercise them on the ground. b. Exectuable assertions (on the ground or in the air) affect timing, both gross throughput margin and interprocess dependencies, which may affect the system's success. c. Particularly for highly-critical software, there is a desire to keep things as simple as possible. Adding code provides opportunities to generate errors via compiler bugs, for example. This is particularly true for the case where testing is done with executable assertions active, and then they are turned off just prior to delivery in order to mitigate item "b". If the code is correct with assertions active, but incorrect with assertions suppressed (and I have seen cases of this!), then a serious problem will result. d. From a documentation standpoint, excessive assertions may cause "reader fatigue." As a result, a human being must decide which assertions are meaningful, and which aren't. Which leads us right back to the Ariane 5 disaster. With respect to testing, it is important to keep in mind that the testing that was not done is considered a normal and routine test for such a system. It is not hindsight to say that it was a bad idea to skip the test. In fact, I have been a part of a system where this idea was suggested, and have helped shoot it down personally - BEFORE the Ariane 5 accident! > Testing was not forgotten, but argued obsolete. Maybe this > is not entirely true: > > FLIGHT 501 WAS A TEST FLIGHT, SO THE SOFTWARE WAS TESTED, AND THE > PRESENCE OF BUGS WAS PROVEN, THEREFORE TEST FLIGHT 501 WAS A SUCCESS. > > Any piece of software has a applicability scope. Outside this scope the > software becomes indeterminate. Assertions are an important (and > automatic!) aid in the documentation of this applicabilty scope. NEVER > DO WITHOUT THEM. Never say "never"! :) > > Groeten, Karel -- LMTAS - The Fighter Enterprise - "Our Brand Means Quality" For job listings, other info: http://www.lmtas.com or http://www.lmco.com