This is a multi-part message in MIME format. --------------1EB2F48589C Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit I�m releasing a draft of this paper for people to review prior to sending it on to the Eiffel web site. Please send your comments to kennieg@flash.net. I still need to review the whole Ariane thread from the last go-round (as well as the latest one) in order to get the footnotes completed, as well as to see if I missed anything (particularly in section 6). I hope to send the paper to J�z�quel and co. before the end of July. Any help would be much appreciated. Thanks. (Note: the paper appears to run about 13 pages on my printer.) --------------1EB2F48589C Content-Type: text/plain; charset=iso-8859-1; name="Arianed1.txt" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="Arianed1.txt" I�m releasing a draft of this paper for people to review prior to sending it on to the Eiffel web site. Please send your comments to kennieg@flash.net. I still need to review the whole Ariane thread from the last go- round (as well as the latest one) in order to get the footnotes completed, as well as to see if I missed anything (particularly in section 6). I hope to send the paper to J�z�quel and co. before the end of July. Any help would be much appreciated. Thanks. ----- Critique of �Put it in the contract: The lessons of Ariane� Ken Garlington (kennieg@flash.net) July 20, 1997 FIRST DRAFT 1. Introduction On June 4, 1996, the first flight of the Ariane 5 rocket ended in failure. Approximately 40 seconds after initiation of the flight sequence, Ariane 5 veered off of its flight path, broke up and exploded. The European Space Agency (ESA) and French National Center for Space Studies (CNES), in cooperation with other agencies, established an inquiry board to determine the cause of the accident, to establish if the system had been adequately tested before use, and to recommended corrective actions for the Ariane 5 system. This inquiry determined that the actions of the software inside the on-board Inertial Reference System (IRS) caused the catastrophe [1]. (For the sake of brevity, the results of the inquiry board are described as �the inquiry� elsewhere in this critique.) The following year, Jean-Marc J�z�quel of IRISA and Bertrand Meyer of ISE co-wrote a paper [2] titled �Put it in the contract: The lessons of Ariane.� This paper attempts to make the case for the use of the Design by Contract (DBC) methodology and Eiffel language as a corrective action for the Ariane 5 IRS fault. (This paper will subsequently be described as �the Eiffel paper.�) This critique analyzes the Eiffel paper, principally to challenge two fundamental points: a. That DBC/Eiffel would have been _sufficient_ to prevent the destruction of the Ariane 5, and b. That DBC/Eiffel would have been _necessary_ to prevent its destruction. The conclusion is that neither condition is adequately established in the Eiffel paper. In addition, some other concerns with the paper are expressed, and finally some �non-issues� which have been raised in the past to these arguments are addressed. 2. The central claim What, exactly, does the Eiffel paper claim? The central issue is stated in the paper as follows: �Does this mean that the [Ariane 5] crash would automatically have been avoided had the mission used a language and method supporting built-in assertions and Design by Contract? Although it is always risky to draw such after-the-fact conclusions, the answer is probably yes...� Note three important factors with respect to this statement: a. It does not say that DBC/Eiffel might have helped. It makes a more aggressive claim: that these techniques would probably have saved Ariane 5. As a result, it seems fair to hold the Eiffel paper to a reasonably strong standard of proof regarding this claim. Conversely, this critique does not need to show that it is impossible for DBC/Eiffel to have saved Ariane 5; only that there are significant doubts about the outcome. b. Although some have suggested that the Eiffel paper�s recommendations can be implemented in other languages [3], it is clear that this paper is based on a specific �language and method�. The Eiffel-centric aspect of this claim is reinforced elsewhere in the Eiffel paper [4] and in other statements by the authors [5]. Thus, it appears reasonable to discuss the proposed solution in terms of Eiffel assertions, rather than the broader topic of �good methodology.� c. The inquiry established several causes for the IRS software fault. However, nothing in the Eiffel paper suggests that any action beyond the use of DBC/Eiffel was required to avoid the fault. Thus, it seems logical to assume that all other aspects of the Ariane 5 development effort (in particular, decisions regarding the specification and the associated testing approach) would have been the same. 3. Would DBC/Eiffel have been sufficient to avoid the failure? The Eiffel paper states three ways in which DBC/Eiffel would have significantly altered the results of the Ariane 5 development effort: a. The use of Eiffel assertions as part of a manual review of the code in order to detect errors (the �documentation effect�), b. The use of such assertions during the testing stage (to be turned off after testing, but before operational use) to detect errors (the �testing effect�), and c. The use of such assertions during execution to detect (and presumably compensate for) errors (the �execution effect�). Each of these claims is analyzed in detail in the following subsections. This critique asserts that, for each claim, there is insufficient evidence to conclude that the Ariane 5 incident would have been avoided. Furthermore, for each type of use, there are associated risks that have the potential for actually making such an error more probable. 3.1 The documentation effect The Eiffel paper states: �...most importantly the assertions are a prime component of the software and its documentation (�short form�, produced automatically by tools). In an environment such as that of Ariane where there is so much emphasis on quality control and thorough validation of everything, they would be the QA team's primary focus of attention. Any team worth its salt would have checked systematically that every call satisfies the precondition. That would have immediately revealed that the Ariane 5 calling software did not meet the expectation of the Ariane 4 routines that it called.� There is a chain of events that would have to occur in order for this statement to be true: a. The critical assertion would have to be documented (correctly) in the code. b. The review team would need the proper expertise to detect the problem. c. The review team would have to review the code, as opposed to other documentation. d. The review team would have to select the critical assertion from the total set of assertions, and e. The review team would need the proper source information regarding Ariane 5 to realize that a problem existed. If any of these events failed to materialize, the documentation effect of the assertion would be nullified. For all of these events, the case can be made that factors specific to the Ariane development effort could have caused such a failure in the process. 3.1.1 Would the critical assertion have been documented (correctly) in the original code? First, would the critical assertion have been documented in the code at all? Although the Eiffel paper (with hindsight) describes this assertion as a �fundamental constraint�, the inquiry notes that the Ariane development team considered this particular condition improbable [6]. Thus, there is a reasonable expectation that the development team might not have included such a constraint in their system. Some Eiffel advocates have claimed that the use of DBC/Eiffel causes even �improbable� assertions to be included in the code [7]. However, this claim should be viewed with some skepticism. Consider, for example, attempting to show that an object did not modify any data it was not supposed to modify. In theory, this would require each object to specify assertions for all �unused� data within the system, showing that the data had not been affected during the execution of the object. Such a strategy becomes especially difficult when concurrency is introduced (common to real-time systems such as the Ariane IRS), where these �unused� data elements may well be modified during the course of the object�s execution -- by another thread which interrupted it. It is more likely that Eiffel assertions are written in terms of conditions that the developer reasonably expects might occur. Note also that documenting all �improbable� events, even if possible, will cause other problems (described later in 3.1.4). In addition to documenting the assertion, the programmer must document it correctly. In other words, the same individual who likely has introduced a software design fault within the system may be called upon to write the protection for that fault. In this context, an incorrect assertion seems to be a possibility that cannot be discounted. (Presumably, the process could be changed to have the assertions written by someone independent of the developer, but this is not discussed in the Eiffel paper. Even in this case, common-mode errors are quite possible [8]) Interestingly, the specific assertion proposed in the Eiffel paper as the solution to the Ariane IRS design fault is neither correct nor complete. Here is the exact proposal: �...any software element that has such a fundamental constraint should state it explicitly, as part of a mechanism present in the programming language, as in the Eiffel construct convert (horizontal_bias: INTEGER): INTEGER is require horizontal_bias <= Maximum_bias do ... ensure ... end where the precondition states clearly and precisely what the input must satisfy to be acceptable.� First, note that the routine converts the horizontal bias (BH) variable from an integer value to another integer value. However, the IRS error occurred when the computed BH was converted from a floating- point value to an integer [9]. Thus, the fault would have occurred before this precondition would have been evaluated (whether by a reviewer or during execution of the code). As a result, this precondition as worded would not have detected the design fault. Assuming that horizontal_bias is changed to a floating-point value, the next question has to do with the specification of Maximum_bias. A likely candidate for this value is 32,767.0, since this is the maximum positive value that can be stored in a 16-bit integer. However, if the �convert� routine scales the value (i.e., uses the integer value 32,767 to represent the largest expected value for horizontal_bias), then the actual Maximum_bias would need to be computed in terms of 32,767.0 multiplied by an appropriate scaling value. Additional confusion could occur if Maximum_bias is derived from the system specification, and is not written in terms of the actual scaling being performed. If the scaling is in error (does not cover all possible values), then Maximum_bias may be too broad and thus miss the potential conversion error. Assuming that both horizontal_bias and Maximum_bias are specified correctly, there is still the question of completeness. As noted, Maximum_bias would be written in terms of the largest desired positive value. What about negative values? If the conversion of horizontal_bias yields a 16-bit integer smaller than - 32,768 (on some architectures, -32,767), Ariane 5 would have experienced the same failure as if horizontal_bias was too big. However, the precondition proposed would permit any negative value to be treated as valid. Therefore, it is not complete with respect to conversion failures. Note that, given the use of horizontal_bias as an error term [10], it should always be greater than or equal to zero. If the precondition is just written in terms of avoiding values less than -32,768, it would still permit invalid horizontal biases to be converted. However, it is unclear whether the assertion in the conversion routine should be limited to protecting against invalid conversions, or should also consider the meaning of horizontal_bias within the system. In summary, the Eiffel paper does not provide compelling evidence to support their claim that this specific assertion would have been included by the Ariane team. It is easy to identify the need for such an assertion after the fact, but circumstances specific to the Ariane development effort imply that the inclusion of the proper assertion was far from certain. Furthermore, as can be seen by the paper�s example, there is also the possibility that the assertion would have been improperly specified (although there is no way to quantify this possibility). 3.1.2 Would the review team have had the expertise to detect the assertion violation? Even if the assertion was properly defined, it would not have provided any benefit if the right personnel did not review the assertion for correctness. The Eiffel paper speaks of a �QA team� performing this function. However, it is the author�s experience that QA teams, while uniformly diligent in their efforts, do not always have the experience and knowledge base to determine difficult technical questions. This is due, in part, to the requirement in some methodologies that the QA team be composed of individuals who are not also responsible for development of the system [11]. This independence is useful, but it can cause the QA team to be more effective in catching process failures vs. technical faults. More critically, it is unclear that any team within the subcontractor responsible for the IRS development had the necessary expertise to determine the fault that lie within their system. This fault was not an internal contradiction within the IRS design. It was a fault in the operation of the IRS within the larger Ariane 5 system [12]. The total system was the responsibility of a different company [13]. As a result, the IRS vendor might not have had the total expertise base available to properly analyze such an assertion. The inquiry suggests that the relationship between the companies was not as efficient as it could have been [14], and also suggests that experts from outside the project should have been brought in to provide adequate analysis capabilities [15]. Overall, despite the Dilbert-esque assertion that all that was needed was a review team �worth its salt,� it is far from clear that circumstances specific to the Ariane 5 program were conducive to having the right people on the right team available to review the IRS software design for the Ariane 5. Thus, the evidence presented in the paper is insufficient to show that a proper team would have been available. 3.1.3 Would the review team have read the code? The Eiffel paper asserts: �The requirement that the horizontal bias should fit on 16 bits was in fact stated in an obscure part of a document. But in the code itself it was nowhere to be found!� This statement implies that the inquiry found that documenting such assertions in the code was preferable to other documents. Although it is true that the decision was not specified directly in the code, the inquiry�s statement is ambiguous as to whether it was actually clearly stated in the documentation, either [16]. Furthermore, the inquiry�s statement appears to address documentation of the justification to not provide a handler for an overflow, vs. a �requirement that the horizontal bias should fit on 16 bits.� Nonetheless, the inquiry does make it clear in its recommendations that documentation be included in such reviews, and no mention is made that the source code is the preferred location for such data [17]. Is there any reason to believe that the code is not the preferred location for this information? It is the author�s experience that non-programmers, particularly senior systems architects of the type described in section 3.1.2 above, often prefer to read English documentation over source code. (For some reason, test engineers do not always share this bias.) Of course, this preference can be mitigated somewhat by the ability of the source language to be easily read by non-programmers, but nothing in the Eiffel paper addresses this issue. Overall, the Eiffel paper assumes that source code is the best place to document such assertions, but does not provide justification for this in the context of the Ariane 5 review team described in the previous section. On the other hand, there is at least anecdotal evidence that the code may not be the preferred place to document such assertions. While there may be justifications for including assertions in source code in general (better for programmers to review, less chance of documentation �drift� from the source), these advantages may not have been as useful for the specific issue under discussion. 3.1.4 Would the review team have fully reviewed the critical assertion? Assuming that the right team is looking at the code, and that the code contains the right assertion, it is still possible that the team would not have adequately reviewed the critical assertion. Putting aside the team�s likely bias toward looking for the reaction of the software to hardware failures vs. software design faults [18], there is also the issue of the inherent difficulty of performing manual reviews of complex systems, as the inquiry acknowledges [19]. There are at least two types of complexity involved here: a. The number of assertions to be reviewed. When there are �thousands of problems and potential failures� [19] to be considered, it is very easy to miss a critical issue. Ironically, the addition of assertions to cover �improbable� cases, in order to answer the criticism described in 3.1.1, makes this problem much worse. b. The complexity of analyzing individual assertions. The specific assertion required in the Ariane case is a good example of such complexity. The goal of the reviewer is to ensure that horizontal bias will not exceed the maximum bias. However, the actual range of horizontal bias for the Ariane 5 environment is not necessarily directly specified in the requirements. It is a calculated value, whose actual range would have to be determined in part by the flight profile [12] and possibly other factors. Although the inquiry does not state the complexity of this calculation, even a fairly simple calculation may be error-prone when done manually. Furthermore, the necessary calculation is not stated in the assertion as it is presented in the paper; the reviewer would have to trace back through all calculations involving horizontal bias (and ancestor values) to ensure that the appropriate actual range was created. Given that non-trivial work may be required for each assertion, and that there are many assertions to be reviewed, the chance for a reviewer error or oversight cannot be discounted. No empirical evidence is provided in the Eiffel paper as to the accuracy of reviews for a system of this size and complexity. Therefore, the case is not made that a manual review would have detected the fault. 3.1.5 Would the review team have had the necessary information to detect the error? Finally, as noted in section 3.1.4 above, the actual range of horizontal bias is to be calculated based in part on the Ariane 5 flight profile. However, this profile was deliberately left out of the IRS specification [20]. Therefore, the necessary data to evaluate this assertion was not available to the IRS developer. More to the point, the deliberate omission of this data reflects the (incorrect) assumption that the profiles were not significantly different. Would a reviewer, having to analyze a large set of assertions, have spent much time questioning a mutually-accepted decision of this type? There is nothing in the Eiffel paper which suggests that (s)he would in fact press the issue. As a result, it is reasonable to assume that this assertion would not have been a focus area of the review, although again in hindsight it was obviously a �fundamental constraint.� Therefore, the incorrect assumption would likely not have been challenged, and the Ariane disaster would still have occurred. This effect may be mitigated somewhat if the reviewer is from the prime contractor and has direct access to the Ariane 5 flight profile, as described in 3.1.2. However, keep in mind that the prime contractor participated in the original decision to leave out flight profile, and thus may already be biased against this as a source of error. In fact, it is possible that the same senior engineer or manager that left out the relevant data is the one performing the review of the IRS code! 3.1.6 Adverse effects of documenting assertions There is a line of reasoning that says, �Even if DBC/assertions would have been of minimal use, why not include them anyway just in case they do some good?� Such statements assume there are no adverse impacts to including assertions in code for documentation purposes. However, there are always adverse effects to including additional code: a. As with any other project, there are management pressures to meet cost and schedule drivers. Additional effort, therefore, is discouraged unless justified. b. More importantly, all projects have finite time and other resources available. Time spent on writing and analyzing assertions is time not spent elsewhere. If the work that is not performed as a result would have been more valuable (in terms of its effect on the safety and integrity of the system) than the time spent with assertions, then the resulting system may be less safe. There is a growing consensus in the safety-critical software field that simpler software tends to be safer software [21]. With this philosophy, additional code such as assertions should only be added where there is a clear benefit to the overall safety of the particular system being developed. 3.2 The testing effect The Eiffel paper states: �Assertions (preconditions and post-conditions in particular) can be automatically turned on during testing, through a simple compiler option. The error might have been caught then.� However, it is unlikely that, in the context of the Ariane 5 program, the use of assertions during testing would have provided much benefit. Even the Eiffel paper itself tends to be less enthusiastic about the value of testing [22] vs. the manual reviews described earlier. (Note that this is not to say that, with the appropriate changes in the Ariane process, a reasonable test could not have been conducted; see section 4.2.) 3.2.1 Ariane V testing limitations There were two key problems in the Ariane 5 test program: a. As noted earlier, the Ariane 5 flight profile was not included in the IRS system specification. Therefore, the IRS contractor did not have the proper data with which to perform a test. Testing with the Ariane 4 flight data would not have violated the assertion [23], and thus the error would not have been detected. b. Integrated system testing at the prime contractor, using the proper test environment and data, would have been able to detect the error. However, the prime contractor decided against performing these tests, preferring instead to assume that their experience with this IRS on Ariane 4 was sufficient, along with the vendor�s qualification tests, to qualify the IRS in the total system Thus, without other changes in the Ariane test philosophy (which are not called for in the Eiffel paper), the assertions would not have been tested in the proper conditions and thus would not have detected the fault. 3.2.2 Adverse effects of testing with assertions Assume for a moment that the proper testing environment and data had been available. Putting aside for the moment the question as to whether assertions would have been necessary to detect the fault (see section 4.2), are there any disadvantages to using assertions during testing, then disabling them for the operational system? In the author�s experience, there are some concerns about using this approach for safety-critical systems: a. The addition of code at the object level obviously affects the time it takes for an object to complete execution. Particularly for real-time systems (such as the Ariane IRS), differences in timing between the system being tested and the operational system may cause timing faults, such as race conditions or deadlock, to go undetected. Such timing faults are serious failures in real-time systems, and a test which is hindered from detected them loses some of its effectiveness. b. In addition, the differences in object code between the tested and operational systems raise the issue of errors in the object code for the operational system. Such errors are most likely to occur due to an error in the compilation environment, although it is possible that other factors, such as human error (e.g. specifying the wrong version of a file when the code is recompiled) can be involved. For example, the author has documented cases where Ada compilers generate the correct code when exceptions are not suppressed, but generate incorrect code (beyond the language�s definition of �erroneous�) when they are suppressed. This is not entirely unexpected; given the number of user-selectable options present with most compilation environments, it is difficult if not impossible to perform adequate testing over all combinations of options. Nothing in the Eiffel paper indicates that Eiffel compilers are any better (or worse) than other compilers in this area. Although this is a fault of the implementation, not the language or methodology, it is nonetheless a practical limitation for safety-critical systems, where one object code error can have devastating results. One possible approach to resolving this issue is to completely test the system twice; once with assertions on and another time with assertions suppressed. However, the adverse factors described in section 3.1.6 then come into play: By adding to the test time in this manner, other useful test techniques (which might have greater value) are not performed. Generally, it is difficult to completely test such systems once, never mind twice! This effect is worse for safety-critical systems that perform object-code branch coverage testing, since this testing is completely invalidated when the object code changes [25]. Overall, there are significant limitations to using this technique for safety-critical systems, and in particular real-time systems such as the Ariane 5 IRS. 3.3 The execution effect The final option is to not disable the assertions after testing, but to leave them on during execution. The Eiffel paper describes this option as follows: �Assertions can remain turned on during execution, triggering an exception if violated. Given the performance constraints on such a mission, however, this would probably not have been the case.� There are three basic objections to this approach in safety-critical systems, and in particular real-time systems: a. As Meyer notes, throughput is an issue. This was particularly true of the Ariane IRS [26]. b. As described in section 3.1.6, an argument can be made that adding code adds the risk of introducing the very design errors one is attempting to protect against. c. Most importantly, nothing in the Eiffel paper discusses the very difficult issue of the proper way to react if the assertion is violated. Although the Eiffel paper implies that no handlers were within the system [27], the inquiry makes it clear that an exception hander was properly invoked per the system specification, but that the system specification was based on an incorrect view of the nature of the fault and so the handler took inappropriate actions [28]. It is the author�s experience that, particularly for real- time systems that can neither turn to a human operator nor shut down in the face of faults, that writing the proper actions in such an exception handler can be very difficult. The inquiry makes recommendations about how to handle such a fault [29], although their solution could have obscured the random faults with which the original IRS design was overly-concerned. It should also be noted that turning to a human operator is not always a safe option, as in the recent case of the corrupted Internet address database [30]. Thus, the Eiffel paper is correct in dismissing this approach as a valid option for the Ariane 5 case. 4. Was DBC/Eiffel necessary to avoid the failure? Assuming that the reader concurs that a compelling case for DBC/Eiffel is not made in the context of the Ariane 5 incident, should the conclusion be drawn that nothing could be done? Not at all! For example, the inquiry makes it clear that, even with no reviews and with the code exactly as implemented, it was quite reasonable to perform an integrated system test that would have clearly exposed the problem [24]. It is the author�s experience, as confirmed by the inquiry, that such integrated system testing should be a mandatory part of any safety-critical qualification. Although the Eiffel paper is quite correct that testing is not a panacea [22], in this particular case there is high confidence that the problem would have been caught: a. The fault occurred in the normal processing mode, within the normal flight profile, without any previous unusual events (e.g. other failures) preceding it. Thus, it is very likely that this scenario would have been tested. b. The response of the system to the test case (complete shutdown) is sufficiently obvious to be observed by test personnel. In addition to the testing effect, the documentation effect could have been addressed without the introduction of DBC/Eiffel. As described earlier, the inquiry indicates the use of �justification documents� to describe the necessary review information [17]. Also, the Ada-Based Design Approach for Real-Time Systems is an example of a competing methodology that also incorporates the specification of assertions (and more generally assumptions), although comments are used rather than executable code [31]. As to language, the Eiffel paper itself notes that the IRS implementation language, Ada, would have been adequate for handling the Ariane case [32]. Thus, it can be concluded that, for this particular case alone, DBC/Eiffel was not necessary. This does not mean that DBC/Eiffel is unnecessary for detection of any software fault, of course, just the fault described in the Eiffel paper. 5. Other issues There are other aspects of the Eiffel paper that, although not related directly to the core argument, distract from its effectiveness: a. Lack of peer review. The Eiffel paper would have been much stronger if it could have cited evidence that the inquiry board, the Ariane developers, or others in the general safety-critical community had read the paper and agreed with its findings. It should be noted that, since the publication of the paper, comments the author has read from persons identified as familiar with the IRS domain have not been positive [33]. b. Lack of empirical data. There does not appear to be any evidence that DBC/Eiffel has been successfully used in a system like the Ariane IRS, much less that it produced significantly improved safety or reliability. By itself, this is not particularly damaging: All new ideas have to have a first user. However, the approach used in the development of Ada, where the language team funded the development of prototypes in relevant domains to test it�s effectiveness, should be considered by the Eiffel team. More to the point, no relevant evidence from any other domain is presented in the Eiffel paper. There should be some results from controlled studies showing the advantage of DBC/Eiffel over the methodology and tools used for the Ariane project, even if not related to the Ariane domain. In particular, there must be some data with respect to the reliability of the Eiffel compilation system, at a minimum. No data of this type is used to support the Eiffel paper�s assertions. c. Appeals to authority. Quoting Demarco (particularly when his assessment that �almost no other defense would have been likely to catch� the problem is contradicted by the inquiry) gives the Eiffel paper a �sales� feel, rather than a scholarly work. The quoting of one of the Eiffel paper�s authors (Meyer) in support of his own paper seems particularly odd. Again, quotes from a relevant source would have been more powerful, in the author�s opinion. 6. Non-issues Elements of this criticism have been aired in various Internet forums, and at times objections have been raised that cannot be dealt with in the context of the previous sections. These objections are addressed here. a. "Casuistry." It would not be appropriate to use the criticisms here to say in the general case that assertions have no value, anywhere (�casuistry�), but this criticism does not attempt to do this. It focuses on the specific claim in the Eiffel paper in the context of the Ariane IRS fault. b. Utility of DBC/Eiffel in other domains. Several Eiffel advocates will attest that they like the use of Eiffel for their domain Eiffel may be very useful in some domains, however Ariane is a real-time embedded safety-critical system and has unique properties (as described above). Again, this is a specific, not a general, criticism of DBC/Eiffel. c. Ada vs. Eiffel. Some Eiffel advocates have attempted to cast these arguments as a �language war.� They note that many of these criticisms apply equally well to Ada as Eiffel. They are correct. The author has, in fact, publicly criticized the use of Ada exceptions on the same grounds as Eiffel assertions, within the same context (real-time safety-critical systems). However, the statement �Ada has these problems, too� does not in any way contradict the arguments against the use of Eiffel assertions in the Ariane context. If nothing else, such statements indicate that the criticisms are valid for Eiffel. 7. Conclusions In the specific case of the Ariane IRS design fault, there is not clear and compelling evidence that DBC/Eiffel assertions were likely to have uncovered the fault prior to operational use, either through their documentation, test, or execution value. Furthermore, alternative means were available to the Ariane team to isolate the particular fault, even without the use of DBC/Eiffel. Therefore, although there may be a compelling claim to use DBC/Eiffel in real-time safety-critical systems, the Ariane case (and the Eiffel paper describing this case) does not support such a claim. ABOUT THE AUTHOR Ken Garlington has been developing software for safety-critical real-time flight control systems since 1984, including the F-16, F-16 Ada flight controls ground demonstrator, F-111, A-12, YF-22A, F-22, and Joint Strike Fighter systems. More than one of these projects has included the integration of an inertial reference system with the flight controls system, as was done for Ariane. The opinions in this paper are those of the author, and should not be construed as an official position of his employer, Lockheed Martin. FOOTNOTES [1] The public version of the Ariane 5 inquiry report is located on the World Wide Web at http://www.esrin.esa.it/htdocs/tidc/Press/Press96/ariane5rep.html . Subsequent footnotes will reference this report as �(Inquiry)�. [2] The version of this paper being critiqued is located on the World Wide Web at http://www.eiffel.com/doc/manuals/technology/contract/ariane/index.html . Subsequent footnotes will reference this report as �(Eiffel)�. [3] [4] (Eiffel) See the two paragraphs that begin �From CORBA to C++...�, particularly the reference to �Eiffel-like assertions.� [5] [6] (Inquiry) �The reason for the three remaining variables, including the one denoting horizontal bias, being unprotected was that further reasoning indicated that they were either physically limited or that there was a large margin of safety, a reasoning which in the case of the variable BH turned out to be faulty. It is important to note that the decision to protect certain variables but not others was taken jointly by project partners at several contractual levels.� [7] [8] See, for example, the work of Levison and Knight on common-mode errors in N-version programs. [9] (Inquiry) �The internal SRI software exception was caused during execution of a data conversion from 64-bit floating point to 16-bit signed integer value. The floating point number which was converted had a value greater than what could be represented by a 16-bit signed integer.� The paper does not provide further explanation as to the reason for this conversion. However, such a conversion is commonly used to convert internally calculated values to a format suitable for output on a 16-bit data bus. Given the use of horizontal bias as a measurement of the error in the horizontal measurement, this would seem to be a likely value to output to other systems. [10] (Inquiry) �The Operand Error occurred due to an unexpected high value of an internal alignment function result called BH, Horizontal Bias, related to the horizontal velocity sensed by the platform. This value is calculated as an indicator for alignment precision over time.� [11] See, for example, the US Department of Defense software standards DoD-STD-2167 and DoD-STD- 2168, which were used in the same time-frame as the original Ariane software development. [12] (Inquiry) �Ariane 5 has a high initial acceleration and a trajectory which leads to a build-up of horizontal velocity which is five times more rapid than for Ariane 4. The higher horizontal velocity of Ariane 5 generated, within the 40-second time-frame, the excessive value [BH] which caused the inertial system computers to cease operation.� It is important to note from this statement that BH is computed from the measured horizontal velocities; it is not itself a measured value. [13] The inquiry refers to the prime contractor as the �Industrial Architecture�, and does not refer to any of the parties by name. However, Aviation Week and Space Technology ran articles contemporaneously with the incident, and (if memory serves) identified the IRS supplier as Sextant and the prime contractor as Arianespace, along with a description of their respective roles. [14] (Inquiry) �A more transparent organisation of the cooperation among the partners in the Ariane 5 programme must be considered. Close engineering cooperation, with clear cut authority and responsibility, is needed to achieve system coherence, with simple and clear interfaces between partners.� [15] (Inquiry) �the overriding means of preventing failures are the reviews which are an integral part of the design and qualification process, and which are carried out at all levels and involve all major partners in the project (as well as external experts).... Solutions to potential problems in the on-board computer software, paying particular attention to on-board computer switch over, shall be proposed by the project team and reviewed by a group of external experts.� [16] (Inquiry) �...three of the variables were left unprotected. No reference to justification of this decision was found directly in the source code. Given the large amount of documentation associated with any industrial application, the assumption, although agreed, was essentially obscured, though not deliberately, from any external review.� [17] (Inquiry) �Give the justification documents the same attention as code. Improve the technique for keeping code and its justifications consistent.... The Board has also noted that the systems specification [i.e. the Software Requirements Specification document] of the SRI does not indicate operational restrictions that emerge from the chosen implementation. Such a declaration of limitation, which should be mandatory for every mission-critical device, would have served to identify any non-compliance with the trajectory of Ariane 5.... Include external (to the project) participants when reviewing specifications, code and justification documents.� [18] (Inquiry) �An underlying theme in the development of Ariane 5 is the bias towards the mitigation of random failure. The supplier of the SRI was only following the specification given to it, which stipulated that in the event of any detected exception the processor was to be stopped. The exception which occurred was not due to random failure but a design error. The exception was detected, but inappropriately handled because the view had been taken that software should be considered correct until it is shown to be at fault. The Board has reason to believe that this view is also accepted in other areas of Ariane 5 software design.� [19] (Inquiry) �In a programme of this size, literally thousands of problems and potential failures are successfully handled in the review process and it is obviously not easy to detect software design errors of the type which were the primary technical cause of the 501 failure.� [20] (Inquiry) �There is no evidence that any trajectory data were used to analyse the behaviour of the unprotected variables, and it is even more important to note that it was jointly agreed not to include the Ariane 5 trajectory data in the SRI requirements and specification.� [21] See, for example, Levison�s excellent book �Safeware�. The US Air Force software safety handbook AFISC SSH 1-1 has a similar philosophy. Various integrity models (e.g. Musa) also relate the probability of residual errors to the size of the software. [22] (Eiffel) �But if one can test more one cannot test all. Testing, we all know, can show the presence of errors, not their absence.� [23] (Inquiry) �In Ariane 4 flights using the same type of inertial reference system there has been no such failure because the trajectory during the first 40 seconds of flight is such that the particular variable related to horizontal velocity cannot reach, with an adequate operational margin, a value beyond the limit present in the software.� [24] (Inquiry) See the several paragraphs that begin: �In order to understand the explanations given for the decision not to have the SRIs in the closed-loop simulation...� It is also interesting to note that �Post- flight simulations have been carried out on a computer with software of the inertial reference system and with a simulated environment, including the actual trajectory data from the Ariane 501 flight. These simulations have faithfully reproduced the chain of events leading to the failure of the inertial reference systems.� [25] See, for example, the US Federal Aviation Administrations software process DO-178B, or AFISC SSH 1-1. Beizer provides justification for the utility of structural testing. [26] (Inquiry) �It has been stated to the Board that not all the conversions were protected because a maximum workload target of 80% had been set for the SRI computer.� [27] (Eiffel) �...in the Ariane 5 flight it caused an exception, which was not caught... There was no explicit exception handler to catch the exception, so it followed the usual fate of uncaught exceptions and crashed the entire software, hence the on-board computers, hence the mission.� [28] (Inquiry) �The specification of the exception-handling mechanism also contributed to the failure. In the event of any kind of exception, the system specification stated that: the failure should be indicated on the databus, the failure context should be stored in an EEPROM memory (which was recovered and read out for Ariane 501), and finally, the SRI processor should be shut down.... From this point of view exception - or error - handling mechanisms are designed for a random hardware failure which can quite rationally be handled by a backup system.... An underlying theme in the development of Ariane 5 is the bias towards the mitigation of random failure. The supplier of the SRI was only following the specification given to it, which stipulated that in the event of any detected exception the processor was to be stopped. The exception which occurred was not due to random failure but a design error. The exception was detected, but inappropriately handled because the view had been taken that software should be considered correct until it is shown to be at fault.� [29] (Inquiry) �For example the computers within the SRIs could have continued to provide their best estimates of the required attitude information.� The inquiry does not state the algorithm to use to do this. [30] This story was covered in several newspapers as well as in the RISKS journal (either July 18 or July 19, 1997). A program which failed to properly convert an Ingress database generated alarms to the operator, who for some reason failed to see them. The resulting corrupted database causes widespread Internet disruptions for several hours. [31] See the Software Productivity Consortium web site at http://www.software.org for more information on ADARTS. [32] (Eiffel) �Although one may criticize the Ada exception mechanism, it could have been used here to catch the exception.� [33] --------------1EB2F48589C--