From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00, PP_MIME_FAKE_ASCII_TEXT autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII X-Google-Thread: fac41,a48e5b99425d742a X-Google-Attributes: gidfac41,public X-Google-Thread: f43e6,a48e5b99425d742a X-Google-Attributes: gidf43e6,public X-Google-Thread: 103376,a48e5b99425d742a X-Google-Attributes: gid103376,public X-Google-Thread: 1108a1,5da92b52f6784b63 X-Google-Attributes: gid1108a1,public From: Ken Garlington Subject: Re: Trust but verify (was Re: Papers on the Ariane-5 crash and Design by Contract Date: 1997/03/28 Message-ID: <333C4396.795@lmtas.lmco.com> X-Deja-AN: 229087894 References: <332B5495.167EB0E7@eiffel.com> <5giu3p$beb$1@news.irisa.fr> <332ED8AB.21E7@lmtas.lmco.com> <5go8nk$ngf$1@news.irisa.fr> <3332C049.2A0F@lmtas.lmco.com> <5gue93$2md$3@news.irisa.fr> <33383A6F.3A5A@lmtas.lmco.com> <5ha171$dck@flood.weeg.uiowa.edu> <3338AA0A.7A79CB24@eiffel.com> <333AB895.1459@lmtas.lmco.com> <333B1DF9.18FE@hello.nl> Organization: Lockheed Martin Tactical Aircraft Systems Newsgroups: comp.lang.eiffel,comp.object,comp.software-eng,comp.lang.ada Date: 1997-03-28T00:00:00+00:00 List-Id: Karel Th�nissen wrote: > > Ken Garlington wrote: > > > 1. "Assertions (preconditions and postconditions in particular) > > can be automatically turned on during testing, through a simple > > compiler option. The error might have been caught then." > > > > Catching this error via automated testing, in the case of Ariane 5, > > depends upon two key aspects: > > a. That the proper data is available to trigger the exception > > during test. This was not available in the case of the > > Ariane 5 (see the final report). > > b. That the test would have been executed in the proper > > manner. The final report states quite clearly that > > proper testing was not performed. > > Thus, *in the context of the Ariane-5 development*, this > > argument fails. > > I agree. > > > 2. "Assertions can remain turned on during execution, triggering > > an exception if violated. Given the performance constraints on > > such a mission, however, this would probably not have been the case." > > > > There are many counter-arguments to this point. However, since > > you acknowledge explicitly that it is inapplicable to > > this case, it fails inherently. > > I agree. > > > 3. "But most importantly the assertions are a prime component of the > > software and its documentation ("short form", produced automatically > > by tools). In an environment such as that of Ariane where there is so > > much emphasis on quality control and thorough validation of everything, > > they would be the QA team's primary focus of attention. Any team worth > > its salt would have checked systematically that every call satisfies > > the precondition. That would have immediately revealed that the Ariane > > 5 calling software did not meet the expectation of the Ariane 4 routines > > that it called." > > > > a. This implies that the QA team would have had proper information > > to do such checking. As noted above, the final report clearly > > states that they would not have such information. The most thorough > > manual validation will fail in the face of incomplete data. > > This alone invalidates this claim. > > Yes, one needs the data. But if the integration team would have had the > documentation generated from the DBC-like assertions, it would have > been clear to them that the correctness/applicability of the software > depends on the maximum horizontal speed of Ariane 5 (or of the actual > flight path, to be more specific). Isn't it likely then, that they would > have asked for it? Would higher levels of management have refused the > provision of these data, if the integration team could show the > importance of these data? Here is the suggested assertion from the web page: "convert (horizontal_bias: INTEGER): INTEGER is require horizontal_bias <= Maximum_bias" There are two issues with manual verification of this statement from the flight path data: 1. Based on my experience, the transformation of flight path information to the horizontal bias, in the form used at this point in the program, may be a complex process involving several non-trivial equations, scalings, etc. It is not clear that the inspection team would have had sufficient resources, knowledge, etc. to do this computation to generate the parameter's maximum real value at this exact point. 2. If the integration team had asked for data, they would have been given data. The data from the specification. The data for the Ariane *IV*. The integration team would have had to have asked a much more important question: "Is the data that we have available sufficient for the new environment we are using?" That decision had already been made (incorrectly) by the development team: YES. As noted in the final report, this was a deliberate decision, based on the faulty premise that the IRS would work equally well in either environment. So, if someone had known that the environments were sufficiently different to cause a problem, then they might have had a chance. However, reading the assertion above is insufficient to know that this is the case, particularly when you are working to a specification that does not even acknowledge mulitple environments! > So there is no certainty here that assertions > would prevent the disaster, but it would make it less likely, as was > claimed by several advocates of the use of DBC. Again, I think that this claim is based on an unstated assumption: that the *right* data is always available to the development team. See how easy it is to assume the "obvious"? :) > > > b. It also implies that the QA team would have done the check. > > Putting aside the argument as to whether the Ariane 5 team > > was "worth its salt", the report makes it clear that there > > was NOT an emphasis on "thorough validation of everything." > > Therefore, the assumption made in this point is in error, > > further invalidating the claim. > > What is your point? I feel you are exercising casuistry here. > BTW, DBC is not only about writing assertions in the code, it also > involves using the analytical results provided by the system. For this > reason the support by the language can be so important! Meyer's claim > that the crash was avoidable was about the use of DBC ('language and > method') and not about the sole use of assertions in the code. So your > argument cannot fairly be used as an argument against Meyer's > exposition. My point is that the "integration team" you described earlier did not exist on this program, at least in the form required to take the actions you suggest. I am completely confused. The three statements in the paper used in support of Meyer's claim all are in terms of assertions in the code. These are the statements I am refuting. Quoting further from the paper: "For reuse to be effective, Design by Contract is a requirement. Without a precise specification attached to each reusable component -- precondition, postcondition, invariant -- no one can trust a supposedly reusable component." I assume from this quote that the critical part of DBC is the specification of preconditions, postconditions, and invariants -- which I assume all fall under the umbrella of assertions -- in a meaningful manner. I certainly do not assume they are done randomly, but intelligently. However, this does not address the issues I raise. If you are saying that DBC would have caused the integration team to look beyond the specification and challenge management's decisions in this area, I would appreciate a citation to support this claim. I cannot find any justification for the statement in the paper provided. > > > c. There is, of course, the issue of scalability. Can a QA team > > check that every call satisfies every assertion manually > > (remember that automated testing is described in a seperate > > argument), for a system of this size? No evidence is given > > to support this claim. This further weakens an already dead > > claim. > > Firstly, your argument depends on claims already refuted. Even if the integration team had the appropriate data (claim a), and was tasked to examine the module specifications with this data (claim b), it seems to me that an independent claim can be made that they would not have had the resources to manually check all assertions in a complete manner for a program of moderate complexity (claim c). At least, to me, this claim is an independent challenge. Perhaps you can explain further the dependencies. > Secondly, > verification of the code is not a two-valued process. Assertions can > sometimes prove correctness, sometimes incorrectness and sometimes > neither. Even if we can not prove the absence of contract breaking, we > may be able to show the presence of it. In either case, the use of DBC > was useful. And in the case of assumptions that regard the operating > environment of the software: not the calls need to be checked, but the > input to the system. This is interesting, but I don't see the relevance to claim c. My point is that, using DBC, there would have been more than one assertion in the system. In fact, there may have been quite a few. As the number of assertions grows, analyzing them (through any n-valued process you choose) requires more and more resources. Mr. J�z�quel has acknowledged in a separate post that the scalability of DBC is not proven, although his experience in telco applications suggests it *may* scale. > > > There is also a general argument for all of these points. Given > > that Design by Contract/Eiffel makes it *possible* to specify > > the assertion, would the development team have, in fact, > > specified the assertion? The final report clearly indicates that > > the team felt the assertion could not be violated in practice, > > and that this analysis was agreed to at several management levels. > > No evidence is given that, in this context, the assertion would > > have been written by *this* team. I suspect most development > > teams do not introduce assertions they consider impossible to > > trigger! As the paper itself notes: "If you have proved that a > > condition cannot happen, you are entitled not to check for it. > > If every program checked for all possible and impossible events, > > no useful instruction would ever get executed!" By extension, > > no useful program would ever get *coded*, even if the > > extraneous assertions would be suppressed at run-time, if some > > decisions were not made about what assertions to include. > > It seems you do not properly understand the use of assertions and DBC. > Apart from there role during testing and execution, DBC and assertions > are part of the documentation. It is not necessary to execute them > during testing (I would rather do it, though) and execution. So there > does not have to be a performance penalty (again, this switching > capability is a good argument for having these checks as part of the > language). This point is not related to execution. As with the analysis of assertions, the writing of assertions takes resources (schedule and cost). These "people" resources are finite, and serve as a limit on the number of assertions. Therefore, I would expect that assertions are not written in the code for what the coder considers wildly improbable. If the assertion is never written down, then it cannot serve as documentation, never mind testing or execution. > > Even if one deems something impossible, it is still an assumption and > should therefore be documented with an assertion. What you are saying is > something like: 'this is so obvious that there is no need to document > it'. Firstly, in mission critical systems things are never obvious and, > secondly, leaving out the obvious is not DBC!!! Using this criteria, I would have to write a large number of assertions to handle cases such as: * compiler errors * CPU microcode software failures * hardware failures * physics failures (moving from one location to another at faster than the speed of light, for example) * etc. The reality is: there are always assumptions so basic that they are never coded as assertions in any methodology. At some point, you have to say, "This is so brutally obvious that I would waste the analyst's time writing it." And, with that decision, the door is opened to not writing something that isn't so obvious. As noted in the final report: "The reason for the three remaining variables, including the one denoting horizontal bias, being unprotected was that further reasoning indicated that they were either physically limited or that there was a large margin of safety, a reasoning which in the case of the variable BH turned out to be faulty. It is important to note that the decision to protect certain variables but not others was taken jointly by project partners at several contractual levels." Note especially the phrase "physically limited". In the Ada sense, it would be much like writing: if Integer'First not in Integer'Range then raise Impossible_Error; end if; Generally, this is considered a tautology, and would not be written. My reading of the final report is that the development team considered the BH variable range to be in somewhat the same category (although probably not the *same*). > Then your claim that the assumption is not documented is not supported > by the Lions-report, to the contrary, it was 'agreed', but obscured by > the 'large amount of documentation'. Therefore, it is absurd to suppose > that the programming team would not have made use of assertions in an > environment of DBC supported with the proper tools or languages. In fact, the programming team did *not* use assertions for this variable, although they had a language which the paper (at one point) asserts was sufficient to write the assertion. Furthermore, they did use assertions in other areas, so they clearly had a methodology which included such techniques. I would also quibble with your reading of the final report. I believe they are saying that the decision to leave certain variables unprotected was documented, not the specific assertion in the case of BH. See the paragraph above from the final report, which is part of the context of the sentence you quoted. Note further the complete sentence from the final report: "Given the large amount of documentation associated with any industrial application, the assumption, although agreed, was essentially obscured, though not deliberately, from any external review." It does not say that the *development* team did not have access to the "assumption" (again, the assumption that it was not worth writing an assertion, IMHO), rather than it was obscured from any *external* review. It is unclear to me whether this means an independent integration team, a safety review team, or other body. > Yes, you are entitled not to *check* an assertion considered impossible > to trigger, but you are not entitled to leave it out! Just do not check > it in the production system, but do check it during verification and > during testing/simulation. Note that this is exactly what Meyer and > Jezequel wrote: not _checking_. They did not suggest to refrain from > writing or verifying the assertion. Again, this is not my point. See above. > > (There are several other general arguments related to the use of > > assertions, of course, but advancing such arguments seems > > unnecessary at this point.) > > So do not mention this at all. > > > So much for the explicit arguments. Now, for what the paper > > does NOT say. You state that "...there should... be > > simulation, testing, inspections, independent reviews etc...." > > and go on to note that the paper does not *deny* this fact. > > This, however, is not the issue. > > If it is not the issue, why bother? Because I wished to bring to Mr. Meyer's attention (it was his post to which I was responding, after all!) that he did not understand my position. > > > The issue is, does Design by Contract *require* such activities, which > > everyone seems to agree are in fact required? Note, again, that the > > statement is made that: > > Mr Meyer has always been very clear about this. Nowhere the suggestion > is made that testing etc. can be skipped or weakened. The point of DBC > is that it provides an additional technique for obtaining quality > software, not an replacement for testing. I disagree. I believe the paper strongly implies that DBC was *sufficient* for solving this problem. The following statements support that claim. It is true that Mr. Meyer has made many statements *outside* of the paper that support your statement. I would like to see some of those qualifiers *inside* the paper. I am, after all, critiquing the paper, not an individual! > > BTW what did you not mention? It is far too simple to use this kind of > default reasoning. Are you a racist just because you did not explicitly > deny it? I would not think so :-). No. However, if I had said "The world would be a better place if there were no [racial group members]," I hardly think you would require me to explicitly state I was a racist! > > > This says that the crash would probably have been avoided > > if Design by Contract/Eiffel (DBC) has been used. It does *not* > > say that the crash would have been avoided if Design by > > Contract/Eiffel has been used, AND IN ADDITION: > > I do not want to put words in other people's mouth, so _I_ will state it > on my behalf: DBC (language and method) indeed would have avoided the > crash. Remember what I have written above: DBC is incomplete without the > analysis part. You're braver than the paper, which only goes so far as "probably yes." :) Do you have experience in flight controls/IRS integration? > > > a. That the proper data had been provided in the *system* > > specification. Although this is extensively described > > in the final report, the DBC paper makes no mention of > > it. As noted above, without such data, all arguments in > > support of DBC fail. > > DBC would have triggered the integration team to ask for that data. See the analysis above, starting with: "a. This implies that the QA team would have had proper information to do such checking." They would ask for *data*, and they would get *data* - Ariane IV data. > > b. That DBC would have caused the introduction of the critical > > assertion. As quoted above, it in fact implies the opposite: > > that "impossible" assertions may not in fact be introduced. > > ??? See the analysis above, starting with: "There is also a general argument for all of these points." COMPLETELY IGNORING EXECUTION AND MEMORY, there are finite resources for CODING assertions. Those which the coder considers "impossible" will likely not get coded. > > > c. That sufficient manual QA would have been done. The paper > > does say that proper manual QA should be done. However, > > it does not say how DBC supports this activity, either in > > terms of educating management to the need for > > this activity (a key issue in the Ariane 5 case), or in > > terms of proper performance of this activity. (Apparently, > > all that is needed is "salt" :) > > > > d. That sufficient automated verification would have > > been done. Here is the paper's statement on testing: > > Mixing up testing and verification, hmmmm? These may be identical in > systems engineering, they are not in software engineering. I've always assumed that verification is "The process of evaluating the products of a given software development activity to determine correctness and consistency with respect to the products and standards provided as input to that activity," and that methods to do verification included testing, either as "demonstration, analysis, or inspection." However, you can certainly choose different terminology if you wish. I will be pleased to rephrase the point as: "d. That sufficient testing would have been done. Here is the paper's statement on testing." > > > "Is it a testing error? Not really... if one can test > > more one cannot test all. Testing, we all know, can show > > the presence of errors, not their absence. And the only > > fully 'realistic' test is to launch..." > > > > This, to me, is not a particularly strong endorsement of > > testing! > > > > So, clearly, DBC would not have been *sufficient* in this > > case. You emphasized this point later in your post (although, > > unfortunately, not in the paper): > > > > So, if there are people who do not consider this a "platitiude" > > (as was obviously true for Ariane 5), and such considerations > > are essential to the use of DBC, why does the paper not state this > > explicitly? > > Are you joking? Default reasoning. However, this "default reasoning", according to Mr. Meyer, is often not considered by others. This seems to contradict my understand of "default". > > > So, the only argument remaining is: is DBC *necessary* for > > the Ariane 5 case? I would say, no. If the proper flight > > trajectory had been provided, and the proper test would have > > been performed, then the SRI error would have been detected > > before launch. > > Yes, that is true for flight 501. This again is casuistry. Testing would > have remedied indeed, but so would verification with DBC. It seems you > only rely on testing. But verification can catch bugs that are very > difficult to catch with testing. I can absolutely agree with this statement. Unfortunately, the paper is written in the specific case of flight 501, not as a general treatise on the advantages of testing + "verification" vs. testing alone. I am glad that we now agree with the two central statements of my response: a. DBC, by itself, would not have avoided the Ariane V catastrophy. It is "default reasoning" that many other process errors would also have to be avoided. b. DBC was not *necessary* in the specific case of Ariane 501. For the *particular* problem identified, a commonly-accepted test would have detected the problem on the ground and avoided the crash. Again, feel free to state that DBC is a useful adjunct to other processes, and that DBC is a useful methodology. However, in the *specific* case of the Ariane V, which is the subject of the paper, it was not the only answer, nor was it the full answer. Compare this conclusion with the statement: "Does this mean that the crash would automatically have been avoided had the mission used a language and method supporting built-in assertions and Design by Contract? Although it is always risky to draw such after-the-fact conclusions, the answer is probably yes:" This statement could lead the casual reader to believe that DBC was both necessary and sufficient, as DBC is described in the paper, for the specific error in the Ariane V crash. Perhaps a better statement would have been: "Does this mean that the crash would automatically have been avoided had the mission used a language and method supporting built-in assertions and Design by Contract? By itself, possibly not. However, in the context of a well-managed system and software engineering process, DBC provides additional capabilities to detect problems of this class before they cause a serious impact." This statement could be debated, but it isn't provably false. > The paper by M&J was one from a > software engineering point of view, and an interesting one from that > perspective. And 501 is an good illustration of a software engineering > pitfall that can be avoided with DBC. This is noted in the final report. Actually, the phrase "software engineering" does not appear in the final report. The report does state: "The failure of the Ariane 501 was caused by the complete loss of guidance and attitude information 37 seconds after start of the main engine ignition sequence (30 seconds after lift- off). This loss of information was due to specification and design errors in the software of the inertial reference system. "The extensive reviews and tests carried out during the Ariane 5 Development Programme did not include adequate analysis and testing of the inertial reference system or of the complete flight control system, which could have detected the potential failure." Based on my reading of the final report, and my knowledge of this field, I do not interpret "specification" in the sense of module-level DBC specifications, but a much higher system/software specification done well before specific modules are identified. Similarly, "testing" is described in the final report as being more than just the type of testing that DBC/assertions support. In fact, of the fourteen report recommendations, only R5 seems to be related to DBC. > > Note that this > > is not an issue of 20/20 hindsight; as several people familiar > > with such systems have noted, such a test is standard practice. > > One could argue that the proper use of assertions would have been > > able to more easily isolate and correct the problem, or that some other > > system fault might have been detected through the use of DBC, > > but both arguments are irrelevant to the subject of the > > paper: that DBC would have saved the Ariane 5 from the *specific* > > fault which destroyed it. So, if the paper had emphasized that > > proper system engineering and verification testing had been done, > > it would have further weakened it's own case. It is not > > surprising, then, that no such statement is in the paper. > > ???? should one comment??? > > > Unfortunately, continuing to claim that the paper answers these > > criticisms, without providing any details as to *how* they are > > answered, does not provide me with any confidence that your > > paper is valid. Perhaps your paper is not directed at practitioners > > in the flight controls or inertial reference fields. If so, then > > I could understand why you would ignore the arguments made by > > those practitioners, such as myself and Mr. White. > > (Speaking only for myself) I think the paper is directed at a wider > audience than only practitioners in flight controls and IR fields. It > will help them and others to understand to problems of software reuse. > It is not the case that there is a software engineering for flight > control and a different software engineering for other disciplines. No, but it is the case that, when analyzing a *specific* application of software engineering practices, it helps to have experience in that field. This is the topic of this paper: the use of a software engineering practice in a *specific* field. This seems to be the most difficult point to grasp in my argument: that the topic of this paper is the exception raised during the first flight of the Ariane V. It is not a general treatise on software engineering. Therefore, it should be evaluated in terms of the field it attempts to explore, at least I would think! > > I appreciate your replies, because they illustrate the complexities of > feedback systems, complex closed loop simulations, fail behaviour, etc., > but I do not think that they invalidate M&J's claims. We will have to agree to disagree, then. They make three claims in support of their statement. I have addressed all three, as well as your rebuttals. > > > In any case, I have posted these arguments several times without > > response. If you do not respond to this post, I will assume that > > no response is possible. > > This is a silly and childish way of exchanging ideas. Maybe you are the > one that needs to be convinced :-). ??? My point is that Mr. Meyer and I are not exchanging ideas. I have posted three different times my concerns, without direct response. I have sent one private e-mail, without response. I have also been told by someone else, a developer of IRS software, that he has also sent similar concerns to Mr. Meyer without response. The only public comment Mr. Meyer wishes to make is that he is misunderstood, and that the responses to these criticisms are "obviously" in the paper. I can't really consider this an exchange. In any case, I appreciate your specific and thoughtful comments to my criticisms. Please consider a further response. -- LMTAS - The Fighter Enterprise - "Our Brand Means Quality" For job listings, other info: http://www.lmtas.com or http://www.lmco.com