From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: f43e6,5ac12f5a60b1bfe X-Google-Attributes: gidf43e6,public X-Google-Thread: 101deb,f96f757d5586710a X-Google-Attributes: gid101deb,public X-Google-Thread: 103376,5ac12f5a60b1bfe X-Google-Attributes: gid103376,public From: rav@goanna.cs.rmit.edu.au (++ robin) Subject: Re: Ariane 5 - not an exception? Date: 1996/08/01 Message-ID: <4totv7$o9f@goanna.cs.rmit.edu.au> X-Deja-AN: 171314166 expires: 1 November 1996 00:00:00 GMT references: <4t9vdg$jfb@goanna.cs.rmit.edu.au> <31FE35BC.1A0D@sanders.lockheed.com> organization: Comp Sci, RMIT, Melbourne, Australia newsgroups: comp.software-eng,comp.lang.ada,comp.lang.pl1 nntp-posting-user: rav Date: 1996-08-01T00:00:00+00:00 List-Id: Steve O'Neill writes: >++ robin wrote: >> ---I think the real lessons are that >> 1. real-time programming requires special expertise. >Agreed wholeheartedly >> 2. the choice of language is suspect. A better-established >> language such as PL/I -- specifically designed for >> real-time programming -- with robust compilers, and >> with its base of experienced programming >> staff could well have prevented this disaster. >I disagree completely! The language was not the >problem the design decisions in how the language >was used were. ---The choice of language is indeed very relevant. What I wrote in an earlier posting on this topic is highly apt: "A PL/I programmer experienced with real time systems, would have CHALLENGED such a stupid requirement that the computer be shut down by the error-handler in the event of a fixed-point overflow. He would have had it changed. "I'd go further to say that no experienced PL/I programmer would have shut down the system as a result of a fixed-point overflow. "Furthermore, he would have included a check that the value did not go out of range;" >Ada is completely capable the realm [sic] >of real-time programming, has robust >compilers and tools, and has quite a few experienced >software engineers capable of implementing >just about any requirements thrown their way (been there, done that). >Had the designers of the system allowed the >implementors to use Ada exception mechanisms fully >and properly they could have localized the failure >to, at worst, the alignment function ---But all it needed was a check that the value was in range. Such checks had been included on other similar conversions in the vicinity! >(which >was not necessary at the time of the failure anyway) ---what? The OBC was using the attitude information to direct the nozzles. It was their [the nozzles] sudden change that caused the space vehicle to break up, thereby forcing the vehicle to self-destruct automatically [that sudden change was the result of the OBC interpreting the error readout from the shut-down SRI computer as attitide data.] > without shutting down the entire device. >Instead, as is common practice in the safety- >critical world, local exception handlers are >frequently banned and a global 'shut it all >down' handler is the only stop gap measure. >Unbelievably the rationale for disallowing local >handlers is because they make it difficult to >verify complete code coverage since they are >only executed in the case of exceptional conditions ---As I wrote in an earlier post: "This project might well have been written in PL/I, which has excellent real-time facilities, including error handling, error simulation and validation facilities. The language has robust compilers, and experts with many years of PL/I programming experience. "As to PL/I facilities, I refer to the SIGNAL statement, with which given conditions (errors such as fixed-point overflow) can be signalled as if the condition (error) actually occurred. "This alone would have showed up the deficiency of the overall design (that the system would shut itself down for fixed-point overflow)." >(i.e. given the expected data (Ariane 4 profile) >the handlers are not executed and therefore we >can't prove that all of our code has been >exercised at least once). ---But they can be, and shown to be, in PL/I -- the language with the right tools -- with the SIGNAL statement. That statement leaves an indisputable footprint! >I find this logic suspect in >the extreme! As somebody once said "expect the >unexpected". In addition to trying for fault >avoidance through analysis we should also be >planning for fault resiliency in the presence of >reality. ---Exactly what I wrote in an earlier posting. >You're other conclusions are right on target >though - you should never shut a system down >(unless its presence is impacting system performance >as in the case of babbling nodes et.al.) but >do indicate its distress to a higher authority >who then can take this into account in using the >information provided. >Steve O'Neill | "No,no,no, don't tug on that! >Sanders, A Lockheed Martin Company | You never know what it might >smoneill@sanders.lockheed.com | be attached to."