From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=BAYES_00,INVALID_MSGID, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,3d3f20d31be1c33a X-Google-Attributes: gid103376,public From: Ken Garlington Subject: Re: Safety-critical development in Ada and Eiffel Date: 1997/07/21 Message-ID: <33D3FA82.6EA6@flash.net>#1/1 X-Deja-AN: 258044830 References: <97072111025086@psavax.pwfl.com> Organization: Flashnet Communications, http://www.flash.net Reply-To: kennieg@flash.net Newsgroups: comp.lang.ada Date: 1997-07-21T00:00:00+00:00 List-Id: Marin David Condic, 561.796.8997, M/S 731-96 wrote: > > Ken Garlington writes: > >> Depends on the application. Generally they print an error report, dump > >> stack information to a file, and ask the user to phone the vendor. Note > >> that I am not proposing this for real-time embedded applications. However > >> there are safety-critical applications which are not real-time. > > > >True. Furthermore, there are safety-critical real-time applications that > >are not required to be fail-operational. In both cases, I can at least > >see > >the glimmer of hope that assertions might have some value. (However, > >even > >a non-real-time system monitoring a nuclear power plant, for example, > >might > >not want to print out a message saying "phone the vendor, and I hope the > >reactor doesn't go supercritical while you're on hold :) > > > >However, for _at least_ certain classes of safety-critical systems, this > >behavior is completely unacceptable. Unfortunately, most people who > >advocate > >liberal use of exceptions are working on systems where it is quite > >acceptable. > > > Well, here's one way of dealing with exceptions in a real-time > safety critical application: > > If you have a control loop executing code, say, every 5mSec, > sensing some inputs and doing some loop closure, you know by the > rules of Ada that there are some exception possibilities you can't > disable. Realisitically, you can disable all of them (and we have in the past). > Hence they could be raised by code beyond your control. > You insert an exception handler in the loop to catch any of these, > possibly logging them for telemetry (or at least ticking off a > counter somewhere so you know it happened in lab testing!) then > allow the loop to restart. Yes, we do this with interrupt handlers (although we resume where we left off, rather than restart). The problem with restart is blowing off a frame of data. For high-gain data, you might see a significant transient, which could have very bad effects structurally, operationally, etc. The bottom line is, there is no intrinsically "safe" general-purpose approach to handling exceptions. For the ones you can't suppress (or figure out how to handle otherwise), you end up basically making the best of a bad situation. > > What you're saying is this: "On pass N everything was fine. On > pass N+1, something went haywire and interrupted normal execution. > Because quitting operation is not an acceptable alternative, what > I'm betting on is that on pass N+2, the problem will clear > itself." OK for transient input problems (we use input filtering to handle those, however), or for transient hardware problems (and you should read the beating Ariane took for assuming that!), but there's absolutely no reason to assume a software design fault will act this way. That's not to say that your approach is wrong, but if it fails... what will your inquiry board's report look like? > > This would potentially give you a viable use for raising > exceptions on the fly. Granted, you wouldn't do this for any sort > of expected conditions with planned for accommodations, but > strictly for those sorts of errors that should never occur, but > might just do so anyway. Your accommodation at that point might be > something like resetting all of memory to its initial state and > hoping that the next batch of inputs gets you back to where you > should be. We actually have a top-level handler on some programs that does a warm start if a really serious event happens, that's similar to what you describe. However, it's more of wishful thinking than anything else that says this will save the system. It's the last line of defense, not the first, and certainly not something you want to depend on to say your system is safe! > > MDC > > Marin David Condic, Senior Computer Engineer ATT: 561.796.8997 > Pratt & Whitney GESP, M/S 731-96, P.O.B. 109600 Fax: 561.796.4669 > West Palm Beach, FL, 33410-9600 Internet: CONDICMA@PWFL.COM > =============================================================================== > "You spend a billion here and a billion there. Sooner or later it > adds up to real money." > -- Everett Dirksen > ===============================================================================