From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_05,REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,59dddae4a1f01e1a X-Google-Attributes: gid103376,public From: JP Thornley Subject: Re: Need help with PowerPC/Ada and realtime tasking Date: 1996/05/25 Message-ID: <355912560wnr@diphi.demon.co.uk> X-Deja-AN: 156718968 x-nntp-posting-host: diphi.demon.co.uk references: <1026696wnr@diphi.demon.co.uk> x-mail2news-path: disperse.demon.co.uk!post.demon.co.uk!diphi.demon.co.uk organization: None reply-to: jpt@diphi.demon.co.uk newsgroups: comp.lang.ada Date: 1996-05-25T00:00:00+00:00 List-Id: Richard Riehle writes, in a follow-up on safety-critical software using interrupts and tasking:- > The main requirement of safety-critical code is that it be "safe." My view is that code can never be judged as safe or unsafe - only correct or incorrect. However my usage of the words "safe" - and "safety-critical" carries a lot of additional baggage, and it is possible that we are differing over the meaning of these words rather than anything fundamental. So here are my meanings (this could get quite lengthy and it's rather off-topic for cla, so bail out now if not really interested). Software *on its own* is incapable of causing harm. For this to occur, it must be part of a larger system that translates the outputs of the software into actions in the real world - eg moving actuators or displaying information. So safety is an attribute of a system. In assessing the safety of a system, the process starts with hazard identification. A hazard is an event that has a reasonable chance of resulting in a serious outcome (eg death or serious injury to a person, major financial loss or widespread environmental damage). For example - a traffic light controlled road junction is a system; a hazard (possibly the only one) could be 'collision between vehicles using the junction'. [Note - other people use 'hazard' with a different meaning; here I'm giving the meaning I use and I'm *not* arguing that it's the only correct meaning.] Hazard analysis then identifies the mechanisms that could give rise to the hazard. For example:- 1. 'vehicle crosses junction when lights are on red' or 2. 'lights indicate green in conflicting directions' The first of these could be further analysed as:- 1a. 'driver ignores red light' 1b. 'weather conditions make light difficult to see' 1c. 'failure of the vehicle's braking mechanism' etc. This process continues until specific failures of individual components of the system have been identified. [Time for more caveats - system safety isn't really my area, also this is only one of a number of different ways of doing hazard analysis - it's still very much a developing technology (see "Safeware" by Nancy Levenson)]. Based upon this analysis each component of the system can be given a required integrity rating. In many cases, failure of a single component does not lead to the hazard unless there is an independent failure of one or more other components - so the required integrity level of each component can be reduced. A _safety-critical_ rating is given to any component where a failure can lead to the hazard without the need for any independent failure occurring. Clearly any safety-critical component must have a very low failure rate as the overall failure rate for the system cannot be less than the sum of the failure rates for the safety-critical components. Following this process, and the prediction of failure rates for the components, the system can be judged as _safe_ or unsafe on a calculated probability of the hazard occurring. It is often measured as the rate of the hazard occuring over a defined period of operation - typical figures might be 10^-6 to 10^-9 per hour depending on the perceived severity of the hazard, rates of exposure, etc. So why do I say that software cannot be considered safe? There are no meaningful failure modes for a software component, since a software failure can rarely be contained to only part of that component - it either works without failure or fails completely. The effects of a software failure are assumed to be whatever are the worst possible in the situation that is currently under analysis. Given that we cannot measure software to the rates quoted above, any software component rated as safety-critical has to be given a failure rate of zero in the system safety assessment. (This places quite severe requirements on the software development team and their process ;-). So safety is measured by (usually) small but definitely non-zero numbers; software is either correct or not, with no numeric scale. Sorry to take so long to get there, but I thought it worthwhile trying to get my meanings as clear as possible. Phil Thornley -- ------------------------------------------------------------------------ | JP Thornley EMail jpt@diphi.demon.co.uk | ------------------------------------------------------------------------