comp.lang.ada
 help / color / mirror / Atom feed
* Orders of Fault Management
@ 2004-07-27 20:12 Marc A. Criley
  2004-07-28 12:06 ` Marin David Condic
  0 siblings, 1 reply; 8+ messages in thread
From: Marc A. Criley @ 2004-07-27 20:12 UTC (permalink / raw)


There is a hierarchy of ignorance, which has been summarized into Orders of
Ignorance (see http://www.corvusintl.com/CACM_Oct_2000.htm).

It strikes me that there's an analogous hierarchy of Fault Management for
software systems, which I summarize as follows:

Fault Management Order 0: Nothing can go wrong.
   - Short of hardware failure, proper verification of the software ensures
no faults. (See www.sparkada.com).

Fault Management Order 1: I know what can go wrong.
   - And then plan for it. Timeouts with appropriate retry or other recovery
processing, exception handling (such as End_Error being raised when reading
a file), and validation (with 'Valid and other checks) of bad data received
via an external interface, is in place and ready to handle the faults that
it is known can occur, no matter how egregious they may be.

Fault Management Order 2: I don't know what can go wrong.
   - Assuming Fault Management 1 is properly addressed, this predominantly
involves bugs. E.g., Order 0 or 1 interfaces violate that which is thought
known, or a bug in the system manifests itself. Recovery from such
situations could involve restarting the system, or the individual component
in which the problem occurred.

Fault Management Order 3: I wouldn't know if something went wrong.
   - This can involve not checking return codes or the results of resource
requests, or blithely using "when others => null" exception handlers. The
system will continue to run, with its users ignorant of the degradation and
errors that may be accumulating.


FMO-3 is unacceptable of course. You shouldn't even be programming if your
code "handles" faults this way. You can at least turn FMO-3 into FMO-2 with
technqiues like asserting return codes and resource request results, and
removing all exception handlers whose purpose is not explicit. This means
not just "when others =>", but probably also most "when Constraint_Error =>"
and "when Program_Error =>" appearances. Even if you don't know why
something somewhere may have gone wrong, at least know when it's gone
_right_.

FMO-2 is what I always find problematic. The statement "All software has
bugs" gets thrown around, and through gritted teeth I have to agree, but too
often I hear that used an excuse for lack of development rigor. And just
today I discovered a new term, "software rejuvenation", that addresses FMO-2
by preemptively and regularly restarting a system
(http://www.stsc.hill.af.mil/crosstalk/2004/08/0408Bernstein.html). The
authors' research shows that it's been used and is effective, but I just
want to sigh "You're giving up! Fix the bugs!"

FMO-2 can be attacked with well-defined engineering practices, a good
software engineering oriented language (Ada)  and liberal use of [pragma]
assert, but what do you about the effect of those bugs that slip by? Or when
a trusted interface unexpectedly misbehaves? You can't anticipate these
specific occurrences, so how do you deal with them?

My predilection is to let the system fail (definitely so during the
development and integration phases), at least then you can fix the bug, or
turn an FMO-2 instance into FMO-1. But the counter-scenario is of course,
"What if that system has been delivered and is flying your plane?"

I bring this up because I often see exception handling discussed and how it
pertains to what amounts to FMO-1 and FMO-2 scenarios, but with the
participants not clearly aware of the distinction between them, and wherein
exceptions serve different purposes.

Something to ponder...

Marc A. Criley
McKae Technologies
www.mckae.com





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-07-27 20:12 Orders of Fault Management Marc A. Criley
@ 2004-07-28 12:06 ` Marin David Condic
  2004-07-28 13:11   ` Dmitry A. Kazakov
  0 siblings, 1 reply; 8+ messages in thread
From: Marin David Condic @ 2004-07-28 12:06 UTC (permalink / raw)


The "All software has bugs" is not always true. However it is not really 
something you can demonstrate. I just delivered a significant control 
program (done in Ada) that - after rigorous verification - is operating 
with no *known* bugs. There may be some in there - but we can't 
demonstrate that there are from what we know.

Also, a "bug" may not truly stop a software application from 
accomplishing its purpose. In that case, one might debate the economics 
of trying to remove "all bugs". It kind of leads to the question "How 
good is 'good enough'?"

MDC


Marc A. Criley wrote:
> 
> FMO-2 is what I always find problematic. The statement "All software has
> bugs" gets thrown around, and through gritted teeth I have to agree, but too
> often I hear that used an excuse for lack of development rigor. And just
> today I discovered a new term, "software rejuvenation", that addresses FMO-2
> by preemptively and regularly restarting a system
> (http://www.stsc.hill.af.mil/crosstalk/2004/08/0408Bernstein.html). The
> authors' research shows that it's been used and is effective, but I just
> want to sigh "You're giving up! Fix the bugs!"

-- 
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jsf.mil/NSFrames.htm

Send Replies To: m   o   d   c @ a   m   o   g
                    c   n   i       c   .   r

     "All reformers, however strict their social conscience,
      live in houses just as big as they can pay for."

          --Logan Pearsall Smith
======================================================================




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-07-28 12:06 ` Marin David Condic
@ 2004-07-28 13:11   ` Dmitry A. Kazakov
  2004-07-28 14:14     ` Puckdropper
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Dmitry A. Kazakov @ 2004-07-28 13:11 UTC (permalink / raw)


On Wed, 28 Jul 2004 12:06:13 GMT, Marin David Condic wrote:

> Also, a "bug" may not truly stop a software application from 
> accomplishing its purpose.

Which is probably not a bug then. (:-))

> In that case, one might debate the economics 
> of trying to remove "all bugs". It kind of leads to the question "How 
> good is 'good enough'?"

Perhaps one should better talk about requirements fulfilling...

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-07-28 13:11   ` Dmitry A. Kazakov
@ 2004-07-28 14:14     ` Puckdropper
  2004-07-29 12:46     ` Marin David Condic
  2004-08-11  4:56     ` Mark A. Biggar
  2 siblings, 0 replies; 8+ messages in thread
From: Puckdropper @ 2004-07-28 14:14 UTC (permalink / raw)


Dmitry A. Kazakov wrote:
> On Wed, 28 Jul 2004 12:06:13 GMT, Marin David Condic wrote:
> 
> 
>>Also, a "bug" may not truly stop a software application from 
>>accomplishing its purpose.
> 
> 
> Which is probably not a bug then. (:-))

There's a saying:  "A *bug* becomes a *feature* when you document it."

>>In that case, one might debate the economics 
>>of trying to remove "all bugs". It kind of leads to the question "How 
>>good is 'good enough'?"
> 
> 
> Perhaps one should better talk about requirements fulfilling...

How good is good enough?  Hm... My hobbies and interests always seem to 
intersect somehow.  That's a question model railroaders ask. :-)

Puckdropper



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-07-28 13:11   ` Dmitry A. Kazakov
  2004-07-28 14:14     ` Puckdropper
@ 2004-07-29 12:46     ` Marin David Condic
  2004-08-11  4:56     ` Mark A. Biggar
  2 siblings, 0 replies; 8+ messages in thread
From: Marin David Condic @ 2004-07-29 12:46 UTC (permalink / raw)


Dmitry A. Kazakov wrote:
> On Wed, 28 Jul 2004 12:06:13 GMT, Marin David Condic wrote:
> 
> 
>>Also, a "bug" may not truly stop a software application from 
>>accomplishing its purpose.
> 
> 
> Which is probably not a bug then. (:-))

Kind of a gray area. A common type of bug in the world I live in might 
be one where you fail to detect that an actuator is out of control 
because of a current difference between what you ask for and what you 
get. That is a bug. But sooner or later because of a position feedback, 
you see that the actuator is out of control and take the right 
accommodation. So you have a "bug", but due to redundancy in detection 
techniques, you can still use the software to operate the device safely.

> 
> 
>>In that case, one might debate the economics 
>>of trying to remove "all bugs". It kind of leads to the question "How 
>>good is 'good enough'?"
> 
> 
> Perhaps one should better talk about requirements fulfilling...

See above. You fulfil the important system level requirement, but you 
fail a lower level requirement that might have lesser importance. Do you 
change your mind and say "well, that wasn't really a requirement - I can 
live without that feature for now." Or do you insist the software is 
unacceptable and delay everything until it is fixed? As is often the 
case, the answer is "It Depends..." There is no clear cut right and 
wrong here - you have to look at what the actual situation is and 
determine if you can live with it or not.

MDC


-- 
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jsf.mil/NSFrames.htm

Send Replies To: m   o   d   c @ a   m   o   g
                    c   n   i       c   .   r

     "All reformers, however strict their social conscience,
      live in houses just as big as they can pay for."

          --Logan Pearsall Smith
======================================================================




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-07-28 13:11   ` Dmitry A. Kazakov
  2004-07-28 14:14     ` Puckdropper
  2004-07-29 12:46     ` Marin David Condic
@ 2004-08-11  4:56     ` Mark A. Biggar
  2004-08-11  8:38       ` Dmitry A. Kazakov
  2 siblings, 1 reply; 8+ messages in thread
From: Mark A. Biggar @ 2004-08-11  4:56 UTC (permalink / raw)


Dmitry A. Kazakov wrote:

> On Wed, 28 Jul 2004 12:06:13 GMT, Marin David Condic wrote:
> 
> 
>>Also, a "bug" may not truly stop a software application from 
>>accomplishing its purpose.
> 
> 
> Which is probably not a bug then. (:-))

No, there are things that everyone would agree are bugs, but just don't 
effect the usability of an application.  For example, if it has an
error message with a miss-spelled word ("Fiel not found"), that's a real 
bug, but it may not be economically worth the effort to fix it.

-- 
mark@biggar.org
mark.a.biggar@comcast.net



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-08-11  4:56     ` Mark A. Biggar
@ 2004-08-11  8:38       ` Dmitry A. Kazakov
  2004-08-11 11:49         ` Marin David Condic
  0 siblings, 1 reply; 8+ messages in thread
From: Dmitry A. Kazakov @ 2004-08-11  8:38 UTC (permalink / raw)


On Wed, 11 Aug 2004 04:56:21 GMT, Mark A. Biggar wrote:

> Dmitry A. Kazakov wrote:
> 
>> On Wed, 28 Jul 2004 12:06:13 GMT, Marin David Condic wrote:
>> 
>>>Also, a "bug" may not truly stop a software application from 
>>>accomplishing its purpose.
>> 
>> Which is probably not a bug then. (:-))
> 
> No, there are things that everyone would agree are bugs, but just don't 
> effect the usability of an application.  For example, if it has an
> error message with a miss-spelled word ("Fiel not found"), that's a real 
> bug, but it may not be economically worth the effort to fix it.

It depends on whether the requirement was to bring up *an* or *the* error
message. One of our programmers customary uses obscene language in all
message boxes, which should never appear. Unfortunately they do! (:-))

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Orders of Fault Management
  2004-08-11  8:38       ` Dmitry A. Kazakov
@ 2004-08-11 11:49         ` Marin David Condic
  0 siblings, 0 replies; 8+ messages in thread
From: Marin David Condic @ 2004-08-11 11:49 UTC (permalink / raw)


One can state anything in the requirements, but the ultimate question is 
"Can I use this software to get my job done?" and a misspelled or 
slightly erroneous error message may be incorrect, but doesn't stop you 
from using the software to accomplish the job. Even in more serious 
issues of functionality, one can decide that a bug isn't stopping the 
product from accomplishing its mission. Does that mean the requirements 
were overstated? Possibly. Or it possibly means that you get what you 
settle for. You may want X but you'll settle for X - 1.

MDC

Dmitry A. Kazakov wrote:
> 
> It depends on whether the requirement was to bring up *an* or *the* error
> message. One of our programmers customary uses obscene language in all
> message boxes, which should never appear. Unfortunately they do! (:-))
> 


-- 
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jsf.mil/NSFrames.htm

Send Replies To: m   o   d   c @ a   m   o   g
                    c   n   i       c   .   r

     "All reformers, however strict their social conscience,
      live in houses just as big as they can pay for."

          --Logan Pearsall Smith
======================================================================




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-08-11 11:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-07-27 20:12 Orders of Fault Management Marc A. Criley
2004-07-28 12:06 ` Marin David Condic
2004-07-28 13:11   ` Dmitry A. Kazakov
2004-07-28 14:14     ` Puckdropper
2004-07-29 12:46     ` Marin David Condic
2004-08-11  4:56     ` Mark A. Biggar
2004-08-11  8:38       ` Dmitry A. Kazakov
2004-08-11 11:49         ` Marin David Condic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox