Space Station S/W in Ada -- No Tasking?

comp.lang.ada
 help / color / mirror / Atom feed

* Space Station S/W in Ada -- No Tasking?
@ 1998-05-03  0:00 Robert Munck
  1998-05-03  0:00 ` Robert Dewar
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Robert Munck @ 1998-05-03  0:00 UTC (permalink / raw)



A paragraph in Popular Science notes that the software for
the International Space Station is being written in Ada,
about 3M lines worth.  However, it goes on to say:

   "To make troubleshooting easier, the software that runs
   the trio of computer networks aboard the space station is
   written to operate in synchronous, or serial, fashion 
   rather than the faster but more complex asynchronous."

Does this mean that they're not using tasking, but rather the
old "crystal clock" architecture where you organize your
processing into major and minor cycles, disable interrupts, and
poll for events "just in time" at various places in the cycles?

In my experience, large systems built that way tended to be
complete disasters: nightmares to debug ("troubleshoot!"),
horror shows to maintain and enhance.  They often had
interdependencies that were handled purely by the positions
of pieces of code in the cycles and the processing times of
the other (unrelated) functions between those positions. 
Adding a tiny fix in one place could break code half a major
cycle and 1 million lines of code away from it.

Could we possibly be using this approach for a life-critical
system that will run in an incompletely-understood
environment, be subject to extensive and rapid change, and
have a lifetime of decades?

Bob Munck
Mill Creek Systems LC




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-03  0:00 Space Station S/W in Ada -- No Tasking? Robert Munck
@ 1998-05-03  0:00 ` Robert Dewar
  1998-05-07  0:00   ` JP Thornley
  1998-05-05  0:00 ` LarryButts
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Robert Dewar @ 1998-05-03  0:00 UTC (permalink / raw)



Bob says

<<Could we possibly be using this approach for a life-critical
system that will run in an incompletely-understood
environment, be subject to extensive and rapid change, and
have a lifetime of decades?
>>

This is of course an old argument, and proponents of synchronous
cyclic scheduling will be glad to produce similar rhetoric denouncing
the use of the asyncrhonous approach.

Of course one has to decide this on a case-by-case basis, but there are
plenty of examples of disasters and successes created using both approaches.

At least some parts o the space station software definitely use cyclic
scheduling (I remember this because I implemented the ncessary CIFO
primitives to support cyclic scheduling for ALsys who was supplying 
Ada compilers for the Space Station effort).







^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-03  0:00 ` Robert Dewar
@ 1998-05-07  0:00   ` JP Thornley
  0 siblings, 0 replies; 14+ messages in thread
From: JP Thornley @ 1998-05-07  0:00 UTC (permalink / raw)



It might be worth mentioning, in this discussion, the existence of the 
Ravenscar Profile - a subset of Ada tasking designed explicitly to meet 
the certification requirements of high integrity Ada programs.

The profile was defined at the 8th International Real-Time Ada Workshop 
in April 1997 (held at the Ravenscar Hotel, North Yorkshire, UK). 
Details are in the proceedings, published as the September/October issue 
of Ada Letters (Volume XVII Number 5).

At least one vendor has announced plans to provide a 'certifiable'
run-time that supports this subset.

Use of the profile is recommended in the HRG Guidance document (an ISO 
technical report on the use for Ada in high integrity software currently 
in preparation).

Phil Thornley.

-- 
------------------------------------------------------------------------
| JP Thornley    EMail jpt@diphi.demon.co.uk                           |
|                      phil.thornley@acm.org                           |
------------------------------------------------------------------------






^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-03  0:00 Space Station S/W in Ada -- No Tasking? Robert Munck
  1998-05-03  0:00 ` Robert Dewar
@ 1998-05-05  0:00 ` LarryButts
  1998-05-05  0:00 ` Roger Racine
  1998-05-06  0:00 ` Robert I. Eachus
  3 siblings, 0 replies; 14+ messages in thread
From: LarryButts @ 1998-05-05  0:00 UTC (permalink / raw)



I believe you are 100% correct. However, on the trainer for the ISS we
are using Ada, about 1.5M lines worth and we are using tasking and a lot
of it. We are using rate monotonic sheduling for our real-time hard
deadline simulations. Ada tasking works great and this thing is
integrating a whole lot better that if we were using an old cyclic exec.







^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-03  0:00 Space Station S/W in Ada -- No Tasking? Robert Munck
  1998-05-03  0:00 ` Robert Dewar
  1998-05-05  0:00 ` LarryButts
@ 1998-05-05  0:00 ` Roger Racine
  1998-05-05  0:00   ` Robert Munck
  1998-05-06  0:00   ` William D. Ghrist
  1998-05-06  0:00 ` Robert I. Eachus
  3 siblings, 2 replies; 14+ messages in thread
From: Roger Racine @ 1998-05-05  0:00 UTC (permalink / raw)



In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes:
>Path: news.draper.com!nsnought.draper.com!cam-news-feed5.bbnplanet.com!cam-news-hub1.bbnplanet.com!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!feed2.news.erols.com!erols!news.mindspring.net!news.mindspring.com!not-for-mail
>From: munck@Mill-Creek-Systems.com (Robert Munck)
>Newsgroups: comp.lang.ada
>Subject: Space Station S/W in Ada -- No Tasking?
>Date: Sun, 03 May 1998 18:04:15 GMT
>Organization: Mill Creek Systems LC
>Lines: 30
>Message-ID: <354dadfd.2883074@news.mindspring.com>
>Reply-To: munck@acm.org
>NNTP-Posting-Host: ip144.herndon6.va.pub-ip.psi.net
>Mime-Version: 1.0
>Content-Type: text/plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit
>X-Server-Date: 3 May 1998 18:05:12 GMT
>X-Newsreader: Forte Agent 1.5/32.451


>A paragraph in Popular Science notes that the software for
>the International Space Station is being written in Ada,
>about 3M lines worth.  However, it goes on to say:

>   "To make troubleshooting easier, the software that runs
>   the trio of computer networks aboard the space station is
>   written to operate in synchronous, or serial, fashion 
>   rather than the faster but more complex asynchronous."

>Does this mean that they're not using tasking, but rather the
>old "crystal clock" architecture where you organize your
>processing into major and minor cycles, disable interrupts, and
>poll for events "just in time" at various places in the cycles?

>In my experience, large systems built that way tended to be
>complete disasters: nightmares to debug ("troubleshoot!"),
>horror shows to maintain and enhance.  They often had
>interdependencies that were handled purely by the positions
>of pieces of code in the cycles and the processing times of
>the other (unrelated) functions between those positions. 
>Adding a tiny fix in one place could break code half a major
>cycle and 1 million lines of code away from it.

>Could we possibly be using this approach for a life-critical
>system that will run in an incompletely-understood
>environment, be subject to extensive and rapid change, and
>have a lifetime of decades?

>Bob Munck
>Mill Creek Systems LC

The article is misleading; there is tasking being used for the ISS.  I was one 
of the people who convinced the Boeing management to allow it, and helped 
develop the tasking structure.  

Robert Dewar pointed out the development of the CIFO constructs for tasking 
within the Alsys compiler.  This was not used.  It was going to be used within 
the Space Station Freedom program, but was not allowed to be used within the 
re-designed computers in the International Space Station software (I have 
forgotten the reason).




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-05  0:00 ` Roger Racine
@ 1998-05-05  0:00   ` Robert Munck
  1998-05-12  0:00     ` Carla Taylor
  1998-05-06  0:00   ` William D. Ghrist
  1 sibling, 1 reply; 14+ messages in thread
From: Robert Munck @ 1998-05-05  0:00 UTC (permalink / raw)

On Tue, 5 May 1998 15:21:41 GMT, rracine@draper.com (Roger Racine)
wrote:

> ... there is tasking being used for the ISS.  I was one 
>of the people who convinced the Boeing management to allow it

You did good.  Robert Dewar's experience may be different, but
in 32-odd years in the business and a great deal of DoD, NASA,
and ESA involvement, I've never seen a large cyclic-executive-
architecture system that was in any way successful.

The trouble is that cyclic-exec projects are easier for bad
managers to manage.  They don't have to understand tough
concepts like deadlock, critical sections, rate monotonic
scheduling, etc.

Boeing management had to be convinced?  I hesitate to ask,
but how is the 777 avionics s/w structured?

Bob Munck
Mill Creek Systems LC

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-05  0:00   ` Robert Munck
@ 1998-05-12  0:00     ` Carla Taylor
  0 siblings, 0 replies; 14+ messages in thread
From: Carla Taylor @ 1998-05-12  0:00 UTC (permalink / raw)





> Boeing management had to be convinced?  I hesitate to ask,
> but how is the 777 avionics s/w structured?
> 
> Bob Munck
> Mill Creek Systems LC
> 
> 

In 777, tasking was not allowed.  Each "task" is written as a main
procedure, and proprietary hardware/software is used to 
schedule the tasks, gauranteeing that a each task will complete in its
allotted time, or have the processor forcibly taken 
away from it.  I don't know if this design was a Boeing decision or not.

Kevin Tucker





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-05  0:00 ` Roger Racine
  1998-05-05  0:00   ` Robert Munck
@ 1998-05-06  0:00   ` William D. Ghrist
  1 sibling, 0 replies; 14+ messages in thread
From: William D. Ghrist @ 1998-05-06  0:00 UTC (permalink / raw)

Roger Racine wrote:
> 
> In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes:
> >Path: news.draper.com!nsnought.draper.com!cam-news-feed5.bbnplanet.com!cam-news-hub1.bbnplanet.com!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!feed2.news.erols.com!erols!news.mindspring.net!news.mindspring.com!not-for-mail
> >From: munck@Mill-Creek-Systems.com (Robert Munck)
> >Newsgroups: comp.lang.ada
> >Subject: Space Station S/W in Ada -- No Tasking?
> >Date: Sun, 03 May 1998 18:04:15 GMT
> >Organization: Mill Creek Systems LC
> >Lines: 30
> >Message-ID: <354dadfd.2883074@news.mindspring.com>
> >Reply-To: munck@acm.org
> >NNTP-Posting-Host: ip144.herndon6.va.pub-ip.psi.net
> >Mime-Version: 1.0
> >Content-Type: text/plain; charset=us-ascii
> >Content-Transfer-Encoding: 7bit
> >X-Server-Date: 3 May 1998 18:05:12 GMT
> >X-Newsreader: Forte Agent 1.5/32.451
> 
> >A paragraph in Popular Science notes that the software for
> >the International Space Station is being written in Ada,
> >about 3M lines worth.  However, it goes on to say:
> 
> >   "To make troubleshooting easier, the software that runs
> >   the trio of computer networks aboard the space station is
> >   written to operate in synchronous, or serial, fashion
> >   rather than the faster but more complex asynchronous."
> 
> >Does this mean that they're not using tasking, but rather the
> >old "crystal clock" architecture where you organize your
> >processing into major and minor cycles, disable interrupts, and
> >poll for events "just in time" at various places in the cycles?
> 
> >In my experience, large systems built that way tended to be
> >complete disasters: nightmares to debug ("troubleshoot!"),
> >horror shows to maintain and enhance.  They often had
> >interdependencies that were handled purely by the positions
> >of pieces of code in the cycles and the processing times of
> >the other (unrelated) functions between those positions.
> >Adding a tiny fix in one place could break code half a major
> >cycle and 1 million lines of code away from it.
> 
> >Could we possibly be using this approach for a life-critical
> >system that will run in an incompletely-understood
> >environment, be subject to extensive and rapid change, and
> >have a lifetime of decades?
> 
> >Bob Munck
> >Mill Creek Systems LC
> 
> The article is misleading; there is tasking being used for the ISS.  I was one
> of the people who convinced the Boeing management to allow it, and helped
> develop the tasking structure.
> 
> Robert Dewar pointed out the development of the CIFO constructs for tasking
> within the Alsys compiler.  This was not used.  It was going to be used within
> the Space Station Freedom program, but was not allowed to be used within the
> re-designed computers in the International Space Station software (I have
> forgotten the reason).

Iï¿½m not familiar with the term "ï¿½crystal clockï¿½ architecture" and I also
donï¿½t know what is in the Space Station software, but I would like to
point out that using a non-tasking, non-interrupt architecture does not
necessarily result in the complex "major and minor cycles" structure
that is described.  I agree that such an approach is likely to be a
problem if you are attempting to break up the main flow of processing
with explicit polling for events at some faster rate.  What this is
really doing is attempting to emulate multi-tasking, but results in very
tight coupling of functions that should be unrelated.  There is another
approach, however -- that is to replace multi-tasking with
multi-processing.  It is typical in process control and protection
applications that most of the main applications functions of a given
processing subsystem can be done in a single loop repeating at a fixed
interval.  Functions that require faster response, such as input
filtering and serial communications, can be done by additional ("slave")
processors, which then exchange data with the main processor via
access-controlled structures in shared memory.  Different main
processing subsystems can be networked together as well, and they can
operate at different cycle times.  

We have been using this approach successfully for many years in the area
of nuclear safety systems.  The main benefits of this approach are that
it simplifies the task of software verification (the verifier doesnï¿½t
have to analyze what might happen if the software is interrupted at any
point in the program) and simplifies the ability to analyze worst case
response times.  The main drawback is that it is more costly in terms of
hardware.  But for low volume, large scope applications where the
highest level of software integrity is required, the benefits for the
software development and verification can outweigh the additional
hardware costs.  And, when presenting the safety case for licensing, it
is simply easier to demonstrate that the exact response of the system is
clearly known for all circumstances. 

One significant example of the success of this approach is in the
Sizewell B plant in the U.K.  The entire primary protection system and
the reactor control system were implemented in this manner.  This system
has been operating quite successfully for over three years now.  There
have been no "disasters", no "nightmares to debug", no "horror shows to
maintain and enhance". 

As for nightmares to debug, some of the worst in my experience (when I
worked on non-nuclear systems) have been related to the use of
interrupts and multi-tasking operating systems.  I will concede,
however, that operating system design has advance considerably since
those days. 

Certainly, this approach is not suitable in many situations, but
properly applied, it can prevent rather than cause software
"nightmares". 

Regards,
Bill Ghrist

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-03  0:00 Space Station S/W in Ada -- No Tasking? Robert Munck
                   ` (2 preceding siblings ...)
  1998-05-05  0:00 ` Roger Racine
@ 1998-05-06  0:00 ` Robert I. Eachus
  1998-05-07  0:00   ` Joe Gwinn
  1998-05-08  0:00   ` Chris Warwick
  3 siblings, 2 replies; 14+ messages in thread
From: Robert I. Eachus @ 1998-05-06  0:00 UTC (permalink / raw)



In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes:

  >  "To make troubleshooting easier, the software that runs
  >  the trio of computer networks aboard the space station is
  >  written to operate in synchronous, or serial, fashion 
  >  rather than the faster but more complex asynchronous."

    While the rest of the discussion on this sounds correct, I think
that what was being implicitly rejected here is the way that the Space
Shuttle computers do voting.  In the Space Shuttle, voting is based on
whether three different computer systems come up with about the same
answer at about the same time.  If no two agree, the results of a
fourth are arbitrarily accepted.  (Is that both right and concise?)
Since the computers do not get their data synchronously, the actual
data values, and the control inputs computed from them, will be
slightly different.

    In the ISS, where voting is required, two out of three computers
will have to agree, but based on identical data, and bit for bit
compares.  The Space Shuttle approach does provide more reliability
where the algorithms are not known to be stable, but is a maintenance
nightmare.  (All computers getting the same overflow is no help, and
the SS flight guidance software does go through about 20 different
flight regimes during landing.  At the boundary between some of those
modes, the flight control algorithms are known to be unstable.  So
that approach is not only appropriate to the shuttle, it seems to be
necessary.)
--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-06  0:00 ` Robert I. Eachus
@ 1998-05-07  0:00   ` Joe Gwinn
  1998-05-08  0:00     ` Roger Racine
  1998-05-08  0:00     ` Dale Pontius
  1998-05-08  0:00   ` Chris Warwick
  1 sibling, 2 replies; 14+ messages in thread
From: Joe Gwinn @ 1998-05-07  0:00 UTC (permalink / raw)

In article <EACHUS.98May6171227@spectre.mitre.org>,
eachus@spectre.mitre.org (Robert I. Eachus) wrote:

> In article <354dadfd.2883074@news.mindspring.com>
munck@Mill-Creek-Systems.com (Robert Munck) writes:
> 
>   >  "To make troubleshooting easier, the software that runs
>   >  the trio of computer networks aboard the space station is
>   >  written to operate in synchronous, or serial, fashion 
>   >  rather than the faster but more complex asynchronous."
> 
>     While the rest of the discussion on this sounds correct, I think
> that what was being implicitly rejected here is the way that the Space
> Shuttle computers do voting.  In the Space Shuttle, voting is based on
> whether three different computer systems come up with about the same
> answer at about the same time.  If no two agree, the results of a
> fourth are arbitrarily accepted.  (Is that both right and concise?)
> Since the computers do not get their data synchronously, the actual
> data values, and the control inputs computed from them, will be
> slightly different.

This is my understanding as well.  Three of the computers are identical,
IBM 4pi units if I recall, while the fourth unit is hardwired analog, the
theory being to protect against common-mode hardware failures.

However, there is one added issue to be addressed: common-mode failure in
the software.  A classic solution is N-version programming, where two or
three completely independent and isolated teams develop the software for
the digital computers. The theory of this is that the teams, being
isolated, will not make the same mistakes, so they can cross-check each
other, both during system integration, and operationally.  

It's a pretty good theory, but falls down if for instance the control law
requirements are not correct.  The Swedes lost a prototype fighter
aircraft at the Paris Air Show to just such a problem a few years ago. 
Fortunately, nobody was hurt, although the airplane was destroyed.

My recollection is that NASA used two teams, so two of three computers
will contain the same software.

Anyway, one cannot expect the outputs of these slightly different programs
to match to the bit, nor is it important in practice that they be that
close, so the voting unit compares the absolute value of the algebraic
difference to a threshold.  I would guess that the tolerance is no more
than a few percent of full scale.

>     In the ISS, where voting is required, two out of three computers
> will have to agree, but based on identical data, and bit for bit
> compares.  The Space Shuttle approach does provide more reliability
> where the algorithms are not known to be stable, but is a maintenance
> nightmare.  (All computers getting the same overflow is no help, and
> the SS flight guidance software does go through about 20 different
> flight regimes during landing.  At the boundary between some of those
> modes, the flight control algorithms are known to be unstable.  So
> that approach is not only appropriate to the shuttle, it seems to be
> necessary.)

One could wonder if ISS will really use bit comparison, because they too
may wish to have multiple versions, for exactly the same reasons.

Joe Gwinn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-07  0:00   ` Joe Gwinn
@ 1998-05-08  0:00     ` Roger Racine
  1998-05-08  0:00       ` Joe Gwinn
  1998-05-08  0:00     ` Dale Pontius
  1 sibling, 1 reply; 14+ messages in thread
From: Roger Racine @ 1998-05-08  0:00 UTC (permalink / raw)

In article <gwinn-0705982150240001@d195.dial-5.cmb.ma.ultra.net> gwinn@ma.ultranet.com (Joe Gwinn) writes:

>In article <EACHUS.98May6171227@spectre.mitre.org>,
>eachus@spectre.mitre.org (Robert I. Eachus) wrote:

>> In article <354dadfd.2883074@news.mindspring.com>
>munck@Mill-Creek-Systems.com (Robert Munck) writes:
>> 
>>   >  "To make troubleshooting easier, the software that runs
>>   >  the trio of computer networks aboard the space station is
>>   >  written to operate in synchronous, or serial, fashion 
>>   >  rather than the faster but more complex asynchronous."
>> 
>>     While the rest of the discussion on this sounds correct, I think
>> that what was being implicitly rejected here is the way that the Space
>> Shuttle computers do voting.  In the Space Shuttle, voting is based on
>> whether three different computer systems come up with about the same
>> answer at about the same time.  If no two agree, the results of a
>> fourth are arbitrarily accepted.  (Is that both right and concise?)
>> Since the computers do not get their data synchronously, the actual
>> data values, and the control inputs computed from them, will be
>> slightly different.

>This is my understanding as well.  Three of the computers are identical,
>IBM 4pi units if I recall, while the fourth unit is hardwired analog, the
>theory being to protect against common-mode hardware failures.

This is really getting off the subject of Ada, but it is difficult to allow 
misconceptions to propagate.  There are 5 main computers (IBM 4pi AP-101s) on 
the Shuttle.  Four work together during critical flight phases (ascent and 
entry).  This is the Primary Avionics SubSystem (PASS).  They each get data 
from the same sensors, and they each send data to the same effectors.  The 
effectors have a means to throw away data from a computer if the value 
disagrees with the data from the others. The 4 computers simply send a 
synchronization message to each other periodically.   If a computer fails to 
send the message at the appropriate time (with a little leeway), they tell the 
crew, but keep going.  The crew can turn the power off a computer if they 
decide to.  There is more to the syncronization, but that is the concise 
version.  The software on all 4 of these computers is identical, and contains 
a priority-based pre-emptive executive.  

The 5th computer is the Backup Flight System (also an AP-101).  It can only 
take control if a crew member presses a button (this has not happened to date, 
except during simulations).  It has software developed "independently".  The 
quotes are there because the algorithms within the guidance, navigation 
and control software are the same for both systems, so there could be common 
errors.  The operating system on this computer is a cyclic executive 
(i.e. not priority-based pre-emptive tasking).

The Shuttle is completely digital, by the way.  There is no analog backup.  
The 5 computers get their data from the same types of sensors, and use the 
same effectors.

>However, there is one added issue to be addressed: common-mode failure in
>the software.  A classic solution is N-version programming, where two or
>three completely independent and isolated teams develop the software for
>the digital computers. The theory of this is that the teams, being
>isolated, will not make the same mistakes, so they can cross-check each
>other, both during system integration, and operationally.  

>It's a pretty good theory, but falls down if for instance the control law
>requirements are not correct.  The Swedes lost a prototype fighter
>aircraft at the Paris Air Show to just such a problem a few years ago. 
>Fortunately, nobody was hurt, although the airplane was destroyed.

>My recollection is that NASA used two teams, so two of three computers
>will contain the same software.

As I mentioned above, 4 of the 5 have the same software; the 5th was developed 
by a different team (in fact, different companies).

>Anyway, one cannot expect the outputs of these slightly different programs
>to match to the bit, nor is it important in practice that they be that
>close, so the voting unit compares the absolute value of the algebraic
>difference to a threshold.  I would guess that the tolerance is no more
>than a few percent of full scale.

The voting of outputs is done at the actuators, not by the computers.

>>     In the ISS, where voting is required, two out of three computers
>> will have to agree, but based on identical data, and bit for bit
>> compares.  The Space Shuttle approach does provide more reliability
>> where the algorithms are not known to be stable, but is a maintenance
>> nightmare.  (All computers getting the same overflow is no help, and
>> the SS flight guidance software does go through about 20 different
>> flight regimes during landing.  At the boundary between some of those
>> modes, the flight control algorithms are known to be unstable.  So
>> that approach is not only appropriate to the shuttle, it seems to be
>> necessary.)

>One could wonder if ISS will really use bit comparison, because they too
>may wish to have multiple versions, for exactly the same reasons.

The ISS software is not considered to be of the same criticality as the Space 
Shuttle software, since problems can not happen nearly as fast (one gets 
extremely bored watching a simulation of the Space Station maneuvering).  
There is no backup software.

Roger Racine

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-08  0:00     ` Roger Racine
@ 1998-05-08  0:00       ` Joe Gwinn
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Gwinn @ 1998-05-08  0:00 UTC (permalink / raw)



It appears that Roger Racine has more recent and detailed data than I do;
I am reporting on my recollection of a talk by some NASA people many years
ago.  I would not be in the least surprised if the control system had been
upgraded since then, either.

Joe Gwinn


In article <rracine.2.000E0315@draper.com>, rracine@draper.com (Roger
Racine) wrote:

> In article <gwinn-0705982150240001@d195.dial-5.cmb.ma.ultra.net>
gwinn@ma.ultranet.com (Joe Gwinn) writes:
> 
> >In article <EACHUS.98May6171227@spectre.mitre.org>,
> >eachus@spectre.mitre.org (Robert I. Eachus) wrote:
> 
> >> In article <354dadfd.2883074@news.mindspring.com>
> >munck@Mill-Creek-Systems.com (Robert Munck) writes:
> >> 
> >>   >  "To make troubleshooting easier, the software that runs
> >>   >  the trio of computer networks aboard the space station is
> >>   >  written to operate in synchronous, or serial, fashion 
> >>   >  rather than the faster but more complex asynchronous."
> >> 
> >>     While the rest of the discussion on this sounds correct, I think
> >> that what was being implicitly rejected here is the way that the Space
> >> Shuttle computers do voting.  In the Space Shuttle, voting is based on
> >> whether three different computer systems come up with about the same
> >> answer at about the same time.  If no two agree, the results of a
> >> fourth are arbitrarily accepted.  (Is that both right and concise?)
> >> Since the computers do not get their data synchronously, the actual
> >> data values, and the control inputs computed from them, will be
> >> slightly different.
> 
> >This is my understanding as well.  Three of the computers are identical,
> >IBM 4pi units if I recall, while the fourth unit is hardwired analog, the
> >theory being to protect against common-mode hardware failures.
> 
> This is really getting off the subject of Ada, but it is difficult to allow 
> misconceptions to propagate.  There are 5 main computers (IBM 4pi AP-101s) on 
> the Shuttle.  Four work together during critical flight phases (ascent and 
> entry).  This is the Primary Avionics SubSystem (PASS).  They each get data 
> from the same sensors, and they each send data to the same effectors.  The 
> effectors have a means to throw away data from a computer if the value 
> disagrees with the data from the others. The 4 computers simply send a 
> synchronization message to each other periodically.   If a computer fails to 
> send the message at the appropriate time (with a little leeway), they
tell the 
> crew, but keep going.  The crew can turn the power off a computer if they 
> decide to.  There is more to the syncronization, but that is the concise 
> version.  The software on all 4 of these computers is identical, and contains 
> a priority-based pre-emptive executive.  
> 
> The 5th computer is the Backup Flight System (also an AP-101).  It can only 
> take control if a crew member presses a button (this has not happened to
date, 
> except during simulations).  It has software developed "independently".  The 
> quotes are there because the algorithms within the guidance, navigation 
> and control software are the same for both systems, so there could be common 
> errors.  The operating system on this computer is a cyclic executive 
> (i.e. not priority-based pre-emptive tasking).
> 
> The Shuttle is completely digital, by the way.  There is no analog backup.  
> The 5 computers get their data from the same types of sensors, and use the 
> same effectors.
> 
> >However, there is one added issue to be addressed: common-mode failure in
> >the software.  A classic solution is N-version programming, where two or
> >three completely independent and isolated teams develop the software for
> >the digital computers. The theory of this is that the teams, being
> >isolated, will not make the same mistakes, so they can cross-check each
> >other, both during system integration, and operationally.  
> 
> >It's a pretty good theory, but falls down if for instance the control law
> >requirements are not correct.  The Swedes lost a prototype fighter
> >aircraft at the Paris Air Show to just such a problem a few years ago. 
> >Fortunately, nobody was hurt, although the airplane was destroyed.
> 
> >My recollection is that NASA used two teams, so two of three computers
> >will contain the same software.
> 
> As I mentioned above, 4 of the 5 have the same software; the 5th was
developed 
> by a different team (in fact, different companies).
> 
> >Anyway, one cannot expect the outputs of these slightly different programs
> >to match to the bit, nor is it important in practice that they be that
> >close, so the voting unit compares the absolute value of the algebraic
> >difference to a threshold.  I would guess that the tolerance is no more
> >than a few percent of full scale.
> 
> The voting of outputs is done at the actuators, not by the computers.
> 
> >>     In the ISS, where voting is required, two out of three computers
> >> will have to agree, but based on identical data, and bit for bit
> >> compares.  The Space Shuttle approach does provide more reliability
> >> where the algorithms are not known to be stable, but is a maintenance
> >> nightmare.  (All computers getting the same overflow is no help, and
> >> the SS flight guidance software does go through about 20 different
> >> flight regimes during landing.  At the boundary between some of those
> >> modes, the flight control algorithms are known to be unstable.  So
> >> that approach is not only appropriate to the shuttle, it seems to be
> >> necessary.)
> 
> >One could wonder if ISS will really use bit comparison, because they too
> >may wish to have multiple versions, for exactly the same reasons.
> 
> The ISS software is not considered to be of the same criticality as the Space 
> Shuttle software, since problems can not happen nearly as fast (one gets 
> extremely bored watching a simulation of the Space Station maneuvering).  
> There is no backup software.
> 
> Roger Racine




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-07  0:00   ` Joe Gwinn
  1998-05-08  0:00     ` Roger Racine
@ 1998-05-08  0:00     ` Dale Pontius
  1 sibling, 0 replies; 14+ messages in thread
From: Dale Pontius @ 1998-05-08  0:00 UTC (permalink / raw)



In article <gwinn-0705982150240001@d195.dial-5.cmb.ma.ultra.net>,
        gwinn@ma.ultranet.com (Joe Gwinn) writes:
> In article <EACHUS.98May6171227@spectre.mitre.org>,
>>     While the rest of the discussion on this sounds correct, I think
>> that what was being implicitly rejected here is the way that the Space
>> Shuttle computers do voting.  In the Space Shuttle, voting is based on
>> whether three different computer systems come up with about the same
>> answer at about the same time.  If no two agree, the results of a
>> fourth are arbitrarily accepted.  (Is that both right and concise?)
>> Since the computers do not get their data synchronously, the actual
>> data values, and the control inputs computed from them, will be
>> slightly different.
> This is my understanding as well.  Three of the computers are identical,
> IBM 4pi units if I recall, while the fourth unit is hardwired analog, the
> theory being to protect against common-mode hardware failures.
> However, there is one added issue to be addressed: common-mode failure in
> the software.  A classic solution is N-version programming, where two or
> three completely independent and isolated teams develop the software for
> the digital computers. The theory of this is that the teams, being
> isolated, will not make the same mistakes, so they can cross-check each
> other, both during system integration, and operationally.
>
IIRC, there are five IDENTICAL computers on the shuttle. Four of them
are running the same software, in sync. Three of them are continually
voting to deliver results. If there is a non-unanimous vote, the loser
is taken offline and the fourth computer is made active. If there is
another unanimous vote, the whole cluster is brought down and the
fifth computer is made active. The fifth computer hardware is identical,
but the software was programmed by an entirely different group of
people in a different programming language. This is an attempt to
avoid 'deeply systemic' software errors. (The first four were
programmed with a language called HAL/S, I believe.)

This is long ago hearsay, listening on an internal IBM newsgroup to
one of the people who was on the hotseat when Columbia's first liftoff
scuttled. Of course he's since probably been sold to Loral then
Lockheed Martin with the rest of that division.

Dale Pontius
(NOT speaking for IBM)




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Space Station S/W in Ada -- No Tasking?
  1998-05-06  0:00 ` Robert I. Eachus
  1998-05-07  0:00   ` Joe Gwinn
@ 1998-05-08  0:00   ` Chris Warwick
  1 sibling, 0 replies; 14+ messages in thread
From: Chris Warwick @ 1998-05-08  0:00 UTC (permalink / raw)



The design for the chunk of flight software that I saw certainly had Ada 
tasks... So, I presume there is no restriction to prevent the use of Ada 
tasks... The problem we had was the Alsys Ada83 compiler was taking too long 
for a task context switch and thus we were unable to determine the respose 
time for an interrupt.

The other stuggle we had was with the fact that, despite the fact that we were 
trying to use static memory definitions, the compiler still insisted on 
pre-loading memory as part of its startup operation. Thus the code was taking 
so long to start that the watch-dog timer would keep re-starting the 
processor, i.e., our keep alive interrupt handler was never getting started.

This is item number 2 in my list of why I hate some Ada83 compilers. Item 
number 1 was with the Alsys DOS compiler that thought it was reasonable for 
DOS interrupts to halt all Ada processing in all tasks. It has been pointed 
out to me that this is 100% compliant to the LRM, and to use Mr. Dewar's 
words, makes the compiler 100% useless...




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~1998-05-12  0:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-05-03  0:00 Space Station S/W in Ada -- No Tasking? Robert Munck
1998-05-03  0:00 ` Robert Dewar
1998-05-07  0:00   ` JP Thornley
1998-05-05  0:00 ` LarryButts
1998-05-05  0:00 ` Roger Racine
1998-05-05  0:00   ` Robert Munck
1998-05-12  0:00     ` Carla Taylor
1998-05-06  0:00   ` William D. Ghrist
1998-05-06  0:00 ` Robert I. Eachus
1998-05-07  0:00   ` Joe Gwinn
1998-05-08  0:00     ` Roger Racine
1998-05-08  0:00       ` Joe Gwinn
1998-05-08  0:00     ` Dale Pontius
1998-05-08  0:00   ` Chris Warwick

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox