* Space Station S/W in Ada -- No Tasking? @ 1998-05-03 0:00 Robert Munck 1998-05-03 0:00 ` Robert Dewar ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: Robert Munck @ 1998-05-03 0:00 UTC (permalink / raw) A paragraph in Popular Science notes that the software for the International Space Station is being written in Ada, about 3M lines worth. However, it goes on to say: "To make troubleshooting easier, the software that runs the trio of computer networks aboard the space station is written to operate in synchronous, or serial, fashion rather than the faster but more complex asynchronous." Does this mean that they're not using tasking, but rather the old "crystal clock" architecture where you organize your processing into major and minor cycles, disable interrupts, and poll for events "just in time" at various places in the cycles? In my experience, large systems built that way tended to be complete disasters: nightmares to debug ("troubleshoot!"), horror shows to maintain and enhance. They often had interdependencies that were handled purely by the positions of pieces of code in the cycles and the processing times of the other (unrelated) functions between those positions. Adding a tiny fix in one place could break code half a major cycle and 1 million lines of code away from it. Could we possibly be using this approach for a life-critical system that will run in an incompletely-understood environment, be subject to extensive and rapid change, and have a lifetime of decades? Bob Munck Mill Creek Systems LC ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-03 0:00 Space Station S/W in Ada -- No Tasking? Robert Munck @ 1998-05-03 0:00 ` Robert Dewar 1998-05-07 0:00 ` JP Thornley 1998-05-05 0:00 ` LarryButts ` (2 subsequent siblings) 3 siblings, 1 reply; 14+ messages in thread From: Robert Dewar @ 1998-05-03 0:00 UTC (permalink / raw) Bob says <<Could we possibly be using this approach for a life-critical system that will run in an incompletely-understood environment, be subject to extensive and rapid change, and have a lifetime of decades? >> This is of course an old argument, and proponents of synchronous cyclic scheduling will be glad to produce similar rhetoric denouncing the use of the asyncrhonous approach. Of course one has to decide this on a case-by-case basis, but there are plenty of examples of disasters and successes created using both approaches. At least some parts o the space station software definitely use cyclic scheduling (I remember this because I implemented the ncessary CIFO primitives to support cyclic scheduling for ALsys who was supplying Ada compilers for the Space Station effort). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-03 0:00 ` Robert Dewar @ 1998-05-07 0:00 ` JP Thornley 0 siblings, 0 replies; 14+ messages in thread From: JP Thornley @ 1998-05-07 0:00 UTC (permalink / raw) It might be worth mentioning, in this discussion, the existence of the Ravenscar Profile - a subset of Ada tasking designed explicitly to meet the certification requirements of high integrity Ada programs. The profile was defined at the 8th International Real-Time Ada Workshop in April 1997 (held at the Ravenscar Hotel, North Yorkshire, UK). Details are in the proceedings, published as the September/October issue of Ada Letters (Volume XVII Number 5). At least one vendor has announced plans to provide a 'certifiable' run-time that supports this subset. Use of the profile is recommended in the HRG Guidance document (an ISO technical report on the use for Ada in high integrity software currently in preparation). Phil Thornley. -- ------------------------------------------------------------------------ | JP Thornley EMail jpt@diphi.demon.co.uk | | phil.thornley@acm.org | ------------------------------------------------------------------------ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-03 0:00 Space Station S/W in Ada -- No Tasking? Robert Munck 1998-05-03 0:00 ` Robert Dewar @ 1998-05-05 0:00 ` LarryButts 1998-05-05 0:00 ` Roger Racine 1998-05-06 0:00 ` Robert I. Eachus 3 siblings, 0 replies; 14+ messages in thread From: LarryButts @ 1998-05-05 0:00 UTC (permalink / raw) I believe you are 100% correct. However, on the trainer for the ISS we are using Ada, about 1.5M lines worth and we are using tasking and a lot of it. We are using rate monotonic sheduling for our real-time hard deadline simulations. Ada tasking works great and this thing is integrating a whole lot better that if we were using an old cyclic exec. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-03 0:00 Space Station S/W in Ada -- No Tasking? Robert Munck 1998-05-03 0:00 ` Robert Dewar 1998-05-05 0:00 ` LarryButts @ 1998-05-05 0:00 ` Roger Racine 1998-05-05 0:00 ` Robert Munck 1998-05-06 0:00 ` William D. Ghrist 1998-05-06 0:00 ` Robert I. Eachus 3 siblings, 2 replies; 14+ messages in thread From: Roger Racine @ 1998-05-05 0:00 UTC (permalink / raw) In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes: >Path: news.draper.com!nsnought.draper.com!cam-news-feed5.bbnplanet.com!cam-news-hub1.bbnplanet.com!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!feed2.news.erols.com!erols!news.mindspring.net!news.mindspring.com!not-for-mail >From: munck@Mill-Creek-Systems.com (Robert Munck) >Newsgroups: comp.lang.ada >Subject: Space Station S/W in Ada -- No Tasking? >Date: Sun, 03 May 1998 18:04:15 GMT >Organization: Mill Creek Systems LC >Lines: 30 >Message-ID: <354dadfd.2883074@news.mindspring.com> >Reply-To: munck@acm.org >NNTP-Posting-Host: ip144.herndon6.va.pub-ip.psi.net >Mime-Version: 1.0 >Content-Type: text/plain; charset=us-ascii >Content-Transfer-Encoding: 7bit >X-Server-Date: 3 May 1998 18:05:12 GMT >X-Newsreader: Forte Agent 1.5/32.451 >A paragraph in Popular Science notes that the software for >the International Space Station is being written in Ada, >about 3M lines worth. However, it goes on to say: > "To make troubleshooting easier, the software that runs > the trio of computer networks aboard the space station is > written to operate in synchronous, or serial, fashion > rather than the faster but more complex asynchronous." >Does this mean that they're not using tasking, but rather the >old "crystal clock" architecture where you organize your >processing into major and minor cycles, disable interrupts, and >poll for events "just in time" at various places in the cycles? >In my experience, large systems built that way tended to be >complete disasters: nightmares to debug ("troubleshoot!"), >horror shows to maintain and enhance. They often had >interdependencies that were handled purely by the positions >of pieces of code in the cycles and the processing times of >the other (unrelated) functions between those positions. >Adding a tiny fix in one place could break code half a major >cycle and 1 million lines of code away from it. >Could we possibly be using this approach for a life-critical >system that will run in an incompletely-understood >environment, be subject to extensive and rapid change, and >have a lifetime of decades? >Bob Munck >Mill Creek Systems LC The article is misleading; there is tasking being used for the ISS. I was one of the people who convinced the Boeing management to allow it, and helped develop the tasking structure. Robert Dewar pointed out the development of the CIFO constructs for tasking within the Alsys compiler. This was not used. It was going to be used within the Space Station Freedom program, but was not allowed to be used within the re-designed computers in the International Space Station software (I have forgotten the reason). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-05 0:00 ` Roger Racine @ 1998-05-05 0:00 ` Robert Munck 1998-05-12 0:00 ` Carla Taylor 1998-05-06 0:00 ` William D. Ghrist 1 sibling, 1 reply; 14+ messages in thread From: Robert Munck @ 1998-05-05 0:00 UTC (permalink / raw) On Tue, 5 May 1998 15:21:41 GMT, rracine@draper.com (Roger Racine) wrote: > ... there is tasking being used for the ISS. I was one >of the people who convinced the Boeing management to allow it You did good. Robert Dewar's experience may be different, but in 32-odd years in the business and a great deal of DoD, NASA, and ESA involvement, I've never seen a large cyclic-executive- architecture system that was in any way successful. The trouble is that cyclic-exec projects are easier for bad managers to manage. They don't have to understand tough concepts like deadlock, critical sections, rate monotonic scheduling, etc. Boeing management had to be convinced? I hesitate to ask, but how is the 777 avionics s/w structured? Bob Munck Mill Creek Systems LC ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-05 0:00 ` Robert Munck @ 1998-05-12 0:00 ` Carla Taylor 0 siblings, 0 replies; 14+ messages in thread From: Carla Taylor @ 1998-05-12 0:00 UTC (permalink / raw) > Boeing management had to be convinced? I hesitate to ask, > but how is the 777 avionics s/w structured? > > Bob Munck > Mill Creek Systems LC > > In 777, tasking was not allowed. Each "task" is written as a main procedure, and proprietary hardware/software is used to schedule the tasks, gauranteeing that a each task will complete in its allotted time, or have the processor forcibly taken away from it. I don't know if this design was a Boeing decision or not. Kevin Tucker ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-05 0:00 ` Roger Racine 1998-05-05 0:00 ` Robert Munck @ 1998-05-06 0:00 ` William D. Ghrist 1 sibling, 0 replies; 14+ messages in thread From: William D. Ghrist @ 1998-05-06 0:00 UTC (permalink / raw) Roger Racine wrote: > > In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes: > >Path: news.draper.com!nsnought.draper.com!cam-news-feed5.bbnplanet.com!cam-news-hub1.bbnplanet.com!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!feed2.news.erols.com!erols!news.mindspring.net!news.mindspring.com!not-for-mail > >From: munck@Mill-Creek-Systems.com (Robert Munck) > >Newsgroups: comp.lang.ada > >Subject: Space Station S/W in Ada -- No Tasking? > >Date: Sun, 03 May 1998 18:04:15 GMT > >Organization: Mill Creek Systems LC > >Lines: 30 > >Message-ID: <354dadfd.2883074@news.mindspring.com> > >Reply-To: munck@acm.org > >NNTP-Posting-Host: ip144.herndon6.va.pub-ip.psi.net > >Mime-Version: 1.0 > >Content-Type: text/plain; charset=us-ascii > >Content-Transfer-Encoding: 7bit > >X-Server-Date: 3 May 1998 18:05:12 GMT > >X-Newsreader: Forte Agent 1.5/32.451 > > >A paragraph in Popular Science notes that the software for > >the International Space Station is being written in Ada, > >about 3M lines worth. However, it goes on to say: > > > "To make troubleshooting easier, the software that runs > > the trio of computer networks aboard the space station is > > written to operate in synchronous, or serial, fashion > > rather than the faster but more complex asynchronous." > > >Does this mean that they're not using tasking, but rather the > >old "crystal clock" architecture where you organize your > >processing into major and minor cycles, disable interrupts, and > >poll for events "just in time" at various places in the cycles? > > >In my experience, large systems built that way tended to be > >complete disasters: nightmares to debug ("troubleshoot!"), > >horror shows to maintain and enhance. They often had > >interdependencies that were handled purely by the positions > >of pieces of code in the cycles and the processing times of > >the other (unrelated) functions between those positions. > >Adding a tiny fix in one place could break code half a major > >cycle and 1 million lines of code away from it. > > >Could we possibly be using this approach for a life-critical > >system that will run in an incompletely-understood > >environment, be subject to extensive and rapid change, and > >have a lifetime of decades? > > >Bob Munck > >Mill Creek Systems LC > > The article is misleading; there is tasking being used for the ISS. I was one > of the people who convinced the Boeing management to allow it, and helped > develop the tasking structure. > > Robert Dewar pointed out the development of the CIFO constructs for tasking > within the Alsys compiler. This was not used. It was going to be used within > the Space Station Freedom program, but was not allowed to be used within the > re-designed computers in the International Space Station software (I have > forgotten the reason). I�m not familiar with the term "�crystal clock� architecture" and I also don�t know what is in the Space Station software, but I would like to point out that using a non-tasking, non-interrupt architecture does not necessarily result in the complex "major and minor cycles" structure that is described. I agree that such an approach is likely to be a problem if you are attempting to break up the main flow of processing with explicit polling for events at some faster rate. What this is really doing is attempting to emulate multi-tasking, but results in very tight coupling of functions that should be unrelated. There is another approach, however -- that is to replace multi-tasking with multi-processing. It is typical in process control and protection applications that most of the main applications functions of a given processing subsystem can be done in a single loop repeating at a fixed interval. Functions that require faster response, such as input filtering and serial communications, can be done by additional ("slave") processors, which then exchange data with the main processor via access-controlled structures in shared memory. Different main processing subsystems can be networked together as well, and they can operate at different cycle times. We have been using this approach successfully for many years in the area of nuclear safety systems. The main benefits of this approach are that it simplifies the task of software verification (the verifier doesn�t have to analyze what might happen if the software is interrupted at any point in the program) and simplifies the ability to analyze worst case response times. The main drawback is that it is more costly in terms of hardware. But for low volume, large scope applications where the highest level of software integrity is required, the benefits for the software development and verification can outweigh the additional hardware costs. And, when presenting the safety case for licensing, it is simply easier to demonstrate that the exact response of the system is clearly known for all circumstances. One significant example of the success of this approach is in the Sizewell B plant in the U.K. The entire primary protection system and the reactor control system were implemented in this manner. This system has been operating quite successfully for over three years now. There have been no "disasters", no "nightmares to debug", no "horror shows to maintain and enhance". As for nightmares to debug, some of the worst in my experience (when I worked on non-nuclear systems) have been related to the use of interrupts and multi-tasking operating systems. I will concede, however, that operating system design has advance considerably since those days. Certainly, this approach is not suitable in many situations, but properly applied, it can prevent rather than cause software "nightmares". Regards, Bill Ghrist ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-03 0:00 Space Station S/W in Ada -- No Tasking? Robert Munck ` (2 preceding siblings ...) 1998-05-05 0:00 ` Roger Racine @ 1998-05-06 0:00 ` Robert I. Eachus 1998-05-07 0:00 ` Joe Gwinn 1998-05-08 0:00 ` Chris Warwick 3 siblings, 2 replies; 14+ messages in thread From: Robert I. Eachus @ 1998-05-06 0:00 UTC (permalink / raw) In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes: > "To make troubleshooting easier, the software that runs > the trio of computer networks aboard the space station is > written to operate in synchronous, or serial, fashion > rather than the faster but more complex asynchronous." While the rest of the discussion on this sounds correct, I think that what was being implicitly rejected here is the way that the Space Shuttle computers do voting. In the Space Shuttle, voting is based on whether three different computer systems come up with about the same answer at about the same time. If no two agree, the results of a fourth are arbitrarily accepted. (Is that both right and concise?) Since the computers do not get their data synchronously, the actual data values, and the control inputs computed from them, will be slightly different. In the ISS, where voting is required, two out of three computers will have to agree, but based on identical data, and bit for bit compares. The Space Shuttle approach does provide more reliability where the algorithms are not known to be stable, but is a maintenance nightmare. (All computers getting the same overflow is no help, and the SS flight guidance software does go through about 20 different flight regimes during landing. At the boundary between some of those modes, the flight control algorithms are known to be unstable. So that approach is not only appropriate to the shuttle, it seems to be necessary.) -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-06 0:00 ` Robert I. Eachus @ 1998-05-07 0:00 ` Joe Gwinn 1998-05-08 0:00 ` Roger Racine 1998-05-08 0:00 ` Dale Pontius 1998-05-08 0:00 ` Chris Warwick 1 sibling, 2 replies; 14+ messages in thread From: Joe Gwinn @ 1998-05-07 0:00 UTC (permalink / raw) In article <EACHUS.98May6171227@spectre.mitre.org>, eachus@spectre.mitre.org (Robert I. Eachus) wrote: > In article <354dadfd.2883074@news.mindspring.com> munck@Mill-Creek-Systems.com (Robert Munck) writes: > > > "To make troubleshooting easier, the software that runs > > the trio of computer networks aboard the space station is > > written to operate in synchronous, or serial, fashion > > rather than the faster but more complex asynchronous." > > While the rest of the discussion on this sounds correct, I think > that what was being implicitly rejected here is the way that the Space > Shuttle computers do voting. In the Space Shuttle, voting is based on > whether three different computer systems come up with about the same > answer at about the same time. If no two agree, the results of a > fourth are arbitrarily accepted. (Is that both right and concise?) > Since the computers do not get their data synchronously, the actual > data values, and the control inputs computed from them, will be > slightly different. This is my understanding as well. Three of the computers are identical, IBM 4pi units if I recall, while the fourth unit is hardwired analog, the theory being to protect against common-mode hardware failures. However, there is one added issue to be addressed: common-mode failure in the software. A classic solution is N-version programming, where two or three completely independent and isolated teams develop the software for the digital computers. The theory of this is that the teams, being isolated, will not make the same mistakes, so they can cross-check each other, both during system integration, and operationally. It's a pretty good theory, but falls down if for instance the control law requirements are not correct. The Swedes lost a prototype fighter aircraft at the Paris Air Show to just such a problem a few years ago. Fortunately, nobody was hurt, although the airplane was destroyed. My recollection is that NASA used two teams, so two of three computers will contain the same software. Anyway, one cannot expect the outputs of these slightly different programs to match to the bit, nor is it important in practice that they be that close, so the voting unit compares the absolute value of the algebraic difference to a threshold. I would guess that the tolerance is no more than a few percent of full scale. > In the ISS, where voting is required, two out of three computers > will have to agree, but based on identical data, and bit for bit > compares. The Space Shuttle approach does provide more reliability > where the algorithms are not known to be stable, but is a maintenance > nightmare. (All computers getting the same overflow is no help, and > the SS flight guidance software does go through about 20 different > flight regimes during landing. At the boundary between some of those > modes, the flight control algorithms are known to be unstable. So > that approach is not only appropriate to the shuttle, it seems to be > necessary.) One could wonder if ISS will really use bit comparison, because they too may wish to have multiple versions, for exactly the same reasons. Joe Gwinn ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-07 0:00 ` Joe Gwinn @ 1998-05-08 0:00 ` Roger Racine 1998-05-08 0:00 ` Joe Gwinn 1998-05-08 0:00 ` Dale Pontius 1 sibling, 1 reply; 14+ messages in thread From: Roger Racine @ 1998-05-08 0:00 UTC (permalink / raw) In article <gwinn-0705982150240001@d195.dial-5.cmb.ma.ultra.net> gwinn@ma.ultranet.com (Joe Gwinn) writes: >In article <EACHUS.98May6171227@spectre.mitre.org>, >eachus@spectre.mitre.org (Robert I. Eachus) wrote: >> In article <354dadfd.2883074@news.mindspring.com> >munck@Mill-Creek-Systems.com (Robert Munck) writes: >> >> > "To make troubleshooting easier, the software that runs >> > the trio of computer networks aboard the space station is >> > written to operate in synchronous, or serial, fashion >> > rather than the faster but more complex asynchronous." >> >> While the rest of the discussion on this sounds correct, I think >> that what was being implicitly rejected here is the way that the Space >> Shuttle computers do voting. In the Space Shuttle, voting is based on >> whether three different computer systems come up with about the same >> answer at about the same time. If no two agree, the results of a >> fourth are arbitrarily accepted. (Is that both right and concise?) >> Since the computers do not get their data synchronously, the actual >> data values, and the control inputs computed from them, will be >> slightly different. >This is my understanding as well. Three of the computers are identical, >IBM 4pi units if I recall, while the fourth unit is hardwired analog, the >theory being to protect against common-mode hardware failures. This is really getting off the subject of Ada, but it is difficult to allow misconceptions to propagate. There are 5 main computers (IBM 4pi AP-101s) on the Shuttle. Four work together during critical flight phases (ascent and entry). This is the Primary Avionics SubSystem (PASS). They each get data from the same sensors, and they each send data to the same effectors. The effectors have a means to throw away data from a computer if the value disagrees with the data from the others. The 4 computers simply send a synchronization message to each other periodically. If a computer fails to send the message at the appropriate time (with a little leeway), they tell the crew, but keep going. The crew can turn the power off a computer if they decide to. There is more to the syncronization, but that is the concise version. The software on all 4 of these computers is identical, and contains a priority-based pre-emptive executive. The 5th computer is the Backup Flight System (also an AP-101). It can only take control if a crew member presses a button (this has not happened to date, except during simulations). It has software developed "independently". The quotes are there because the algorithms within the guidance, navigation and control software are the same for both systems, so there could be common errors. The operating system on this computer is a cyclic executive (i.e. not priority-based pre-emptive tasking). The Shuttle is completely digital, by the way. There is no analog backup. The 5 computers get their data from the same types of sensors, and use the same effectors. >However, there is one added issue to be addressed: common-mode failure in >the software. A classic solution is N-version programming, where two or >three completely independent and isolated teams develop the software for >the digital computers. The theory of this is that the teams, being >isolated, will not make the same mistakes, so they can cross-check each >other, both during system integration, and operationally. >It's a pretty good theory, but falls down if for instance the control law >requirements are not correct. The Swedes lost a prototype fighter >aircraft at the Paris Air Show to just such a problem a few years ago. >Fortunately, nobody was hurt, although the airplane was destroyed. >My recollection is that NASA used two teams, so two of three computers >will contain the same software. As I mentioned above, 4 of the 5 have the same software; the 5th was developed by a different team (in fact, different companies). >Anyway, one cannot expect the outputs of these slightly different programs >to match to the bit, nor is it important in practice that they be that >close, so the voting unit compares the absolute value of the algebraic >difference to a threshold. I would guess that the tolerance is no more >than a few percent of full scale. The voting of outputs is done at the actuators, not by the computers. >> In the ISS, where voting is required, two out of three computers >> will have to agree, but based on identical data, and bit for bit >> compares. The Space Shuttle approach does provide more reliability >> where the algorithms are not known to be stable, but is a maintenance >> nightmare. (All computers getting the same overflow is no help, and >> the SS flight guidance software does go through about 20 different >> flight regimes during landing. At the boundary between some of those >> modes, the flight control algorithms are known to be unstable. So >> that approach is not only appropriate to the shuttle, it seems to be >> necessary.) >One could wonder if ISS will really use bit comparison, because they too >may wish to have multiple versions, for exactly the same reasons. The ISS software is not considered to be of the same criticality as the Space Shuttle software, since problems can not happen nearly as fast (one gets extremely bored watching a simulation of the Space Station maneuvering). There is no backup software. Roger Racine ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-08 0:00 ` Roger Racine @ 1998-05-08 0:00 ` Joe Gwinn 0 siblings, 0 replies; 14+ messages in thread From: Joe Gwinn @ 1998-05-08 0:00 UTC (permalink / raw) It appears that Roger Racine has more recent and detailed data than I do; I am reporting on my recollection of a talk by some NASA people many years ago. I would not be in the least surprised if the control system had been upgraded since then, either. Joe Gwinn In article <rracine.2.000E0315@draper.com>, rracine@draper.com (Roger Racine) wrote: > In article <gwinn-0705982150240001@d195.dial-5.cmb.ma.ultra.net> gwinn@ma.ultranet.com (Joe Gwinn) writes: > > >In article <EACHUS.98May6171227@spectre.mitre.org>, > >eachus@spectre.mitre.org (Robert I. Eachus) wrote: > > >> In article <354dadfd.2883074@news.mindspring.com> > >munck@Mill-Creek-Systems.com (Robert Munck) writes: > >> > >> > "To make troubleshooting easier, the software that runs > >> > the trio of computer networks aboard the space station is > >> > written to operate in synchronous, or serial, fashion > >> > rather than the faster but more complex asynchronous." > >> > >> While the rest of the discussion on this sounds correct, I think > >> that what was being implicitly rejected here is the way that the Space > >> Shuttle computers do voting. In the Space Shuttle, voting is based on > >> whether three different computer systems come up with about the same > >> answer at about the same time. If no two agree, the results of a > >> fourth are arbitrarily accepted. (Is that both right and concise?) > >> Since the computers do not get their data synchronously, the actual > >> data values, and the control inputs computed from them, will be > >> slightly different. > > >This is my understanding as well. Three of the computers are identical, > >IBM 4pi units if I recall, while the fourth unit is hardwired analog, the > >theory being to protect against common-mode hardware failures. > > This is really getting off the subject of Ada, but it is difficult to allow > misconceptions to propagate. There are 5 main computers (IBM 4pi AP-101s) on > the Shuttle. Four work together during critical flight phases (ascent and > entry). This is the Primary Avionics SubSystem (PASS). They each get data > from the same sensors, and they each send data to the same effectors. The > effectors have a means to throw away data from a computer if the value > disagrees with the data from the others. The 4 computers simply send a > synchronization message to each other periodically. If a computer fails to > send the message at the appropriate time (with a little leeway), they tell the > crew, but keep going. The crew can turn the power off a computer if they > decide to. There is more to the syncronization, but that is the concise > version. The software on all 4 of these computers is identical, and contains > a priority-based pre-emptive executive. > > The 5th computer is the Backup Flight System (also an AP-101). It can only > take control if a crew member presses a button (this has not happened to date, > except during simulations). It has software developed "independently". The > quotes are there because the algorithms within the guidance, navigation > and control software are the same for both systems, so there could be common > errors. The operating system on this computer is a cyclic executive > (i.e. not priority-based pre-emptive tasking). > > The Shuttle is completely digital, by the way. There is no analog backup. > The 5 computers get their data from the same types of sensors, and use the > same effectors. > > >However, there is one added issue to be addressed: common-mode failure in > >the software. A classic solution is N-version programming, where two or > >three completely independent and isolated teams develop the software for > >the digital computers. The theory of this is that the teams, being > >isolated, will not make the same mistakes, so they can cross-check each > >other, both during system integration, and operationally. > > >It's a pretty good theory, but falls down if for instance the control law > >requirements are not correct. The Swedes lost a prototype fighter > >aircraft at the Paris Air Show to just such a problem a few years ago. > >Fortunately, nobody was hurt, although the airplane was destroyed. > > >My recollection is that NASA used two teams, so two of three computers > >will contain the same software. > > As I mentioned above, 4 of the 5 have the same software; the 5th was developed > by a different team (in fact, different companies). > > >Anyway, one cannot expect the outputs of these slightly different programs > >to match to the bit, nor is it important in practice that they be that > >close, so the voting unit compares the absolute value of the algebraic > >difference to a threshold. I would guess that the tolerance is no more > >than a few percent of full scale. > > The voting of outputs is done at the actuators, not by the computers. > > >> In the ISS, where voting is required, two out of three computers > >> will have to agree, but based on identical data, and bit for bit > >> compares. The Space Shuttle approach does provide more reliability > >> where the algorithms are not known to be stable, but is a maintenance > >> nightmare. (All computers getting the same overflow is no help, and > >> the SS flight guidance software does go through about 20 different > >> flight regimes during landing. At the boundary between some of those > >> modes, the flight control algorithms are known to be unstable. So > >> that approach is not only appropriate to the shuttle, it seems to be > >> necessary.) > > >One could wonder if ISS will really use bit comparison, because they too > >may wish to have multiple versions, for exactly the same reasons. > > The ISS software is not considered to be of the same criticality as the Space > Shuttle software, since problems can not happen nearly as fast (one gets > extremely bored watching a simulation of the Space Station maneuvering). > There is no backup software. > > Roger Racine ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-07 0:00 ` Joe Gwinn 1998-05-08 0:00 ` Roger Racine @ 1998-05-08 0:00 ` Dale Pontius 1 sibling, 0 replies; 14+ messages in thread From: Dale Pontius @ 1998-05-08 0:00 UTC (permalink / raw) In article <gwinn-0705982150240001@d195.dial-5.cmb.ma.ultra.net>, gwinn@ma.ultranet.com (Joe Gwinn) writes: > In article <EACHUS.98May6171227@spectre.mitre.org>, >> While the rest of the discussion on this sounds correct, I think >> that what was being implicitly rejected here is the way that the Space >> Shuttle computers do voting. In the Space Shuttle, voting is based on >> whether three different computer systems come up with about the same >> answer at about the same time. If no two agree, the results of a >> fourth are arbitrarily accepted. (Is that both right and concise?) >> Since the computers do not get their data synchronously, the actual >> data values, and the control inputs computed from them, will be >> slightly different. > This is my understanding as well. Three of the computers are identical, > IBM 4pi units if I recall, while the fourth unit is hardwired analog, the > theory being to protect against common-mode hardware failures. > However, there is one added issue to be addressed: common-mode failure in > the software. A classic solution is N-version programming, where two or > three completely independent and isolated teams develop the software for > the digital computers. The theory of this is that the teams, being > isolated, will not make the same mistakes, so they can cross-check each > other, both during system integration, and operationally. > IIRC, there are five IDENTICAL computers on the shuttle. Four of them are running the same software, in sync. Three of them are continually voting to deliver results. If there is a non-unanimous vote, the loser is taken offline and the fourth computer is made active. If there is another unanimous vote, the whole cluster is brought down and the fifth computer is made active. The fifth computer hardware is identical, but the software was programmed by an entirely different group of people in a different programming language. This is an attempt to avoid 'deeply systemic' software errors. (The first four were programmed with a language called HAL/S, I believe.) This is long ago hearsay, listening on an internal IBM newsgroup to one of the people who was on the hotseat when Columbia's first liftoff scuttled. Of course he's since probably been sold to Loral then Lockheed Martin with the rest of that division. Dale Pontius (NOT speaking for IBM) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Space Station S/W in Ada -- No Tasking? 1998-05-06 0:00 ` Robert I. Eachus 1998-05-07 0:00 ` Joe Gwinn @ 1998-05-08 0:00 ` Chris Warwick 1 sibling, 0 replies; 14+ messages in thread From: Chris Warwick @ 1998-05-08 0:00 UTC (permalink / raw) The design for the chunk of flight software that I saw certainly had Ada tasks... So, I presume there is no restriction to prevent the use of Ada tasks... The problem we had was the Alsys Ada83 compiler was taking too long for a task context switch and thus we were unable to determine the respose time for an interrupt. The other stuggle we had was with the fact that, despite the fact that we were trying to use static memory definitions, the compiler still insisted on pre-loading memory as part of its startup operation. Thus the code was taking so long to start that the watch-dog timer would keep re-starting the processor, i.e., our keep alive interrupt handler was never getting started. This is item number 2 in my list of why I hate some Ada83 compilers. Item number 1 was with the Alsys DOS compiler that thought it was reasonable for DOS interrupts to halt all Ada processing in all tasks. It has been pointed out to me that this is 100% compliant to the LRM, and to use Mr. Dewar's words, makes the compiler 100% useless... ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~1998-05-12 0:00 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1998-05-03 0:00 Space Station S/W in Ada -- No Tasking? Robert Munck 1998-05-03 0:00 ` Robert Dewar 1998-05-07 0:00 ` JP Thornley 1998-05-05 0:00 ` LarryButts 1998-05-05 0:00 ` Roger Racine 1998-05-05 0:00 ` Robert Munck 1998-05-12 0:00 ` Carla Taylor 1998-05-06 0:00 ` William D. Ghrist 1998-05-06 0:00 ` Robert I. Eachus 1998-05-07 0:00 ` Joe Gwinn 1998-05-08 0:00 ` Roger Racine 1998-05-08 0:00 ` Joe Gwinn 1998-05-08 0:00 ` Dale Pontius 1998-05-08 0:00 ` Chris Warwick
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox