From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.4 required=5.0 tests=AC_FROM_MANY_DOTS,BAYES_00 autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 107f24,582dff0b3f065a52 X-Google-Attributes: gid107f24,public X-Google-Thread: 1014db,582dff0b3f065a52 X-Google-Attributes: gid1014db,public X-Google-Thread: 103376,bc1361a952ec75ca X-Google-Attributes: gid103376,public X-Google-Thread: 109fba,582dff0b3f065a52 X-Google-Attributes: gid109fba,public X-Google-ArrivalTime: 2001-08-03 08:14:04 PST Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!newsfeeds.belnet.be!news.belnet.be!psinet-eu-nl!psiuk-p4!uknet!psiuk-n!news.pace.co.uk!nh.pace.co.uk!not-for-mail From: "Marin David Condic" Newsgroups: comp.lang.ada,comp.lang.c,comp.lang.c++,comp.lang.functional Subject: Re: How Ada could have prevented the Red Code distributed denial of service attack. Date: Fri, 3 Aug 2001 10:41:15 -0400 Organization: Posted on a server owned by Pace Micro Technology plc Message-ID: <9ked6d$mtr$1@nh.pace.co.uk> References: <3B687EDF.9359F3FC@mediaone.net> <3B6A588C.B67A9CF8@isltd.insignia.com> NNTP-Posting-Host: 136.170.200.133 X-Trace: nh.pace.co.uk 996849677 23483 136.170.200.133 (3 Aug 2001 14:41:17 GMT) X-Complaints-To: newsmaster@news.cam.pace.co.uk NNTP-Posting-Date: 3 Aug 2001 14:41:17 GMT X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Xref: archiver1.google.com comp.lang.ada:11229 comp.lang.c:71910 comp.lang.c++:79675 comp.lang.functional:7260 Date: 2001-08-03T14:41:17+00:00 List-Id: "Christian Bau" wrote in message news:3B6A588C.B67A9CF8@isltd.insignia.com... > > If it is true that this value was indeed never used then the decision to > blow up the rocket was quite unfortunate. But if the value was used, > then it is obvious that this wrong value could cause very bad things to > happen; so blowing up the rocket was indeed correct. > The problem was that the IRS was not needed after the initial launch, so it should have been shut down. The error was at the hardware level - triggering an interrupt for an overflow wherein the ISR's designed behavior was to assume it was the result of flakey hardware, shut down the computer and transfer control to the other channel. The blowing up of the rocket was done because it had become unstable in flight - not because the IRS decided to shut down per se. > Why was there no "protection against operand errors"? In other words, > why was there no code that would detect the error, take appropriate > action against the error, and continue flying the rocket? There was of > course a global "protection against unanticipated operand errors": Any > overflow was indeed detected, and anything that comes unanticipated > means that the software doesn't work as planned. Whether this is a > hardware fault or a fault in some programmers logic doesn't really > matter. All you know is that something is wrong, you cannot be sure that > the rocket is doing what it is supposed to do, and this is a very > dangerous situation, so you blow it up. I assume that someone determined > that blowing it up is the least risky thing to do, at least once it is > up in the air. > There was code to detect and react to errors. It worked exactly as it had been planned to work. There wasn't even a "logic error" - the logic was perfect given the anticipated use of the device. (Having built similar systems, I can attest to the fact that when certain errors come up, you have to make your best guess as to what the cause is likely to be and take some kind of action that would make sense. This is what the engineers did.) Remember, the IRS was designed to accommodate the flight envelope of the Arianne 4 rocket. The situation was such that the normal Ada constraint checks were removed to gain needed speed. This was done *after* an analysis indicated that any numbers big enough to trigger the constraint checks (or the hardware overflow) would have to be wildly out of the possible range of the Arriane 4 flight envelope. The engineers who designed it basically said "If a hardware overflow occurs at this point, that's O.K. It means a sensor has gone bad and when we trap the interrupt, we will shut down the bad channel and switch to the good one." Having Ada constraint checks probably wouldn't have changed this any since the likely decision would be "If we got a Constraint_Error in this routine, it means a bad sensor so shut down the channel and transfer to the other side." The problem was that basically, the FDA was correct for the Arianne 4 - a failure of a sensor would be detected and accommodated by a transfer of control to the other channel. It was just the *wrong* FDA for the Arianne 5 since an overflow of that computation would be an expected condition given the flight envelope. Hence, the software would probably have been designed to do the calculations differently, allowing for the larger values. They *never* tested the IRS against the Arianne 5 flight envelope. If they did so, it would have triggered the error and they would have known they had a problem. > I think any explicit check for this overflow and trying to handle it > would have been inappropriate. It was (incorrectly) determined that an > overflow could not happen, so there was no appropriate action possible. > (This is assuming that the results were indeed used. If there is > functionality in a rocket that is not related to its performance, like > sending data to the ground, and a malfunction in this is detected, then > ignoring that malfunction might be the better action). No, it was *correctly* determined that an overflow could never happen - within the flight path of the Arianne 4 and so long as the sensors were functioning correctly. If the overflow *did* occur, it meant you had a problem with the computer or the sensors. Go into FDA because something is broke. It was an *incorrect* design for the Arianne 5. Nobody ever determined that because nobody ever looked. They just took the IRS off the shelf, bolted it to the Arriane 5 and assumed it would work as flawlessly as it did in the Arianne 4. The problem wasn't a language issue or even a design issue - it was a management issue for failing to determine that a specific part was/was not suited for a new application. I hope this clears things up a little. There always seems to be a lot of misunderstanding of exactly what went on in this disaster and wherein the problems came up. MDC -- Marin David Condic Senior Software Engineer Pace Micro Technology Americas www.pacemicro.com Enabling the digital revolution e-Mail: marin.condic@pacemicro.com Web: http://www.mcondic.com/