From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,LOTS_OF_MONEY
	autolearn=ham autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,885dab3998d28a4
X-Google-Attributes: gid103376,public
From: Ken Garlington <garlingtonke@lmtas.lmco.com>
Subject: Re: Ariane 5 failure
Date: 1996/10/04
Message-ID: <3255593B.19B7@lmtas.lmco.com>
X-Deja-AN: 187628589
references: <96100112290401@psavax.pwfl.com> <32531A6F.6EDB@dynamite.com.au>
 <3252B46C.5E9D@lmtas.lmco.com> <32546A8F.75D8@dynamite.com.au>
content-type: text/plain; charset=us-ascii
organization: Lockheed Martin Tactical Aircraft Systems
mime-version: 1.0
newsgroups: comp.lang.ada
x-mailer: Mozilla 2.02 (Macintosh; I; 68K)
Date: 1996-10-04T00:00:00+00:00
List-Id: <comp.lang.ada>


Alan Brain wrote:
> 
> Ken Garlington wrote:
> 
> > So what did you do when you needed to build a system that was bigger than the
> > torpedo hatch? Re-design the submarine?
> 
> Nope, we re-designed the system so it fit anyway.

Tsk, tsk! You violated your own design constraint of "always provide enough
margin for growth." Just think how much money you would have saved if you had
built it bigger to begin with!

> Actually, we designed
> the thing in the first place so that the risk of it physically growing
> too big and needing re-design was tolerable (ie contingency money was
> allocated for doing this, if we couldn't accurately estimate the risk as
> being small).

I'm sure the Arianespace folks had the same contingency funding. In fact, they're
spending it right now. :)

> 
> > Oh for the luxury of a diesel generator! We have to be able to operate on basic
> > battery power (and we share that bus with emergency lighting, etc.)
> 
> Well ours had a generator connected to a hamster wheel with a piece of
> cheese as backup ;-).... but seriously folks, yes we have a diesel. Why?
> to charge the batteries.

Batteries, plural? Wow!

> I'd be very, very suspicious of a slack like "15%". This implies you
> know to within 2 significant figures what the load is going to be. Which
> in my experience is not the case. "About a Seventh" is more accurate, as
> it implies more imprecision. And I'd be surprised if any Bungee-Jumper
> would tolerate that small amount of safety margin using new equipment.
> Then again, slack is supposed to be used up. It's for the unforeseen.
> When you come across a problem during development, you shouldn't be
> afraid of using up that slack, that's what it's there for!

Actually, no. For most military programs, slack is for a combination of
growth _after_ the initial development, or for unforseen variations in
the production system (e.g., a processor that's a little slower than spec.)
And, 15% is a common number for such slack.

I think you're confusing "slack" with "management reserve," which is usually
an number set by the development organization and used up (if needed) during
development. The 15% number is usually imposed by a prime on a subcontractor
for the reasons described above.

> > What if your brand new CPU requires more power than your diesel generator
> > can generate?
> > What if your brand new CPU requires a technology that doesn't let you meet
> > your heat dissipation?
> 
> But it doesn't. When you did your initial systems engineering, you made
> sure there was enough slack - OR had enough contingency money so that
> you could get custom-built stuff.

How much money is required to violate the laws of physics? _That's_ the
kind of limitations we're talking about when you get into power, cooling,
heat dissipation, etc.

> I see your zero-cooling situations, and I raise you H2, CO2, CO, Cl, H3O
> conditions etc. The constraints on a sub are different, but the same in
> scope. Until such time as you do work on a sub, or I do more than just a
> little work on aerospace, we may have to leave it at that.

But we _already_ have these same restrictions, since we have to operate in
Naval environments. We also have _extra_ requirements.

Considering that the topic of this thread is an aerospace system, I think
it's not enough to "leave it at that."

> 
> > > Usually such ridiculously extreme measures are not neccessary. The
> > > Hardware guys
> > > bitch about the cost-per-CPU going through the roof. Heck, it could cost
> > > $10 million.
> > > But if it saves 2 years of Software effort, that's a net saving of $90
> > > million.
> >
> > What does maintenance costs have to do with this discussion?
> 
> Sorry I didn't make myself clear: I was talking development costs, not
> maintenance.

Then you're not talking about inertial nav systems. On most of the projects
I've seen, the total software development time is two years or less. You're
not going to save 2 years of software effort for a new system!

> >  If you're used to developing systems
> > with those kind of constraints, you know how to make those decisions.
> > Occasionally, you make the wrong decision, as the Ariane designers discovered.
> > Welcome to engineering.
> 
> My work has only killed 2 people (Iraqi pilots - that particular system
> worked as advertised in the Gulf). There might be as many as 5000 people
> whose lives depend on my work at any time, more if War breaks out. I
> guess we have a different view of "acceptable losses" here, and your
> view may well be more correct.

You're misisng the point. It's not a question as to whether it's OK for the
system to fail. It's a question of humans having to make decisions that
don't include "well, if we throw enough money at it, we'll get everything we
want." You cannot optimize software development time and ignore all other
factors! In some cases, you have to compromise software development/maintenance
efficiencies to meet other requirements. Sometimes, you make the wrong
decision. Anyone who says they've always made the right call is a lawyer, not
an engineer.

> Why? Because such a conservative view as
> my own may mean I just can't attempt some risky things. Things which
> your team (sometimes at least) gets working, teherby saving more lives.

However, if you build a system with the latest and greatest CPU, thereby
having the maximum amount of horsepower to permit the software engineers
to avoid turning off certain checks, etc., you _have_ attempted a risky
thing. The latest hardware technology is the least used.

> Yet I don't think so.
> 
> > And, if you had only got 20MB per second after all that, you would have
> > done...?
> 
> 20 MB? First, re-check all calculations. Examine hardware options. Then
> (probably) set up a "get-well" program using 5-6 different tracks and
> pick the best. Most probably though, we'd give up: it's not doable
> within the budget.

That's the difference. We would not go to our management and say, "The
only solutions we have require us to make compromises in our software
approach, therefore it can't be done. Take your multi-billion project
and go home." We'd work with the other engineering disciplines to come
up with the best compromise. It's the difference, in my mind, between a
computer scientist and a software engineer. The software engineer is paid
to find a way to make it work -- even if (horrors) he has to write it in
assembly, or use Unchecked_Conversion, or whatever.


 The difficult case is 150 MB. In this case, assembler
> coding might just make the difference - I do get your point, BTW.
> 
> > Certainly, if you just throw out range checking without knowing its cost,
> > you're an idiot. However, no one has shown that the Ariane team did this.
> > I guarantee you (and am willing to post object code to prove it) that
> > range checking is not always zero cost, and in the right circumstances can
> > cause you to bust your budget.
> 
> Agree. There's always pathological cases where general rules don't
> apply. Being fair, I didn't say "zero cost", I said "typically 5%
> measured". In doing the initial Systems work, I'd usually budget for
> 10%, as I'm paranoid.

I've seen checks in just the wrong place that cause differences in 30% or
more in a high-rate process. It's just not that trivial.

> You get what you pay for, IF you're lucky. My point though is that many
> of the hacks, kludges etc in software are caused by insufficient
> foresight in systems design. 

And I wouldn't argue that. However, it's a _big_ leap to say ALL hacks
are caused by such problems. Also, having gone through the system design
process a few times, I've never had "sufficient foresight." There's always
been at least one choice I made then that I would have made differently today.
(Why didn't I see the obvious answer in 1985: HTML for my documentation! :)

That's why reuse is always so tricky in safety-critical systems. It's very
easy to make reasonable decisions then that don't make sense now. That's
why at laugh at people who say, "reused code is safer; you don't have to
test it once you get it working once!"

> Case in point: RAN Collins class submarine.
> Now many years late due to software problems. Last time I heard, they're
> still trying to get that last 10% performance out of the 68020s on the
> cards. Which were leading-edge when the systems work was done. Putting
> in 68040s a few years ago would have meant the Software would have been
> complete by now, as the hacks wouldn't have been neccessary.

68040s? I didn't think you could get mil-screened 68040s anymore. They're
already obsolete.

Not easy to make those foresighted decisions, is it? :)

-- 
LMTAS - "Our Brand Means Quality"
For more info, see http://www.lmtas.com or http://www.lmco.com