From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,42427d0d1bf647b1
X-Google-Attributes: gid103376,public
From: dewar@cs.nyu.edu (Robert Dewar)
Subject: Re: Ada Core Technologies and Ada95 Standards
Date: 1996/04/04
Message-ID: <dewar.828626921@schonberg>
X-Deja-AN: 145800693
references: <00001a73+00002c20@msn.com> <dewar.827809782@schonberg>
 		<828038680.5631@assen.demon.co.uk> <dewar.828062076@schonberg>
 		<828127251.85@assen.demon.co.uk> <dewar.828157508@schonberg>
 		<315FD5C9.342F@lfwc.lockheed.com> <dewar.828415807@schonberg>
 		<3160EFBF.BF9@lfwc.lockheed.com> <EACHUS.96Apr3104524@spectre.mitre.org>
 <3162B080.490F@lfwc.lockheed.com>
organization: Courant Institute of Mathematical Sciences
newsgroups: comp.lang.ada
Date: 1996-04-04T00:00:00+00:00
List-Id: <comp.lang.ada>

In my domain, a reverse mapping is always done by the test developers,
to demonstrate effective test coverage. Anyway, since ACVC isn't
intended to measure product quality, I don't know it matters...

   That's a misleading. There are many aspects of quality for
   Ada compilers. Validation helps to measure and assess some of
   these aspects:

      (a) full feature coverage
      (b) accurate interpretation of tricky semantic rules
      (c) lack of improper extensions

   The point is not that the ACVC suite does not measure product quality,
   it is that it does not guarantee product quality, because it only
   addresses certain quality aspects. You can still have a 100% conformant
   compiler with no extensions that is of low quality because it has
   unacceptable capacity limits, or compile/runtime performance for
   example.

   P.S. It is perfectly possible to trace ACVC tests back to the test
   objectives in the implementors guide.

Unfortunately, I can never tell the criteria by which "reasonable"
coverage was established. However, as I noted above, I'll take it on
faith that the current ACVC is as good as it can be.
 
   The reason you cannot tell the criteria is that you have, as you
   noted in your previous message, not made the effort to find out.
   A lot has been written down to answer these questions. As I
   mentioned before, a good starting point is to look at the papers
   John Goodenough has written on the subject. You could also
   contact Nelson Weiderman for material relating to the development
   of ACVC version 2 (this is a large SAIC contract involving tens
   of person years of work, and as you would hope, this work does
   not take place in an informatoin vacuum!) You might be particularly
   interested to study the coverage matrices that have been developed
   for Ada 95.

How many attempts were made in the last three years to add a regression
test to the ACVC? How does that compare to the list of known bugs in Ada
compilers? I guess I'm still operating from ignorance: Dr. Dewar seemed
to think that it wasn't possible to try to put in regression tests for
every bug, but you're saying this is what is attempted? Perhaps we're
talking about different types of bugs?

   An attempt seems to imply something you tried and failed, so I am
   not sure I would apply the word. It is normal procedure in the ACVC
   development process to add tests to the suite as gaps are found.

   In the case of GNAT, if I find a bug that seems like it should be
   caught by the ACVC suite, I send along a message to the ACVC
   development group. If appropriate it is discussed by the ACVC
   review group, but more often than not, it is simply incorporated.

   I do not send along all bug reports for many reasons. Some involve
   proprietary code, and the bug cannot easily be abstracted into a
   simple example, some depend on implementation specific issues. For
   example, the fact that some Ada program triggers an obscure bug
   in the assembler for some obscure machine does not necessarliy
   translate into a useful test. Some are quite GNAT specific. Some
   are untestable issues (nice error messages for instance), etc.

Do we believe that there is something in the works within the Ada
community to keep reducing the defect rate? Where is this effort documented?
If there is such a plan, discussing it would be a welcome change from
the "can't be done" answer I've been given so far in this thread!

   If you are really interested in finding out what is going on in detail
   with validation, informal discussion on CLA is not the place to find
   it out. Why not spend some of the effort you put in on posting these
   messages in finding out more about the ACVC process. Obtain and
   read the ACVC validation procedures. Obtain and read some sample
   VSR's. Obtain and read John Goodenough's paper. etc.

Do you believe the Ada compiler vendor market looks more like "some"
markets, or like a "safety critical" market? I think that question is
pretty much at the heart of my grousing, and I suspect at the heart of
Mr. McCabe's statements as well.

  The Ada compiler vendor market certainly does NOT look like a safety
  critical market (even by Ken's standards, which seem a bit weak to me
  since they seem very testing oriented, and I would never consider
  testing alone to be sufficient for safety critical applications).

  It is definitely the fact that today we do NOT have the technology
  to develop compilers for modern languages that meet the criteria
  of safety critical programs. This is why all safety critical programs
  are tested, validated and certified at the object code level (I will
  certainly not fly a plane where the flight software took on faith
  the 100% accuracy of all generated code).

  Is it possible to develop such technology? Yes, I think so, but it
  involves some substantial work at both the engineering level and
  research level. The NYU group is actually very interested in this
  area. We recently made a proposal to NSF, not funded :-( precisely
  to do work in this area. If anyone is interested, they can contact
  me and I will send a copy of the proposal. We are currently exploring
  other funding sources for this work.

  But that's definitely in the future.

  Compilers are not large programs, but they are extremely complex. They
  involve very complex data structures and algorithms. Far more complex
  than appear in typical application programs. Furthermore, we do not
  begin to have complete formal specifications of the problem, Unlike
  typical safety-critical applications, at least those that I am
  familiar with, where formal specifications are developed, we have
  not yet arrived at a practical technology for generaing formal
  specifcations of large languages.

  The last sentence here is controversial. There are those who would
  disagree (see for example the VDM definition of Chill), but the
  empirical fact is that nothing approaching a full formal specification
  exists for Ada 95, Fortran 90, COBOL, C++ or even, as far as I am
  aware, C. Furthermore, I just don't think that formal definition
  methodology is good enough yet to solve this problem fully.

  Could we do more now with current technology? Possibly. We could do
  for example full coverage and path testing for compilers. There are
  however several problems with this.

    It would be VERY expensive. Look for example at the TSP CSMART
    product. Now this is an interesting example of a chunk of Ada
    compiler technology that does meet Ken's expectations as a
    safety-critical program creator. BUT, this runtime is TINY
    compared to the full Ada 95 runtime, perhaps less than one
    tenth the size, and yet it costs two orders of magnitude
    more than a normal full runtime. I was working with Alsys
    during some of the time they worked on CSMART, and I can
    assure you that this price accurately reflects the enormous
    effort that went into certifying this relatively small
    piece of code (the 386 version was based in part on code
    that I had written for the full Ada 83 runtime for Alsys).

    It would tend to make it much harder to fix problems and
    improve the compiler. Compilers are not static, but tend
    to constantly improve. If the cost of improvement is to
    repeat a very expensive certification and testing process
    then improvements will not happen so often, or at all.

    It would divert energy from other important areas. Even if
    we verified 100% proved conformance to the Ada 95 spec, or
    even to some imaginary formal version of this spec, it
    would still be only one aspect of quality. There is no point
    in spending too much effort strengthening the strong link of
    a chain. What we want to aim at is a situation where problems
    of compiler bugs and conformance are no more of a problem
    in the total development process than other factors, and
    taken as a whole, that thees bugs and other problems do not
    significantly impact project success or timescales.

  I can certainly understand Ken's frustration, but these are tough
  problems. I am sure that when uninformed members of the public,
  or even simply technical people not familiar with avionics, look
  at yet another report of a military aircraft crash, and ask "Why
  can't those guys build planes that don't crash?" Certainly
  congress likes to ask such questions :-)

  Of course, they simply don't understand the engineering problems
  and realities involved. Our technology is not perfect, and this
  is true of many areas. About the best we can hope for is to ensure
  that compiler technology is not the weak link in the chain, and that
  when a plane does crash, it is not because of a bug in an Ada compiler!
  In practice I think we do a pretty good job. I am not aware of any
  major failure of safety-critical software that can be traced to a
  compiler bug.

  Could we do better? Will we do better in the future?
  I would certainly hope that the answer is yes and yes.

P.S. I find it a bit amazing that John McCabe is so unaware of the
validation status of the compiler he is using. One important piece
of advice for any user of validatd Ada compilers is to obtain the
VSR (validation status report), and read it carefully. VSR's are
public documents, available from the AVO, so even if your vendor
does not supply a copy (they should), you can obtain one. John,
along with a lot of other data, the VSR lists the expiration date,
or points to the documents that define the expiration date.