From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,7df278df170d566b,start X-Google-Attributes: gid103376,public From: Ken Garlington Subject: ACVC and Compiler Quality (was: Ada Core Technologies blah blah blah) Date: 1996/04/10 Message-ID: <316B9234.35AA@lfwc.lockheed.com> X-Deja-AN: 146805129 content-type: text/plain; charset=us-ascii organization: Lockheed Martin Tactical Aircraft Systems mime-version: 1.0 newsgroups: comp.lang.ada x-mailer: Mozilla 2.01 (Macintosh; I; 68K) Date: 1996-04-10T00:00:00+00:00 List-Id: (I don't know if this got posted before, so I'm re-posting...) Robert Dewar wrote: > > The point is not that the ACVC suite does not measure product quality, > it is that it does not guarantee product quality, because it only > addresses certain quality aspects. OK, let's say that ACVC measures some aspects of product quality, and that I misunderstood the statement in the old AJPO guide about not using ACVC to determine "suitability for any particular purpose," or words to that effect. Is what it measures meaningful to the end user? In other words, does the presence of the ACVC have a _measureable_ effect on the portability of code between compilers? How is this _measured_? It's clear you can respond by saying, "Well, without the ACVC, a reasonable person would conclude that these attributes would be worse." However, you've implied that the ACVC suite _measures_ relevant aspects of product quality. >From this, one could assume that changes in the ACVC test suite, say from 1.x to 2.x, might change this measure. How did it change, or how is it expected to change, such measures as portability, etc? > You can still have a 100% conformant > compiler with no extensions that is of low quality because it has > unacceptable capacity limits, or compile/runtime performance for > example. I'm not sure what a "100% conformant compiler" has to do with this thread, since we've already agreed there's no way to know if a compiler truly conforms to the language standard. If by "conformant," you mean "passes all ACVC tests," then you can also add to your list problems due to compilers not generating correct code, or allowing illegal contructs (I've found "relaxed" rules in two validated compilers so far), or providing non-standard (but "allowable," as far as I know) extensions to the language (some of which are now in the Ada standard), or even providing arguably unallowable extensions, as in the TLD case. Given all this, what is the expected benefit of the ACVC to the end user? Let me say it another way. If I worked on a non-DoD project, and a vendor offered me a discount if they didn't have to run the ACVC on future versions of their compiler, under what circumstances should I take it? > P.S. It is perfectly possible to trace ACVC tests back to the test > objectives in the implementors guide. Again, a difference of philosophy. The benefit is accrued in my domain when the traceability matrix is generated during development. Is this what you are saying happened? Or are you saying something similar to: "We didn't actually test the product, but it is perfectly possible to do so." (As to whether the test objectives can be traced back to the ARM, see below.) > I do not send along all bug reports for many reasons. Some involve > proprietary code, and the bug cannot easily be abstracted into a > simple example, some depend on implementation specific issues. For > example, the fact that some Ada program triggers an obscure bug > in the assembler for some obscure machine does not necessarliy > translate into a useful test. Some are quite GNAT specific. Some > are untestable issues (nice error messages for instance), etc. Do you send all bug reports that don't meet these (reasonable) exclusions? Do other vendors? Should they? If so, what document implies that they should? > Obtain and > read the ACVC validation procedures. Obtain and read some sample > VSR's. Obtain and read John Goodenough's paper. etc. 1) ACVC validation procedures OK, let me recheck the AdaIC server and see what I find: (ACVC) VERSION 2.0.1 USER'S GUIDE "1.3 ACVC Purpose "The purpose of the ACVC is to check whether an Ada compilation system is a conforming implementation, i.e., whether it produces an acceptable result for every applicable test. A fundamental goal of validation is to promote Ada software portability by ensuring consistent processing of Ada language features as prescribed by [Ada95]." This is a fundamental goal; however, nowhere have I been able to find how the effectiveness of the ACVC in meeting this goal has been measured. Note also that the purpose, essentially, is to pass the tests! "ACVC tests use language features in accord with contexts and idioms normally found in production software." Presumably, there is some mechanism used to compare the ACVC tests to the current use of Ada in production systems. Nowhere can I find how this is done. The rest of this section painstakingly points out what the ACVC tests cannot do, which is probably a good idea, since this area is where supporters of the ACVC seem to feel most secure in dicussing the ACVC. So, how are the tests designed? The following section would seem to be the right place for the answer. "3.6 General Standards Tests for 9XBasic and for Ada95 were developed to a general set of standards. To promote a variety of code styles and usage idioms in the tests, standards were not necessarily rigorously enforced but were used as guidelines for test writers." Well, that's doesn't sound very encouraging for a standard. In fact, the section quickly moves into fine naming conventions, etc. I do discover that: "In 9XBasic, tests use as few Ada features as necessary to write a self-checking executable test that can be read and modified. Tests for Ada95 exhibit a usage oriented style, employing a rich assortment and interaction of features and exemplifying the kind of code styles and idioms that compilers may encounter in practice." Presumably, this emphasis on a "user-oriented" style (which I've heard repeatedly stressed as a new thing for Ada 95 tests) will accrue some benefit. This implies that (1) some deficiency was measured before, and (2) there is some effort to take Ada 95 measures to see if the deficiency is corrected. What are these deficiencies, and these measures? Not in this document. How about: THE ADA COMPILER VALIDATION CAPABILITY (ACVC) VERSION 2.0.1 TEST OBJECTIVES DOCUMENT This immediately goes into brief paragraphs for each test, e.g. " B360001 Check that, within the definition of a nonlimited composite type or a limited composite type that..." Presumably, the writer of this test contributed to a bi-directional traceability matrix, but it must have been at the bottom of this document, since my browser died midway through the download. This would probably help reduce the redundancy found in the ACVC 1.x tests, as mentioned in the User's Guide. Both of these documents are March 1996 documents, and are referenced in others documents which I read previously, such as the Ada 9X Transition Planning Guide and the Ada 95 Adoption Handbook. I couldn't find the answer to my questions in these documents, either. 2) VSRs - Read one. Not a plan. Doesn't answer any of the questions I've raised. 3) John Goodenough's paper - I found several references to the Quality and Style guide on Ada IC when I searched on "goodenough". Was this what you had in mind, or was there some other paper? If so, how would the average end user get it? Furthermore, if this paper is not incorporated into the official AJPO documentation, how useful is it? > The Ada compiler vendor market certainly does NOT look like a safety > critical market (even by Ken's standards, which seem a bit weak to me > since they seem very testing oriented, and I would never consider > testing alone to be sufficient for safety critical applications). Me, neither. Why would you think otherwise? What's more, the Ada community (well, OK, those members in this thread) doesn't even seem willing to accept 50% of the weakened standards I've raised, so what difference does it make where I set the bar? > It is definitely the fact that today we do NOT have the technology > to develop compilers for modern languages that meet the criteria > of safety critical programs. What about the criteria for SEI Level II? Level III? ISO 9000? _Any_ explicit criteria related to end-use quality, other than: "to check whether an Ada compilation system ... produces an acceptable result for every applicable test." > This is why all safety critical programs > are tested, validated and certified at the object code level (I will > certainly not fly a plane where the flight software took on faith > the 100% accuracy of all generated code). Of course, since testing can't prove 100% accuracy... :) Did you understand the Musa reference? Should I go into more detail about what I mean? As an aside, would you fly on an aircraft where formal methods were not used in the development of the safety-critical software? > Is it possible to develop such technology? Yes, I think so, but it > involves some substantial work at both the engineering level and > research level. The NYU group is actually very interested in this > area. We recently made a proposal to NSF, not funded :-( precisely > to do work in this area. Why should it have been funded, given that you claim below that Ada compiler quality is adequate? If you can answer that, I suspect you will end up agreeing with Mr. McCabe and myself! > If anyone is interested, they can contact > me and I will send a copy of the proposal. We are currently exploring > other funding sources for this work. I would be very interested. Please send me a copy. If you don't have my address, let me know. > Could we do more now with current technology? Possibly. Yes! The part of the thread _I_ want to talk about! :) > We could do > for example full coverage and path testing for compilers. There are > however several problems with this. > > It would be VERY expensive.... > It would tend to make it much harder to fix problems and > improve the compiler.... These first two I cannot argue, although I can point out that automation can at least reduce the impact. On the other hand, it can be argued that going from an SEI I to an SEI III process will also have these effects, at least initially. In particular, it may not be a bad idea to slow down the rate of change of a software item, if that item is experiencing significant regressions with each release. Also, you may find that the initial up-front cost is offset by fewer bug fixes downstream (and less time running that enormous regression test suite :) Are you saying that, in order to keep Ada toolset prices low, you (and the Ada communityu in general) are unable to invest in such technologies? That Ada toolsets should be thought of as a commodity product, where you make your money by volume selling of products with small profit margins? > It would divert energy from other important areas. ... > What we want to aim at is a situation where problems > of compiler bugs and conformance are no more of a problem > in the total development process than other factors, and > taken as a whole, that thees bugs and other problems do not > significantly impact project success or timescales. THAT'S IT! THAT'S WHAT I WANT! Now that we agree as to where "we" (the Ada community, hopefully) want to aim, I'll re-ask some questions: 1. Does the ACVC help in this aim? If so, how can we tell? If not, should it? 2. What other activities should be happening to meet this aim? 3. Where is the plan to make these other activities happen? (I think it's in Humphrey's book: - If you don't have a destination in mind, any road is a good one, BUT - If you don't have a road available, no destination is reachable!) > I can certainly understand Ken's frustration, but these are tough > problems. I am sure that when uninformed members of the public, > or even simply technical people not familiar with avionics, look > at yet another report of a military aircraft crash, and ask "Why > can't those guys build planes that don't crash?" Certainly > congress likes to ask such questions :-) This is an outstanding example. In Dr. Levison's book "Safeware" (which is missing from my bookcase, or I'd quote it directly), she talks about the exemplary safety record in military aviation in recent years. However, this wasn't always the case. In the past, the record was dismal. It got so bad that, in desparation, the military put together a _plan_. They developed a safety standard, made safety a top priority with commanders, etc. and turned it around. Whenever a plane does crash, there is an investigation, the root cause is determined, and made public to help avoid repetition. This is also true in the engineering world. As we've gone into the business of computer-based flight controls (analog, and later digital), we've put together industry-wide standards on the development and evaluation of these systems. When there is an incident, the root cause is established and published, and the standards are updated. And, as a result, the reliability of those systems have continued to improve, in a measurable sense. This is also true for software in general. I've mentioned SEI and ISO 9000 already. You can argue about the effectiveness of these, but they exist. There were issues about software quality, and the industry put together a plan to respond. And now, you're hearing complaints from some users of Ada tools that the quality overall - not just a single vendor - is not as good as it should be. Generally, the response has _not_ been, "Yes, here's what we need to do." It's been: > Of course, they simply don't understand the engineering problems > and realities involved. Our technology is not perfect, and this > is true of many areas. Well, OK, but so what? You've already made the heinous mistake of admitting that more can be done, so why isn't it? Why don't Ada vendors use modern software engineering and management techniques? Or do they? And how would anyone know? > About the best we can hope for is to ensure > that compiler technology is not the weak link in the chain, and that > when a plane does crash, it is not because of a bug in an Ada compiler! I am here to tell you that, today, I believe that Ada compiler technology _is_ the weak link after requirements definition. I see too many papers (including mine) that say, "Ada is great, but the compiler was buggy" to believe otherwise. We can (and do) patch that weak link with object code analysis and other techniques. I am worried (and Musa predicts) that, particularly as Ada compilers grow more complex, our patch may not hold. By the way, anyone who tries to use this claim as an argument to not use Ada is misrepresenting my position entirely. I'm talking about improving a good product, not denouncing a poor one. Note that you don't have to be building safety-critical software for this to be an important issue. How many converts to Ada have we lost due to the quality of compilers? How many schedules have been impacted? How much money has been spent? I can spin anecdotes on this, but I would think that someone, somewhere, should actually be finding out the answers to these questions. More importantly, they should be working on _improving_ the answers to these questions. If Ada is intended for systems that need to be more reliable than average, then Ada compilers should be more reliable than the average language compiler. Not 100% reliable, and perhaps not even at the reliability of a safety- critical system. Just better than average. This means two things: (1) you have to continuously measure the reliability, and (2) you have to have an explicit plan to make that measure improve. I was encouraged by the TRI-Ada '95 presentation on the Verdix Ada vs. C++ compiler error rates. I would have been more impressed by a presentation showing a steadily decreasing error rate measured across several vendor products. (Like _that's_ ever going to happen, based on this thread!) > In practice I think we do a pretty good job. I am not aware of any > major failure of safety-critical software that can be traced to a > compiler bug. I don't think we have enough operational hours to know. I also don't know if there is a reporting mechanism available to find out if/when this does occur. Most importantly, I don't want to wait for the software equivalent of an O-ring failure to find out that we should have been addressing this issue. (I keep looking at those curves in Musa's book, and crossing my fingers...) > Could we do better? Will we do better in the future? > I would certainly hope that the answer is yes and yes. Sorry, I work for an SEI III contractor. We are not permitted to "hope" for a goal. ;)