comp.lang.ada
 help / color / mirror / Atom feed
From: "Randy Brukardt" <randy@rrsoftware.com>
Subject: Re: Formal Subprogram Access
Date: Thu, 15 Feb 2018 17:03:12 -0600
Date: 2018-02-15T17:03:12-06:00	[thread overview]
Message-ID: <p653jh$dbq$1@franka.jacob-sparre.dk> (raw)
In-Reply-To: p633tm$cur$1@franka.jacob-sparre.dk

I wrote:
...
> One of the big things learned from these automatic grading tools is that 
> it is really easy for junk results to creep into typical ACATS grading 
> setups (which usually depend on comparing against known-good results). I 
> found 4 ACATS tests that were marked as passing for Janus/Ada that 
> actually failed. Two of those actually reflected compiler bugs introduced 
> in recent years (both easy to fix, thank goodness), one was a batch file 
> problem, and one probably was just let off of the to-do list (but of 
> course if it isn't on the to-do list, it isn't very likely to ever be 
> worked on). Thus I'm not too surprised to find similar things for GNAT.

I should clarify a bit about this. Just because the automated tools report 
GNAT as failing on some particular target, that doesn't mean tht GNAT 
actually would fail a formal conformity assessment. There are a number of 
other factors in play.

First, the fully automated testing tool can only handle "usual" tests; those 
that require special handling have to be run specially. For my GNAT tools, 
this includes things like the tests that include foreign language code (the 
ACATS grading tools ignore such code as it isn't relevant to grading - the 
only thing that matters about it is that it was compiled, and it isn't worth 
anyone's time to try to automate detection of non-Ada compilers and non-Ada 
source code). It also includes all of the Annex E tests that require actual 
partitioning (I have no personal interest in figuring out how to configure 
those).

For a formal conformity assessment, many of these special handling things 
would be run using custom scripts (rather than the automatically generated 
scripts generated by the tools); the results of running those scripts could 
be graded with the usual grading tool. This would handle examples like the 
cases above, as well as any other tests that need special options to be 
processed correctly.

Moreover, an implementer doing a formal test would have the opportunity to 
challenge the grading of any test, potentially to explain why it should be 
considered "Passed", to suggest that it be run specially, or even to argue 
that it does not appropriately reflect the rules of Ada. These test disputes 
would be discussed with the ACAL (the formal tester) and possibly with the 
ACAA Technical Agent (that's me). This process can result in modified 
grading requirements for that implementer or for all ACATS users, or even 
the permanent removal of test from the test suite.

Additionally, the ACATS gradiing tool enforces a rather strict view of the 
ACATS grading standards. It's quite likely that a B-Test that it reports as 
failed actually would be graded as passed by a human as the error message is 
"close enough" to the required location. Moreover, a human can read the 
contents of an error message while the grading tool makes no attempt to do 
that. (I've spent some time improving existing ACATS tests so that the 
grading tools are more likely to be able to grade them successfully; but 
doing that for the entire test suite is not going to be a good use of 
limited resources.)

To summarize, just because an automatic test run grades some tests as failed 
doesn't necessarily mean that those tests would be graded as failed in a 
formal conformity assessment. More simply, some failures in an automatic 
test run doesn't mean that the compiler can't pass conformity assessment.

                                       Randy.

P.S. It should be noted that I did most of this GNAT tools work on my own 
time, and not in an official capacity as ACAA Technical Agent. If I had done 
it officially, I wouldn't be allowed to talk about it (which would have 
defeated the purpose of building the tools).





  parent reply	other threads:[~2018-02-15 23:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-09 22:11 Formal Subprogram Access Jeffrey R. Carter
2018-02-10  3:03 ` Randy Brukardt
2018-02-10  9:57   ` Jeffrey R. Carter
2018-02-13  5:51     ` Randy Brukardt
2018-02-13  9:24       ` AdaMagica
2018-02-13  9:41         ` Dmitry A. Kazakov
2018-02-13 10:28           ` AdaMagica
2018-02-14  0:47             ` Randy Brukardt
2018-02-14  8:19               ` Dmitry A. Kazakov
2018-02-14 10:01                 ` Jacob Sparre Andersen
2018-02-14 11:07                   ` Dmitry A. Kazakov
2018-02-13 12:24       ` Simon Wright
2018-02-14  0:53         ` Randy Brukardt
2018-02-14 14:36           ` Simon Wright
2018-02-15  4:56             ` Randy Brukardt
2018-02-15 13:12               ` Simon Clubley
2018-02-15 16:38                 ` Simon Wright
2018-02-15 18:40                   ` Simon Clubley
2018-02-15 16:19               ` Simon Wright
2018-02-15 23:03               ` Randy Brukardt [this message]
2018-02-13 17:34       ` Jeffrey R. Carter
2018-02-13 18:31         ` AdaMagica
2018-02-14  0:57           ` Randy Brukardt
2018-02-10 14:55   ` AdaMagica
2018-02-21 17:51 ` Jeffrey R. Carter
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox