From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,ad4aec717fd8556e
X-Google-Attributes: gid103376,public
From: bobduff@world.std.com (Robert A Duff)
Subject: Re: 'size attribute inheritance
Date: 1997/09/03
Message-ID: <EFy4Gt.BLA@world.std.com>
X-Deja-AN: 269749852
References: <33ECF679.4B5D@lmco.com> <dewar.871839941@merv>
 <EF4u22.715@world.std.com> <dewar.872433846@merv>
Organization: The World Public Access UNIX, Brookline, MA
Newsgroups: comp.lang.ada
Date: 1997-09-03T00:00:00+00:00
List-Id: <comp.lang.ada>


In article <dewar.872433846@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> wrote:

I said:

><<That's not quite right.  For example:
>
>    with Text_IO;
>    procedure Main is
>        subtype S is Integer range 1..10;
>        X, Y: S; -- uninitialized
>    begin
>        X := Y;
>        Text_Put(Integer'Image(X));
>    end Main;
>
>The above program must either print out a value in the range 1 to 10, or
>else raise C_E.  The above program must *not* print out the number 11,
>for example.>>

Robert replied:

>I disagree with this analysis for two reasons.

I must admit that Robert is correct to disagree.  I was wrong about what
the RM says, and what it was intended to say.

>1. Requiring range checks for the assignment of identical subtypes would
>   damage code quality severely.

That part, I don't buy, and I won't buy unless I see measurements of
real code.  My wild guess is that the damage would be not-much-worse
than the damage due to any of the other range checks and whatnot defined
by the language.  Robert's wild guess is that the damage would be much
worse.  The only way to know who's right would be to implement a
compiler that does the "extra" checks, and measure the speed of typical
programs with and without those checks.

>... In the most common cases (array elements
>   and procedure parameters), it is impossible for a compiler to prove
>   that data is initialized, so these checks would be all over the place.

I'm not sure what "most common" means here.  People don't copy arrays
one element at a time, usually.  They use whole-array assignment, which
happily copies uninitialized components just fine (and it must -- the RM
is clear on that point, and that's good).  As for parameters, the
compiler ought to do the checks at the call site, so that inside the
subprogram body, it can assume that the variable is within its subtype.
At the call site, more information is usually available, so that the
check can often be omitted.

>   While it is clear that damage can be done with some uses of uninitialized
>   variables, e.g. using them in case statements, so that checks are needed
>   there, the damage of a simple copy seems minimal, and the design idea 
>   behind the Ada 95 changes was to avoid extreme damage from erroneous
>   constructs in Ada 83 where practical, without introducing unacceptable
>   overhead.

Agreed (now).

>   If the language did require checks in such cases, then to me, it would
>   be a clear mistake in the language requirements (we have found a few
>   so far, the RM is not infallible), and would have to be fixed.
>
>2. The RM, in section 13.9.1, seems clear enough:

I disagree that it's *clear*, but I can agree with Robert's reading
below.

Note that the key thing is when the representation of the object
represents a value of its *type*, but not its *subtype*.  The RM says
that the values of an integer type are the *infinite* set of integers
(the ones we all learned about in grade school, before being polluted by
computers ;-)).  So if we have "type T is range 1..10;", the values of
subtype T are 1..10, but the values of the *type* are all the integers.
So if a given bit pattern represents 11, then its a value of the type,
but not the subtype, and must raise C_E when assigned to a variable of
subtype T.  However, the compiler is free to claim that this pattern
does not represent 11 (even though you might think it does).  In that
case, no C_E need be raised, which is Robert's point.

It gets more interesting with floating point, because an uninitialized
variable might contain a NaN.

>                          Bounded (Run-Time) Errors
>
>9   If the representation of a scalar object does not represent a value of
>the object's subtype (perhaps because the object was not initialized), the
>object is said to have an invalid representation.  It is a bounded error to
>evaluate the value of such an object.  If the error is detected, either
>Constraint_Error or Program_Error is raised.  Otherwise, execution continues
>using the invalid representation.  The rules of the language outside this
>subclause assume that all objects have valid representations.  The semantics
>of operations on invalid representations are as follows:
>
>   10  If the representation of the object represents a value of the
>       object's type, the value of the type is used.
>
>   11  If the representation of the object does not represent a value of
>       the object's type, the semantics of operations on such
>       representations is implementation-defined, but does not by itself
>       lead to erroneous or unpredictable execution, or to other objects
>       becoming abnormal.
>
>If we take the test program, it seems quite reasonable for an implementation
>to define the following behavior:
>
>  An assignment of an uninitialized value (that may be out of range) copies
>  the (possibly out of range) value into the target.
>
>  The conversion of an out of range value to a subtype that legitimately
>  includes the value simply gives the expected value, which is now in range
>  and hence cannot "lead to erroneous or unpredictable execution".

For the above reasoning to work, one must claim that all bit patterns
(other than the ones that represent 1..10) do not represent integer
values at all.  That is, the bit pattern (on a 32-bit machine):
"0000 0000 0000 0000  0000 0000 0000 1011" does *not* represent 11.  It
represents some non-Integer thing.  I find that interpretation slightly
surprising, but Tucker assured me (in private e-mail) that the *intent*
of the RM agrees with what Robert says here.

>One could I suppose argue that the output is unpredictable, but I would
>disagree. The predictable outcome of the above program is that some value
>in the range of Integer will be output. Yes, the output is non-deterministic,
>as are many legitimate Ada 95 programs, but non-deterministic is not the
>same as unpredictable!

Agreed.

>Bob Duff and I have always disagreed in this area. His balance of thinking
>is far over on the side of checking everything and to heck with the efficiency
>consequences. I take a more balanced (:-) view which worries about the
>efficiency of generated code more.

We've always disagreed on what the rule *should* say -- but I would hope
we could agree on what it *does* say.  (FWIW, I think the rule should be
that any read of an uninit scalar raises an exception.  Yes, this would
be a big efficiency hit.  But it would catch bugs, and if you don't like
it for efficiency reasons, you could pragma-Suppress it.  You also want
to avoid such checks when interfacing to the outside world.  Clearly,
this is not what the RM says, and I certainly wouldn't claim it does.  I
was hoping that my opinions about what it *ought* to say did not color
what it *does* say.)

>That being said, I think there are good arguments for making Normalize_Scalars
>push a compiler into a more agggressive mode when it comes to detecting
>out of range values.

I agree (there are good arguments), but none of those arguments are
supported by the RM, which is unfortunate.

- Bob