From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,c1131ea1fcd630a
X-Google-Attributes: gid103376,public
From: Ken Garlington <garlingtonke@lmtas.lmco.com>
Subject: Re: To Initialise or not
Date: 1996/05/13
Message-ID: <3196FBD9.6AA1@lmtas.lmco.com>
X-Deja-AN: 154585187
references: <318508FE.204B@sanders.lockheed.com>
 <318E3A26.3565@lmtas.lmco.com> <Dr0MCu.1JG@world.std.com>
 <3190A8D3.3D53@lmtas.lmco.com> <Dr5H2B.5FD@world.std.com>
content-type: text/plain; charset=us-ascii
organization: Lockheed Martin Tactical Aircraft Systems
mime-version: 1.0
newsgroups: comp.lang.ada
x-mailer: Mozilla 2.01 (Macintosh; I; 68K)
Date: 1996-05-13T00:00:00+00:00
List-Id: <comp.lang.ada>


Robert A Duff wrote:
> 
> OK, this explains our disagreement.  I would *not* be happy with such a
> language.  You would, and would rely on default-initialization to
> T'First.  IMHO, that just masks real bugs in the program.

The part that perplexes me about this statement (other than not being able
to figure out what kind of "bugs" are "masked"), is that it only seems to
apply to the implicit action that occurs when an access type is declared.
For example, do you also avoid the following constructs? If not, why not?

1. Use of "others" in an aggregate expression, case statement, etc.

2. Controlled types

3. Dispatching

4. Use of default values for components of a record type.

You did say:

> Good question.  Ada doesn't do detection of uninit vars, so it's not
> surprising that aggregates have to be complete, even though some parts
> will never be used.  Yes, I *do* think it would be valuable to have that
> information in the source code, but Ada doesn't provide any way to say
> that, except as a comment.

Why not have a coding convention that says, "you can only use others to
complete an aggregate?" Thus, if the value is meaningful to the algorithm,
it has to be explicitly identified in the aggregate. Isn't this the same
approach as you're taking for null?

How about requiring that others can only be used for case statements if
there is no action to be taken for the case selector value -- that is,
the "others" branch is not important to the algorithm?

All dispatching would have to be replaced by case statements, of course,
and you'd have to always put the initial value on the object declaration,
never the type.

All of these can be expressed directly in the language, so it's in the
source code, as you desire.

If you are consistent in your approach, then I can at least agree that
you are expressing a philosophy regarding the use of the language ("no
implicit actions, since they hide bugs." If you're not consistent,
then I don't understand why the philosophy makes sense in some cases,
but not others.


>  IYHO, it is
> helpful, in that it avoids extra "useless" code.  This is a
> philosophical difference that explains all the differences in detail
> (which I argue about endlessly below;-)).

However, I think it's "philosophy" that's backed up by experience in
maintaining code. Have you done experiments that showed the value of adding
extra code in this case?

> Let me be quite clear about what I would *like* in a language: Any type
> that's built in to the language, and is treated as a non-composite
> entity, should have run-time detection of uninit vars.

OK. Why do you feel that you can't achieve this in Ada? With Ada 95, you
have the ability to detect uninitialized scalars at any point in the program,
so long as the compiler can generate a "marker" value for the object (and
it supports the Safety and Security annex, of course).

> If this run-time detection causes efficiency problems, or problems in
> interfacing to C, or to some hardware, or to anything else, the
> *programmer* (not the language designer) can turn it off.

Which you can do in Ada, for everything except access values.

If your argument is that _requiring_ default initialization of access
values can sometimes be inefficient, I guess I'd agree. I don't see how
your coding convention affects that situation, however. (In fact, if using
a dumb compiler, it could lead to _double_ initialization, worsening the
situation.)

> If I'm coding in a language that doesn't support the above ideal, then
> I'll probably code in a way *as* *if* the above were true, assuming
> that's safe (i.e. assuming that my assumptions are a subset of what's
> actually required by the language).

However, to do this consistently in Ada, you need to make a data abstraction
for each type for which you want such protection. Just providing a redundant
initialization fails to meet your goal.

> Come on Ken,
> don't you have *any* project-wide information that needs to be
> understood by all programmers who wish to modify the code?

With respect to coding standards? I would expect that a maintainer can
read and understand my code without reference to coding standards.
Furthermore, I would expect that my code would still be readable even
if a different set of coding standards were used.

>  Do you
> really expect a programmer to jump in to the middle of the code, grab
> some random module, and start hacking on it without knowing anything
> about the complete product of which it is a part?

Absolutely not. They would need to understand the design and the requirements
from which the code was generated. However, they don't need to know the
coding standards. It may make the code more maintainable in the future if
they do maintain those standards, but the goal should be the use of coding
standards that make it easy to read the code, not code that makes it
necessary to read the coding standards!

>  It seems like your
> arguments here could be applied to *any* coding conventions -- do you
> really want to say that coding conventions are evil, and that any code
> that obeys the language standard should be OK?

Nope - I want to say what I just said: "the goal should be the use of coding
standards that make it easy to read the code, not code that makes it
necessary to read the coding standards!"

> E.g., "This project obeys the Ada Quality and Style Guide, which may be
> found at <so-and-so>."?  Clearly, anybody meddling with the code
> *anywhere* in this project needs to know this fact.

Would that be the 83 standard, or the 95 standard?

I would hope that someone could use the 95 standard to maintain my code,
whether or not I used the 95 standard (or the 83 standard) to develop it.

More importantly, I would expect that anyone could read (and maintain) code
developed with either AQ&S, without knowledge of the AQ&S!

> But still, before meddling with the code on *my*
> project, you will be required to read the document that talks about
> project-wide conventions, and you will be required to obey those
> conventions (or else, if you think they're stupid, get them changed at
> the project level, rather than going off on your own and hacking however
> you like).

What happens if someone on another project reuses your code?

> Yawn.  Yes, there's a possibility that somebody will type "nell" when
> they meant "null".  This seems like a minor concern, since misspelling
> names is *always* a concern.  For integers, if I write "X: Some_Int =
> O;", I might have meant 0 instead of O.  Big deal.  So don't declare
> integer variables called "O", and don't declare pointers called "nell".

1. If explicit initialization to null provides a minor _benefit_, then
it should be a concern if this benefit is outweighed by a disadvantage, even
if it is a minor one. The fallacy in your integer initialization comparison
is that the minor disadvantage is outweighed by a _major_ benefit - namely
that the code may fail if the variable is uninitialized! (Of course, if
you're initializing integers to zero for no good reason, then it's _also_
a big deal.)

2. Do you have something in your coding standards regarding the naming of
access objects, such that they don't have names similar to null? If not,
do you intend to add such a restriction based on this discussion? What
about tools to check for violations of this new standard?

3. Several little disadvantages (assignment to the wrong value, reader
confusion, etc.) can equal a significant disadvantage.

> Yes, it does make sense to ask *that*.  If the convention is truly
> useless, then it's bad to require that extra code.

Now we're getting somewhere!

You just said that there _would_ be a condition where it would be bad
to require extra code. I think there are several conditions where this
is bad, I've given a couple of examples:

1. Explicit initialization to null.

2. Forcing integer values to look like aggregates (on the spurious
grounds that it makes integer and array initial values "look more
consistent," as you have indicated you desire for integers and
access values.)

So, it seems to me we're left with the argument,"is explicit initialization to null one of 
those conditions?"

Perhaps it would help if you identified cases where you would legitimately
object to adding extra code, and then let's see if I can convince you that
your coding convention matches one of your cases...

> I did -- the doctor/patient thing.  You didn't buy it.

Right, because Ada provides a way to do what you _really_ want to do, and it doesn't
involve use of your coding convention. If you really want to plan for changing
the representation of an access value to an integer, then Ada gives you everything
you need. If you want to have two reserved values for a pointer (either represented as Ada 
access values or integers), then you can do that, too. In fact, if you re-write my example as 
a generic, you can get both features very easily. And, with just a couple of comments on the 
generic explaining what it's for (to permit the internal representation of the pointer to 
change), the reader can easily grasp what's happening, without reference to _any_ coding 
standard. Furthermore, the maintainer is encouraged to continue to use the convention, since 
it's more work to rewrite the code to remove it.

Your current coding convention, on the other hand, seems to be a fairly "weak" (e.g., no 
support from the toolset) and unreliable way to not-quite-achieve what you want. Why not go 
all the way?

(By the way, I misspoke in my example. Since you're only using the integer value to access an 
array, you probably don't need the 'Valid check. You just need to map Constraint_Error into 
whatever you're doing for uninitialized values.)

If you had a coding (or, more likely, a design) convention that said, "Use generic XXX to 
create pointer types, so that (a) two reserved values are available for each pointer and (b) 
pointers representations can be readily changed between Ada access types and other types." I'd 
say that was a reasonable statement, if you believed
either outcome was reasonably likely to happen. In fact, my design standard (ADARTS) supports 
thinking in these terms.

> This explains why you like my hypothetical language that initializes all
> integers to T'First.

No, I was referring to the disadvantages of using explicit nulls to overwrite
implicit nulls. If something is implicitly set in all cases, then it's safer
just to assume this behavior will occur in the source. Less confusion, less
chance for a coding error.

It might be safer if all scalars had out-of-range values. However, my applications
also have to run in real-time, so I would not claim that this is a requirement
for a safety-critical system. Useful, perhaps.

> >What about loops? When I write "for J in Buffer'Range loop", there's
> >an implicit initialization of J to Buffer'First. Is there an issue here?
> 
> No, there's no issue here.  It is quite clear (and explicit) what J is
> initialized to.

What is Buffer'First?

> >It just seems like a strange way to represent data flow information.
> 
> You prefer comments, I guess.

Certainly, comments could explicitly say what you meant to say, and they could
be placed at the point in the code where you need them. However, I was thinking
more in terms of declaration/use information generated by a tool, or better
yet, a graphical representation.

> Pragma Normalize_Scalars is nice, but it doesn't go far enough, for my
> taste, because (1) it doesn't work when there are no "extra bits", and
> (2) it doesn't require every read of an uninit scalar to be detected --
> the program has to actually do something that trips over that value, and
> causes some other constraint check to be violated.  And, of course, it
> can't detect uninitialized access values, because any access value that
> is conceptually uninitialized is actually set to null by the
> implementation, thus masking any such bugs.

(1) You should be able to generate extra bits for any scalar object you want
to check, AFAIK.

(2) Note that, with an appropriate abstraction, you _can_ require every
read of an uninit scalar to be detected. You can also have this abstraction
cover access values.

(3) Also, note that the use of "null" for access values only detects _dereferences_ of uninit 
pointers. You can still read the access object in other ways, without
implicit detection. To cover _all_ cases, you need an abstraction.

(4) Most importantly, how does your coding convention help this situation in any way?

> >> I claim that the coding convention we're arguing about is actually
> >> *more* KISS, since it treats integers and pointers alike.
> >
> >I disagree that consistency necessarily equates to simplicity.
> 
> This is a huge difference in our philosophies.  I won't quite say
> "necessarily", but "usually", consisistency = simplicity.  This is
> certainly one such case.

And you stated for another case of "consistency":

> Seems like a bogus argument, to me.
> I said integers and pointers should behave the same with respect to
> initialization.  Composites should not.

Why not? Usually, consistency = simplicity, right?

Note also that "consistency" in one dimension (making access values look like
integers) can cause inconsistency in another dimension (see my discussion of
extending your philosophy to other areas of the coding standards).

> >Well, assuming you don't resort to Unchecked_Conversion or something nasty
> >like that, any cases I can think of where you can use integers to reference
> >a memory location will automatically have bounds (e.g., as an array reference).
> >Pointers are unbounded in Ada 95, as far as I can tell -- there is no easy
> >way to check for "in range" in a machine-independent fashion. Therefore,
> >to avoid illegal references, you have to do something special, right?
> 
> Ada 95 had to *add* some rules to achieve this, for integers.  In Ada
> 83, "A(I) := 3;" will overwrite random memory if I is uninitialized.
> In Ada 95, it will either raise an exception, or overwrite some random
> component of A.  Better than nothing, I suppose.  The rule for "Ptr.all
> := 3;" could be the same, except that Ada 83 already said you can rely
> on Ptr being initialized to null.

Right - the expression should either write to a
defined, allocated subset of memory (the range of A) or it should raise
an exception. (Is there a case where an uninitialized integer even gets
into the class of a bounded error, or is it always run-time detectable?)

However, in Ada 95, an access value can't have any such subset defined. (In
Ada 83, you could at least argue that it's restricted to the pool.)  Therefore,
I can see where access values should be treated differently. (If null were not
a default initial value, I would assume access to an uninitialized access would
be erroneous, and not even bounded, right?)

> I still don't see how you translate a requirement to initialize, into a
> requirement to initialize with a weird aggregate with Hundreds
> components and so forth.

I translate a requirement to treat access values like integers on initialization,
into a requirement to treat integers like arrays on initialization. Seems very
consistent to me!

> My point is that for one particular type, we want *some* things
> initialzied to null, and we want *some* things to be initially undefined
> (and we wish our compiler would detect any errors).

However, earlier you said something a little different: that you wanted two
reserved values for each access type: one that represented "uninitialized"
and one that represented "no particular value." Furthermore, you wanted all
access types to have a default value of "uninitialized", so that it could
later be detected. Given thew way Ada is designed, you can certainly do that
by mapping "uninitialized" to "null" and "no particular value" to some other (non-null) 
allocated value used only for that purpose. You can't do it by
using null in two different contexts, as far as I can tell. Furthermore, to
get your behavior for "uninitialized", you want the language to do default
initialization to "uninitialized." Ada does that for access values.

can you identify a case where null _is_ a
> >> >reasonable initial value, but not a reasonable initial _default_ value?

> >1. When I read the Patient data structure, how important to me is it to know
> >that Doctor should never be null? Wouldn't this be better as a comment (or even an
> >assertion) within the code which creates Patients, since that's probably
> >where I need to know this information?
> 
> As a comment?  Surely you're not going to claim that comments are more
> reliable than coding conventions?!

I'm not sure how reliability got into this discussion, since your convention isn't
reliable (checked in any way), right? I would certainly claim
that an assertion is more reliable than your coding convention (particularly in the
context of the abstraction I showed), since it _does_ involve an explicit check.

If you meant "readable," then yes - if you gave me the choice of Hungarian notation
or comments, I'd probably choose comments.

> As an assertion when the Patient is created?  No, it's an *invariant*,
> which needs to be understood by the person writing the code to create
> one, and also by the person writing the code to look at one.

Great! _Check_ it as an invariant, then.

  When
> writing code, you need to know whether
> "Sue_Doctor(Some_Patient.My_Doctor)" is valid -- you need to know
> whether you have to write:
> 
>     if Some_Patient.My_Doctor = null then
>         Dont_Bother; -- Patient was self-medicating; noone to blame.
>     else
>         Sue_Doctor(Some_Patient.My_Doctor);
>     end if;
> 
> instead.

Would it be better to express this knowledge:

1. As an explicit initialization on the data structure?

2. As a comment in the specification of Sue_Doctor?

3. As a run-time check at the beginning of Sue_Doctor?

I can see #2 or #3. I have a hard time seeing how the maintainer would
think to review the data structure for Some_Patient in order to know
whether or not to do a check for null on My_Doctor. For that matter,
what about:

  Contact_Doctor(Some_Patient.My_Doctor);

Doesn't it have to have the same behavior as Sue_Doctor, for your
coding convention to work? What if Contact_Doctor should contact the
patient if there is no doctor?

> 
> >2. Suppose I change the application slightly, such that a Patient is created
> >when they enter the waiting room. In this case, it may be quite valid for a
> >Patient to exist, but not have an associated Doctor. In order to maintain
> >your coding convention, the maintainer must go back to the Patient record and
> >add an explicit initialization. Is this likely to happen, particularly if the
> >code is separated from the data structure, and nothing will happen functionally
> >if the maintainer fails to do this? Or will your coding convention "drift" over
> >time, causing more confusion?
> 
> If the application is different, then yes, you have to change the code.
> If you used comments, instead, then you'd have to change the comments.

> Either way, there's a danger that the programmer will forget to do it.
> This is the nature of comments, which is the same nature as un-enforced
> coding conventions.  I don't see any way around that, except to outlaw
> both comments and coding conventions.

See my example. You can use Ada to enforce the pointer abstraction, consolidate
comments in one useful location, and still have coding conventions!

> Agreed.  What's the alternative?  Comments?  Same maintenance problem.
> Don't bother?  Well, there is some useful information here, do you
> really want to hide it from the users of this type?

Third alternative: Abstraction.

> I guess we'll have to agree to disagree.  If you work on *my* project,
> you'll have to obey *my* conventions.  If I work on *your* project I
> will, of course, obey your conventions.

Again, what happens when your code hits my project (or the other way
around)? Wouldn't it be better if the code were readily maintained
without reference to coding conventions?

What happens if you decide to change your coding conventions? Aren't you
restricted, since some of your code may no longer make sense if you change
a convention?

> Either way, the maintainer can
> benefit from knowing what the coding conventions are.

True, but you go further. It is _required_ to know the coding conventions
to understand what your code is trying to tell the maintainer. I would
think that this should not be required.

-- 
LMTAS - "Our Brand Means Quality"