From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,INVALID_DATE,
	MSGID_SHORT autolearn=no autolearn_force=no version=3.4.4
Path: utzoo!mnetor!uunet!husc6!bbn!rochester!PT.CS.CMU.EDU!sei!sei.cmu.edu!firth
From: firth@sei.cmu.edu (Robert Firth)
Newsgroups: comp.lang.ada
Subject: Ada Equality
Message-ID: <3689@aw.sei.cmu.edu>
Date: 4 Jan 88 22:38:11 GMT
Sender: netnews@sei.cmu.edu
List-Id: <comp.lang.ada>


	Ada and "="
	-----------

The discussions on User Defined Equality, and its friend User
Defined Assignment, were long, repetitious, and inconclusive.
It would be difficult indeed to give an accurate summary of
this fragment of Ada design history, and impossible for a
vociferous participant to give an unbiased one, so please
regard this as an incomplete and possibly distorted view.

Having warned you at the start, I shall feel free to present
both fact and opinion in a continuous narrative.

----

The basic idea of user-defined equality seems easy:

	function "=" (Left,Right : T) return Boolean is ...

Indeed, this much IS easy - there is no technical reason for
any restrictions on the use of "=" as a mere function designator.

The difficulties arise elsewhere.  Consider first this argument:

(A) We expect equality to have certain properties.  For instance,
    it is either True or False.  It is symmetric, so if X=Y then
    Y=X.  It is transitive, so if X=Y and Y=Z then X=Z.  If we
    simply allow the user to redefine "=" to mean anything at all,
    eg string concatenation, stack overflow check, or chipmunk
    feeding, then chaos will result.

One vestige of this argument remains in current Ada: the rule that
any definition of equality must return a Boolean result.  The other
problems disappeared, not so much because of a rush of good sense as
because most other proposed restrictions proved unenforceable. 

This argument carries no weight with me.  One reason is that many of
these supposed expectations are specious.  It is certainly legitimate
for an equality test to return "Unknown" as a result.  Some numerical
analysts like to define equality as "fuzzy" equality, which violates
transitivity but is nevertheless a reasonable thing to do.

Finally, the argument rests ultimately on magical thinking: that "="
is the True Name for equality, and misusing it will evoke demons.  Now
I admit that a programmer who uses "=" to name an operation other than
abstract equality, or who uses SQRT to name something other than
square root, is perhaps unwise.  But it is not the purpose of a
programming language to distinguish wisdom from folly.

----

A second argument is related to the first:

(B) There is a relationship between equality and assignment, such that,
    after the assignment X:=Y, it must be true that X=Y.

This continues in two ways: either we forbid user-defined equality, or we
permit user-defined assignment; in the former case the language can enforce
the alleged relationship, and in the latter case the user can choose to
preserve it.

This argument is one reason why Ada allows you to define equality only on
limited types, for which there is no visible assignment.  It is a slight
relaxation, therefore, of the restrictive rule: user-defined equality is
permitted only where assignment is forbidden.

Another relic of this kind of thinking is the rule that, when you define
"=", you also get "/=", with the converse meaning.  Discussion on this
issue ('But what if you redefine "not"?') threatened to lead to further
restrictions (eg on the ability to redefine "not"), so were prudently
abandoned by proponents of the permissive cause.

The permissive rule - allow also user-defined assignment - was, as you
might expect, widely urged.  We failed for several reasons.  One of
course was the natural inertia of language design.  A second, and rather
more important, was that we got into difficulty explaining how you would
pass the parameters to an assignment procedure.  If you pass them by value,
does not that create a vicious circle, since by-value implies assignment
of actual to formal.  But if you don't pass them by value, either you
have a special rule for ":=", or you allow the user to specify the
mechanism, which resurrects another long discussion that we thought
safely buried. (We were wrong, alas!)

A further point is that we do not attain full data-type abstraction by
adding ":=" to the list of things the user can specify.  We would need
also a user-defined declarator (":"), and user-definied initialisation
and finalisation procedures.  The problems Red had encountered while
trying to do all that were a real deterrent to our ambition, and on the
whole I think rightly so.

----

A third problem was the fact that some Ada language structures carried
implicit semantic assumptions, especially about equality.  Consider a
type GENDER with user-defined equality.  We now argue:

(C)	if Thing.Sex = Male then
	    blue_code;
	else
	    pink_code;
	end if;

    is quite clear, and can invoke the user-defined "=".  But we could
    also write the code as

	case Thing.Sex is
	    when Male   => blue_code;
	    when Female => pink_code;
	end case;

    Is this equivalent to the former code?  If not, why not; and if so,
    by what mechanism?

Clearly, we have problems.  Does the case statement use the user-defined
equality?  If so, we must be sure to define what happens if the function
is badly-behaved, eg returning True for everything or False for everything.
And in the latter event, what would a "when others" mean?  It seemed hard
to define these structures in such a way that the underlying semantic
operations could freely be redefined.

And here is a worse situation.  We have defined a type Person with a
discriminant component Sex:

	...
	case Sex is
	    when Male   => Bearded  : Boolean;
	    when Female => Pregnant : Boolean;
	end case;

Now, if we say

	if P.Sex = Male then...

do we guarantee the existence of P.Bearded?  If not, then the compiler's
constraint checking cannot be explained in Ada terms - it uses some hidden
"true" equality we can no longer see.  Otherwise, if we have defined "=" to
return always True, then we have the famous Variant Record Anomaly, "that
leaves its hapless victims both bearded and pregnant".

This argument was part of the motivation for permitting "=" to be defined
only for private types.  My own suggestion was equally ad-hoc but less
radical: rule that a type with redefined "=" is no longer discrete.  This
seemed to me to disallow use of that type in syntactic contexts that
carried assumptions about equality.

But that is a patch.  The underlying issue is that any programming language
assumes some immutable primitive concepts, and builds structures that in
some way embody those concepts.  Hence, we can drive abstraction - in
language terms, the ability of the user to define the primitive attributes
of an object - down only so far into the underlying semantic model, before
the language constructs begin to disintegrate.  One answer is to restrict
ourselves to constructs robust enough and well enough understood to support
full abstraction, which I believe today implies an applicative language.
[Yes, indeed, but would the customer have bought one?]

----

The final problem remains unresolved in Ada:

(D) Consider the declarations

	type COMPLEX is record
	    re,im : REAL;
	end record;

	Z1,Z2 : COMPLEX;

    Suppose the user has defined REAL with his own "fuzzy" equality.
    What is the meaning of

	Z1 = Z2;
    ?

This is the issue of the composability of equality.  I submit that the
answer the user expects is

	Z1 = Z2  means  Z1.re = Z2.re and then Z1.im = Z2.im

invoking, therefore, the user-defined equality between REALs.  But this
raises a lot of very controversial issues, including

. user redefinition of AND (and OR) - that is, of the composition mechanism

. "=" with result type other than Boolean

. order of evaluation of expressions

. partial versus total evaluation

I believe that "=" and ":=" indeed ought to compose properly, ie that the
inherited meaning for composite types ought to be composed from the meaning
for the component types.  But it is hard indeed to define that in a robust
manner.  In the event, we abandoned the attempt, so in Ada equality does
not compose.  If the user defines "=" for a component type, he must define
it again, explicitly, for all types composed from it.

----

Is there a short conclusion?  Perhaps this: the current situation is a
compromise that few liked, but nobody could prove infeasible.  Moreover,
any major improvement must solve the hard problems, and they were too
hard for us.  And there were many more important things to discuss than
"=", so we tended not to disturb a consensus, even one achieved by exhaustion.

Robert Firth