From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,INVALID_DATE, MSGID_SHORT autolearn=no autolearn_force=no version=3.4.4 Path: utzoo!mnetor!uunet!husc6!bbn!rochester!PT.CS.CMU.EDU!sei!sei.cmu.edu!firth From: firth@sei.cmu.edu (Robert Firth) Newsgroups: comp.lang.ada Subject: Ada Equality Message-ID: <3689@aw.sei.cmu.edu> Date: 4 Jan 88 22:38:11 GMT Sender: netnews@sei.cmu.edu List-Id: Ada and "=" ----------- The discussions on User Defined Equality, and its friend User Defined Assignment, were long, repetitious, and inconclusive. It would be difficult indeed to give an accurate summary of this fragment of Ada design history, and impossible for a vociferous participant to give an unbiased one, so please regard this as an incomplete and possibly distorted view. Having warned you at the start, I shall feel free to present both fact and opinion in a continuous narrative. ---- The basic idea of user-defined equality seems easy: function "=" (Left,Right : T) return Boolean is ... Indeed, this much IS easy - there is no technical reason for any restrictions on the use of "=" as a mere function designator. The difficulties arise elsewhere. Consider first this argument: (A) We expect equality to have certain properties. For instance, it is either True or False. It is symmetric, so if X=Y then Y=X. It is transitive, so if X=Y and Y=Z then X=Z. If we simply allow the user to redefine "=" to mean anything at all, eg string concatenation, stack overflow check, or chipmunk feeding, then chaos will result. One vestige of this argument remains in current Ada: the rule that any definition of equality must return a Boolean result. The other problems disappeared, not so much because of a rush of good sense as because most other proposed restrictions proved unenforceable. This argument carries no weight with me. One reason is that many of these supposed expectations are specious. It is certainly legitimate for an equality test to return "Unknown" as a result. Some numerical analysts like to define equality as "fuzzy" equality, which violates transitivity but is nevertheless a reasonable thing to do. Finally, the argument rests ultimately on magical thinking: that "=" is the True Name for equality, and misusing it will evoke demons. Now I admit that a programmer who uses "=" to name an operation other than abstract equality, or who uses SQRT to name something other than square root, is perhaps unwise. But it is not the purpose of a programming language to distinguish wisdom from folly. ---- A second argument is related to the first: (B) There is a relationship between equality and assignment, such that, after the assignment X:=Y, it must be true that X=Y. This continues in two ways: either we forbid user-defined equality, or we permit user-defined assignment; in the former case the language can enforce the alleged relationship, and in the latter case the user can choose to preserve it. This argument is one reason why Ada allows you to define equality only on limited types, for which there is no visible assignment. It is a slight relaxation, therefore, of the restrictive rule: user-defined equality is permitted only where assignment is forbidden. Another relic of this kind of thinking is the rule that, when you define "=", you also get "/=", with the converse meaning. Discussion on this issue ('But what if you redefine "not"?') threatened to lead to further restrictions (eg on the ability to redefine "not"), so were prudently abandoned by proponents of the permissive cause. The permissive rule - allow also user-defined assignment - was, as you might expect, widely urged. We failed for several reasons. One of course was the natural inertia of language design. A second, and rather more important, was that we got into difficulty explaining how you would pass the parameters to an assignment procedure. If you pass them by value, does not that create a vicious circle, since by-value implies assignment of actual to formal. But if you don't pass them by value, either you have a special rule for ":=", or you allow the user to specify the mechanism, which resurrects another long discussion that we thought safely buried. (We were wrong, alas!) A further point is that we do not attain full data-type abstraction by adding ":=" to the list of things the user can specify. We would need also a user-defined declarator (":"), and user-definied initialisation and finalisation procedures. The problems Red had encountered while trying to do all that were a real deterrent to our ambition, and on the whole I think rightly so. ---- A third problem was the fact that some Ada language structures carried implicit semantic assumptions, especially about equality. Consider a type GENDER with user-defined equality. We now argue: (C) if Thing.Sex = Male then blue_code; else pink_code; end if; is quite clear, and can invoke the user-defined "=". But we could also write the code as case Thing.Sex is when Male => blue_code; when Female => pink_code; end case; Is this equivalent to the former code? If not, why not; and if so, by what mechanism? Clearly, we have problems. Does the case statement use the user-defined equality? If so, we must be sure to define what happens if the function is badly-behaved, eg returning True for everything or False for everything. And in the latter event, what would a "when others" mean? It seemed hard to define these structures in such a way that the underlying semantic operations could freely be redefined. And here is a worse situation. We have defined a type Person with a discriminant component Sex: ... case Sex is when Male => Bearded : Boolean; when Female => Pregnant : Boolean; end case; Now, if we say if P.Sex = Male then... do we guarantee the existence of P.Bearded? If not, then the compiler's constraint checking cannot be explained in Ada terms - it uses some hidden "true" equality we can no longer see. Otherwise, if we have defined "=" to return always True, then we have the famous Variant Record Anomaly, "that leaves its hapless victims both bearded and pregnant". This argument was part of the motivation for permitting "=" to be defined only for private types. My own suggestion was equally ad-hoc but less radical: rule that a type with redefined "=" is no longer discrete. This seemed to me to disallow use of that type in syntactic contexts that carried assumptions about equality. But that is a patch. The underlying issue is that any programming language assumes some immutable primitive concepts, and builds structures that in some way embody those concepts. Hence, we can drive abstraction - in language terms, the ability of the user to define the primitive attributes of an object - down only so far into the underlying semantic model, before the language constructs begin to disintegrate. One answer is to restrict ourselves to constructs robust enough and well enough understood to support full abstraction, which I believe today implies an applicative language. [Yes, indeed, but would the customer have bought one?] ---- The final problem remains unresolved in Ada: (D) Consider the declarations type COMPLEX is record re,im : REAL; end record; Z1,Z2 : COMPLEX; Suppose the user has defined REAL with his own "fuzzy" equality. What is the meaning of Z1 = Z2; ? This is the issue of the composability of equality. I submit that the answer the user expects is Z1 = Z2 means Z1.re = Z2.re and then Z1.im = Z2.im invoking, therefore, the user-defined equality between REALs. But this raises a lot of very controversial issues, including . user redefinition of AND (and OR) - that is, of the composition mechanism . "=" with result type other than Boolean . order of evaluation of expressions . partial versus total evaluation I believe that "=" and ":=" indeed ought to compose properly, ie that the inherited meaning for composite types ought to be composed from the meaning for the component types. But it is hard indeed to define that in a robust manner. In the event, we abandoned the attempt, so in Ada equality does not compose. If the user defines "=" for a component type, he must define it again, explicitly, for all types composed from it. ---- Is there a short conclusion? Perhaps this: the current situation is a compromise that few liked, but nobody could prove infeasible. Moreover, any major improvement must solve the hard problems, and they were too hard for us. And there were many more important things to discuss than "=", so we tended not to disturb a consensus, even one achieved by exhaustion. Robert Firth