From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 1108a1,c52c30d32b866eae
X-Google-Attributes: gid1108a1,public
X-Google-Thread: fac41,c52c30d32b866eae
X-Google-Attributes: gidfac41,public
X-Google-Thread: 103376,2ea02452876a15e1
X-Google-Attributes: gid103376,public
From: ncohen@watson.ibm.com (Norman H. Cohen)
Subject: Re: Real OO
Date: 1996/03/29
Message-ID: <4jh5cr$101s@watnews1.watson.ibm.com>
X-Deja-AN: 144870843
distribution: world
references: <4id031$cf9@dayuc.dayton.saic.com> <DonJqn.755@assip.csasyd.oz>
  <4iuvmi$113p@watnews1.watson.ibm.com> <65O34ib-3RB@herold.franken.de>
  <4j9p3a$uo5@watnews1.watson.ibm.com> <65h8jGx-3RB@herold.franken.de>
organization: IBM T.J. Watson Research Center
reply-to: ncohen@watson.ibm.com
newsgroups: comp.lang.eiffel,comp.lang.ada,comp.object
Date: 1996-03-29T00:00:00+00:00
List-Id: <comp.lang.ada>

In article <65h8jGx-3RB@herold.franken.de>, jhd@herold.franken.de
(Joachim Durchholz) writes: 

|> Eiffel references ( = pointers for this discussion) always have a static
|> type that can be inferred by the compiler. A pointer might be more
|> restricted, but then it should be declared to be of the more restricted
|> type in the beginning.
|> Pointer-induced aliasing is possible in Eiffel (and in fact creates
|> problems in other areas), but even then the referred-to objects is 100%
|> guaranteed to conform to the declared type.
|> No data-flow analysis necessary.

The data-flow analysis is required to perform optimizations based on
knowledge that an objects dynamic type is precisely its static type AND
NOT some more restricted type.  There is no way to express this knowledge
with an Eiffel declaration.

|> > The analysis required is expensive, requiring
|> > interprocedural analysis to be effective,
|>
|> True. And one of the places where a typical Eiffel compiler would spend
|> time. But I don't think this is a real problem - the Eiffel compiler will
|> spend less time than the programmer thinking about wether he can really
|> assume what he's telling the compiler. (And even less time if the
|> programmer goes bug-hunting if the programmer's assumption was wrong.)

Typically, a programmer comes upon this knowledge (that the dynamic type
is exactly the specific type) because it is an obvious consequence of the
algorithm, not because he has performed a sophsiticated data flow
analysis by hand!  If an Ada programmer is mistaken, he will be told so
by the failure of a tag check, at precisely the site of the attempt to
assign or pass a value with one dynamic type to a variable declared to be
of some different specific type.  No hunting necessary.

|>
|> > and the payoff
|> > is rare, so it is not clear that it is worthwhile for an optimizing
|> > compiler to perform this analysis routinely.
|>
|> My Booch says that it is usually possible to replace 80% of all calls by
|> static (non-polymorphic) ones. If I can guarantee that I catch all cases,
|> this is worthwile. (And Eiffel allows me to statically determine the set
|> of possible dynamic types, so no problem there.)

Well that depends largely on your Booch's coding style.  The results
presented by Calder and Grunwald in their 1994 POPL paper are different.
Among other things, they measured: 

   1. the percentage of indirect function calls executed during their
      experiments that invoked methods that were never overridden
      (a condition that it is, in principle, straightforward to detect
      at link time, but not sooner)

   2. the percentage of indirect function calls executed during their
      experiments at call sites from which only one target was invoked
      during THAT execution of the program

(Notice that in each case they counted executions of calls, not the
number of call sites in the text of the program.) Some of the calls
counted in Measurement 2 may have been from call sites where a
sophisticated interprocedural data flow analysis would have detected that
only one method version was reachable; but some of these calls may have
been from call sites that would have invoked different method versions in
other executions of the program, with different input data.  Calder and
Grunwald describe Measurement 2 as an upper bound on the number of
dynamic calls that could be turned into static calls by powerful
interprocedural optimizations, and Measurement 1 as the lower bound we
would expect of such optimizations, since that much improvement can be
achieved even by a naive interprocedural optimization.  The mean values,
over the programs they measured, were 31.6% for Measurement 1 and 66.3%
for Measurement 2, a far cry from your Booch's 80%.

Of course in Eiffel programs, even call sites where there is obviously
one target are necessarily programmed with indirect function calls, so we
would expect these percentages to be higher.  In Ada programs, call sites
where there is obviously one target are more likely to be programmed as
direct calls in the first place, so these percentages (in which the
denominator is the number of calls at the remaining dispatching call
sites) are likely to be lower.  In other words, the optimization will
seem more "worthwhile" to Eiffel programmers than to Ada programmers
because Eiffel programs need it more badly.

|> > In Ada, the programmer
|> > conveys, not through any special hint-giving pragma, but just through the
|> > natural course of declaring an object to belong to a specific type, that
|> > a call should not be dispatching.
|>
|> In Eiffel, the programmer "declares" an object to belong to a specific
|> type by not assigning any object of a descendant type to it. I think this
|> is less error-prone and requires less thinking about an otherwise
|> uninteresting detail.

Of course this is not a declaration at all.  It is a run-time property
that may or may not be determinable at compile time; and if determinable
at all, may be determinable only by sophisticated analysis.

I wrote: 

|> > |> > Making it classwide simply documents the fact that the algorithm does
|> > |> > not depend on the tag of the parameter, and causes the compiler to
|> > |> > catch any attempt to override.

Someone (Joachim?) responded: 

|> > |> This assumes that the writer of a module knows that his routines never
|> > |> need to be overridden.

I responded: 

|> > No, it assumes that the writer of a module knows that a particular
|> > routine in the module accomplishes its purpose without directly
|> > exploiting knowledge of a given object's dynamic type, or tag.  (The
|> > routine may invoke other routines that dispatch based on the dynamic
|> > type, but it does not itself differentiate among different dyanamic
|> > types.)

Joachim responded: 

|> Sorry to object, but it *does* assume that the routine will never be
|> overridden. This includes cases where the algorithm in the routine isn't
|> appropriate for some descendant type - if the routine can't be changed
|> (overridden) then, the whole class will have to be implemented.

But the routine was written in the first place not to rely on distinctive
properties of different descendants, so the algorithm must be
appropriate for any new descendant type.  In a previous post, I wrote

|> > A classwide routine is, by definition, one whose
|> > correctness does not depend on the dynamic types of the objects it is
|> > dealing with.

and Joachim responded: 

|> True. But then, how do I know that? I think cases where I'm sure that this
|> will hold are very rare. And there is always the possibility that somebody
|> will extend the class in a way that I never anticipated, and in such a
|> case, it is quite likely that routines that I considered classwide are
|> anything but classwide in the new design.

Formally, you know that because the correctness of the classwide routine
depends only on the specifications (preconditions and postconditions) of
the (root versions of) the methods it invokes.  Proper OO methodology,
explicitly codified in Eiffel, requires that any overriding versions of
these methods preserve, and perhaps strengthen, these specifications.

Opportunities for classwide subprograms are not rare at all.  Here is an
example: There is an abstract type (in Eiffel terms, a deferred class)
for bank transactions.  It has descendant types for different kinds of
transactions (deposits, checks, withdrawals, fees, interest, etc.).
Among the methods of the root type are classwide functions returning the
date and amount of the transaction (classwide because they simply examine
components present in all descendants of the type) and a dispatching
function returning a string that describes the transaction (dispatching
because the information in the string, and the algorithm for formatting
it, depend on the kind of transaction).  By calling these three methods,
one can write a classwide routine that takes a transaction of unknown
dynamic type and a previous balance, prints a line of a monthly statement
corresponding to that transaction, and updates the running balance.
(This routine can be called in a loop to print a monthly statement, given
an array of transactions.)  The classwide routine's correctness
clearly does not depend on the dynamic type of the transaction with which
it is invoked.  The addition of a new kind of transaction cannot possibly
break the classwide routine (unless, of course, the type for the new
kind of transaction overrides the description-string function with a
broken version that returns, say, the lyrics of Waltzing Matilda instead
of a description of the transaction).

In a previous post, I suggested that, given a type Shape_Type meant to
represent abstract Euclidean entities, the best way to represent graphic
representations of shapes was by composition rather than inheritance: 

|> >    type Graphic_Type is tagged
|> >       record
|> >          Shape : Shape_Type;
|> >          Fill  : Fill_Type;
|> >       end record;

(Actually, that should have been: 

   type Shape_Pointer_Type is access Shape_Type'Class;

   type Graphic_Type is tagged
      record
         Shape : Shape_Pointer_Type;
         Fill  : Fill_Type;
      end record;

so that a Graphic_Type instance can contain--or, as we must make explicit
in Ada, point to--any kind of shape.)

I wrote: 

|> > If one is interested in congruence of Graphic_Type instances, that's a
|> > very different notion from congruence of abstract geometric figures, but
|> > it's easily implemented as a classwide operation: 
|> >
|> >    function Congruent
|> >       (Graphic_1, Graphic_2: Graphic_Type'Class) return Boolean is
|> >    begin
|> >       return Congruent (Graphic_1.Shape, Graphic_2.Shape);
|> >    end Congruent;

Joachim repsonded: 

|> Plus redefining all routines of Shape_Type for Graphic_Type, with similar
|> one-liners. And the Graphic_Type class would have to be revisited whenever
|> the Shape_Type class is extended with new functionality, to include the
|> new features.

You're right, if the application requires Graphic_Type to reflect all the
features of Shape_Type, not just the Congruent function, this solution is
inappropriate.

|> I'd think this is a classical example for when to use inheritance <g>.

No, the proper solution is to add the classwide function

   function Geometry_Of
      (Graphic: Graphic_Type'Class) return Shape_Pointer_Type is
   begin
      return Graphic.Shape;
   end Geometry_Of;

(and possibly a Set_Geometry_Of procedure).  Then, to test whether
Graphic_Type instances G1 and G2 have congruent polygons, we write

   Congruent (Geometry_Of(G1).all, Geometry_Of(G2).all)

--
Norman H. Cohen    ncohen@watson.ibm.com