From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,18f7f6e041b3e0bf
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-08-01 06:09:10 PST
Path: 
 archiver1.google.com!news1.google.com!newsfeed.stanford.edu!headwall.stanford.edu!fu-berlin.de!uni-berlin.de!dialin-145-254-040-041.arcor-ip.NET!not-for-mail
From: Dmitry A.Kazakov <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: Dispatching and generics - language lawyer question
Date: Fri, 2 Aug 2002 03:15:07 +0200
Message-ID: <aibbth$133ln5$1@ID-77047.news.dfncis.de>
References: <mailman.1027489082.2047.comp.lang.ada@ada.eu.org>
 <wccvg6yqng5.fsf@shell01.TheWorld.com>
 <ai8f69$126d5v$1@ID-77047.news.dfncis.de>
 <wccsn1z1wjj.fsf@shell01.TheWorld.com>
Reply-To: mailbox@dmitry-kazakov.de
NNTP-Posting-Host: dialin-145-254-040-041.arcor-ip.net (145.254.40.41)
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7Bit
X-Trace: fu-berlin.de 1028207349 36820709 145.254.40.41 (16 [77047])
User-Agent: KNode/0.4
Xref: archiver1.google.com comp.lang.ada:27566
Date: 2002-08-02T03:15:07+02:00
List-Id: <comp.lang.ada>

Robert A Duff wrote:

> Dmitry A.Kazakov <mailbox@dmitry-kazakov.de> writes:
> 
>> [away from Ada 95] There should be only tagged types and no untagged
>> [ones.
>> Then I suppose the formal derived types will do the whole work, because
>> the set of all available operations will be defined by the root type. All
>> types tagged means only that T'Class exists for all but class-wide types.
>> The tag should be kept in the instances of T'Class and class-wide
>> pointers, but not in the values of T, so no performance hit when objects
>> are specific.
> 
> This is a reasonable idea (not new -- I and probably others have thought
> of it before).
> 
> As you imply, there are two ways to implement tags: (1) Store the tag
> with each object.  Then an access value (or a pass-by-reference
> parameter) can be represented as the address of the object.  Or (2),
> do not store the tag with each object, but gin it up when necessary.
> This requires access values and parameters to be a pair -- the tag, plus
> the address of the object.  When you do X'Access, the compiler creates
> the pair.
> 
> Neither method is uniformly better: there are some programs that will be
> more efficient with method (1), and some more efficient with method (2).
> This indicates that the user should have the option.  I would suggest
> the default (in this hypothetical language) should be (2), and a
> per-type pragma could request (1) for that type and its descendants.
> This preserves the Bauer Principle -- if you never say 'Class, you never
> have any overhead for storing or manipulating tags at run time.
> 
> Of course, having options like that complicates compilers.

I think that (1) would cause many problems with MI. When tag is a part of 
the value and then view conversions are problematic.

> For most of the Ada 9X design, either (1) or (2) were feasible (for
> tagged types).  However, a last-minute change to the language made (2)
> infeasible.  Namely, the fact that all tagged parameters are aliased.

There is also redispatch which is IMO incompatible with (1). I think 
redispatch must be removed, because when object identification is really 
necessary (to redispatch) one could use a solution similar to J.-P. Rosen 
trick.

> As far as I know, all existing compilers use (1).
> 
> Note that the distinction between (1) and (2) is analogous to the
> compiler storing array bounds with the array, or with pointers to the
> array.  Neither one is uniformly better.  I believe GNAT allows a user
> choice, and I think all other compilers store bounds with the array
> (analogous to (1)).

Yes. Same with discriminants. I have only a vague idea, that there should 
be way to separate all things like that from values. The goal is of course 
to make arrays, strings etc user-defined types without additional overhead. 
Then of course you should be able to derive from an abstract array and 
provide an implementation which is internally not an array at all. So the 
client will never know.

>> > I'm disappointed that nobody answered my previous question -- what
>> > should the rules about predefined operators in generics be?  I don't
>> > like the current Ada rules (which require reemergence of the "wrong"
>> > operator), but the alternatives are uncomfortable in various ways.
>> > Christoph Grein, or anybody else want to comment?
>> > 
>> > Here's an interesting case: a generic Sort routine, that takes "<" as a
>> > parameter.  (Or it could be called "Less_Than" -- it doesn't matter
>> > what
>> > it's called, or whether it's an operator symbol.)  The Sort routine
>> > might require that the "<" behave in certain ways (like if A<B and B<C
>> > both return True, then A<C should always return True).  Is there any
>> > way
>> > to specify that (other than via comments)?  Is there some way to design
>> > the language so that the generic Sort could formally require such
>> > properties of "<"?
>> 
>> [far away form Ada 95] You should simply have different root types.
>> Transitive_Ordered should be a[!] root type for "all" types with
>> transitive "<". "All" means the types you want to pass the actual
>> parameter. Of course then the language should allow creating supertypes
>> on-fly, because one cannot foresee all possible root types like Ada tries
>> with its classification of formal types. Let Integer is not a descendant
>> of Transitive_Ordered. Then one could create a "proxy" type which is
>> simultaneously a supertype of Integer and a subtype of
>> Transitive_Ordered.
>> [The compiler should check that "<" be implemented.] So Integer would
>> become a subtype Transitive_Ordered in the given context and thus could
>> be used as an actual parameter of the generic. This of course would
>> require MI and dynamically maintained dispatch tables.
> 
> Again, good ideas.  But they don't solve the problem I was thinking of.
> 
> By the way, I don't see why you need dynamically maintained dispatch
> tables.  To me, "on the fly" means "somewhere later in the code" --
> i.e. in some module other than where the type is first declared.
> That need not imply "at run time", I think.
> 
> The notion of dynamically maintained dispatch tables scares me: I hope
> that "<" on type T does not dispatch to different subprogram bodies
> at different times (for the same type tag)!
>
> It seems like adding new operations to a type should only be allowed at
> the same nesting level as the original type declaration.  I shudder to
> think of the run-time overhead of adding an operation in a recursive
> procedure.  ("Adding new op" is essentially the same thing as "adding a
> new parent type", here.)

Yes, I also do not think that one could override in any context without a 
heavy penalty. I think if that would be allowed then no expression would be 
static when used in a subprogram. No, I meant another thing. If a derived 
type is created in some nested context and overrides, say "<", then the 
dispatching table of "<" should be modified once when the type is 
elaborated and restored when the context is finalized.

> But anyway, as you explain below, none of this solves the problem.  The
> compiler checks that "<" exists with the right parameter types, but that
> doesn't *prove* its transitive (whether or not the Transitive_Ordered
> idea is used).
> 
>> > This is similar to the Text_IO.Integer_IO example, where the code wants
>> > to assume that "mod" behaves according to some mathematical rules.
>> > (I'm not even sure how to state those rules precisely.)
>> 
>> Well, this is about LSP. No one can enforce semantics of an operation.
> 
> Exactly my point.  For *this* issue, all of the above good ideas are
> equivalent to simply changing the language so that predefined operators
> do not reemerge.

One should distinguish overloading and overriding. If ">" is overridden 
then a reemergence would clearly violate LSP. On contrary in the case of 
overloading reemergence would enforce LSP. So logically, the reemergency 
rule is OK so far the type is untagged [but there should be no such (:-))]. 
But look at this:

package Some_Type is
   type X is tagged null record;
   function "<" (A, B : X) return Boolean;
end Some_Type;

package body Some_Type is
   function "<" (A, B : X) return Boolean is
   begin
      return True;
   end "<";
end Some_Type;

with Ada.Text_IO; use Ada.Text_IO;
with Some_Type;   use Some_Type;

procedure Test is
   XX : X;

   generic procedure Compare (A : X);

   procedure Compare (A : X) is
   begin
      if A < A then
         Put_Line ("Reemerged");
      else
         Put_Line ("Overloaded");
      end if;
   end Compare;

   procedure G1 is new Compare;
   procedure N1 (A : X) is
   begin
      if A < A then
         Put_Line ("Reemerged");
      else
         Put_Line ("Overloaded");
      end if;
   end N1;

   function "<" (A, B : X) return Boolean is
   begin
      return False;
   end "<";

   procedure G2 is new Compare;
   procedure N2 (A : X) is
   begin
      if A < A then
         Put_Line ("Reemerged");
      else
         Put_Line ("Overloaded");
      end if;
   end N2;

begin
   G1 (XX); -- Reemerged
   N1 (XX); -- Reemerged
   G2 (XX); -- Reemerged
   N2 (XX); -- Overloaded
end Test;

Is THIS OK? I do not think so. The first thought is to prohibit overloading 
of primitive operations.

>> However, the difference between what we have in Ada now and what I would
>> like to have, is that presently operations of a private formal type are
>> matched by their profiles. This warranties absolutely nothing.
> 
> Right, it warranties nothing.
> 
>>... With generic
>> derived types, you would have a subtype relation which assumes a
>> meaningfull implementation of the root type operations.
> 
> But this also warranties nothing.  One must "assume" that operations do
> what they're supposed to do; the compiler can't prove it.

To some extent it could. Let we introduce, as many proposed, pre-, 
post-conditions and invariants. Then some of requirements on operations of 
Transitive_Order could be expressed in their terms. Then (do not ask me 
how) the compiler could check that a pre-condition of a derived type 
declared to be a "strong subtype" is not stronger than it was for the base 
type etc.

This will still warranty nothing, but it is much like 
Unchecked_Deallocation. How do you warranty that it is not called twice? 
One thing is when the compiler silently uses ">", just because it has a 
right name and profile. Another thing is when the compiler says: sorry guy, 
">" is abstract, please, override it, even if the overriding would be a 
simple renaming, even if some idiot would call a TRAP instruction from the 
overriding.

> That's no
> different from the current case, where "mod" could be redefined to do
> something weird, thus breaking Text_IO.Integer_IO (I mean, if we changed
> the rules to avoid the reemergence).
> 
> You still haven't answered my question: What should the RM say that
> Integer_IO.Put does?  (Or if it were user-defined, what should the
> comment on its spec say?)  I don't care whether the language removes
> reemergence, or all your nice ideas above, including about formal
> derived types, are adopted -- either way, it's difficult to define the
> semantics of Put if it might be calling some user-defined version of
> "mod".

The precondition of Integer_IO is that the generic parameter is of some 
integer type. It means that all what is said about integer types in RM 
should be true for it. RM could say that for an integer type for any x and 
any y/=0 there are d and r of the type that x=y*d+r and etc (*). There is a 
related question: how to make preconditions as weak as possible? 
Definitely, Put does not need *all* integer operations. How to express 
that? To refine further the type hierarchy like with Transitive_Order would 
be a heavy burden for a programmer, because then he/she should instantiate 
each procedure of the package separately depending on what subset of 
operations (subtype) the procedure require.

(*) You can always write in RM: "it is a bounded error if" and a draconian 
requirement follows (:-))

Also any polymorphic solution (generic or class-wide) may face 
substitutability problems because of overriding. Thus for simple things 
like Integer_IO it could be better to have a non-polymorphic one. I mean 
user-defined type conversions and subtyping through conversions. In fact we 
do it manually on daily basis when write something like:

type X is range ...;
A : X;

Put (Integer'Image (Integer (A)));

Let we have some integer type Int_Out which cannot have instances 
(variables). Then let's declare Put on it. Then if I would like to make an 
output for X, I would declare X a subtype of IO_Int by providing a 
conversion X->Int_Out. The compiler would apply the conversion should I 
write Put (X). This suffers no substitutability problems (**), because Put 
will always deal with an Int_Out.

(**) If Int_Out is capable to represent all values of the derived type.

> The current RM doesn't even mention that Put calls "mod".  In fact,
> there are various ways of implementing Put, some of which call "mod" and
> some of which don't.  Surely, Put calling "mod" is an implementation
> detail?

Yes. Yet it is consistent, because what is not an implementation detail is 
that Put *may* call "mod" expecting a meaningfull behaviour from it. 

-- 
Regards,
Dmitry Kazakov
www.dmitry-kazakov.de