From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00,FREEMAIL_FROM,
	TO_NO_BRKTS_PCNT,XPRIO autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,fd63afa4dc364b7e,start
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2001-03-13 15:06:08 PST
Path: 
 supernews.google.com!sn-xit-03!supernews.com!freenix!fr.usenet-edu.net!usenet-edu.net!fu-berlin.de!uni-berlin.de!ppp-5-151.5800-2.access.uk.worldonline.COM!not-for-mail
From: "Nick Roberts" <nickroberts@callnetuk.com>
Newsgroups: comp.lang.ada
Subject: Better support for garbage collection
Date: Tue, 13 Mar 2001 18:37:41 -0000
Message-ID: <98m938$2iod0$1@ID-25716.news.dfncis.de>
NNTP-Posting-Host: ppp-5-151.5800-2.access.uk.worldonline.com (62.64.137.151)
X-Trace: fu-berlin.de 984524715 2711968 62.64.137.151 (16 [25716])
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Xref: supernews.google.com comp.lang.ada:5703
Date: 2001-03-13T18:37:41+00:00
List-Id: <comp.lang.ada>

There follows a proposal that I intend to make to the ARG in the next few
days. I would appreciate comments and criticisms before doing so.

This is the first of many proposals that I shall be likely to make. The fact
that this one comes first is of no significance.

----------

I would like to make a proposal for the next revision of the language which
would provide standardised support for partially user-defined garbage
collection, and a standard mechanism for disposability.

I think these facilities should be defined in an optional annex.

RM95-13.11 defines a mechanism permitting a degree of user control over the
dynamic allocation and deallocation of storage for objects. This mechanism
is based on the abstract type Root_Storage_Pool, and defines the interface
for this type. However this interface is not sufficient for the provision
of a user-defined storage pool which supports garbage collection.

The following proposal defines a new abstract type, derived from
Root_Storage_Pool, which adds sufficient extra functionality to this
interface to facilitate user-defined garbage-collecting storage pools
(UDGCSPs). The new type is called Repositioning_Storage_Pool, from the
assumption that garbage collection will (generally) require the
repositioning of objects in memory.

The language should define the following package:

   package System.Storage_Pools.Repositioning is

      pragma Preelaborate(System.Storage_Pools.Repositioning);

      type Repositioning_Storage_Pool is
         abstract new Root_Storage_Pool with private;

      type Object_Token is new Address;

      procedure Allocate(
         Pool : in out Repositioning_Storage_Pool;
         Token : out Object_Token) is abstract;

      procedure Deallocate(
         Pool : in out Repositioning_Storage_Pool;
         Token : in Object_Token) is abstract;

      procedure Lock(
         Pool : in out Repositioning_Storage_Pool;
         Token : in Object_Token;
         Storage_Address : out Address) is abstract;

      procedure Unlock(
         Pool : in out Repositioning_Storage_Pool;
         Token : in Object_Token) is abstract;

      procedure Tidy(
         Pool : in out Repositioning_Storage_Pool) is abstract;

      function Max_Lock_Count(Pool : Repositioning_Storage_Pool)
         return Natural is abstract;

      Locking_Error: exception;

   private
      ... -- not specified by the language
   end System.Storage_Pools.Repositioning;

A type derived from Repositioning_Storage_Pool is a 'repositioning pool' (a
UDGCSP), whose operations are defined below.

These operations all work in a way which is 100% upwards-compatible with
Root_Storage_Pool, so that a compiler which does not support the extra
functionality implied by a repositioning pool will nevertheless work
correctly with the pool (although no garbage collection will be performed).

Each pool element is identified by a unique (within the pool) value of the
type Object_Token, called the element's 'token'. This value will never
change throughout the life (from its allocation to its deallocation) of the
element. The type Object_Token is the same size as System.Address (and will
typically actually be an address, hence the declaration which makes this
conversion convenient).

Each pool element has a 'lock count', of an unspecified integer (or
modular) subtype, whose range does not exceed that of Standard.Natural. The
element is said to be 'locked' if its lock count is a value other than 0.

The overloading of the Allocate procedure with a Storage_Address parameter
sets the lock count of the pool element it allocates to 1, reserves a token
for the element, and reserves a block of memory, whose address is passed
out to the Storage_Address parameter.

The overloading of the Allocate procedure with a Token parameter sets the
lock count of the pool element it allocates to 0, reserves a block of
memory, and reserves a token for the element, whose value is passed out to
the Token parameter.

Either Allocate procedure may raise Storage_Error as a result of having
insufficient memory for a block or for a token.

The Lock procedure increments (by 1) the lock count of a pool element. The
current address of the block of memory reserved for the element is passed
out in Storage_Address.

For any repositioning pool, there is a maximum value the lock count of any
of its elements can reach. The Max_Lock_Count function returns this
maximum. If a call is made to Lock for an element whose lock count equals
this value, the Locking_Error exception is raised (and the lock count is
not changed).

I suggest a recommendation (or perhaps a stipulation) that the maximum lock
count is never less than 15. (Although this recommendation will apply to
the user rather than the implementation, except for implementation-supplied
repositioning pools.)

The Unlock procedure decrements (by 1) the lock count of a pool element,
unless the element's lock count is already 0, in which case it raises the
Locking_Error exception instead.

The overloading of the Deallocate procedure which takes a Storage_Address
parameter raises the Locking_Error exception if the pool element's lock
count is not 1 (and does not deallocate the element).

The overloading of the Deallocate procedure which takes a Token parameter
raises the Locking_Error exception if the pool element's locking count is
not 0 (and does not deallocate the element).

The Tidy procedure carries out what is called 'compaction' or
'defragmentation' (the second stage of the garbage collection process) on a
pool. Blocks of memory reserved for objects are moved (their contents
copied to other locations in memory) so as to reduce (to a minimum) the
amount of memory which is effectively unusable (typically because many of
the gaps between the blocks have become too small). The block of memory
reserved for a pool element must not be moved if the element is locked (its
lock count is not 0).

The first stage of the garbage collection process is 'scavenging', the
recognition and deallocation of allocated objects which have become
'unreachable' (inaccessible by any degree of indirection). This stage needs
to be carried out by code specially generated by the compiler.

A 'disposable object' is a dynamically allocated object which an
implementation (of Ada) is permitted to deallocate at any time, if it needs
to do so in order to successfully complete the allocation of another
object.

The language should define a subtype in the package System:

   subtype Disposability_Level is
      Integer range 0..implementation_defined;

and an attribute Disposability, which applies to any pool-specific access
type, and has a static value of subtype Disposability_Level. A value other
than 0 indicates that objects referenced by access values of this access
type are disposable. The disposability level of the objects is the same as
the disposability level of the access type. The default value is 0, but may
be set by a representation clause. An implementation must never attempt to
dispose of an object with a certain disposability level if there exists
another object with a higher disposability level.

Disposal of a controlled object will cause finalization of the object as
normal.

The principles behind the above interface are explained briefly below.

Whenever an implementation makes an attempt to allocate an object in a
repositioning pool, and that attempt fails (Storage_Error being raised), it
can intervene by carrying out garbage collection (both stages) and then
attempting the allocation again. If the second attempt fails, the
implementation may attempt to deallocate a disposable object, and then try
once more. Eventually, on repeated failures, defeat must be conceded (and
the Storage_Error exception propagated as normal).

Additional calls to Tidy may be made, by the implementation or the user,
e.g. as part of an ongoing garbage collection strategy.

The imlementation must always lock a pool element before reading from it or
writing into it, in order to prevent it being moved at an inopportune
moment. On the other hand, the implementation should unlock elements as
frequently as possible (without overly reducing execution speed), so that
the Tidy procedure is not stymied by locked elements to the point of being
made ineffective.

I would suggest that an implementation should employ a strategy where it
unlocks any elements of a particular pool it has locked at every 'control
point', where a control point with respect to a pool is defined as being:

(1) just before any call to an allocator for an access value associated
with that pool;

(2) just before any call to a subprogram which may (or does) contain a
control point for that pool;

(3) any instance of pragma Inspection_Point (see RM95-H.3.2).

The same comments made in RM95-13.11(27-30), regarding tasking, would apply
to UDGCSPs. Locking, unlocking, and moving operations would need to be
atomic, for a task-safe pool.

The use of access values associated with UDGCSPs will, in general, be
inefficient compared to non-repositioning pools; this is a compromise
inherent in (full) garbage collection. It may be possible for compilers to
remove a great deal of locking and unlocking in the absence of (potential)
parallel task usage of a UDGCSP. Only calls to Tidy should ever move pool
elements; elements should remain statically positioned in memory otherwise
(regardless of their lock counts).

Garbage collection facilities for Ada tend to be rare at the moment. I
believe there are many situations where the availability of garbage
collection would be of genuine value to Ada programmers.

For example: students beginning to learn Ada, and wanting to be relieved of
the burden of storage management until later in their studies; anyone
writing a program which is likely to have a very short useful life, or who
for any reason requires speed of development more than speed of execution;
whenever an 'engine' of something else that requires garbage collection
(e.g. a Java Virtual Machine) is being implemented in Ada.

I think that a language definition of user-defined garbage-collecting
storage pools (UDGCSPs) would help in two ways towards making garbage
collection facilities more available to Ada programmers: first, it would
lessen the burden on the compiler writer, who would need only to be
concerned with the first stage of garbage collection (the second stage
being the province of UDGCSPs); second, it would open up a 'market' in
UDGCSPs, offering hopefully many variations with regard to price,
functionality, reliability, and efficiency.

Disposability is a facility provided by a variety of other languages, and
its use is an important technique for certain kinds of software (e.g. many
'artifical intelligence' applications). In addition, it is a facility that
can be very helpful for some kinds of commercial software, especially
programs which need to be able to cope gracefully with environments that
provide widely differing amounts of memory.

Even if an implementation supports repositioning pools, it should not be
required for any of its standard storage pools to be repositioning pools.
Every implementation must continue to support non-repositioning pools.

Finally, I believe the design I propose above would have the benefit that
UDGCSPs could still be used with implementations that did not specifically
support them (albeit without the garbage collection functionality), and
that it does not impose any difficulty on the programmer requiring -- e.g.
for speed reasons -- a non-repositioning pool.

----------

One question in my mind is whether the second parameter of the Unlock
procedure should be a token or an address. Might the latter enable compilers
to generate slightly better code?

Apologies for the length of the post!

--
Nick Roberts
http://www.AdaOS.org