From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00,FREEMAIL_FROM, TO_NO_BRKTS_PCNT,XPRIO autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,fd63afa4dc364b7e,start X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-03-13 15:06:08 PST Path: supernews.google.com!sn-xit-03!supernews.com!freenix!fr.usenet-edu.net!usenet-edu.net!fu-berlin.de!uni-berlin.de!ppp-5-151.5800-2.access.uk.worldonline.COM!not-for-mail From: "Nick Roberts" Newsgroups: comp.lang.ada Subject: Better support for garbage collection Date: Tue, 13 Mar 2001 18:37:41 -0000 Message-ID: <98m938$2iod0$1@ID-25716.news.dfncis.de> NNTP-Posting-Host: ppp-5-151.5800-2.access.uk.worldonline.com (62.64.137.151) X-Trace: fu-berlin.de 984524715 2711968 62.64.137.151 (16 [25716]) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Xref: supernews.google.com comp.lang.ada:5703 Date: 2001-03-13T18:37:41+00:00 List-Id: There follows a proposal that I intend to make to the ARG in the next few days. I would appreciate comments and criticisms before doing so. This is the first of many proposals that I shall be likely to make. The fact that this one comes first is of no significance. ---------- I would like to make a proposal for the next revision of the language which would provide standardised support for partially user-defined garbage collection, and a standard mechanism for disposability. I think these facilities should be defined in an optional annex. RM95-13.11 defines a mechanism permitting a degree of user control over the dynamic allocation and deallocation of storage for objects. This mechanism is based on the abstract type Root_Storage_Pool, and defines the interface for this type. However this interface is not sufficient for the provision of a user-defined storage pool which supports garbage collection. The following proposal defines a new abstract type, derived from Root_Storage_Pool, which adds sufficient extra functionality to this interface to facilitate user-defined garbage-collecting storage pools (UDGCSPs). The new type is called Repositioning_Storage_Pool, from the assumption that garbage collection will (generally) require the repositioning of objects in memory. The language should define the following package: package System.Storage_Pools.Repositioning is pragma Preelaborate(System.Storage_Pools.Repositioning); type Repositioning_Storage_Pool is abstract new Root_Storage_Pool with private; type Object_Token is new Address; procedure Allocate( Pool : in out Repositioning_Storage_Pool; Token : out Object_Token) is abstract; procedure Deallocate( Pool : in out Repositioning_Storage_Pool; Token : in Object_Token) is abstract; procedure Lock( Pool : in out Repositioning_Storage_Pool; Token : in Object_Token; Storage_Address : out Address) is abstract; procedure Unlock( Pool : in out Repositioning_Storage_Pool; Token : in Object_Token) is abstract; procedure Tidy( Pool : in out Repositioning_Storage_Pool) is abstract; function Max_Lock_Count(Pool : Repositioning_Storage_Pool) return Natural is abstract; Locking_Error: exception; private ... -- not specified by the language end System.Storage_Pools.Repositioning; A type derived from Repositioning_Storage_Pool is a 'repositioning pool' (a UDGCSP), whose operations are defined below. These operations all work in a way which is 100% upwards-compatible with Root_Storage_Pool, so that a compiler which does not support the extra functionality implied by a repositioning pool will nevertheless work correctly with the pool (although no garbage collection will be performed). Each pool element is identified by a unique (within the pool) value of the type Object_Token, called the element's 'token'. This value will never change throughout the life (from its allocation to its deallocation) of the element. The type Object_Token is the same size as System.Address (and will typically actually be an address, hence the declaration which makes this conversion convenient). Each pool element has a 'lock count', of an unspecified integer (or modular) subtype, whose range does not exceed that of Standard.Natural. The element is said to be 'locked' if its lock count is a value other than 0. The overloading of the Allocate procedure with a Storage_Address parameter sets the lock count of the pool element it allocates to 1, reserves a token for the element, and reserves a block of memory, whose address is passed out to the Storage_Address parameter. The overloading of the Allocate procedure with a Token parameter sets the lock count of the pool element it allocates to 0, reserves a block of memory, and reserves a token for the element, whose value is passed out to the Token parameter. Either Allocate procedure may raise Storage_Error as a result of having insufficient memory for a block or for a token. The Lock procedure increments (by 1) the lock count of a pool element. The current address of the block of memory reserved for the element is passed out in Storage_Address. For any repositioning pool, there is a maximum value the lock count of any of its elements can reach. The Max_Lock_Count function returns this maximum. If a call is made to Lock for an element whose lock count equals this value, the Locking_Error exception is raised (and the lock count is not changed). I suggest a recommendation (or perhaps a stipulation) that the maximum lock count is never less than 15. (Although this recommendation will apply to the user rather than the implementation, except for implementation-supplied repositioning pools.) The Unlock procedure decrements (by 1) the lock count of a pool element, unless the element's lock count is already 0, in which case it raises the Locking_Error exception instead. The overloading of the Deallocate procedure which takes a Storage_Address parameter raises the Locking_Error exception if the pool element's lock count is not 1 (and does not deallocate the element). The overloading of the Deallocate procedure which takes a Token parameter raises the Locking_Error exception if the pool element's locking count is not 0 (and does not deallocate the element). The Tidy procedure carries out what is called 'compaction' or 'defragmentation' (the second stage of the garbage collection process) on a pool. Blocks of memory reserved for objects are moved (their contents copied to other locations in memory) so as to reduce (to a minimum) the amount of memory which is effectively unusable (typically because many of the gaps between the blocks have become too small). The block of memory reserved for a pool element must not be moved if the element is locked (its lock count is not 0). The first stage of the garbage collection process is 'scavenging', the recognition and deallocation of allocated objects which have become 'unreachable' (inaccessible by any degree of indirection). This stage needs to be carried out by code specially generated by the compiler. A 'disposable object' is a dynamically allocated object which an implementation (of Ada) is permitted to deallocate at any time, if it needs to do so in order to successfully complete the allocation of another object. The language should define a subtype in the package System: subtype Disposability_Level is Integer range 0..implementation_defined; and an attribute Disposability, which applies to any pool-specific access type, and has a static value of subtype Disposability_Level. A value other than 0 indicates that objects referenced by access values of this access type are disposable. The disposability level of the objects is the same as the disposability level of the access type. The default value is 0, but may be set by a representation clause. An implementation must never attempt to dispose of an object with a certain disposability level if there exists another object with a higher disposability level. Disposal of a controlled object will cause finalization of the object as normal. The principles behind the above interface are explained briefly below. Whenever an implementation makes an attempt to allocate an object in a repositioning pool, and that attempt fails (Storage_Error being raised), it can intervene by carrying out garbage collection (both stages) and then attempting the allocation again. If the second attempt fails, the implementation may attempt to deallocate a disposable object, and then try once more. Eventually, on repeated failures, defeat must be conceded (and the Storage_Error exception propagated as normal). Additional calls to Tidy may be made, by the implementation or the user, e.g. as part of an ongoing garbage collection strategy. The imlementation must always lock a pool element before reading from it or writing into it, in order to prevent it being moved at an inopportune moment. On the other hand, the implementation should unlock elements as frequently as possible (without overly reducing execution speed), so that the Tidy procedure is not stymied by locked elements to the point of being made ineffective. I would suggest that an implementation should employ a strategy where it unlocks any elements of a particular pool it has locked at every 'control point', where a control point with respect to a pool is defined as being: (1) just before any call to an allocator for an access value associated with that pool; (2) just before any call to a subprogram which may (or does) contain a control point for that pool; (3) any instance of pragma Inspection_Point (see RM95-H.3.2). The same comments made in RM95-13.11(27-30), regarding tasking, would apply to UDGCSPs. Locking, unlocking, and moving operations would need to be atomic, for a task-safe pool. The use of access values associated with UDGCSPs will, in general, be inefficient compared to non-repositioning pools; this is a compromise inherent in (full) garbage collection. It may be possible for compilers to remove a great deal of locking and unlocking in the absence of (potential) parallel task usage of a UDGCSP. Only calls to Tidy should ever move pool elements; elements should remain statically positioned in memory otherwise (regardless of their lock counts). Garbage collection facilities for Ada tend to be rare at the moment. I believe there are many situations where the availability of garbage collection would be of genuine value to Ada programmers. For example: students beginning to learn Ada, and wanting to be relieved of the burden of storage management until later in their studies; anyone writing a program which is likely to have a very short useful life, or who for any reason requires speed of development more than speed of execution; whenever an 'engine' of something else that requires garbage collection (e.g. a Java Virtual Machine) is being implemented in Ada. I think that a language definition of user-defined garbage-collecting storage pools (UDGCSPs) would help in two ways towards making garbage collection facilities more available to Ada programmers: first, it would lessen the burden on the compiler writer, who would need only to be concerned with the first stage of garbage collection (the second stage being the province of UDGCSPs); second, it would open up a 'market' in UDGCSPs, offering hopefully many variations with regard to price, functionality, reliability, and efficiency. Disposability is a facility provided by a variety of other languages, and its use is an important technique for certain kinds of software (e.g. many 'artifical intelligence' applications). In addition, it is a facility that can be very helpful for some kinds of commercial software, especially programs which need to be able to cope gracefully with environments that provide widely differing amounts of memory. Even if an implementation supports repositioning pools, it should not be required for any of its standard storage pools to be repositioning pools. Every implementation must continue to support non-repositioning pools. Finally, I believe the design I propose above would have the benefit that UDGCSPs could still be used with implementations that did not specifically support them (albeit without the garbage collection functionality), and that it does not impose any difficulty on the programmer requiring -- e.g. for speed reasons -- a non-repositioning pool. ---------- One question in my mind is whether the second parameter of the Unlock procedure should be a token or an address. Might the latter enable compilers to generate slightly better code? Apologies for the length of the post! -- Nick Roberts http://www.AdaOS.org