From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,fd63afa4dc364b7e X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-03-14 11:32:12 PST Newsgroups: comp.lang.ada Path: supernews.google.com!sn-xit-03!supernews.com!news-feed.riddles.org.uk!newsfeed.direct.ca!look.ca!newsfeed.mesh.ad.jp!uunet!osa.uu.net!sac.uu.net!ash.uu.net!world!bobduff From: Robert A Duff Subject: Re: Better support for garbage collection Sender: bobduff@world.std.com (Robert A Duff) Message-ID: Date: Wed, 14 Mar 2001 19:29:48 GMT References: <98m938$2iod0$1@ID-25716.news.dfncis.de> Organization: The World Public Access UNIX, Brookline, MA X-Newsreader: Gnus v5.3/Emacs 19.34 Xref: supernews.google.com comp.lang.ada:5729 Date: 2001-03-14T19:29:48+00:00 List-Id: "Nick Roberts" writes: > I would like to make a proposal for the next revision of the language which > would provide standardised support for partially user-defined garbage > collection, and a standard mechanism for disposability. If you're interested in garbage collection, you should get on the gc mailing list (gclist-digest@iecc.com). You should also read Jones and Lins' excellent book called "Garbage Collection", if you haven't already. I too think it would be cool if GC for Ada were widely available. But I think youll have a hard time convincing others. After all, any implementation is allowed to provide GC, but nobody has done so, except when the GC is unavoidable (the Ada 83 implementation on top of the Symbolics Lisp Machine, and the Ada 95 implementations on top of the JVM). Why not? Robert Dewar would argue that it's because nobody wants it (at least, nobody wants it enough to pay for it). A more pessimistic take on this is that all the folks who understand the glorious benefits of GC have long since abandoned Ada in favor of "better" languages. It is also possible to use Boehm's conservative collector with Ada. Does anybody do so? (I find the idea of conservative collection somewhat distasteful, but it does seem to work in many cases.) You seem to be saying that the user provides the compaction phase, and the implementation provides the tracing phase. But the tracing phase is the hard part. So how does this make GC so easy to implement that implementers will do it when they haven't already? Now to the technical points: I'm completely confused about what this proposal means. I don't see how the user can write the Tidy routine. (Why not call it Compact?) The Tidy routine needs to modify all the pointers to objects in the repositioning pool -- how is it supposed to find them? It doesn't know the layout of all the records in the pool. Also, it can't find all the pointers from outside the pool to objects in the pool. A sample implementation of package Repositioning would help me understand. Eg, what is the intended implementation of tokens? (Or at least one possible implementation?) Note that you can have pointers into the middle of objects -- these also need to be adjusted when objects move around in memory. (Pointers to aliased components. Renamings of components. Components passed by reference.) Note that the tracing phase needs to trace the whole world (all stacks and storage pools) whenever *any* repositioning pool is collected, because you haven't proposed to restrict pointers from the outside. You didn't say when the compiler is supposed to generate calls to the Allocate and Deallocate routines (the ones with token parameter), so I don't understand the point of them. Note that it is not always known at compile time whether a given access type's pool is a repositioning pool, so the code generated for "new" and "Unchecked_Deallocation" must be as in the RM (eg, these can't call the new Allocate). > procedure Allocate( > Pool : in out Repositioning_Storage_Pool; > Token : out Object_Token) is abstract; > Each pool element has a 'lock count', of an unspecified integer (or > modular) subtype, whose range does not exceed that of Standard.Natural. The > element is said to be 'locked' if its lock count is a value other than 0. Nonnegative, right? > The overloading of the Allocate procedure with a Token parameter sets the > lock count of the pool element it allocates to 0, reserves a block of > memory, and reserves a token for the element, whose value is passed out to > the Token parameter. I don't see how it can reserve a block of memory, since it doesn't take size and alignment parameters. I'm confused as to what this routine is for. > I suggest a recommendation (or perhaps a stipulation) that the maximum lock > count is never less than 15. (Although this recommendation will apply to > the user rather than the implementation, except for implementation-supplied > repositioning pools.) I don't see the point of such a recommendation. > The first stage of the garbage collection process is 'scavenging', the > recognition and deallocation of allocated objects which have become > 'unreachable' (inaccessible by any degree of indirection). This stage needs > to be carried out by code specially generated by the compiler. But that's not really how many (most?) GC's work. The don't recognize unreachable objects, and they don't deallocate them individually. They find all the *reachable* objects, and whatever's left over is garbage. In a copying collector, the garbage need not be touched. So the cost of GC is proportional to the amount of reachable data. If you have to individually deallocate all the garbage objects, then it's proportional to the size of the heap, which seems bad. > A 'disposable object' is a dynamically allocated object which an > implementation (of Ada) is permitted to deallocate at any time, if it needs > to do so in order to successfully complete the allocation of another > object. > > The language should define a subtype in the package System: > > subtype Disposability_Level is > Integer range 0..implementation_defined; Why "implementation_defined"? It seems like a useless non-portability. > and an attribute Disposability, which applies to any pool-specific access > type, and has a static value of subtype Disposability_Level. This is the first attribute of a non-scalar type that is static. >... A value other > than 0 indicates that objects referenced by access values of this access > type are disposable. It makes no sense to talk about the objects "referenced by access values of this access type", because there can be many access values, of many different types, all pointing to the same object. Perhaps you mean "allocated by allocators for this access type"? (BTW, "designated" is the Ada jargon -- not "referenced".) > Whenever an implementation makes an attempt to allocate an object in a > repositioning pool, and that attempt fails (Storage_Error being raised), it > can intervene by carrying out garbage collection (both stages) and then > attempting the allocation again. If the second attempt fails, the > implementation may attempt to deallocate a disposable object, and then try > once more. Eventually, on repeated failures, defeat must be conceded (and > the Storage_Error exception propagated as normal). Deallocating a disposable object can leave dangling pointers (which is the whole point of GC to avoid). Are you assuming the Finalize routine will track them all down and do something about them (eg null them out)? What happens if it doesn't. You didn't mention anything about finalization of normal (non-disposable) objects. There are nasty interactions there. Eg, can the finalization routine resurrect the object by creating a new pointer to it? Java has an answer to this (not a very good one, IMHO). > Additional calls to Tidy may be made, by the implementation or the user, > e.g. as part of an ongoing garbage collection strategy. > > The imlementation must always lock a pool element before reading from it or > writing into it, in order to prevent it being moved at an inopportune > moment. On the other hand, the implementation should unlock elements as > frequently as possible (without overly reducing execution speed), so that > the Tidy procedure is not stymied by locked elements to the point of being > made ineffective. It seems like the compiler must generate a Lock before each call where a movable object is passed by reference (or a part of such an object). And it must remain locked throughout that call, because the called procedure doesn't know whether it has a pointer into a repositioning pool. Doesn't sound very "frequent". For that matter, it seems like it has to Lock even if the thing is passed by copy, because the address of the actual is presumably saved at the call site (at least in some cases). - Bob