From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,INVALID_DATE autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: f4fd2,8e3b3db66f3b0061 X-Google-Attributes: gidf4fd2,public X-Google-Thread: fd443,8e3b3db66f3b0061 X-Google-Attributes: gidfd443,public X-Google-Thread: 109fba,8e3b3db66f3b0061 X-Google-Attributes: gid109fba,public X-Google-Thread: 103376,474d28ddf953b3c1 X-Google-Attributes: gid103376,public X-Google-Thread: 1014db,8e3b3db66f3b0061 X-Google-Attributes: gid1014db,public X-Google-ArrivalTime: 1994-12-02 23:50:51 PST Path: bga.com!news.sprintlink.net!hookup!swrinde!gatech!bloom-beacon.mit.edu!biosci!parc!boehm From: boehm@parc.xerox.com (Hans Boehm) Newsgroups: alt.lang.design,comp.lang.c,comp.lang.c++,comp.lang.lisp,comp.lang.ada Subject: Re: Reference Counting (was Re: Searching Method for Incremental Garbage Collection) Date: 2 Dec 1994 21:58:55 GMT Organization: Xerox Palo Alto Research Center Message-ID: <3bo5av$683@news.parc.xerox.com> References: <3be0mh$1p7@ga <3bii2g$kn2@news.parc.xerox.com> <3bjl64$1q8@scipio.cyberstore.ca> NNTP-Posting-Host: siria.parc.xerox.com Xref: bga.com alt.lang.design:154 comp.lang.c:33158 comp.lang.c++:39383 comp.lang.lisp:4256 comp.lang.ada:8221 Date: 1994-12-02T21:58:55+00:00 List-Id: kevinw@whistler.com (Kevin Warne) writes: >In article <3bii2g$kn2@news.parc.xerox.com>, boehm@parc.xerox.com (Hans Boehm) says: >>You do need at least a write barrier on the heap for incremental collection. >>(In the best case, the dirty bits in the MMU are a sufficient >>write barrier, though not for that algorithm. But using them is likely >Can you think of a realistic incremental collection algorithm >that uses MMU dirty bits? At best I suppose a collector could perform >a collection and then freeze the application while it rescanned >pointers in areas covered by dirty bits. Ours :-). You essentially just described the algorithm. The rescanning takes some work, but that's typically on the order of 10% of the work required by a full mark phase. It also has the big advantage that with probability nearly one, this mark phase can look only at pages that were recently modified, and hence are nearly certain to be in physical memory. There are some measurements of this scheme in the 1991 SIGPLAN PLDI paper by Alan Demers, Scott Shenker, and myself. I would argue that on modern workstation/PC processors this is almost always good enough for interactive (100msec) response. With good OS support, it's probably barely adequate for animation on a fast machine. If you have stronger real-time requirements, you need dirty information with better granularity than 4K pages. >And don't forget all those OSs who keep the dirty bits to themselves. >Just imagine the hell involved to access those bits under the likes >of Windows NT and OS/2 (and other operating systems). The OS vendors haven't let me forget, unfortunately. Actually I'm told that it's possible to fake dirty bits under NT by write- protecting pages and catching faults. But the NT exception model makes that a bit clumsy, and I haven't implemented it. (This is actually the strategy we currently use under SunOS4.X, Irix, and OSF/1. Solaris 2.X supports direct retrieval of kernel maintained dirty bits, though the current implementation is slower than it should be.) >>The crucial issue in my mind is whether you can safely avoid counting >>pointers from the stack. >Any incremental collector (RC or tracing) is going to need a stack >barrier unless either A) stacks are scanned atomically at the end of a >collection period along with the objects that they reference or B) >stack objects have restrictions during collection about what kind of >heap objects they may point to. Otherwise stack-based references can >circumvent heap barriers. Agreed. But scanning the stack(s) atomically at the end shouldn't be a big deal for most applications. If you need better than 10msec response time, then it's a problem and you have to pay for the stack barrier. (You can also use MMU dirty bits on the stack, though the write protect/trap emulation of dirty bits has problems in this context.) Note that simple RC algorithms are also unlikely to give you 10 msec response. I suspect that many applications often drop 10K objects at once. Hans-J. Boehm (boehm@parc.xerox.com) Standard disclaimer ...