From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,INVALID_DATE
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: f4fd2,8e3b3db66f3b0061
X-Google-Attributes: gidf4fd2,public
X-Google-Thread: fd443,8e3b3db66f3b0061
X-Google-Attributes: gidfd443,public
X-Google-Thread: 109fba,8e3b3db66f3b0061
X-Google-Attributes: gid109fba,public
X-Google-Thread: 103376,474d28ddf953b3c1
X-Google-Attributes: gid103376,public
X-Google-Thread: 1014db,8e3b3db66f3b0061
X-Google-Attributes: gid1014db,public
X-Google-ArrivalTime: 1994-12-02 23:50:51 PST
Path: 
 bga.com!news.sprintlink.net!hookup!swrinde!gatech!bloom-beacon.mit.edu!biosci!parc!boehm
From: boehm@parc.xerox.com (Hans Boehm)
Newsgroups: 
 alt.lang.design,comp.lang.c,comp.lang.c++,comp.lang.lisp,comp.lang.ada
Subject: Re: Reference Counting (was Re: Searching Method for Incremental
 Garbage Collection)
Date: 2 Dec 1994 21:58:55 GMT
Organization: Xerox Palo Alto Research Center
Message-ID: <3bo5av$683@news.parc.xerox.com>
References: <CzHCvp.9rM@rheged.dircon.co.uk>
 <TFB.94Nov21091959@burns.cogsci.ed.ac.uk>
 <RFB.94Nov27213114@cfdevx1.lehman.com> <3be0mh$1p7@ga
 <3bii2g$kn2@news.parc.xerox.com> <3bjl64$1q8@scipio.cyberstore.ca>
NNTP-Posting-Host: siria.parc.xerox.com
Xref: bga.com alt.lang.design:154 comp.lang.c:33158 comp.lang.c++:39383
 comp.lang.lisp:4256 comp.lang.ada:8221
Date: 1994-12-02T21:58:55+00:00
List-Id: <comp.lang.ada>

kevinw@whistler.com (Kevin Warne) writes:

>In article <3bii2g$kn2@news.parc.xerox.com>, boehm@parc.xerox.com (Hans Boehm) says:
>>You do need at least a write barrier on the heap for incremental collection.
>>(In the best case, the dirty bits in the MMU are a sufficient
>>write barrier, though not for that algorithm.  But using them is likely

>Can you think of a realistic incremental collection algorithm 
>that uses MMU dirty bits?  At best I suppose a collector could perform
>a collection and then freeze the application while it rescanned
>pointers in areas covered by dirty bits.

Ours :-).  You essentially just described the algorithm.  The rescanning
takes some work, but that's typically on the order of 10% of the work
required by a full mark phase.  It also has the big advantage that with
probability nearly one, this mark phase can look only at pages that were
recently modified, and hence are nearly certain to be in physical
memory.  There are some measurements of this scheme in the 1991 SIGPLAN
PLDI paper by Alan Demers, Scott Shenker, and myself.

I would argue that on modern workstation/PC processors this is almost
always good enough for interactive (100msec) response.  With good
OS support, it's probably barely adequate for animation on a fast machine.
If you have stronger real-time requirements, you need dirty information
with better granularity than 4K pages.

>And don't forget all those OSs who keep the dirty bits to themselves.
>Just imagine the hell involved to access those bits under the likes
>of Windows NT and OS/2 (and other operating systems).

The OS vendors haven't let me forget, unfortunately.  Actually
I'm told that it's possible to fake dirty bits under NT by write-
protecting pages and catching faults.  But the NT exception model
makes that a bit clumsy, and I haven't implemented it.  (This is
actually the strategy we currently use under SunOS4.X, Irix, and OSF/1.
Solaris 2.X supports direct retrieval of kernel maintained dirty bits,
though the current implementation is slower than it should be.)

>>The crucial issue in my mind is whether you can safely avoid counting
>>pointers from the stack.

>Any incremental collector (RC or tracing) is going to need a stack
>barrier unless either A) stacks are scanned atomically at the end of a 
>collection period along with the objects that they reference or B) 
>stack objects have restrictions during collection about what kind of
>heap objects they may point to.  Otherwise stack-based references can 
>circumvent heap barriers.

Agreed.  But scanning the stack(s) atomically at the end shouldn't
be a big deal for most applications.  If you need better than 10msec
response time, then it's a problem and you have to pay for the stack
barrier.  (You can also use MMU dirty bits on the stack, though the
write protect/trap emulation of dirty bits has problems in this context.)

Note that simple RC algorithms are also unlikely to give you
10 msec response.  I suspect that many applications often drop 10K objects
at once.

Hans-J. Boehm
(boehm@parc.xerox.com)
Standard disclaimer ...