GC FAQ -- algorithms
- root set
The data that is immediately available to a program, without following
any pointers. Typically this would include local variables from the
activation stack, values in machine registers, and global, static, or
- reachable data
Data that is accessible by following pointers
(references) from the root set. Reachability is a conservative
approximation of liveness and is used by most garbage collectors.
- live data
Data that is reachable, and that the program will actually make use
of in the future. Garbage collectors typically cannot tell the
difference between live and reachable data, but programmers and
compilers sometimes can.
Data that is not reachable.
- semantic garbage
Data that is reachable, but not live.
A garbage collector that can determine unambiguously whether a given
value is a pointer or not. This can be seen as a requirement on the
run-time the collector operates within that it is possible to identify
whether some datum is a pointer.
A garbage collector which can scan data without needing
to determine with certainty whether an object is a pointer.
A collector which permits a mixture of precisely and conservatively
identified values. For example, values in registers and on the stack
might be treated conservatively, but objects on the heap would be fully
described and precisely scanned.
An incremental collector interleaves collection activity with the
actual work of the program. The interleaving is usually orderly;
that is, the collector and mutator do not simultaneously access
and modify data. If the collector and mutator are viewed as separate
threads, then they are scheduled as coroutines.
In a concurrent collector, the collector and mutator may simultaneously
access and modify data. The interleaving is disorderly, and preemption
may occur at any time.
A real-time garbage collector (yes, it is possible) guarantees that the
garbage collection costs associated with any single operation are
bounded by a small time constant. The cost of this guarantee is
typically reduced throughput, higher memory overheads, and custom
code generation. When hard real-time scheduling is not required, an
incremental or concurrent collector is probably a better choice.
In a collector which moves objects, a forwarding pointer is a reference
installed by the garbage collector from an old location to the new one.
Some systems (notably the Lisp machines) have hardware support for
forwarding pointers, which allow both the old and new address to be used
interchangably without explicit checks for forwarding pointers.
- In a copying collector, when an arena has had all active
objects removed from it, and is eligible for refilling, the sense
of the arenas is changed -- that is, an "old" arena is now a "new"
arena, and some "new" arena becomes "old". This is referred to
as a flip.
- weak reference
- weak pointer
A pointer to an object which does not prevent the object from being
reclaimed. If the only pointers to an object are from weak references,
the object may disappear, in which case the reference is replaced with
some distinguished value, typically a language's equivalent of a NULL
pointer. Weak references are often used for implementing caches.
- write barrier
- read barrier
A write barrier is a mechanism for executing some memory management
code when a write to some object takes place (that object is then
"behind the write barrier", or - informally - "write barrier-ed", or -
sloppily - "write-protected"). It can take the form of inlined code
(if memory management is integral to the compiler), or a
memory-protection fault which is handled by the memory management
code. There are also "read barriers", the nature of which is obvious.
The roles a write barrier can play in GC are a little trickier to
explain to a novice, but I'll give it a stab.
- Consider a simple generational stop-and-collect collector, such as
the one which SML/NJ used to have. "Generational" means that data is
partitioned into old and new. This partition is useful to the GC for
two reasons: (a) because data tends to die young, collecting just new
data will probably free a lot of space, and (b) because pointers tend
to point from new objects to old objects, and not vice versa, it is
cheap to find all the pointers to new objects.
Property (b) is only true if you can tell when a pointer to a new
object has been written into an old object. Otherwise you have to scan
all the old objects to find pointers to new objects, which loses one
of the main advantages of generational GC. So you put the old data
behind a write barrier, and record those writes. When you come to GC
the new data, you know the only pointers from old to new are those
which you have recorded.
- Consider a tracing GC (see note below) which is incremental or concurrent,
i.e. the user's program (the 'mutator') can run before the GC is
complete. Now there is an invariant: black objects do not point to
white objects. If the mutator writes a white pointer into a black
object, this invariant is broken and the GC can fail. There are two
basic solutions: prevent the mutator from seeing white objects ("read
barriers") or prevent the mutator from writing white pointers into
black objects ("write barriers"). The write barrier solution puts the
black objects behind a write barrier. When a white-on-black write
takes place there are various fixes: incrementally grey the white
object, regrey the black object, &c.
(note) For a tracing collector (marking or copying), one conceptually
colours the data white (not yet seen by the collector), black (alive
and scanned by the collector) and grey (alive but not yet scanned by
the collector). The collector proceeds by scanning grey objects for
pointers to white objects. The white objects found are turned grey,
and the grey objects scanned are turned black. When there are no more
grey objects, the collection is complete and all the white objects can
- remembered set
A list of all the interesting references between two sets of objects
maintained separately, so you don't have to find them by scanning. In
generational garbage collection, this technique is used for references
from an older generation to a younger one. Typically, remembered sets
are maintained by write barriers (q.v.).
- register set partitioning
Garbage-collected languages often partition the set of registers a
priori into those always traced and updated (if necessary) by the
collector and those ignored by it. The language always maintains the
former in a format understood by the collector; and never uses the
latter to hold pointers to collectable objects. More complicated
schemes are also possible.
- big bag of pages
Big Bag of Pages, or BIBOP, is the technique of using an object's address
(page number) to encode information about the object. For example, all objects
allocated on a particular page might be the same size, or the same type. This
technique can be used to avoided allocating object header bits or header tags.
For each "object" managed by the collector, maintain
a count of pointers referencing that object. Whenever
a pointer is assigned to, as in "P := Q", the referent
of Q has its reference count incremented, and the former
referent of P has its reference count decremented. If
the reference count ever falls to zero, then it is known
that there are no pointers to the object, and it may be
reclaimed. The converse is not true -- some unreferenced
objects may not have a reference count of zero, because
of cycles. Furthermore, the fields in which reference
counts are stored are typically small, so it is also possible
that they will overflow, in which case they "stick" at
their maximum value -- again, the object cannot be reclaimed.
However, in practice this is rare (unless the reference
count field is very small) and it is also possible to confine
reference-counting to data structures that will not participate
in cycles (for examples, strings, or other pointer-free data).
In a multi-threaded system, if threads are preemptively
scheduled, reference counting requires additional care to
ensure that the updates to reference counts are atomic.
This is straightforward using hardware primitives such as
fetch-and-add, compare-and-swap, or load-linked/store-conditional,
but it is definitely necessary to pay attention to this
The garbage collector has some way of finding all "root"
pointers (e.g., global variables and automatic variables),
and for each referenced object, it has some way of finding
all the pointers contained within it. Via some iterative
process (it could be recursive, but it need not be, see
...) all objects reachable from the root set are found.
As objects are found, they are marked as "found". All other
objects not marked as "found", must be free, and are returned
to the free pool for future reallocation.
Copying collection attempts to collect memory by removing all
active memory from a storage arena, and then reusing the arena.
This has the theoretical advantages of compacting the working
set of an application, and of being asymptotically more efficient
(because the cost of the collection is only proportional to the
size of the reachable data, and is unrelated to the size of the
arena itself. Mark-and-sweep collection must scan the arena).
However, theory and practice disagree sometimes (in practice). See
ftp://parcftp.xerox.com/pub/gc/complexity.html for an opinion
on this subject. Jargon ("flip"): When an arena has had all active
objects removed from it, and is eligible for refilling, the sense
of the arenas is changed -- that is, and "old" arena is now a "new"
arena, and some "new" arena becomes "old". This is referred to
as a flip.
Henry Baker has requested that people make a distinction
between `moving' collection and `copying' collection.
Conservative garbage collection makes use of the observation
that if you are not relocating (copying) objects, then you
need not be quite so certain about exactly what is a pointer.
It suffices to scan the root set (and objects) for any pointer-like
bit patterns, and treat those pointer-like bit patterns as
actual pointers while marking. The result is that the "true"
reachable objects are all found, along with a few others.
This works surprisingly well in practice (if a few additional
tricks are added) and often permits the use of garbage collection
with oblivious source code and compilers.
Conservative collection is very important because it permits
the use of garbage collection with programs that were not written
with garbage collection in mind. That is, it simpifies the use
of garbage collection with existing (non-garbage-collected)
libraries of code.
Based on the observation that most objects have short lifetimes, it is
useful to restrict garbage collection to the most recently allocated objects.
A generational collector maintains several ``generations'' of objects.
Newly created objects are all put in the ``youngest'' generation, and when
the space allocated for that generation is full, the collector will use the
root set (and any pointers from older generations to the youngest
generation -- see below) to reclaim dead objects from the youngest
generation only, leaving the ``older'' generations untouched. Objects that
survive after several (perhaps just one) collection of the youngest
generation are ``promoted'' to the next older generation, and when that
generation becomes full, it and all the generations younger than it are
The difficulty with generational collectors is the identification of
pointers from older generations into younger ones. An observation is that
such references are uncommon in most programs: new objects typically point
to older objects; this is clearly true in pure functional languages where
assignment is prohibited, but is common for many programs and languages.
Nonetheless, such references do exist, and a collector must handle them.
When a pointer to a younger generation object is written into an old
object, the pointer (or just the old object) is recorded so it can be
scanned when the younger generation is collected.
In deferred reference counts, local updates to the reference
count (that is, those arising from assignments to and from
local variables) are "deferred", as long as it is known that
the reference count will remain positive when it should be positive.
For example, the code to swap the (reference-counted) values
contained in two variables
naively performs the following operations:
After all the dust has cleared, after six reference count
operations, the reference counts of the objects referred
incrc(a); tmp := a;
incrc(b); decrc(a); a := b;
incrc(tmp); decrc(b); b := tmp
decrc(tmp); (tmp goes out of scope)
b (now by
a) are unchanged. It is more efficient to
recognize this and leave the reference counts unchanged.
Note that naive reference counting can take arbitrarily long
to return from a decrement, because freeing one object may
recursively trigger the freeing of other objects whose counts
drop to zero. Instead, an object whose reference count falls
to zero may be placed on a queue/list/stack of objects that
are no longer in use, but not yet processed onto the free list.
Alternatively, when an object is allocated off the free list,
the pointers to objects within it may be scanned.
This is also known as "Weizenbauming", at least in some circles.
William Stoye's trick in his combinator machine. Reference counts
are frequently one, and noting this in the pointer, rather than
the object, can save space, cut memory references, and allow
the easy recycling of vast amounts of garbage.
In a combinator
machine, the actual primitives can be parameterized by the
reference counts of their operands to recycle memory in place.
Note, however, that a combinator machine generates vast amounts
of garbage to begin with; this technique is unlikely to work
as well in general use.
This might also be regarded as "lazy sweep" or "incremental sweep".
Whenever memory is needed, the allocator scans objects until it
discovers one that is marked "unused" and is also large enough for the
request. All objects scanned, whether large enough or not, are marked
"used". When the scan pointer reaches the end of memory, the sense of
the mark bits is reversed (that is, if the current assignment is
0==used and 1==unused, then the new assignment is 0==unused, 1==used).
Temporarily, all objects on the heap are now marked "unused".
Next, a mark phase is run, which tags all reachable objects as "used"
so that they will not be reallocated later.
Snapshot mark-and-sweep uses the observation that the set
of unreachable objects does not shrink. It is possible
(using, for instance, copy-on-write page mapping) to quickly make
a copy of the address space, process it concurrently to
determine what is garbage, and send that information back
to the running process. More garbage may have been generated
in the interim, but that is ok, because it will be found in
the next collection.
Baker's algorithm works by deferring collection, but maintaining
a "mutator invariant". The basic algorithm is a modification of
copying-compacting collecting. The "mutator invariant" is that
the mutator (the program performing useful work) can only observe
pointers to new space. The deferred collection means that in fact,
new space itself may contain pointers to old space, so the mutator
invariant must be checked and restored (through incremental forwarding)
whenever a pointer is loaded from memory.
This is better-explained with pictures.
It isn't really real time (see Jeff Barth's question from the audience at
1988 SIGPLAN PLDI) but that's probably ok, unless you are feeling
pedantic. Use page protection to enforce the mutator invariant,
and move objects on the entire page when one of them is loaded.
Joel Bartlett's conservative-compacting collector fragments
the "new" and "old" spaces of a typical copying-compacting
collector into "new" and "old" pages. To add conservatism,
his collector classifies pointers into definite and indefinite.
A definite pointer is definitely a pointer -- if the object is
relocated, the pointer to it can be changed, because it is definitely
NOT an integer, floating point value, or portion of a bitmap.
An indefinite pointer (typically one found on the stack, where
an uncooperative compiler has placed variables willy-nilly)
cannot be changed, because it might not really be a pointer.
Thus, any object referenced by an indefinite pointer cannot be
relocated. In the simplest form, this collector scans from
all the indefinite pointers first (if only the stack is indefinite,
this is not hard), and all objects so referenced cannot be moved.
Finishing the collection, from definite pointers, any newly touched
objects (i.e., those not directly referenced by indefinite pointers
may be moved (this is not a good explanation, is it?)
A treadmill collector is based on the observation that the regions
used in a copying-compacting collector (real-time or not) need not
be identified with address ranges. It is also possible to use
doubly-linked lists (unit-time insertion and deletion) together with
a few membership bits to indicate which list an arbitrary object is on.
"Flips" are accomplished by changing the meaning of the tag bits,
instead of changing the tag bit. For small objects, the space
overhead is relatively high (two list pointers).
(Appel's work predates this, but I've got the conference proceedings
for Goldberg's work. I'll try to get a full collection of references).
Astoundingly, in a statically typed language with a type system
as complex as ML's, it is not necessary to tag pointers to make
a garbage collector work.
Goldberg showed something much more interesting --
that you could not reconstruct the types of all reachable objects
(without auxiliary information maintained at run time). However,
and this is the fascinating part, these objects were guaranteed
to be "semantic" garbage -- i.e., even though the objects were
reachable, they would never be accessed.
There are other tagless collectors for such a language:
Aditya & Caro's type reconstructor for Id
Tolmach's tag-free collection for ML
Greg Morrisett's tag-free implementation in the TIL/ML compiler
In a conservative collector, it is often useful to look
for bit patterns that aren't yet valid pointers, but that
are likely to become valid if the size of the heap is increased.
The pages (objects) that they reference should be "blacklisted"
so that they are never used, because if the bit pattern doesn't
change, it will cause spurious retention of any object allocated
at that address.
GC FAQ table of contents
Techniques and algorithms