GC FAQ -- algorithms

Basic Algorithms

Jargon

root set
The data that is immediately available to a program, without following any pointers. Typically this would include local variables from the activation stack, values in machine registers, and global, static, or module variables.
reachable data
Data that is accessible by following pointers (references) from the root set. Reachability is a conservative approximation of liveness and is used by most garbage collectors.
live data
Data that is reachable, and that the program will actually make use of in the future. Garbage collectors typically cannot tell the difference between live and reachable data, but programmers and compilers sometimes can.
garbage
Data that is not reachable.
semantic garbage
Data that is reachable, but not live.
precise
A garbage collector that can determine unambiguously whether a given value is a pointer or not. This can be seen as a requirement on the run-time the collector operates within that it is possible to identify whether some datum is a pointer.
conservative
A garbage collector which can scan data without needing to determine with certainty whether an object is a pointer.
semi-conservative
mostly-precise
A collector which permits a mixture of precisely and conservatively identified values. For example, values in registers and on the stack might be treated conservatively, but objects on the heap would be fully described and precisely scanned.
incremental
An incremental collector interleaves collection activity with the actual work of the program. The interleaving is usually orderly; that is, the collector and mutator do not simultaneously access and modify data. If the collector and mutator are viewed as separate threads, then they are scheduled as coroutines.
concurrent
In a concurrent collector, the collector and mutator may simultaneously access and modify data. The interleaving is disorderly, and preemption may occur at any time.
real-time
A real-time garbage collector (yes, it is possible) guarantees that the garbage collection costs associated with any single operation are bounded by a small time constant. The cost of this guarantee is typically reduced throughput, higher memory overheads, and custom code generation. When hard real-time scheduling is not required, an incremental or concurrent collector is probably a better choice.
forwarding-pointer
In a collector which moves objects, a forwarding pointer is a reference installed by the garbage collector from an old location to the new one. Some systems (notably the Lisp machines) have hardware support for forwarding pointers, which allow both the old and new address to be used interchangably without explicit checks for forwarding pointers.
flip
In a copying collector, when an arena has had all active objects removed from it, and is eligible for refilling, the sense of the arenas is changed -- that is, an "old" arena is now a "new" arena, and some "new" arena becomes "old". This is referred to as a flip.
weak reference
weak pointer
A pointer to an object which does not prevent the object from being reclaimed. If the only pointers to an object are from weak references, the object may disappear, in which case the reference is replaced with some distinguished value, typically a language's equivalent of a NULL pointer. Weak references are often used for implementing caches.
write barrier
read barrier
A write barrier is a mechanism for executing some memory management code when a write to some object takes place (that object is then "behind the write barrier", or - informally - "write barrier-ed", or - sloppily - "write-protected"). It can take the form of inlined code (if memory management is integral to the compiler), or a memory-protection fault which is handled by the memory management code. There are also "read barriers", the nature of which is obvious.

The roles a write barrier can play in GC are a little trickier to explain to a novice, but I'll give it a stab.

  1. Consider a simple generational stop-and-collect collector, such as the one which SML/NJ used to have. "Generational" means that data is partitioned into old and new. This partition is useful to the GC for two reasons: (a) because data tends to die young, collecting just new data will probably free a lot of space, and (b) because pointers tend to point from new objects to old objects, and not vice versa, it is cheap to find all the pointers to new objects.

    Property (b) is only true if you can tell when a pointer to a new object has been written into an old object. Otherwise you have to scan all the old objects to find pointers to new objects, which loses one of the main advantages of generational GC. So you put the old data behind a write barrier, and record those writes. When you come to GC the new data, you know the only pointers from old to new are those which you have recorded.

  2. Consider a tracing GC (see note below) which is incremental or concurrent, i.e. the user's program (the 'mutator') can run before the GC is complete. Now there is an invariant: black objects do not point to white objects. If the mutator writes a white pointer into a black object, this invariant is broken and the GC can fail. There are two basic solutions: prevent the mutator from seeing white objects ("read barriers") or prevent the mutator from writing white pointers into black objects ("write barriers"). The write barrier solution puts the black objects behind a write barrier. When a white-on-black write takes place there are various fixes: incrementally grey the white object, regrey the black object, &c.

    (note) For a tracing collector (marking or copying), one conceptually colours the data white (not yet seen by the collector), black (alive and scanned by the collector) and grey (alive but not yet scanned by the collector). The collector proceeds by scanning grey objects for pointers to white objects. The white objects found are turned grey, and the grey objects scanned are turned black. When there are no more grey objects, the collection is complete and all the white objects can be recycled.

remembered set
A list of all the interesting references between two sets of objects maintained separately, so you don't have to find them by scanning. In generational garbage collection, this technique is used for references from an older generation to a younger one. Typically, remembered sets are maintained by write barriers (q.v.).
register set partitioning
Garbage-collected languages often partition the set of registers a priori into those always traced and updated (if necessary) by the collector and those ignored by it. The language always maintains the former in a format understood by the collector; and never uses the latter to hold pointers to collectable objects. More complicated schemes are also possible.
big bag of pages
Big Bag of Pages, or BIBOP, is the technique of using an object's address (page number) to encode information about the object. For example, all objects allocated on a particular page might be the same size, or the same type. This technique can be used to avoided allocating object header bits or header tags.

Reference counting

For each "object" managed by the collector, maintain a count of pointers referencing that object. Whenever a pointer is assigned to, as in "P := Q", the referent of Q has its reference count incremented, and the former referent of P has its reference count decremented. If the reference count ever falls to zero, then it is known that there are no pointers to the object, and it may be reclaimed. The converse is not true -- some unreferenced objects may not have a reference count of zero, because of cycles. Furthermore, the fields in which reference counts are stored are typically small, so it is also possible that they will overflow, in which case they "stick" at their maximum value -- again, the object cannot be reclaimed. However, in practice this is rare (unless the reference count field is very small) and it is also possible to confine reference-counting to data structures that will not participate in cycles (for examples, strings, or other pointer-free data).

In a multi-threaded system, if threads are preemptively scheduled, reference counting requires additional care to ensure that the updates to reference counts are atomic. This is straightforward using hardware primitives such as fetch-and-add, compare-and-swap, or load-linked/store-conditional, but it is definitely necessary to pay attention to this detail.

See also....

Mark-and-sweep

The garbage collector has some way of finding all "root" pointers (e.g., global variables and automatic variables), and for each referenced object, it has some way of finding all the pointers contained within it. Via some iterative process (it could be recursive, but it need not be, see ...) all objects reachable from the root set are found. As objects are found, they are marked as "found". All other objects not marked as "found", must be free, and are returned to the free pool for future reallocation.

Copying

Copying collection attempts to collect memory by removing all active memory from a storage arena, and then reusing the arena. This has the theoretical advantages of compacting the working set of an application, and of being asymptotically more efficient (because the cost of the collection is only proportional to the size of the reachable data, and is unrelated to the size of the arena itself. Mark-and-sweep collection must scan the arena). However, theory and practice disagree sometimes (in practice). See ftp://parcftp.xerox.com/pub/gc/complexity.html for an opinion on this subject. Jargon ("flip"): When an arena has had all active objects removed from it, and is eligible for refilling, the sense of the arenas is changed -- that is, and "old" arena is now a "new" arena, and some "new" arena becomes "old". This is referred to as a flip.

Henry Baker has requested that people make a distinction between `moving' collection and `copying' collection.

Variations

Conservative collection

Conservative garbage collection makes use of the observation that if you are not relocating (copying) objects, then you need not be quite so certain about exactly what is a pointer. It suffices to scan the root set (and objects) for any pointer-like bit patterns, and treat those pointer-like bit patterns as actual pointers while marking. The result is that the "true" reachable objects are all found, along with a few others. This works surprisingly well in practice (if a few additional tricks are added) and often permits the use of garbage collection with oblivious source code and compilers.

Conservative collection is very important because it permits the use of garbage collection with programs that were not written with garbage collection in mind. That is, it simpifies the use of garbage collection with existing (non-garbage-collected) libraries of code.

Generational collection

Based on the observation that most objects have short lifetimes, it is useful to restrict garbage collection to the most recently allocated objects.

A generational collector maintains several ``generations'' of objects. Newly created objects are all put in the ``youngest'' generation, and when the space allocated for that generation is full, the collector will use the root set (and any pointers from older generations to the youngest generation -- see below) to reclaim dead objects from the youngest generation only, leaving the ``older'' generations untouched. Objects that survive after several (perhaps just one) collection of the youngest generation are ``promoted'' to the next older generation, and when that generation becomes full, it and all the generations younger than it are collected.

The difficulty with generational collectors is the identification of pointers from older generations into younger ones. An observation is that such references are uncommon in most programs: new objects typically point to older objects; this is clearly true in pure functional languages where assignment is prohibited, but is common for many programs and languages. Nonetheless, such references do exist, and a collector must handle them. When a pointer to a younger generation object is written into an old object, the pointer (or just the old object) is recorded so it can be scanned when the younger generation is collected.

Deferred reference counting

In deferred reference counts, local updates to the reference count (that is, those arising from assignments to and from local variables) are "deferred", as long as it is known that the reference count will remain positive when it should be positive. For example, the code to swap the (reference-counted) values contained in two variables a and b naively performs the following operations:
incrc(a); tmp := a;
incrc(b); decrc(a); a := b;
incrc(tmp); decrc(b); b := tmp
decrc(tmp); (tmp goes out of scope)
After all the dust has cleared, after six reference count operations, the reference counts of the objects referred to by a and b (now by b and a) are unchanged. It is more efficient to recognize this and leave the reference counts unchanged.

Lazy Freeing

Note that naive reference counting can take arbitrarily long to return from a decrement, because freeing one object may recursively trigger the freeing of other objects whose counts drop to zero. Instead, an object whose reference count falls to zero may be placed on a queue/list/stack of objects that are no longer in use, but not yet processed onto the free list. Alternatively, when an object is allocated off the free list, the pointers to objects within it may be scanned. This is also known as "Weizenbauming", at least in some circles.

One-bit reference counting

William Stoye's trick in his combinator machine. Reference counts are frequently one, and noting this in the pointer, rather than the object, can save space, cut memory references, and allow the easy recycling of vast amounts of garbage. In a combinator machine, the actual primitives can be parameterized by the reference counts of their operands to recycle memory in place. Note, however, that a combinator machine generates vast amounts of garbage to begin with; this technique is unlikely to work as well in general use.

Mark-and-don't-sweep

This might also be regarded as "lazy sweep" or "incremental sweep". Whenever memory is needed, the allocator scans objects until it discovers one that is marked "unused" and is also large enough for the request. All objects scanned, whether large enough or not, are marked "used". When the scan pointer reaches the end of memory, the sense of the mark bits is reversed (that is, if the current assignment is 0==used and 1==unused, then the new assignment is 0==unused, 1==used). Temporarily, all objects on the heap are now marked "unused". Next, a mark phase is run, which tags all reachable objects as "used" so that they will not be reallocated later.

Snapshot mark-and-sweep

Snapshot mark-and-sweep uses the observation that the set of unreachable objects does not shrink. It is possible (using, for instance, copy-on-write page mapping) to quickly make a copy of the address space, process it concurrently to determine what is garbage, and send that information back to the running process. More garbage may have been generated in the interim, but that is ok, because it will be found in the next collection.

Baker's real time copying collection

Baker's algorithm works by deferring collection, but maintaining a "mutator invariant". The basic algorithm is a modification of copying-compacting collecting. The "mutator invariant" is that the mutator (the program performing useful work) can only observe pointers to new space. The deferred collection means that in fact, new space itself may contain pointers to old space, so the mutator invariant must be checked and restored (through incremental forwarding) whenever a pointer is loaded from memory.

This is better-explained with pictures.

Appel-Ellis-Li "real time" concurrent collection

It isn't really real time (see Jeff Barth's question from the audience at 1988 SIGPLAN PLDI) but that's probably ok, unless you are feeling pedantic. Use page protection to enforce the mutator invariant, and move objects on the entire page when one of them is loaded. (...)

Bartlett's conservative-compacting collection

Joel Bartlett's conservative-compacting collector fragments the "new" and "old" spaces of a typical copying-compacting collector into "new" and "old" pages. To add conservatism, his collector classifies pointers into definite and indefinite. A definite pointer is definitely a pointer -- if the object is relocated, the pointer to it can be changed, because it is definitely NOT an integer, floating point value, or portion of a bitmap. An indefinite pointer (typically one found on the stack, where an uncooperative compiler has placed variables willy-nilly) cannot be changed, because it might not really be a pointer. Thus, any object referenced by an indefinite pointer cannot be relocated. In the simplest form, this collector scans from all the indefinite pointers first (if only the stack is indefinite, this is not hard), and all objects so referenced cannot be moved. Finishing the collection, from definite pointers, any newly touched objects (i.e., those not directly referenced by indefinite pointers may be moved (this is not a good explanation, is it?)

Treadmill collection

A treadmill collector is based on the observation that the regions used in a copying-compacting collector (real-time or not) need not be identified with address ranges. It is also possible to use doubly-linked lists (unit-time insertion and deletion) together with a few membership bits to indicate which list an arbitrary object is on. "Flips" are accomplished by changing the meaning of the tag bits, instead of changing the tag bit. For small objects, the space overhead is relatively high (two list pointers).

Goldberg's tagless collection

(Appel's work predates this, but I've got the conference proceedings for Goldberg's work. I'll try to get a full collection of references). Astoundingly, in a statically typed language with a type system as complex as ML's, it is not necessary to tag pointers to make a garbage collector work. Goldberg showed something much more interesting -- that you could not reconstruct the types of all reachable objects (without auxiliary information maintained at run time). However, and this is the fascinating part, these objects were guaranteed to be "semantic" garbage -- i.e., even though the objects were reachable, they would never be accessed. There are other tagless collectors for such a language: Aditya & Caro's type reconstructor for Id Tolmach's tag-free collection for ML Greg Morrisett's tag-free implementation in the TIL/ML compiler

Blacklisting in a conservative collector

In a conservative collector, it is often useful to look for bit patterns that aren't yet valid pointers, but that are likely to become valid if the size of the heap is increased. The pages (objects) that they reference should be "blacklisted" so that they are never used, because if the bit pattern doesn't change, it will cause spurious retention of any object allocated at that address.
GC FAQ table of contents
Techniques and algorithms
Language interfaces
Difficult topics
Silly stuff