Garbage Collection Without Unsafe Code
Many people, including myself, have implemented garbage collection (GC)
libraries for Rust. Manish Goregaokar wrote up a fantastic survey of this
space a few years ago. These libraries aim to provide a safe API for their users
to consume: an unsafe
-free interface which soundly encapsulates and hides the
library’s internal unsafe
code. The one exception is their mechanism to
enumerate the outgoing GC edges of user-defined GC types, since failure to
enumerate all edges can lead the collector to believe that an object is
unreachable and collect it, despite the fact that the user still has a reference
to the reclaimed object, leading to use-after-free bugs.1 This
functionality is generally exposed as an unsafe
trait for the user to
implement because it is the user’s responsibility, not the library’s, to uphold
this particular critical safety invariant.
However, despite providing safe interfaces, all of these libraries make
extensive use of unsafe
code in their internal implementations. I’ve always
believed it was possible to write a garbage collection library without any
unsafe
code, and no one I’ve asserted this to has disagreed, but there has
never been a proof by construction.
So, finally, I created the safe-gc
crate: a garbage collection library for
Rust with zero unsafe
code. No unsafe
in the API. No unsafe
in the
implementation. It even has a forbid(unsafe_code)
pragma at the top.
That said, safe-gc
is not a particularly high-performance garbage collector.
Using safe-gc
To use safe-gc
, first we define our GC-managed types, using Gc<T>
to define
references to other GC-managed objects, and implement the Trace
trait to
report each of those GC edges to the collector:
This looks pretty similar to other GC libraries in Rust, although it could
definitely benefit from an implementation of Trace
for Option<T>
and a
derive(Trace)
macro. The big difference from existing GC libraries is that
Trace
is safe to implement; more on this later.
Next, we create one or more Heap
s to allocate our objects within. Each heap is
independently garbage collected.
And with a Heap
in hand, we can allocate objects:
The heap will automatically trigger garbage collections, as necessary, but we can also force a collection if we want:
Rather than deref’ing Gc<T>
pointers directly, we must index into the Heap
to access the referenced T
object. This contrasts with other GC libraries and
is the key that unlocks safe-gc
’s lack of unsafe
code, allowing the
implementation to abide by Rust’s ownership and borrowing discipline.2
Finally, there are actually two types for indexing into Heap
s to access GC
objects:
Gc<T>
, which we have seen already, andRoot<T>
, which we have also seen in action, but which was hidden from us by type inference.
The Gc<T>
type is Copy
and should be used when referencing other GC-managed
objects from within a GC-managed object’s type definition, or when you can prove
that a garbage collection will not happen (i.e. you have a shared borrow of its
heap). A Gc<T>
does not root its referenced T
, keeping it alive across
garbage collections, and therefore Gc<T>
should not be used to hold onto GC
references across any operation that can trigger a garbage collection.
A Root<T>
, on the other hand, does indeed root its associated T
object,
preventing the object from being reclaimed during garbage collection. This makes
Root<T>
suitable for holding references to GC-managed objects across
operations that can trigger garbage collections. Root<T>
is not Copy
because
dropping it must remove its entry from the heap’s root set. Allocation returns
rooted references; all the heap.alloc(...)
calls from our earlier examples
returned Root<T>
s.
Peeking Under the Hood
A safe_gc::Heap
is more similar to an arena newtype over a
Vec
than an engineered heap with hierarchies of
regions like
Immix. Its
main storage is a hash map from std::any::TypeId
to uniform arenas of the
associated type. This lets us ultimately use Vec
as the storage for
heap-allocated objects, and we don’t need to do any unsafe pointer arithmetic or
worry about splitting large blocks in our free lists. In fact, the free lists
only manage indices, not blocks of raw memory.
To allocate a new T
in the heap, we first get the T
object arena out of the
heap’s hash map, or create it if it doesn’t exist yet. Then, we check if the
arena has capacity to allocate our new T
. If it does, we push the object onto
the arena and return a rooted reference. If it does not, we fall back to an
out-of-line slow path where we trigger a garbage collection to ensure that we
have space for the new object, and then try again.
Arena<T>
allocation bottoms out in allocating from a FreeList<T>
, which will
attempt to use existing capacity by popping off its internal list of empty
entries when possible, or otherwise fall back to reserving additional capacity.
Accessing objects in the heap is straightforward: look up the arena for T
and
index into it.
Before we get into how safe-gc
actually performs garbage collection, we need
to look at how it implements the root set. The root set are the set of things
that are definitely alive; things that the application is actively using right
now or planning to use in the future. The goal of the collector is to identify
all objects transitively referenced by these roots, since these are the objects
that can still be used in the future, and recycle all others.
Each Arena<T>
has its own RootSet<T>
. For simplicity a RootSet<T>
is a
wrapper around a FreeList<Gc<T>>
. When we add new roots, we insert them into
the FreeList
, and when we drop a root, we remove it from the FreeList
. This
does mean that the root set can contain duplicates and is therefore not a proper
set. The root set’s FreeList
is additionally wrapped in an Rc<RefCell<...>>
so that we can implement Clone
for Root<T>
, which adds another entry in the
root set, and don’t need to explicitly pass around a Heap
to hold additional
references to a rooted object.
Finally, I took care to design Root<T>
and RootSet<T>
such that Root<T>
doesn’t directly hold a Gc<T>
. This allows for updating rooted GC pointers
after a collection, which is necessary for moving GC algorithms like
generational GC and compaction. In fact, I originally intended to implement a
copying collector, which is a moving GC algorithm, for safe-gc
but ran into
some issues. More on those later. For now, we retain the possibility of
introducing moving GC at a later date.
With all that out of the way, we can finally look at the core garbage collection algorithm.
safe-gc
implements simple mark-and-sweep garbage collection. We begin by
resetting the mark bits for each arena, and making sure that there are enough
bits for all of our allocated objects, since we keep the mark bits in an
out-of-line compact bitset rather than in each object’s header word or something
like that.
Next we begin the mark phase. This starts by iterating over each root and then
setting its mark bit and enqueuing it in the mark stack by calling
collector.edge(root)
.
The mark phase continues by marking everything transitively reachable from those roots in a fixed-point loop. If we discover an unmarked object, we mark it and enqueue it for tracing. Whenever we see an already-marked object, we ignore it.
What is kind of unusual is that we don’t have a single mark stack. The Heap
has no T
type parameter, and contains many different types of objects, so the
heap itself doesn’t know how to trace any particular object. However, each of
the heap’s Arena<T>
s holds only a single type of object, and an arena does
know how to trace its objects. So we have a mark stack for each T
, or
equivalently, each arena. This means that our fixed-point loop has two levels:
an outer loop that continues while any mark stack has work enqueued, and an
inner loop to drain a particular mark stack.
While the driver loop for marking is inside the Heap::gc
method, the actual
edge tracing and mark bit setting happens inside Collector
and the arena
which, because it has a T
type parameter, can call the correct Trace
implementation for each object.
Once our mark stacks are all empty, we’ve reached our fixed point, and that means we’ve finished marking all objects reachable from the root set. Now we transition to the sweep phase.
Sweeping iterates over each object in each arena. If that object’s mark bit is not set, then it is unreachable from the GC roots, i.e. it is not a member of the live set, i.e. it is garbage. We drop such objects and push their slots into their arena’s free list, making the slot available for future allocations.
After sweeping each arena we check whether the arena is still close to running out of capacity and, if so, reserve additional space for the arena. This amortizes the cost of garbage collection and avoids a scenario that could otherwise trigger a full GC on every object allocation:
- The arena has zero available capacity.
- The user tries to allocate, triggering a GC.
- The GC is able to reclaim only one slot in the arena.
- The user’s pending allocation fills the reclaimed slot.
- Now the arena is out of capacity again, and the process repeats from the top.
By reserving additional space in the arena after sweeping, we avoid this failure mode.
We could also compact the arena and release excess space back to the global
allocator if there was too much available capacity. This would additionally
require a method for updating incoming edges to the compacted objects, and
safe-gc
does not implement compaction at this time.
After every arena is swept, garbage collection is complete!
Preventing Classic Footguns
Now that we know how safe-gc
is implemented, we can explore a couple classic
GC footguns and analyze how safe-gc
either completely nullifies them or
downgrades them from critical security vulnerabilities to plain old bugs.
Often an object might represent some external resource that should be cleaned up
when the object is no longer in use, like an open file descriptor. This
functionality is typically supported with finalizers, the GC-equivalent of C++
destructors and Rust’s Drop
trait. Finalization of GC objects is usually
tricky because of the risks of either accessing objects that have already been
reclaimed by the collector (which is a use-after-free bug) or accidentally
entrenching objects and making them live again (which leads to memory
leaks). Because of these risks, Rust GC libraries often make finalization an
unsafe
trait and even forbid allocating types that implement Drop
in their
heaps.
However, safe-gc
doesn’t need an unsafe
finalizer trait, or even any
additional finalizer trait: it can just use Drop
. Drop
implementations
simply do not have access to a Heap
, which is required to deref GC pointers,
so they cannot suffer from those finalization footguns.
Next up: why isn’t Trace
an unsafe
trait? And what happens if you don’t root
a Gc<T>
and then index into a Heap
with it after a garbage collection? These
are actually the same question: what happens if I use a dangling Gc<T>
? As
mentioned at the start, if a Trace
implementation fails to report all edges to
the collector, the collector may believe an object is unreachable and reclaim
it, and now the unreported edge is dangling. Similarly, if the user holds an
unrooted Gc<T>
, rather than a Root<T>
, across a garbage collection then the
collector might believe that the referenced object is garbage and reclaim it,
leaving the unrooted reference dangling.
Indexing into a Heap
with a potentially-dangling Gc<T>
will result in one of
three possibilities:
-
We got “lucky” and something else happened to keep the object alive. The access succeeds as it otherwise would have and the potentially-dangling bug is hidden.
-
The associated slot in the arena’s free list is empty and contains a
FreeListEntry::Free
variant. This scenario will raise a panic. -
A new object has since been allocated in the same arena slot. The access will succeed, but it will be to the wrong object. This is an instance of the ABA problem. We could, at the cost of some runtime overhead, turn this into a loud panic instead of silent action at a distance by adding a generation counter to our arenas.
Of course, it would be best if users always rooted GC references they held
across collections and correctly implemented the Trace
trait but, should they
fail to do that, all three potential outcomes are 100% memory
safe.3 These failures can’t lead to memory corruption or
use-after-free bugs, which would be the typical results of this kind of thing
with an unsafe GC implementation.
Copying Collector False Start
I initially intended to implement a copying collector rather than mark-and-sweep, but ultimately the borrowing and ownership didn’t pan out. That isn’t to say it is impossible to implement a copying collector in safe Rust, but it ended up feeling like more of a headache than it was worth. I spent several hours trying to jiggle things around to experiment with different ownership hierarchies and didn’t get anything satisfactory. When I decided to try mark-and-sweep, it only took me about half an hour to get an initial prototype working. I found this really surprising, since I had a strong intuition that a copying collector, with its separate from- and to-spaces, should play well with Rust’s ownership and borrowing.
Briefly, the algorithm works as follows:
-
We equally divide the heap into two semi-spaces.
-
At any given time in between collections, all objects live in one semi-space and the other is sitting idle.
-
We bump allocate within the active semi-space, slowly filling it up, and when the bump pointer reaches the end of the semi-space, we trigger a collection.
-
During collection, as we trace the live set, we copy objects from the old semi-space that has been active, to the other new semi-space that has been idle. At the same time, we maintain a map from the live objects’ location in the old semi-space to their location in the new semi-space. When we trace an object’s edges, we also update those edges to point to their new locations. Once tracing reaches a fixed-point, we’ve copied the whole live set to the new semi-space, it becomes the active semi-space, and the previously-active semi-space now sits idle until the next collection.
Copying collection has a number of desirable properties:
-
The algorithm is relatively simple and easy to understand.
-
Allocating new objects is fast: just bumping a pointer and checking that space isn’t exhausted yet.
-
The act of copying objects to the new semi-space compacts the heap, defeating fragmentation.
-
It also eliminates the need for a sweep phase, since the whole of the old semi-space is garbage after the live set has been moved to the new semi-space.
Copying collection’s primary disadvantage is the memory overhead it imposes: we can only ever use at most half of the heap to store objects.
When I think about a copying collector, I tend to imagine Lisp cons cells, as I was first introduced to this algorithm in that context by SICP. Here is what a very naive implementation of the core copying collection algorithm might look like in safe Rust:
As written, this works and is 100% safe!4 So where do things start to break down? We’ll get there, but first…
The old-to-new-location map needn’t be an additional, separate allocation. We don’t need that hash map. Instead, we can reuse the from-space’s storage and write the address of each copied object’s new location inline into its old location. These are referred to as forwarding pointers. This is a super standard optimization for copying collection; so much so that it’s rare to see a copying collector without it.
Let’s implement inline forwarding pointers for our safe copying
collector. Because we are mutating the from-space to write the forwarding
pointers, we will need to change it from a shared borrow into an exclusive
borrow. Additionally, to differentiate between forwarding pointers and actual
cons cells, our semi-spaces must become slices of an enum
rather than slices
of cons cells directly.
Again, this copying collector with forwarding pointers works and is still 100% safe code.
Things break down when we move away from a homogeneously-typed heap that only contains cons cells towards a heterogeneously-typed heap that can contain any type of GC object.
Recall how safe_gc::Heap
organizes its underlying storage with a hash map
keyed by type id to get the Arena<T>
storage for that associated type:
My idea was that a whole Heap
would be a semi-space, and if it was the active
semi-space, the heap would additionally have an owning handle to the idle
semi-space:
Given that, we would collect the heap by essentially (papering over some
details) calling copy_collect
on each of its internal arenas:
Note that we pass the whole to_heap
into copy_collect
, not from_arena
’s
corresponding Arena<T>
in the to-space, because there can be cross-type
edges. A Cat
object can have a reference to a Salami
object as a little
treat, and we need access to the whole to-space, not just its Arena<Cat>
, to
copy that Salami
over when tracing Cat
s.
But here’s where things break down: we also need mutable access the whole
from-space when tracing Arena<Cat>
s because we need to write the forwarding
pointer in the from-space’s Arena<Salami>
for the Salami
’s new location in
the to-space. But we can’t have mutable access to the whole from-space because
we’ve already projected into one of its arenas. Yeah, I guess we could use
something like take the Arena<Cat>
out of the from-space, and then pass both
the Arena<Cat>
and the whole from-space into copy_collect
. But then what do
we do for Cat
-to-Cat
edges? Have some kind of check to test for whether we
need to follow a given edge into the from-space Heap
or the Arena
we are
currently tracing?
Like I said, I don’t think it is impossible to overcome these hurdles, the
question is: is overcoming them worth it? Everything I could think up got pretty
inelegant pretty quickly and/or would have laughably poor performance.5
When compared with how easy it was to implement mark-and-sweep, I just don’t
think a 100% unsafe
-free copying collector that supports arbitrary,
heterogeneous types is worth the headache.
Why safe-gc
?
safe-gc
is certainly a point in the design space of garbage-collection
libraries in Rust. One could even argue it is an interesting — and maybe
even useful? — point in the design space!
Also, it was fun!
At the very least, you don’t have to wonder about the correctness of any
unsafe
code in there, because there isn’t any. As long as the Rust language
and its standard library are sound, safe-gc
is too.
Conclusion
The safe-gc
crate implements garbage-collection-as-library for Rust with
zero unsafe
code. It was fun to implement!
Thanks to Trevor Elliott and Jamey Sharp for brainstorming with me and thanks to Manish Goregaokar and again to Trevor Elliott for reading early drafts of this blog post.
-
In the garbage collection literature, we think about the heap of GC-managed objects as a graph where each object is a node in that graph and the graph’s edges are the references from one object to another. ↩
-
The one exception to this statement that I’m aware of is the
gc-arena
crate, although it is only half an exception. Similar tosafe-gc
, it also requires threading through a heap context (that it calls aMutation
) to access GC objects, although only for allocation and mutable access to GC objects. Getting shared, immutable borrows of GC objects doesn’t require threading in a heap context. ↩ -
I do have sympathy for users writing these bugs! I’ve written them myself. Remembering to root GC references across operations that can trigger collections isn’t always easy! It can be difficult to determine which things can trigger collections or whether some reference you’re holding has a pointer to another structure which internally holds onto a GC reference. The SpiderMonkey GC folks had to resort to implementing a GCC static analysis plugin to find unrooted references held across GC-triggering function calls. This analysis runs in Firefox’s CI because even the seasoned systems engineers who work on SpiderMonkey and Firefox routinely make these mistakes and the resulting bugs are so disastrous! ↩
-
Well, this collector works in principle; I haven’t actually compiled it. I wrote it inside this text file, so it probably has some typos and minor compilation errors or whatever. But the point stands: you could use this collector for your next toy lisp. ↩
-
I’m not claiming that
safe-gc
has incredible performance, I haven’t benchmarked anything and it almost assuredly does not. But its performance shouldn’t be laughably bad, and I’d like to think that with a bit of tuning it would be competitive with just about any otherunsafe
-free Rust implementation. ↩