A Compact Representation of Captured Stack Frames for SpiderMonkey
Last year, I implemented a new, compact representation for captured JavaScript
stacks in SpiderMonkey1. This was done to support the allocation
tracking I built into SpiderMonkey's Debugger
API2 3, which
saves the stack for potentially every Object
allocation in the observed
globals. That's a lot of stacks, and so it was important to make sure that we
incurred as little overhead to memory usage as we could. Even more so, because
we'd like to affect memory usage as little as possible while simultaneously
instrumenting and measuring it.
The key observation is that while there may be many allocations, there are
many fewer allocation sites. Most allocations happen at a few places. Thus,
much information is duplicated across stacks. For example, if we run the esprima
JavaScript parser on its own JavaScript source, there are approximately 54,700
total Object
allocations, but just ~1,400 unique JS stacks at allocation
time. There are only ~200 allocation sites if you consider only the youngest
stack frame.
Consider the example below:
Disregarding compiler optimizations removing allocations, arguments
objects,
as well as engine internal allocations, calling the a
function allocates three
Object
s. With arrows representing the "called by" relationship from a younger
frame to its older, parent frame, this is the set of stacks that existed during
an Object
allocation:
c -> b -> a
d -> b -> a
e -> b -> a
Instead of duplicating all these saved stack frames, we use a technique called hash consing. Hash consing lets us share older tail stack frames between many unique stacks, similar to the way a trie shares string prefixes. With hash consing, SpiderMonkey internally represents those stacks like this:
c -> b -> a
^
d ---|
|
e ---'
Each frame is stored in a hash table. When saving new stacks, we use this table to find pre-existing saved frames. If such an object is already extant, it is reused; otherwise a new saved frame is allocated and inserted into the table. During the garbage collector's sweep phase, we remove old entries from this hash table that are no longer used.
I just landed a series of patches to make Error
objects use these hash cons'd
stacks internally, and lazily stringify it when the stack
property is
accessed4. Previously, every time an Error
object was created,
the stack string was eagerly computed. This change can result in significant
memory savings for JS programs that allocate many Error
objects. In one Real
World™ test case5, we dropped memory usage from 1.2GB down
to 167MB. Additionally, we use almost half of the memory that V8 uses on this
same test case.
Many thanks to Jim Blandy for guidance and reviews. Thanks also to
Boris Zbarsky, Bobby Holley, and Jason Orendorff
for insight into implementing the security wrappers needed for passing stack
frame objects between privileged and un-privileged callers. Thanks to
Paolo Amadini for adding support for tracing stacks across asynchronous
operations (more about this in the future). Thanks again to Boris Zbarsky for
making DOMException
s use these hash cons'd stacks. Another thanks to Jason
Orendorff for reviewing my patches to make the builtin JavaScript Error
objects use hash cons'd stacks behind the scenese.
In the future, I would like to make capturing stacks even faster by avoiding re-walking stack frames that we already previously walked capturing another stack6. The cheaper capturing a stack is, the more ubiquitous it can be throughout the the platform, and that puts more context and debugging information into developers' hands.