Combining Coverage-Guided and Generation-Based Fuzzing
Coverage-guided fuzzing and generation-based fuzzing are two powerful approaches to fuzzing. It can be tempting to think that you must either use one approach or the other at a time, and that they can’t be combined. However, this is not the case. In this blog post I’ll describe a method for combining coverage-guided fuzzing with structure-aware generators that I’ve found to be both effective and practical.
What is Generation-Based Fuzzing?
Generation-based fuzzing leverages a generator to create random instances of
the fuzz target’s input type. The csmith
program, which generates
random C source text, is one example generator. Another example is any
implementation of the Arbitrary
trait from the quickcheck
property-testing framework. The fuzzing engine repeatedly uses the generator to
create new inputs and feeds them into the fuzz target.
The generator can be made structure aware, leveraging knowledge about the fuzz target to generate inputs that are more likely to be interesting. They can generate valid C programs for fuzzing a C compiler. They can make inputs with the correct checksums and length prefixes. They can create instances of typed structures in memory, not just byte buffers or strings. But naïve generation-based fuzzing can’t learn from the fuzz target’s execution to evolve its inputs. The generator starts from square one each time it is invoked.
What is Coverage-Guided Fuzzing?
Rather than throwing purely random inputs at the fuzz target, coverage-guided fuzzers instrument the fuzz target to collect code coverage. The fuzzer then leverages this coverage information as feedback to mutate existing inputs into new ones, and tries to maximize the amount of code covered by the total input corpus. Two popular coverage-guided fuzzers are libFuzzer and AFL.
The coverage-guided approach is great at improving a fuzzer’s efficiency at creating new inputs that poke at new parts of the program, rather than testing the same code paths repeatedly. However, in its naïve form, it is easily tripped up by the presence of things like checksums in the input format: mutating a byte here means that a checksum elsewhere must be updated as well, or else execution will never proceed beyond validating the checksum to reach more interesting parts of the program.
The Problem
Coverage-based fuzzing lacks a generator’s understanding of which inputs are well-formed, while generation-based fuzzing lacks coverage-guided fuzzing’s ability to mutate inputs strategically. We would like to combine coverage-guided and generation-based fuzzing to get the benefits of both without the weaknesses of either.
So how do we implement a fuzz target?
When writing a fuzz target for use with coverage-guided fuzzers, you’re typically given some byte buffer, and you feed that into your parser or process it in whatever way is relevant.
However, when writing a fuzz target for use with a generation-based fuzzer, instead of getting a byte buffer, you’re given a structure-aware input that was created by your generator.
The signatures for coverage-guided and generation-based fuzz targets have different input types, and it isn’t obvious how we can reuse a single fuzz target with each approach to fuzzing separately, let alone combine both approaches at the same time.
A Solution
As a running example, let’s imagine we are authoring a crate that converts back and forth between RGB and HSL colors.
Now, let’s start by writing a couple structure-aware test case generators by
implementing quickcheck::Arbitrary
for our color types, and then using
quickcheck
’s default test runner to get generation-based fuzzing up and
running.
Here are our Arbitrary
implementations:
Now we need to define what kinds of properties we want to check and what oracles we want to implement to check those properties for us. One of the simplest thigns we can assert is “doing operation X does not panic or crash”, but we might also assert that converting RGB into HSL and back into RGB again is the identity function.
Now we can run cargo test
and quickcheck
will do some generation-based
fuzzing for our RGB and HSL conversion crate — great!
However, so far we have just done plain old generation-based fuzzing. We also want to get the advantages of coverage-guided fuzzing without giving up our nice structure-aware generators.
An easy and practical — yet still effective — way to add coverage-guided fuzzing into the mix, is to treat the raw byte buffer input provided by the coverage-guided fuzzer as a sequence of random values and implement a “random” number generator around it. If our generators only use “random” values from this RNG, never from any other source, then the coverage-guided fuzzer can mutate and evolve its inputs to directly control the structure-aware fuzzer and its generated test cases!
The BufRng
type from my bufrng
crate is exactly this “random”
number generator implementation:
BufRng
will generate “random” values, which are copied from its given
buffer. Once the buffer is exhausted, it will keep yielding zero.
Because there is a blanket implementation of quickcheck::Gen
for all types
that implement rand_core::RngCore
, we can use BufRng
with any
quickcheck::Arbitrary
implementation.
Finally, here is a fuzz target definition for a coverage-guided fuzzer that uses
BufRng
to reinterpret the input into something that our Arbitrary
implementations can use:
With BufRng
, going from quickcheck
property tests to combined
structure-aware test case generation and coverage-guided fuzzing only required
minimal changes!
Conclusion
We can get the benefits of both structure-aware test case generators and
coverage-guided fuzzing in an easy, practical way by interpreting the fuzzer’s
raw input as a sequence of random values that our generator uses. Because
BufRng
can be used with quickcheck::Arbitrary
implementations directly, it
is easy to make the leap from generation-based fuzzing with quickcheck
property tests to structure-aware, coverage-guided fuzzing with libFuzzer or
AFL. With this setup, the fuzzer can both learn from program execution feedback
to create new inputs that are more effective, and it can avoid checksum-style
pitfalls.
If you’d like to learn more, check out these resources:
-
Structure-Aware Fuzzing with libFuzzer. A great tutorial describing a few different ways to do exactly what it says on the tin.
-
Write Fuzzable Code by John Regehr. A nice collection of things you can do to make your code easier to fuzz and get the most out of your fuzzing.
-
cargo fuzz
. A tool that makes using libFuzzer with Rust a breeze. -
quickcheck
. A port of the popular property-testing framework to Rust. -
bufrng
. A “random” number generator that yields pre-determined values from a buffer (e.g. the raw input generated by libFuzzer or AFL).
Finally, thanks to Jim Blandy and Jason Orendorff for reading an early draft of this blog post. This text is better because of their feedback, and any errors that remain are my own.
Post Script
After I originally shared this blog post, a few people made a few good points that I think are worth preserving and signal boosting here!
Rohan Padhye et al implemented this same idea for Java, and they wrote a paper about it: JQF: Coverage-Guided Property-Based Testing in Java
Manish Goregaokar pointed out that cargo-fuzz
uses the arbitrary
crate,
which has its own Arbitrary
trait that is different from
quickcheck::Arbitrary
.
Notably, its paradigm is “generate an instance of Self
from this buffer of
bytes” rather than “generate an instance of Self
from this RNG” like
quickcheck::Arbitrary
does. If you are starting from scratch, rather than
reusing existing quickcheck
tests, it likely makes sense to implement the
arbitrary::Arbitrary
trait instead of (or in addition to!) the
quickcheck::Arbitrary
trait, since using it with a coverage-guided fuzzer
doesn’t require the BufRng
trick.
John Regehr pointed out that this pattern works well when small mutations to the input string result in small changes to the generated structure. It works less well when a small change (e.g. at the start of the string) guides the generator to a completely different path, generating a completely different structure, which leads the fuzz target down an unrelated code path, and ultimately we aren’t really doing “proper” coverage-guided, mutation-based fuzzing anymore. Rohan Padhye reported that he experimented with ways to avoid this pitfall, but found that the medicine was worse than the affliction: the overhead of avoiding small changes in the prefix vastly changing the generated structure was greater than just running the fuzz target on the potentially very different generated structure.
One thing that I think would be neat to explore would be a way to avoid this
problem by design: design an Arbitrary
-style trait that instead of generating
an arbitrary instance of Self
, takes &mut self
and makes an arbitrary
mutation to itself as guided by an RNG. Maybe we would allow the caller to
specify if the mutation should grow or shrink self
, and then you could build
test case reduction “for free” as well. Here is a quick sketch of what this
might look like: