Generating Rust FFI Bindings to C/C++ Libraries at cargo build
time in build.rs
with bindgen
TLDR? If you just want to see the full working example crate, generating
bzip2
bindings on-the-fly, it is available on Github here.
Table of Contents
- Motivation
- Step 1: Adding
bindgen
as a Build Dependency - Step 2: Create a
wrapper.h
Header - Step 3: Create a
build.rs
File - Step 4: Include the Generated Bindings in
src/lib.rs
- Step 5: Write a Sanity Test
- Step 6: Publish Your Crate!
Motivation
Often times C and C++ headers will have platform- and architecture-specific
#ifdef
s that affect the shape of the Rust FFI bindings we need to create to
interface Rust code with the outside world. The state of the art solution so far
has been to maintain a different set of bindings for each of our supported
platforms. This might be a manual process if we’re writing our FFI bindings by
hand, or slightly (and only slightly) less manual if we’re running bindgen
once on each supported platform and checking in the generated bindings.
The result has been that maintaining Rust FFI bindings to C and C++ libraries
has been tedious, even with bindgen
to help automate some bits.
Recently, we exposed library usage of bindgen
that enables us to put bindgen
in the [build-dependencies]
section of a crate’s Cargo.toml
file and
generate bindings for the current platform on-the-fly from inside a build.rs
file.
No more need to manually generate and check-in into our repository a different set of bindings for each supported platform!
What follows is a whirlwind introductory tutorial to this brave new
bindgen
+ build.rs
world. We’ll generate bindings to bzip2
(which is
available on most systems) on-the-fly.
Note: we won’t be publishing these bindings on crates.io becuase there is
already a bzip2-sys
raw FFI crate and a bzip2
crate
providing a nice Rust-y API built on top of that. This tutorial is only for
exposition!
Step 1: Adding bindgen
as a Build Dependency
Declare a build-time dependency on bindgen
by adding it to the
[build-dependencies]
section of our crate’s Cargo.toml
metadata file:
[build-dependencies]
bindgen = "0.20.0"
Step 2: Create a wrapper.h
Header
The wrapper.h
file will include all the various headers containing
declarations of structs and functions we would like bindings for. In the
particular case of bzip2
, this is pretty easy since the entire public API is
contained in a single header. For a project like SpiderMonkey,
where the public API is split across multiple header files and grouped by
functionality, we’d want to include all those headers we want to bind to in this
single wrapper.h
entry point for bindgen
.
Here is our wrapper.h
:
#include <bzlib.h>
Step 3: Create a build.rs
File
First, we have to tell cargo
that we have a build.rs
script by adding
another line to the Cargo.toml
:
[package]
build = "build.rs"
Second, we create the build.rs
file in our crate’s root. This file is compiled
and executed before the rest of the crate is built, and can be used to generate
code at compile time. And of course in our case, we will be generating Rust FFI
bindings to bzip2
at compile time. The resulting bindings will be written to
$OUT_DIR/bindings.rs
where $OUT_DIR
is chosen by cargo
and is something
like ./target/debug/build/libbindgen-tutorial-bzip2-sys-afc7747d7eafd720/out/
.
extern crate bindgen;
use std::env;
use std::path::PathBuf;
fn main() {
// Tell cargo to tell rustc to link the system bzip2
// shared library.
println!("cargo:rustc-link-lib=bz2");
// The bindgen::Builder is the main entry point
// to bindgen, and lets you build up options for
// the resulting bindings.
let bindings = bindgen::Builder::default()
// Do not generate unstable Rust code that
// requires a nightly rustc and enabling
// unstable features.
.no_unstable_rust()
// The input header we would like to generate
// bindings for.
.header("wrapper.h")
// Finish the builder and generate the bindings.
.generate()
// Unwrap the Result and panic on failure.
.expect("Unable to generate bindings");
// Write the bindings to the $OUT_DIR/bindings.rs file.
let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());
bindings
.write_to_file(out_path.join("bindings.rs"))
.expect("Couldn't write bindings!");
}
Now, when we run cargo build
, our bindings to bzip2
are generated on the
fly!
There’s more info about build.rs
files in the crates.io documentation.
Step 4: Include the Generated Bindings in src/lib.rs
We can use the include!
macro to dump our generated bindings right into our
crate’s main entry point, src/lib.rs
:
#![allow(non_upper_case_globals)]
#![allow(non_camel_case_types)]
#![allow(non_snake_case)]
include!(concat!(env!("OUT_DIR"), "/bindings.rs"));
Because bzip2
’s symbols do not follow Rust’s style conventions, we suppress a
bunch of warnings with a few #![allow(...)]
pragmas.
We can run cargo build
again to check that the bindings themselves compile:
$ cargo build
Compiling libbindgen-tutorial-bzip2-sys v0.1.0
Finished debug [unoptimized + debuginfo] target(s) in 62.8 secs
And we can run cargo test
to verify that the layout, size, and alignment of
our generated Rust FFI structs match what bindgen
thinks they should be:
$ cargo test
Compiling libbindgen-tutorial-bzip2-sys v0.1.0
Finished debug [unoptimized + debuginfo] target(s) in 0.0 secs
Running target/debug/deps/bzip2_sys-10413fc2af207810
running 14 tests
test bindgen_test_layout___darwin_pthread_handler_rec ... ok
test bindgen_test_layout___sFILE ... ok
test bindgen_test_layout___sbuf ... ok
test bindgen_test_layout__bindgen_ty_1 ... ok
test bindgen_test_layout__bindgen_ty_2 ... ok
test bindgen_test_layout__opaque_pthread_attr_t ... ok
test bindgen_test_layout__opaque_pthread_cond_t ... ok
test bindgen_test_layout__opaque_pthread_mutex_t ... ok
test bindgen_test_layout__opaque_pthread_condattr_t ... ok
test bindgen_test_layout__opaque_pthread_mutexattr_t ... ok
test bindgen_test_layout__opaque_pthread_once_t ... ok
test bindgen_test_layout__opaque_pthread_rwlock_t ... ok
test bindgen_test_layout__opaque_pthread_rwlockattr_t ... ok
test bindgen_test_layout__opaque_pthread_t ... ok
test result: ok. 14 passed; 0 failed; 0 ignored; 0 measured
Doc-tests libbindgen-tutorial-bzip2-sys
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
Step 5: Write a Sanity Test
Finally, to tie everything together, let’s write a sanity test that round trips some text through compression and decompression, and then asserts that it came back out the same as it went in. This is a little wordy using the raw FFI bindings, but hopefully we wouldn’t usually ask people to do this, we’d provide a nice Rust-y API on top of the raw FFI bindings for them. However, since this is for testing the bindings directly, our sanity test will use the bindings directly.
The test data I’m round tripping are some Futurama quotes I got off the internet
and put in the futurama-quotes.txt
file, which is read into a &'static str
at compile time via the include_str!("../futurama-quotes.txt")
macro
invocation.
Without further ado, here is the test, which should be appended to the bottom of
our src/lib.rs
file:
#[cfg(test)]
mod tests {
use super::*;
use std::mem;
#[test]
fn round_trip_compression_decompression() {
unsafe {
let input = include_str!("../futurama-quotes.txt").as_bytes();
let mut compressed_output: Vec<u8> = vec![0; input.len()];
let mut decompressed_output: Vec<u8> = vec![0; input.len()];
// Construct a compression stream.
let mut stream: bz_stream = mem::zeroed();
let result = BZ2_bzCompressInit(&mut stream as *mut _,
1, // 1 x 100000 block size
4, // verbosity (4 = most verbose)
0); // default work factor
match result {
r if r == (BZ_CONFIG_ERROR as _) => panic!("BZ_CONFIG_ERROR"),
r if r == (BZ_PARAM_ERROR as _) => panic!("BZ_PARAM_ERROR"),
r if r == (BZ_MEM_ERROR as _) => panic!("BZ_MEM_ERROR"),
r if r == (BZ_OK as _) => {},
r => panic!("Unknown return value = {}", r),
}
// Compress `input` into `compressed_output`.
stream.next_in = input.as_ptr() as *mut _;
stream.avail_in = input.len() as _;
stream.next_out = compressed_output.as_mut_ptr() as *mut _;
stream.avail_out = compressed_output.len() as _;
let result = BZ2_bzCompress(&mut stream as *mut _, BZ_FINISH as _);
match result {
r if r == (BZ_RUN_OK as _) => panic!("BZ_RUN_OK"),
r if r == (BZ_FLUSH_OK as _) => panic!("BZ_FLUSH_OK"),
r if r == (BZ_FINISH_OK as _) => panic!("BZ_FINISH_OK"),
r if r == (BZ_SEQUENCE_ERROR as _) => panic!("BZ_SEQUENCE_ERROR"),
r if r == (BZ_STREAM_END as _) => {},
r => panic!("Unknown return value = {}", r),
}
// Finish the compression stream.
let result = BZ2_bzCompressEnd(&mut stream as *mut _);
match result {
r if r == (BZ_PARAM_ERROR as _) => panic!(BZ_PARAM_ERROR),
r if r == (BZ_OK as _) => {},
r => panic!("Unknown return value = {}", r),
}
// Construct a decompression stream.
let mut stream: bz_stream = mem::zeroed();
let result = BZ2_bzDecompressInit(&mut stream as *mut _,
4, // verbosity (4 = most verbose)
0); // default small factor
match result {
r if r == (BZ_CONFIG_ERROR as _) => panic!("BZ_CONFIG_ERROR"),
r if r == (BZ_PARAM_ERROR as _) => panic!("BZ_PARAM_ERROR"),
r if r == (BZ_MEM_ERROR as _) => panic!("BZ_MEM_ERROR"),
r if r == (BZ_OK as _) => {},
r => panic!("Unknown return value = {}", r),
}
// Decompress `compressed_output` into `decompressed_output`.
stream.next_in = compressed_output.as_ptr() as *mut _;
stream.avail_in = compressed_output.len() as _;
stream.next_out = decompressed_output.as_mut_ptr() as *mut _;
stream.avail_out = decompressed_output.len() as _;
let result = BZ2_bzDecompress(&mut stream as *mut _);
match result {
r if r == (BZ_PARAM_ERROR as _) => panic!("BZ_PARAM_ERROR"),
r if r == (BZ_DATA_ERROR as _) => panic!("BZ_DATA_ERROR"),
r if r == (BZ_DATA_ERROR_MAGIC as _) => panic!("BZ_DATA_ERROR"),
r if r == (BZ_MEM_ERROR as _) => panic!("BZ_MEM_ERROR"),
r if r == (BZ_OK as _) => panic!("BZ_OK"),
r if r == (BZ_STREAM_END as _) => {},
r => panic!("Unknown return value = {}", r),
}
// Close the decompression stream.
let result = BZ2_bzDecompressEnd(&mut stream as *mut _);
match result {
r if r == (BZ_PARAM_ERROR as _) => panic!("BZ_PARAM_ERROR"),
r if r == (BZ_OK as _) => {},
r => panic!("Unknown return value = {}", r),
}
assert_eq!(input, &decompressed_output[..]);
}
}
}
Now let’s run cargo test
again and verify that everying is linking and binding
properly!
$ cargo test
Compiling libbindgen-tutorial-bzip2-sys v0.1.0
Finished debug [unoptimized + debuginfo] target(s) in 0.54 secs
Running target/debug/deps/libbindgen_tutorial_bzip2_sys-1c5626bbc4401c3a
running 15 tests
test bindgen_test_layout___darwin_pthread_handler_rec ... ok
test bindgen_test_layout___sFILE ... ok
test bindgen_test_layout___sbuf ... ok
test bindgen_test_layout__bindgen_ty_1 ... ok
test bindgen_test_layout__bindgen_ty_2 ... ok
test bindgen_test_layout__opaque_pthread_attr_t ... ok
test bindgen_test_layout__opaque_pthread_cond_t ... ok
test bindgen_test_layout__opaque_pthread_condattr_t ... ok
test bindgen_test_layout__opaque_pthread_mutex_t ... ok
test bindgen_test_layout__opaque_pthread_mutexattr_t ... ok
test bindgen_test_layout__opaque_pthread_once_t ... ok
test bindgen_test_layout__opaque_pthread_rwlock_t ... ok
test bindgen_test_layout__opaque_pthread_rwlockattr_t ... ok
test bindgen_test_layout__opaque_pthread_t ... ok
block 1: crc = 0x47bfca17, combined CRC = 0x47bfca17, size = 2857
bucket sorting ...
depth 1 has 2849 unresolved strings
depth 2 has 2702 unresolved strings
depth 4 has 1508 unresolved strings
depth 8 has 538 unresolved strings
depth 16 has 148 unresolved strings
depth 32 has 0 unresolved strings
reconstructing block ...
2857 in block, 2221 after MTF & 1-2 coding, 61+2 syms in use
initial group 5, [0 .. 1], has 570 syms (25.7%)
initial group 4, [2 .. 2], has 256 syms (11.5%)
initial group 3, [3 .. 6], has 554 syms (24.9%)
initial group 2, [7 .. 12], has 372 syms (16.7%)
initial group 1, [13 .. 62], has 469 syms (21.1%)
pass 1: size is 2743, grp uses are 13 6 15 0 11
pass 2: size is 1216, grp uses are 13 7 15 0 10
pass 3: size is 1214, grp uses are 13 8 14 0 10
pass 4: size is 1213, grp uses are 13 9 13 0 10
bytes: mapping 19, selectors 17, code lengths 79, codes 1213
final combined CRC = 0x47bfca17
[1: huff+mtf rt+rld {0x47bfca17, 0x47bfca17}]
combined CRCs: stored = 0x47bfca17, computed = 0x47bfca17
test tests::round_trip_compression_decompression ... ok
test result: ok. 15 passed; 0 failed; 0 ignored; 0 measured
Doc-tests libbindgen-tutorial-bzip2-sys
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
Step 6: Publish Your Crate!
That’s it! Now we can publish our crate on crates.io and we can write a nice,
Rust-y API wrapping the raw FFI bindings in a safe interface. However, there is
already a bzip2-sys
crate providing raw FFI bindings, and there is
already a bzip2
crate providing a nice, safe, Rust-y API on top of the
bindings, so we have nothing left to do here!
Check out the full code on Github!