Fuzz testing in Rust with Cargo-fuzz
Hi there, this Johan from Seasoned Software, a new start-up that is working on building a continuous fuzzing service. In this post, I go through how I added the first automated fuzz test for my hobby project Hat — a snapshotting backup system written in Rust. I’ll briefly go through what a fuzz test is and how it works. In a follow-up post, I will share how I made the test more effective by running it through Seasoned Software.
Why fuzz testing?
Looking back, one of the reasons I got interested in personal backup systems in the first place was their ability to merge various interesting technologies into a single piece of software. In Hat, there is hash-tree with built-in deduplication to prevent storing the same data twice, encryption to secure the data while stored, a fuse file-system for recovering files and a garbage collector for expiring old data.
As Hat is growing in complexity, it is getting harder to make a reasonable argument that the program actually works. A fairly important property of a backup system! :-)
I do what I can to add unit tests and quick-check tests for the particularly tricky parts of the code, as well as some high-level tests that attempt to exercise the program end-to-end. I test the edge cases that I can think of. But what about the ones I am unaware of?
Let’s look at a small quick-check test from Hat. It verifies that random input data (generated by the quick-check testing suite) is not accepted as valid file-contents when retrieved from the external storage:
#[test]
fn random_input_fails() {
fn prop(data: Vec<u8>) -> bool {
let keys = crypto::keys::Keeper::new_for_testing();
BlobReader::new(keys, crypto::CipherTextRef::new(&data[..]))
.is_err()
}
quickcheck::quickcheck(prop as fn(Vec<u8>) -> bool);
}
Hat uses cryptographic authentication to check that its data is not changed by someone else. In the test, the random data is rejected by the system because it fails the check (verified by “is_err()”). If it somehow passed, Hat’s verification logic would be faulty. When the test runs, the quick-check system feeds the prop function random data through 100 iterations.
This test protects against a known pitfalls by verifying a specific part of the system. It is meant to run locally after every code change to quickly check if something broke.
Fuzz tests serve a different purpose. To quote Andrew Gallant (aka BurntSushi), the main author of the Rust quick-check testing library:
Quickcheck uses random input to test, so it won’t always find bugs that could be uncovered with a particular property. You can improve your odds of finding these latent bugs by spending more CPU cycles asking quickcheck to find them for you. There are a few different ways to do this, and which one you choose is mostly a matter of taste.
If you are finding yourself doing this sort of thing a lot, you might also be interested in trying out
cargo fuzz
, which runs in a loop by default.
Fuzz tests do not optimize for short “time to failure”. While they have some overlap with quick-check tests (both try random inputs), fuzz tests are different in that they track the coverage of each input, remember the interesting ones, and slowly mutate these to increase coverage over time. This process adds overhead, so fuzz tests are usually run for millions (or billions) of iterations instead of hundreds.
To me, unit tests are great for checking specific cases (e.g. 1+2 = 3). Quick-check tests are a great alternative that checks an invariant or property of any random test case (e.g. x+2 > x).
Fuzz tests are like quick-check tests but different: They are better at finding interesting inputs, but require a more complicated setup. Fuzz tests check millions of random inputs and use coverage based feedback to get smarter over time.
I personally like fuzz tests for cases where I can write small tests that will exercise many pieces of interesting code. I will then sit back and wait, while the fuzz test figures out how to exercise it all.
Fuzz testing with Cargo fuzz
First things first: I am using Cargo-fuzz and to get started, I installed it, went to my code repository, switched to nightly Rust and ran the Cargo-fuzz init command:
cargo install cargo-fuzzcd hat-backup
rustup override set nightly
cargo fuzz init
That init command creates a new Rust project in a sub-directory named fuzz which has a dependency on its parent:
> tree fuzz
fuzz
├── Cargo.toml
└── fuzz_targets
└── fuzz_target_1.rs
(I renamed fuzz_target_1.rs to something more appropriate and modified the Cargo.toml file to match that change.)
Fuzz testing the Hat backup system
For Hat’s first fuzz test, I wanted to test something interesting and with a large scope. Something I could not expect quick-check to figure out in a few seconds, but where a fuzz test might have a chance if run over many days.
I went with a high-level end-to-end test of Hat’s snapshot mechanism. I am testing the code that stores a file’s metadata inside the backup system, as well as the code that recovers it. Here is the test:
fn metadata_test(info: models::FileInfo) {
if !info.name.is_empty() {
// Convert fuzzer-input to insertable file entry.
// The entry contains metadata like modified timestamp.
let entry = key::Entry::new_from_model(
None, key::Data::FilePlaceholder, info); // Setup a testing Hat.
let (_backend, mut hat, mut fam) = setup_family(); // Backup the file entry with no data contents.
fam.snapshot_direct(entry.clone(), false, None).unwrap(); // Complete a full snapshot.
hat.commit(&mut fam, None).unwrap();
hat.meta_commit().unwrap();
hat.data_flush().unwrap(); // Setup virtual file-system and verify the snapshot.
let mut fs = Filesystem::new(hat);
if let vfs::fs::List::Dir(files) =
fs.ls(&path::PathBuf::from("familyname/1"))
.unwrap()
.expect("no files found")
{
assert_eq!(files.len(), 1);
let mut want = entry.info;
want.snapshot_ts_utc = files[0].0.info.snapshot_ts_utc;
assert_eq!(want, files[0].0.info);
} else {
panic!("familyname/1 is not a directory");
}
}
}fn metadata_test_bincode(data: &[u8]) {
bincode::deserialize(data).ok().map(metadata_test);
}fuzz_target!(|data: &[u8]| { metadata_test_bincode(data) });
I can build and start it with Cargo-fuzz like so:
cargo fuzz run insert_file_bincode
Which gives an output roughly like this:
INFO: Seed: 3527004481
INFO: Loaded 1 modules (588119 guards): 588119
INFO: A corpus is not provided, starting from an empty corpus
#2 INITED cov: 881 ft: 877 corp: 1/1b exec/s: 0
#7 NEW cov: 1400 ft: 1472 corp: 2/54b exec/s: 0
#8 NEW cov: 2255 ft: 2743 corp: 3/89b exec/s: 0
#9 NEW cov: 2773 ft: 3749 corp: 4/159b exec/s: 0
#10 NEW cov: 2899 ft: 4179 corp: 5/195b exec/s: 0
#11 NEW cov: 4201 ft: 5638 corp: 6/4291b exec/s: 0
#19 REDUCE cov: 4201 ft: 5638 corp: 6/3226b exec/s: 0
#20 REDUCE cov: 4201 ft: 5638 corp: 6/3208b exec/s: 0
#27 REDUCE cov: 4201 ft: 5638 corp: 6/2696b exec/s: 0
#33 NEW cov: 4242 ft: 5842 corp: 7/2752b exec/s: 0
#40 REDUCE cov: 103425 ft: 103773 corp: 8/5257b exec/s: 0
#47 NEW cov: 103591 ft: 104604 corp: 9/7762b exec/s: 47
#53 NEW cov: 103619 ft: 104634 corp: 10/7808b exec/s: 53
(For longer runs, Rust’s release mode should be used for performance)
In the first iteration of this test, I did not check that the filename was non-empty. As a result, the fuzz test succeeded in inserting a file with an empty name. It turns out that I did not add any checks for this in Hat’s internal APIs and doing so is now on the TODO list.
As it turns out, Hat can insert files with empty names just fine. The difficult part would be restoring them on a file-system later :-)
So what is going on in this fuzz test?
In each iteration, the fuzz test produces an input vector with some bytes in it. The test then tries to parse it as a Rust struct representing file metadata. It does so using the Rust serialization library Serde and the data format bincode. Serde’s flexibility allows me to easily choose the serialization format, so I am using bincode for the fuzz test even though Hat uses cbor internally. To me, bincode seems less restrictive and I am guessing the fuzz test will find it easier to produce valid inputs in bincode than cbor (To verify that assumption later, I added cbor and JSON variants; I will go over the results in a later post).
If the fuzz test’s input data can be parsed as the wanted struct, the test goes on to check whether the filename is non-empty. If so, the metadata is valid enough and the test proceeds to simulate a snapshot of a virtual file with the given metadata, proceeds to do a basic checkout of the file and verifies that the metadata matches my expectations.
I am hoping this test will eventually find some strange combination of file metadata that somehow breaks the system in an interesting way :-)
What to expect
To start with, the inputs will be random like those of a quick-check test. Once the fuzz test finds something that parses correctly, that will trigger new coverage, and the fuzz test will remember that input and use it for future guesses. When the running fuzz test outputs NEW the input tested in that iteration has reached a previously unseen location in the code.
The space of all possible metadata objects is large. And while the most interesting values for something like the modification time are likely few (min, 0 and max) there could be interesting combinations of values that might break something. Or there could be interesting individual values I have not yet thought of.
I will check back in a couple of weeks to see which parts of the code this test was able to exercise and whether it found something.
Example inputs
I want to share some examples of inputs this test produced with you, to give you an idea of what the fuzz test is doing.
This is the very first input that the fuzz test chose to keep:
> hexdump fuzz/corpus/metadata_test_bincode/483ceba1...
0000000 ffff ffff ffff ffff ffff ffff ffff ffff
*
0000020 ffff 0a0a
0000024
This does not deserialize into a valid struct, but from the fuzz test’s perspective getting rejected is a interesting too.
The first input that correctly parsed to a struct, gave this result:
FileInfo {
name: "",
created_ts: 0,
modified_ts: 0,
accessed_ts: 0,
byte_length: 0,
owner: None,
permissions: None,
snapshot_ts_utc: 0
}
This struct has an empty name, so it is only 1 step further to exercising the actual test. I let the test run for a couple hundred iterations more, and found an input that does exercise the test as intended:
FileInfo {
name: "\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}"
"\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}"
"\u{4}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{4}\u{0}"
"z\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}",
created_ts: 0,
modified_ts: 0,
accessed_ts: 0,
byte_length: 0,
owner: None,
permissions: None,
snapshot_ts_utc: 0
}
This is definitely not an everyday filename, but it passed the check and this input brought the fuzz test from a coverage of 4'857 code-points, to 107'869 code-points.
After running the fuzz test a bit longer, it has found 177 interesting inputs. It has not yet managed to flip one of those None values to a Some value, but it has found non-trivial inputs like:
FileInfo {
name: "\u{0}\u{0}\u{0}\u{0}\u{0}#...",
created_ts: 7307217257065611264,
modified_ts: 7310874267742461811,
accessed_ts: 1933205832,
byte_length: 3439329280,
owner: None,
permissions: None,
snapshot_ts_utc: 0
}
This input provides an incorrect file size hint (byte_length) of 3'439'329'280 bytes. The file size hint does not have to be accurate, since the file could change while reading it anyway. That is an interesting case to verify :-)
That is all for now.
I hope you enjoyed reading this post. I sure enjoyed writing it :-)
In future blog posts, I want to write about the results of running this fuzz test, explore other areas for fuzz testing in a backup system and compare cbor, JSON and bincode in the context of fuzz test friendly serialization formats.
I will be on the lookout for feedback and comments. If you have something you would like to share, you can reach me directly at johan@seasoned.software or twitter.com/brinchj.
We will also be at RustFest.eu this weekend :-)
Have a nice day!