Bug Injection Misbeliefs and Lessons

Published in

TrueFi Engineering

7 min readSep 10, 2022

I had a great time two weeks ago at the DeFi Security Summit! My colleague Maciej and I listened to interesting talks, traded banter with colorful characters, and gave a talk that (to my surprise!) sparked a broader discussion.

Many thanks to the organizers of this conference!

Bug Injection Misbeliefs

In our numerous follow-up conversations, Maciej and I noticed that several conference attendees drew inaccurate conclusions about our processes — specifically with what we called bug injection and how this relates to auditors. So we would like to clarify several misbeliefs for the record.

Misbelief 1: Injecting artificial bugs

We don’t inject artificial bugs. Instead, our injected bugs come from code review omission.

Bugs come up naturally when writing code. If a code reviewer notices a natural bug, then we encourage them to not comment on the PR, but instead record the bug in a private repo.

This approach ensures our injected bugs reflect mistakes that could happen to any programmer. Another way to put it: what bugs would slip by if our code reviewers happened to be sleepy?

In retrospect, we regret naming this “bug injection”, as it suggests our bugs were artificial. “Bugs that were omitted during code review” isn’t as snappy, so we welcome better names!

Misbelief 2: Unethical experiments

Our team cares deeply about ethics and trust in this industry. In our conversations with other teams, we often bring up the negative 2021 precedent where University of Minnesota researchers injected vulnerabilities into the Linux kernel.

In this negative precedent,

Researchers did not obtain informed consent from Linux kernel devs.
Many Linux kernel devs worked as unpaid open source volunteers.
Researchers did not account for all known bugs with an independently verifiable mechanism.
Researchers had no skin in the game to ensure bugs would be fixed before release.

In contrast, for our bug injection program,

We make it very clear and transparent when we’ve injected bugs, and reviewers have the option to decline working with us or negotiate other arrangements.
Our internal formal verification team is paid a salary, white hats get bounty rewards for first reports, and external auditors are compensated for their time and effort.
For transparency, we publish hashes of injected bug descriptions in our bounty programs and audit scopes.
This is our own product and reputation that we’re putting on the line.

Misbelief 3: Extra work for reviewers

In our early attempts, we injected 7–8 bugs per release. As I’ll discuss below, this proved to be not worth the trouble, so we adapted and modified our process.

Now we only leave 2 to 3 easy-to-fix bugs per release (if any). This approach ensures we don’t add significant extra work for reviewers.

Misbelief 4: Easy-to-fix means easy-to-find

In discussions we’ve seen, several commenters conflate easy-to-fix with easy-to-find, and use that to call this process a waste of reviewer time.

But these are not the same. Many hard-to-find bugs come from single-line diffs where, for instance, an engineer accidentally put parentheses around the wrong expression. It’s worth measuring how often these hard-to-find bugs get caught. As long as they are also easy-to-fix, they should not present an undue burden to reviewers.

Misbelief 5: Easy-to-find bugs are useless

Finding easy-to-find bugs is a weak positive signal of reviewer skill. If your deposit function burns the user’s tokens, you don’t need an auditor to tell you this code is “stupid”.

On the other hand, missing easy-to-find bugs is a strong negative signal. If a reviewer doesn’t notice that your code is unable to run, then you need to start questioning how much value they bring to your security process.

Misbelief 6: Injecting bugs means reviewers can’t trust engineers

We believe bug injection has a valuable place in high-trust partnerships. This belief is supported by the rich history of testing against tests in collaborative settings.

High assurance software firms have used mutation testing for decades. The SAT college entrance exam has an experimental section to calibrate new test questions. The NORAD missile alert system introduced artificial radar blips to keep their operators attentive.

High-stakes situations demand both close collaboration and testing collaborators. Bug injection is just another tool for giving low-stakes feedback in a high-stakes situation. Used well, it’s a powerful way to promote continuous improvement.

Misbelief 7: Our process is a crusade against auditors

The bug injection part of our talk drew a lot of attention. However, our talk was mostly not about bug injection (only 6 out of 35 slides), and, except as a salient example, mostly not about auditors.

We wanted to talk about the measurement process our team has developed. We looked back at decades-old engineering practices in writing high assurance code, adapted them as best we can for our current needs, and wanted to share our findings to improve the industry’s best practices.

This measurement process applies just as well to our internal teams (code reviewers + formal verification) as it does to external reviewers (auditors + bug bounties). If something isn’t working well, then we need to be able to gather hard feedback, figure out how to improve, and make changes.

Lessons Learned

Lesson 1: Don’t inject too many bugs

In early iterations, we believed that collecting more data would give more accurate measurements.

However, we quickly realized that:

The more bugs we inject, the higher the odds we introduce bugs while fixing injected ones.
Limited auditor and engineer time is best spent finding new bugs instead of fixing injected ones.

Lesson learned: only gather a pragmatic amount of data, 2–3 bugs, in each release.

Lesson 2: Don’t inject hard-to-fix bugs

Not all bugs found in code review can be fixed in a few lines. Some may require significant refactoring, or even tossing out the whole design and starting from scratch.

At one point, we injected a bug that required a (minor) architectural change to fix. Reviewing the fix wasted both our engineers’ and auditors’ time. As a consequence, we’ve determined that this is an unacceptably high cost for the benefits of benchmarking. So when hard-to-fix bugs like this show up in code review, we address them immediately and don’t allow them to remain injected.

Lesson 3: Don’t inject placeholder bugs

Until now, we’ve included placeholder bugs when we’ve done bug injection. An example is the following:

HASH(‘Placeholder bug #1 <SALT here to prevent dictionary attacks>’)

After further discussion at the DeFi Security Summit, we’ve come to realize these are not worth the trouble.

The marginal benefit is small. The purpose of including placeholder bugs is to potentially motivate reviewers to keep searching because they don’t know a priori how many bugs have been injected.
But the marginal cost is large. These placeholder bugs are highly unpopular, and they reduce trust and goodwill.

Lesson 4: Expect low recall

We have always known that no individual engineer, code reviewer, auditor, or bounty hunter is perfect. Finding bugs is uncertain enough that a reviewer’s performance can vary significantly over time.

However, we hadn’t realized how much a good reviewer typically misses. We’ve had to recalibrate our expectations so that a reviewer who finds half of the injected bugs is doing a very good job. And for reviewers who are finding few bugs or none, it’s only consistent underperformance that leads us to question their value.

So What’s the Point of Bug Injection?

Protocol engineers really love Swiss cheese models, almost as much as aviators like chain-of-events analyses or reliability engineers like bathtub curves.

But when every layer of cheese only removes bugs, we lose reliable benchmarks of how well each layer is independently doing.

And if you’re trying to decide whether it’s okay to launch, simple Fermi estimates still need some data from each layer to work.

Concluding Remarks on Bug Injection and Auditors

Reactions to our talk keep returning to bug injection and auditors, so we’ll conclude with some thoughts on this topic. It’s clear from the community response that we sparked a heated conversation and really split the DeFi Security Summit crowd. We’ve gotten strong feedback, both positive and negative, both public and private.

We find some of the negative reactions to be unfortunate because we see the tremendous potential for everyone — protocol engineers, auditors, tool vendors, and bug bounties — to make a concerted push to improve our industry’s processes.

Smart contract auditing is a market for lemons. Protocol engineers and auditors both have asymmetric information about their own skill and high expectations for the other side.

This creates widespread adverse selection, wherein auditors feel compelled to:

deliver high quality security reviews on very tight schedules
downplay customers’ poor code quality in public reports
take undue responsibility for customers’ security

Meanwhile, despite tremendous cost, protocol engineers still have difficulty finding auditors who:

are bookable in reasonable timelines
cut the BS of rubber stamping and preserving reputation over security
have real skin in the game

While we don’t have all the right answers, we think it’s worth having a serious discussion about the state of the industry and how to improve it. We hope our talk and this blog post can move the conversation forward.