Brownie: Evaluating Solidity Code Coverage via Trace Analysis

The Python development and testing framework for Ethereum smart contracts

iamdefinitelyahuman
Coinmonks
Published in
6 min readOct 1, 2019

--

Introduction

An important tool in every developer’s arsenal is the ability to evaluate code coverage of their tests. A coverage report provides a high level overview that can be used to find gaps in your test suite, and while a high coverage percentage by no means guarantees quality tests, it does provide a better sense of where undiscovered bugs may be lurking. Given the immutable nature of smart contracts and the vast sums of value they secure, we should welcome and utilize every tool available during the development process.

The following article discusses how Brownie handles code coverage via trace analysis. It explores the motivations, gives a summary of the implementation, discusses the benefits and challenges, and talks about where we’re going next.

Why is this Useful?

Before we explore the technical details… Behold, a sample Brownie branch coverage report!

Branch coverage report as displayed by the Brownie GUI

The color of each highlight indicates how that branch evaluated during testing:

  • Green branches evaluated both truthfully and falsely
  • Yellow branches were only evaluated truthfully
  • Orange branches only evaluated falsely
  • Red branches were not evaluated

With this report you can quickly see how your tests are interacting with your contract. It can help you determine which areas of your project need more testing, as well as locate sections of unreachable code.

Techniques for Coverage Evaluation

Broadly speaking there are two approaches to evaluating coverage:

  • Instrumentation involves injecting data collectors throughout the code that are used to monitor exactly which lines and branches are executed. This is how the popular tool solidity-coverage works.
  • Tracing involves monitoring the program counter of the compiled code as it executes, and then mapping the executed instructions back to specific lines of code.

Instrumentation is far simpler to implement and the more commonly used approach. The main downside when compared to tracing is that it is invasive — the functions added to monitor execution mean that the code being evaluated is not the same code that will be used in production.

Additionally, the EVM brings a whole new set of challenges in the form of gas costs. Every operation uses gas, and each block has a finite supply of gas available. Instrumenting a contract means adding more operations, which in turn means increased deployment and execution costs. In order to stay within the block gas limit this sometimes means running on a modified EVM ruleset. So now we are testing a modified set of code within a modified virtual machine!

For these reasons, Brownie uses trace analysis rather than instrumentation in order to perform coverage analysis. I feel the benefits of testing the real code justify the challenges in building such a system. So, where to begin?

Implementation Basics

The key to implementing coverage via tracing lies in two data structures returned by the Solidity compiler:

  • The abstract syntax tree, which is a standardized representation of the source code syntax. Brownie uses py-solc-ast to traverse the AST.
  • The deployed source mapping, where compiled opcodes are mapped to the original source code. Brownie expands this into its own program counter map which it uses extensively in coverage analysis.

By analyzing the AST we can locate statements and branches, and then use the source map to associate them with opcodes. We then query the debug_traceTransaction RPC endpoint on each transaction that runs during unit tests, and analyze the returned data to find out which code was hit. For contract calls, we instead broadcast them as transactions to get the trace, then immediately rewind the chain to ensure the state was not changed. Easy, right?

Lets get into the nuts and bolts!

Statement Coverage

Statements are syntax units that express an action to be performed. They are self-contained and linear, having a single point of entry and exit within the code.

Mapping statement coverage within Solidity is relatively straightforward. First, we search the AST for the deepest statement nodes (those which are not the parent of another statement). Then we iterate through the program counter map looking for opcodes which have a source offset contained within the source offset of the statement. Whenever one is found, that opcode is associated with the statement. We know that the presence of that opcode within a trace means that this statement was executed.

Branch coverage

Branch coverage is where things get interesting.

A branch is an instruction that can cause a program to execute different code. In the EVM, branches are denoted by the JUMPI opcode. Within Solidity, these occur during if statements, require statements, and ternary operations.

To map branch coverage, we must first search the AST for the following nodes:

  • IfStatement (if (x) {} else {})
  • Conditional (a = x > y ? 1 : 2;)
  • FunctionCall containing a require expression (require(x, "oopsie"); )

Next, we next search the children of these nodes for BinaryOperation expressions (operations that evaluate in the branch, such as x > y or returnsBoolFn()). Because we must also account for nested operations such as ((x > y) || (x — 4 < y)) we ignore any BinaryOperation nodes that contain children who are also a BinaryOperation.

Once this list of nodes is generated, we next associate opcodes in a similar manner to how statement mapping was handled — looking for opcodes that have a source offset contained within the offset of the node. We must also map these opcodes to JUMPI instructions, which we use to determine how the branch evaluated. To do so, we find the last opcode with a source offset contained inside the AST source offset, and associate it to the next JUMPI instruction. The reason we must use the last opcode has to do with the way Solidity handles jump instructions within nested binary operations.

Determining the relationship between the outcome of the JUMPI and whether the branch evaluated true or false depends on the type of node and it’s location relative to other nodes within the AST. There are many rules, and many exceptions to them. If you’re still with me and interested in how this is handled, I invite you to view the relevant source code.

The end result of all this is a map of opcodes associated with both source offsets and jump instructions, that can be used to determine if a branch has executed and whether it evaluated truthfully or falsely! feelsgoodman.jpg

Execution Times

This technique is not without limitations, the biggest of which is execution time. Queries to debug_traceTransaction are slow! Brownie attempts to mitigate against this in several ways:

  1. Coverage data is tracked on a per-transaction basis. Whenever a transaction is broadcast that is identical to one that was already evaluated, the results are taken from a cache instead of being evaluated again. With a well-designed test suite this can lead to significantly faster execution.
  2. Brownie is compatible with the pytest-xdist plugin, allowing for tests to be executed in parallel. Again, proper design principles combined with the use of xdist can greatly reduce execution times.
  3. The Brownie pytest plugin includes an --update flag which allows you to only run tests involving source files that have changed. Brownie decides which files have changed based on the compiled bytecode, so adjusting comments or renaming variables will not require you to repeat your tests.

Try It Yourself!

If you’d like to see Brownie’s coverage evaluation in action you can use the following commands to install Brownie, download the Brownie Mix token template, run the tests, and open the GUI:

pip install eth-brownie
brownie bake token
cd token
brownie test -C
brownie gui

Get Best Software Deals Directly In Your Inbox

--

--