How formal verification can ensure flawless smart contracts

Bernhard Mueller
ConsenSys Media
Published in
8 min readJan 29, 2018

Bernhard Mueller, the creator of the Mythril smart contract analyzer, lays out how to eradicate errors in the Ethereum bytecode.

Smart contract security is serious business. Previously, I wrote about detecting common security issues in Ethereum bytecode, but generic tests like these only scratch the surface. Ideally, we want to be sure that our contract behaves correctly 100% of the time. Formal verification allows us to prove conclusively that certain error states can never occur.

Note: This article is outdated and the code won’t work with recent versions of Mythril. You can now reproduce the same examples more easily with MythX.

There is a considerable amount of academic research into semantic modeling of the Ethereum Virtual Machine (EVM) and formal verification of smart contracts using various frameworks (links below). In this article, I’ll describe an approach based on symbolic execution and the Z3 theorem prover, and demonstrate that this method is capable of detecting subtle bugs, such as those resulting from Solidity memory addressing mechanics.

Our specimen of choice is Ownable, a commonly used base contract that defines the owner state variable as well as a constructor, a modifier, and a function for transferring ownership.

Computing the state space

The first step is to symbolically execute the code and produce a representation of all possible program states. Each state consists of a set of concrete and/or symbolic values associated with the contract account (storage, balance,… ) and virtual machine environment (program counter, call data, etc.), as well as a set of path constraints, i.e. the requirements that must be fulfilled to reach that particular state.

Mythril’s symbolic execution engine computes the state space in a fully automated way. To visualize the possible paths through the program, we can render a control flow graph with the myth command line tool.

$ myth -g Ownable.html Ownable.sol

The result is a graph that provides a high-level overview of the program flow. Each node in the graph represents a basic block of code, while the edges denote the path conditions.

Control flow graph for Ownable.sol (view large version)

Fully mapping the state space is not always that trivial: Constructs such as unbounded loops and recursive inter-contract calls can lead to an exponential increase in the number of states. Thanks to Ethereum’s gas concept however, we can be sure that execution always halts. I will write more on this topic in subsequent posts — for now, we’ll stick with our simplistic example.

Specification

Given the compiled bytecode of the Ownable contract, we’ll attempt to prove that Ownable behaves correctly with respect to the following safety property:

“Ownership can be transferred by the contract owner only.”

We will formalize this property using the language of the Ethereum yellow paper: The global state is denoted as σ, the machine state as μ, et cetera. Some of the symbols are explained in the image below, but if you’re not familiar with them I recommend studying the paper first.

Following Solidity compiler conventions, the variable owner is represented by the value stored in storage slot 0 (this can be confirmed by inspecting the bytecode). We are therefore interested in the subset of state transitions σ => σ′ in which the contents of storage slot 0 are altered:

According to the EVM language definition only the SSTORE instruction modifies global storage. We can therefore more easily obtain the set of relevant initial states σ by selecting from all pairs of global state and machine state (σ, μ) where:

An error state exists if it is possible to satisfy the logical formula:

In plain English, we attempt to prove that it is impossible to reach a point in the program where the variable owner is changed, but msg.sender does not match the current owner.

Translating the specification into a Python program

The program below performs an exhaustive search for a solution to the above formula. If the formula can be satisfied within the state space of Ownable, we obtain a set of concrete values (for state and input variables) that violate the specification. Conversely, if the formula is proven unsatisfiable, we conclude that the contract is safe with respect to the specification.

Running the analysis on Ownable returns the following result:

$ ./analysis.py Ownable.sol
Analysis completed.

Since no counterexample has been found, we can conclude that Ownable is safe with respect to modifications of the owner state variable, with a few caveats:

  • We assume that the Ethereum virtual machine always behaves correctly;
  • The result only applies to the Ownable contract in isolation, not necessarily to contracts derived from it.

Detecting a not-so-obvious bug

The above result is hardly surprising: Looking at the Solidity code, it is pretty obvious that the specification always holds. Interestingly, however, our analysis tool will detect any kind of violation, including subtle bugs that might not be immediately noticeable to an auditor. Consider Pwnable.sol, a modified Ownable.sol that adds a dynamic array:

At a quick glance, this doesn’t look so bad: After all, transferOwnership() is still the only function that writes to the owner variable, and it is protected by the onlyOwner() modifier. This time however, the analysis tool complains:

$ python analysis.py Pwnable.solViolation found! Function: _function_0x7ca5dcce, address: 410
storage_1 = 0x4ef1d2ad89edf8c4d91132028e8195cdf30bb4b5053d4f8cd260341d48080000
caller = 0x0
storage_0 = 0xffffffffffffffffffffffffffffffffffffffff00
calldata__0 = 0x7ca5dcce00000000000000000000000000000000000000000000000000000000
calldatasize_ = 0x4
calldata__4 = 0x4ef1d2ad89edf8c4d91132028e8195cdf30bb4b5053d4f8cd260341d4805f30a

callvalue = 0x0
Analysis completed.

It appears that certain inputs result in overwriting the owner variable. As it turns out, this is due to the way dynamically sized arrays are laid out in storage:

  • The length of the array is stored at storage slot 1;
  • The data is stored at the storage address keccak(1) + offset.

When offset is set to (MAX_UINT — keccak(1)), this adds up to zero and we’re addressing storage slot 0. Note that this only works because a1.length is intentionally underflowed in the constructor. However, similar issues do occasionally occur in practice: In Solidity, the array length variable is often managed explicitly, and array.length-- is a common pattern (to learn more about this type of issue, check out Doug Hoyte’s USCC submission and watch my upcoming HITB talk).

Exploitation

To trigger the issue, all we need to do is to create a function call using the output of the analysis tool.

  • calldata_0: Function signature hash of fun(uint256,address)
  • calldata_4: Offset found by the solver, should match 2**256 — keccak256(1))

The offset value produced by the solver can be used as is. For clarity however, here is how it can be calculated in Python.

>>> from ethereum import utils
>>> hash = utils.sha3("\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01")
>>> hash_num = int.from_bytes(hash, byteorder="big")
>>> 2**256 — hash_num
35707666377435648211887908874984608119992236509074197713628505308453184860938
>>> hex(2**256-hash_num)
'0x4ef1d2ad89edf8c4d91132028e8195cdf30bb4b5053d4f8cd260341d4805f30a'

We end up with the following function call:

fun(35707666377435648211887908874984608119992236509074197713628505308453184860938,”0x11223344556677889900112233445566deadbeef”)

To confirm that this actually works, you can deploy Pwnable.sol on remix and perform the function call through the UI (note that the Integer argument must be put in quotes). After performing the call, reading the contents of owner should return the new address.

TL;DR

In this article, I showed how to formally verify the correctness of a smart contract with respect to well-defined properties using a symbolic execution engine and SMT solver. This type of analysis can prove certain properties of the code and is capable of detecting subtle bugs.

Further Reading

As mentioned above, a lot of awesome research is available in this field. Here are a few resources that deal with formal analysis of EVM code as well as program verification in general:

About Mythril and MythX

Mythril is a free and open-source smart contract security analyzer. It uses symbolic execution to detect a variety of security vulnerabilities.

MythX is a cloud-based smart contract security service that seamlessly integrates into smart contract development environments and build pipelines. It bundles multiple bleeding-edge security analysis processes into an easy-to-use API that allows anyone to create purpose-built smart contract security tools. MythX is compatible with Ethereum, Tron, Vechain, Quorum, Roostock and other EVM-based platforms.

Disclaimer: The views expressed by the author above do not necessarily represent the views of Consensys AG. ConsenSys is a decentralized community with ConsenSys Media being a platform for members to freely express their diverse ideas and perspectives. To learn more about ConsenSys and Ethereum, please visit our website.

Disclaimer: The views expressed by the author above do not necessarily represent the views of ConsenSys AG. ConsenSys is a decentralized community with ConsenSys Media being a platform for members to freely express their diverse ideas and perspectives. To learn more about ConsenSys and Ethereum, please visit our website.

--

--

Bernhard Mueller
ConsenSys Media

Hackers (1995) fan • “Best Research” Pwnie Awardee • Retired degen • G≡¬Prov(num(G))