An Ethereum Contract Analyzer

Dr. Y
slock.it Blog
Published in
4 min readJan 12, 2016

When you use an Ethereum smart-contract, you are totally sure… of what? The trust-less consensus algorithm makes sure that that nobody can modify the contract. You are totally sure that your funds are managed by the contract on the block chain, which is written in byte code that looks like

0x60606040523615600a575b606e60015460009011801560275750620151806001600050544203115b1560705773ffffffffffffffffffffffffffffffffffffffff3316600034606082818181858883f15050905473ffffffffffffffffffffffffffffffffffffffff16915050ff5b005b670de0b6b3a7640000341060a2576000805473ffffffffffffffffffffffffffffffffffffffff191633179055426001555b56

(example from the blockchain). Before entrusting your valuables, you need to understand what it does. Of course, you can ask for an explanation, but then you trust the explainer. The lost charm is the trust-less part.

With some luck, you might get a Solidity source code. With more luck, somebody tells you a version number of the compiler to run on the source code. If your own compilation matches the bytecode on the blockchain, at least you know how the bytecode is generated. Still, you need to read the source code and you have to trust the compiler.

You can also run the bytecode in EVM implementations. On your own private chain, it’s free to test a contract. The only problem is time. It’s impossible to try all combinations of input data, caller, and value (it’s already impossible to enumerate all possible values of a single uint256 input). Meanwhile a clever attacker can compose a special input that makes the contract misbehave. Against such an attacker, you want to know every possible behavior of the contract. Is that possible?

Yes, in many cases! Try Dr. Y’s Ethereum Contract Analyzer. It tells you that the contract has twelve possible behaviors. It tells you the precise conditions under which each behavior presents. In Behaviors 0, 3, 6 and 9, message calls happen and some funds might be transferred. And the analyzer proudly declares “12 behaviors cover the possibilities (assuming enough gas).”

I’m totally aware that this sounds like a scam; why would you trust such-and-such analyzer? No, not yet. The analyzer is actually wrong in many cases. It doesn’t consider integer overflows and underflows, or stack overflows (just because of the hobby-time constraints). However, it does analyze all possible cases.

Behind the scenes is a rather simple trick called symbolic execution. The analyzer executes the bytecode like other EVM implementations, but without knowing the input. In place of a concrete input value, the analyzer keeps a symbol representing “the unknown input,” hence the name symbolic execution. If the contract returns the fourth byte of the unknown input, the analyzer says that the contract returns “the fourth byte of the unknown input.” If the contract has an if-condition about the unknown input, the analyzer says there are two possible behaviors, under conditions “the fourth byte of the unknown input being {zero/non-zero}.” In this way, all possibilities are covered.

For instance, Slock’s token contract at the time of writing looks like this.

The Analyzer in action (visualisation of the Slock.it contract)

You can specify the number of steps to analyze. When you specify 0, you see the initial state

still running with state {
stack: [](size 0)
memory: (empty)
storage: (initial storage)
}.

After one step, you see something on the stack

still running with state {
stack: [(0x60)](size 1)
memory: (empty)
storage: (initial storage)
}.

After seven steps, you start to see different behaviors depending on the input

Behavior 0
under conditions:
1. (size of input) is not zero.
still running with state {…}.
Behavior 1
under conditions:
1. (size of input) is zero.
still running with state {…}.

because there is an if-condition about the size of the input. When the contract is actually used, the size of input should be either zero or not zero, so that one of the two behaviors is chosen. Nevertheless, the analyzer shows you both possibilities, and after more steps, it keeps showing you all possibilities.

How to make the analyzer trustworthy? There are two sources of trust to be pursued:

  • The Yellow Paper defines the EVM. I want to translate the Yellow Paper to a theorem prover (like ACL2, HOL4, or Coq) and prove that the analyzer matches the Yellow Paper.
  • The blockchain is a source of trust. I want to connect the analyzer to the blockchain as a client, and make sure that the analyzer understands everything that happens on the blockchain.

The analyzer presented here helps us to understand contracts on the blockchain, to build contracts that materialize our ambitions with precision and to keep attackers away from exploiting unknown features of our contracts.

And don’t forget; symbolic execution is the baby-first step of the whole discipline of formal methods. We are getting closer to smart-contracts with mathematical guarantee.

About the Author

Dr. Y is a hobby-time entity, whose identity will be revealed when the hobby is no longer a hobby.

Contact: dr-y@hushmail.com

--

--