Extending the EVM

Published in

clearmatics

3 min readMar 13, 2018

The ethereum virtual machine (EVM) provides a rich Turing complete environment in which to execute smart contracts. That said, being able to execute all programs does not mean it can do so efficiently. Being able to extend the EVM to more efficiently execute certain operations is something that is desirable both on the public chain and for consortium or private chains.

This article briefly examines how we can we extend the EVM across many different implementations.

The EVM can be extended in different ways

Adding opcodes

Extra instructions can be added to the EVM that support the new operations such as addition cryptographic primitives, in the same way as conventional CPU.

The same disadvantages as adding instructions to a traditional CPU exist: namely, in order to take full advantage of the new operations, the compiler must be modified to use those instructions (otherwise assembly must be used).

In addition, the instruction encoding has limited space for instructions (in the case of the EVM, one byte of 256 instructions). As these opcodes are used, public and private chains are likely to attach different operations to the same opcode, leading to the same bytecode behaving differently on different networks.

Precompiled contracts

A second option is to add the operations as contracts written in the native language of the Ethereum client (instead of EVM bytecode). As the operations are implemented as contracts, no updates to the compilers are required (they can be used in the same fashion as a contract written in EVM byte code) and while they suffer from the same address space overloading issues as opcode, the address space is larger so more operations will fit.

This approach still has the overhead of trapping the contract address however (unpacking the parameters, calling the native code and repacking the results into the EVM) and the precompiled contract still needs to be deterministic, as the contract is part of the consensus process.

Where each approach is useful

Both methods do seem to have a place in the future of the EVM.

Precompiles seems especially suitable for consortium or private chains. Adding a small number of accelerated operations targeting the key operations for those verticals.

Opcodes are more suitable for generally applicable extensions.

It is worth noting the recent additions of the BN256 operations was implemented as precompile.

Managing precompiles as a Community

Once precompiles are implemented, managing which address is allocated to which precompile over a number different public, consortium and private chains quickly becomes a risky and time consuming process.

OpenGL (a cross-platform, low-level graphic library controlled by the Khronos group) offers a useful precedent on how this can be managed. OpenGL has included an extension mechanism since the mid 1990’s: a registry is managed by the Architecture Review Board (ARB) where any member of the ARB can propose an extension (which if accepted) is allocated a number in the registry). Different classes of extension exist: single vendor; EXT for older generic or multiple vendor extensions; and ARB for more modern extensions supported by a number of vendors.

Many extensions start as single vendor extensions before being adopted as ARB extensions before finally making their way into a later version of the core specification.

For the EVM, a group such as the EEA could manage a registry of precompiled contracts across all chains, assigning a consistent address to each contract regardless of the implementation, mitigating the risk of overloading the address space. Over time this would lead to a set of standard precompiles for different use cases.

Zoe Nolan, Senior Developer, Clearmatics

Tweet us @Clearmatics