Py-EVM Part 1: Origins

PyEthereum is a bit of a mess.

For those that are not familiar, PyEthereum is the implementation of the EVM in Python, originally authored by Vitalik.

The PyEthereum source code was written quickly (and skillfully) to serve a purpose. It is one of the key tools that Vitalik uses in his protocol research and development. As far as I am aware, every feature and protocol modification Vitalik as created was prototyped using the PyEthereum.

PyEthereum is an amazing library. It has good test coverage and is reasonably fast (for a language like Python). It has a wealth of useful utility functions for things like generating addresses from private keys. It comes with an extremely useful test EVM which has been key to most python development of Ethereum applications. Many of us in the python community have Vitalik to thank for creating the library that has been at the core of our other python code for the last 3 years.

But it is also a bit of a mess.


Software ecosystems with solid foundations grow and thrive at a greatly increased rate than their counterparts. Django is a prime example of this. Django has done exceptionally well as a stable and predictable foundation.

The Django ecosystem has a wide range of 3rd party libraries and tools to solve common web development problems. These allow developers to focus on their products rather than spending time futzing with implementing their own AWS integration or task queue.

The Python/Ethereum ecosystem is no different. Those of us who choose Python need a sturdy foundation for the various applications, tools and libraries we create. Personally, I believe that such a foundation should have the following things:

1. Documentation

It should have excellent documentation. This should cover API style documentation, but more importantly, it needs to contain narrative style documentation which holds your hand and walks you through the various architecture and abstractions. This documentation should read like an easy to follow guidebook which takes you through everything you could every need to know when working with the library.

2. Clearly Defined API

The clearly defined distinction between public and private APIs.

3. Clearly Defined Deprecation Strategy

Users need to know they can rely on the API they are using not changing out from under them. And on the other side, library maintainers need to know what APIs they are allowed to change as needed and which ones need to be taken through a deprecation process before being changed or removed.

4. Modularity and Extensibility as an API

Things like new opcodes should not require code changes to the core codebase. In fact, I would posture that every protocol change to date since the Frontier network went live should have been implementable without making core changes to the codebase. We need an EVM implementation not only allows, but encourages extensibility and modularity as first class features.


In it’s current state, PyEthereum is not this library.

  1. PyEthereum has roughly zero documentation.
  2. PyEthereum has no distinction between public and private APIs
  3. There is no formalized strategy for deprecation and backwards incompatible changes.
  4. PyEthereum can be configured and made modular in some ways but there is zero documentation on how to do this and it is limited.

So if PyEthereum falls short, how do we get ourselves the library we need?

Option 1: Fix PyEthereum

The most obvious option is to fix PyEthereum. Given the magnitude of changes needed, this is not something that could be done iteratively on any reasonable timeframe which means it would need to be a significant backwards incompatible change.

This approach would leverage the work and mindshare that already exists within the PyEthereum codebase and development community. This is normally the route that I would take as rewrites often appear simple on their surface while in reality they are difficult and involved.

The primary argument against such a rewrite is the significant number of backwards incompatible breaking changes which would be required. This type of change would be disruptive and costly for all of the teams and products which are currently built on top of PyEthereum.

In a similar vein, such a broad change would likely be disruptive to Vitalik’s research.

Option 2: Start Fresh (the one I chose)

A completely new implementation written in Python.

The major benefit of this approach new library which can take cues from the existing implementation but start with a clean slate. This library can be developed in parallel to PyEthereum without causing any disruption to existing users.

The drawbacks to this approach are somewhat nuanced. In the short term, it means a lot of work creating a new implementation, but this is lessened by the wonderful ethereum/tests suite of test cases.

Introducing Py-EVM

Py-EVM is a new implementation of the Ethereum Virtual Machine written in python. It is currently in active development but is quickly progressing through the test suite provided by ethereum/tests. I have Vitalik, and the existing PyEthereum code to thank for the quick progress I’ve made as many design decisions were inspired, or even directly ported from the PyEthereum codebase.

Py-EVM aims to eventually become the defacto python implementation of the EVM, enabling a wide array of use cases for both public and private chains. Development will focus on creating an EVM with a well defined API, friendly and easy to digest documentation which can be run as a fully functional mainnet node.

Step 1: Alpha Release

The plan is to begin with an MVP, alpha-level release that is suitable for testing purposes. We’ll be looking for early adopters to provide feedback on our architecture and API choices as well as general feedback and bug finding.

We expect to enter this phase within the next very few months, if not weeks.

Step 2: Beta Release

Once we are satisfied that the basic design decisions are correct we will then enter a cycle of beta releases. During this time we will plan to develop an adapter as well as documentation for people migrating from PyEthereum.

During this time we will also begin development of the networking components to allow Py-EVM to serve as a full ethereum node on the live network.

This phase is likely to last for multiple months. We hope to be at or near the end by the time that Metropolis is released but ultimately, we will be done when it’s ready.

Step 3: Initial Stable Release

Once Py-EVM is able to run both the mainnet and public test networks it will be ready for a stable 1.0 release.

Development

If you’d like to follow along with the ongoing development of Py-EVM you can do so on github.