Why is EVM-on-Plasma hard?

Thanks to Dan Robinson for very useful discussions around this stuff.


Most of the current work around Plasma basically just makes use of very simple UTXOs or non-fungible tokens. Lots of people want to know if it’s possible to create a Plasma chain that allows users to deploy EVM smart contracts. Unfortunately, this is a lot more complicated than it might seem for reasons that probably aren’t obvious if you don’t spend a lot of time working with Plasma. I wrote this to quickly outline why full EVM support on Plasma is non-trivial and then give some suggestions for what could be done to create a Plasma chain capable of running more general smart contracts.

Let’s talk a little bit more about Plasma before we start picking apart exactly why EVM-enabled Plasma chains are hard to build. One fundamental property of Plasma is that state represented on a Plasma chain must be able to be withdrawn to the root chain (e.g. Ethereum) in a way that maintains the integrity of that state. You should be able to freely move assets from the Plasma chain to the root chain, and vice versa. This functionality is particularly important when a Plasma consensus mechanism goes “bad” and users are forced to withdraw their assets from the Plasma chain.

To understand what this means in practice, let’s imagine a simple Plasma chain where users have accounts with which they can transfer or receive funds. You’re one of the users on this Plasma chain and you’d like to withdraw your funds. How do you do that? Well, you effectively tell the contract sitting on Ethereum that you have some funds on the Plasma chain and that you’d like to withdraw those funds. Of course there’s a catch — you shouldn’t be able to lie about how much is in your account. This is why we introduce something called a “challenge period” during which an invalid withdrawal can be blocked.

Now let’s extended this idea to the type of EVM smart contracts you might be familiar with, like a simple multisig account. If this account currently sits on the Plasma chain, then we need to provide some way for the wallet to be “withdrawn” to the root chain. However, we still want to keep the funds in a multisig — we effectively want to move the entire smart contract back onto the root chain. Remember that each EVM contract has state, balance, and code. So what we’re really doing is moving the state, balance, and code of this multisig contract from the Plasma chain to the root chain. The contract should then be able to operate normally on the root chain.

So who actually gets to decide to move stuff from the Plasma chain to the root chain? Well, if we’re talking about something simple like an account, it makes a lot of sense to say that the owner of the account should be able to withdraw the balance at any time. If we’re talking about a multisig account, then we could design a few different mechanisms to determine when the multisig is moved to the root chain. Maybe every user in the multisig needs to sign off, maybe n-of-m users need to sign off, or maybe only one user needs to sign off. All of these are potentially valid mechanisms — we start to see that it makes sense for the designers of the multisig to decide which mechanism is most appropriate.

But here’s the problem — it’s not always clear who gets to move a contract from the Plasma chain to the root chain. Imagine we have a smart contract (on some Plasma chain) that represents lots of virtual kitties. Let’s say the consensus mechanism goes “bad” and everyone needs to leave the Plasma chain to keep their funds safe. What do we do about our virtual kitty contract? As we just discussed, we need to move the contract back onto the root chain. Unfortunately, if we allowed just anyone to withdraw the contract, then (rude) users would almost definitely just withdraw as many contracts as possible to make the Plasma chain unusable. We have to come up with a better mechanism.

We could probably think of a few mechanisms, but it’s likely that they’d be too centralized or too expensive. One seemingly obvious mechanism might be a voting mechanism that decides when the contract can be withdrawn. A small set of eligible voters would highly centralize control over the contract. However, the more users eligible to vote, the more expensive the voting mechanism.

Let’s quickly go back to the process for withdrawing stuff from the Plasma chain. Remember that we can block an exit if we show that the state being withdrawn is somehow invalid. Imagine our virtual kitty contract says that I’m the owner of kitty 123. Now we want to withdraw the contract, so we need to specify the current state of the contract. Part of that current state is “Kelvin owns kitty 123.” What happens if I make someone else the owner of the kitty right in the middle of the contract’s challenge period? The “real” state is now “user X owns kitty 123.” The state being withdrawn is now invalid (and can therefore be challenged). This is our second big problem — if anyone can modify the state of the contract, then anyone can block an exit.

This leads us to our final problem — we need to validate that the state change presented in a challenge is actually a valid state change and validating EVM state changes inside the EVM is hard. For basic contracts like accounts, a valid state change just requires a signature from the owner of the account and is simple enough to check inside the EVM. However, for complex EVM contracts, things get much more expensive. One way to validate EVM executions is to use something like TrueBit. This might be the simplest option, but it basically kills the security properties of Plasma by making it dependent on an external system. Ideally, if we want to validate single EVM steps in a trustless way, then we need to implement the EVM inside the EVM 🤯. People have been thinking about this for a while and Vitalik even created an EIP to start discussing this. I really recommend checking out the EIP to understand why it’s so difficult (gas limits inside gas limits, hmm…).

All of this is why it’s so hard to create a Plasma chain that operates a lot like Ethereum does today. Let’s quickly recap:

  1. It’s not always clear who gets to move a contract from the Plasma chain to the root chain.
  2. If anyone can modify the state of the contract, then anyone can block an exit.
  3. Validating EVM state changes inside the EVM is hard.

So where do we go from here? Well luckily we’ve started to come up with a few concepts that basically involve breaking typical smart contracts down to a level where these questions don’t matter as much. To illustrate, let’s go back to our virtual kitty contract. It might be more clear who’s responsible for moving stuff to the root chain if everyone were moving their own kitty instead of moving the entire contract. We can do this if we represent each kitty as its own mini smart contract that can only be modified by the kitty’s owner. This simple shift in design makes the first and second problems almost entirely irrelevant.

I’ll be publishing some initial thoughts that came out of discussions with Dan Robinson about this sort of design and how to make it usable for developers. The gist is that it’s probably easiest to use something like TXVM and a high-level language that feels like writing EVM smart contracts but automatically breaks things down in the way I just discussed. We’re calling this “Plasma VM” because it makes sense and it’s also MVP backwards.

Of course, feedback/questions/comments are always more than welcome. Let me know if anything is still confusing and I’ll try to make it a little more clear!