Ethereum Scaling Solutions and Tradeoffs

A walk down scaling lane.

2018 is the year of scaling for Ethereum, so here are some solutions and the tradeoffs they each come with.

Firstly, if you’re interested in an in-depth but reader-friendly discussion of various scaling tech, read Josh Stark’s quintessential piece on layer-2 scaling solutions.

I’ll cover/summarize them here as well, but you should be generally aware of Transactional Sharding, State Sharding, State Channels (of which Payment Channels are a subset version of), Plasma, and Truebit.


Are you a Busy Person™ with Many Things to Do?

TL;DR: layer-2 isn’t here yet, and it won’t be for another 6–12 months. The best thing we can do right now to fix the user-experience of blockchain networks is to “trust but verify” optimistic state transitions.

At XLNT we’re working on this problem with gnarly. Come contribute to the discussion in the #gnarly channel at XLNT.chat 🚀.

First, where do you want to scale?

We have the concepts of layer-1, layer-2, and layer-n scaling solutions.

Layer-1 solutions are anything that is core-protocol-level scaling, like the various sharding approaches—they must be part of the consensus protocol to function correctly.

Layer-2 protocols are one degree away—they operate by leveraging a layer-1 protocol (like Ethereum) and allow users to transact in an objectively less secure environment but one that is backed by the security of the layer-1 solution. For example, if there’s fraud in a state channel, a user can submit a fraud proof that’s validated by the layer-1 network. Likewise in Truebit, if there’s a disagreement over the off-chain-computed solution proposed by a solver, the challengers can “bring the solver to court” by playing a verification game on Ethereum. In the optimistic case, we’ve computed information using a subset of the network that can be trusted by the whole, and in the pessimistic case, we’ve reverted back to the security of the main layer-1 consensus and still end up with a correct answer.

Layer-n solutions are an extension of those ideas—as we create, for example, a Plasma chain that branches off of Ethereum, we can create more Plasma chains that treat the first Plasma chain as a base chain. If an n-level Plasma chain evicts, it can revert to the security of its root chain, which, if also attacked, could also revert to the security of its base chain, all the way back to the Ethereum main chain.


The primary difference between layer-1 and layer-n solutions is the different security properties provided. The optimal solutions is layer-1 because it provides guaranteed (in the eyes of the miners) security for all state transfers, with no compromises. That is, everything you compute will be computed within this security context, so your state transition either fails or succeeds with relative finality (because your state transition is a part of the consensus mechanism).

With a layer-2 solution, state transfers are not core to the consensus mechanism of the layer-1 blockchain, so you now rely on alternative security guarantees. For example, a second-layer Plasma chain could operate on Proof of Stake or Proof of Authority, allowing higher transactional throughput at the cost of security within this context. The primary upside to layer-2 solutions is that once we’ve established that falling back to a lower-level security context is possible, attacks become illogical—the only attacks we’ll see are those with malicious, irrational intent. The primary downside of a layer-2 solution is that, when an attack or fraud occurs, the process of reverting to the security of the main chain is long, expensive, and generally an incredibly poor user-experience. In Truebit, for example, verification games (currently) take O(log(n)) steps to convict a malicious solver, which is played out over the course of O(log(n)) Ethereum blocks. That’s 15 seconds per step, meaning it could take literal hours and thousands of dollars to challenge someone to a verification game (in the worst case). In general, attacking a layer-n solution will be very expensive, but malicious users may be willing to pay that cost in order to disrupt the network. Likewise, many layer-2 solutions rely on economic security—the idea that it’s not worth the money to attack the system—which could certainly be a downside if the economics don’t work out perfectly.

Second, what do you want to scale?

“Scaling” covers a lot of ground—there are different aspects of a distributed blockchain that can be scaled, each with different techniques, architectures, and trade-offs.

Scaling Transactions

One popular approach to this is Transaction Sharding, ala Zilliqa. By parallelizing transactions, we allow higher throughput. Of note, this technique doesn’t shard the state of the blockchain like the various Casper implementations attempt to do. Off the top of my head I don’t know a production network that’s successfully implemented transaction sharding.

Scaling Blockchain Size

Blockchains get pretty big. Here’s a graph.

https://twitter.com/alistairmilne/status/963098653668904963

that’s a lot of bytes, y’all. Casper and friends attempt to shard the state of the blockchain so individually nodes can be full, validating nodes without storing hella gigabytes.

Scaling Reductive State Transitions

State Channels

Say we have some sort of state—the classic example is the balance of two different users—that undergoes a lot of frequent, small changes. It’d be cool if we could negotiate all of these quick transfers off-chain and then only commit the final state once each party is happy. This is State Channels (and Payment Channels, for a payment-specific use-case).

The primary downside of state channels is that it requires you to “open a channel” with everyone you want to do state transitional business with. So if I want to pay you money, we have to open a state channel, which requires staking money (again, cryptoeconomic security) and then closing the channel (more expense on our part). In simple cases, the extra work here doesn’t provide any benefits, but it definitely provides efficiency when producing multiple state transitions that can reduce into a single state, like micropayments or peer-to-peer game logic. You can also, similar to Raiden or the Lightning Network, conduct state channels through intermediaries, which may mitigate the majority of this down side.

In general, state transitions are deterministically finalized, but this finalization comes at the cost of a “challenge period” (but only in the case of a unilateral exit). The challenge period provides time for one participant to submit a fraud proof to convict another, invaliding their proposed state change. This challenge period will be very annoying from a UX perspective, since, until it has expired, the final state from a state channel closing cannot realistically be used until the challenge period has elapsed.

Lightning and Raiden are working on Payment Channels for payment-specific channels. Counterfactual is working on a general State Channel implementation for Ethereum. Watch their recent talk at ETHDenver 2018 for some great info.

State Channels aren’t there yet—the timeline is probably looking like 6–12 months, given the ecosystem’s history of not realizing this stuff is harder than it sounds.

Plasma

At a high level, Plasma is a classic implementation of a blockchain, but one that has the ability to, in the case of attack, replay its state transitions in the context of a “base chain” when the higher level of security is needed (primarily if there’s an invalid state transition published). It allows different actors to join and leave the Plasma chain, meaning it’s much more reasonable for multiparty communication. It has similar downsides to State Channels in that the eviction process to a base chain is compute- and time-intensive and would highly disrupt the network and the user’s experience of the network.

Ideally, once we’ve proved that evictions work, no rational bad actor would cause one, but malicious users can still attack the network. The defense of this attack is making it too expensive to justify, but this also increases the usage burden for normal users. There are many other attack surfaces and mitigations covered in L4’s post.

The primary downside of Plasma is that it also requires a “challenge period” to finalize state managed by a Plasma chain, but—because it’s designed with multiple parties in mind—state transitions cannot be considered finalized until after the challenge period, resulting in additional uncertainty.

In terms of timeline, Plasma is very much further away than State Channels and realistically isn’t production viable for another 10–16 months. OmiseGo, along with the Ethereum Foundation are the stewards of Plasma and are developing it for usage in a decentralized exchange. A “Minimum Viable Plasma” is also proposed, with a few teams working on it, but timelines are uncertain.

Scaling General Computation

Truebit scales the “size” of a transaction in the sense that it allows some computation to be computed by a very small subset of the network, but—due to the cryptoeconomic properties of the protocol—be trusted by the entire network. This is primarily useful for cases of Very Expensive Transactions™ like Proof of Work verification, probabilistic video encoding verification, verification of Bulletproofs, validating Plasma state transitions, and more.

Like Plasma, Truebit can fall back to the security context of a root chain, allowing it to optimistically perform general computation off-chain but verify results on-chain. This process is slow and expensive, and has the same downsides of the previous layer-2 solutions.

What about fixing the experience of layer-1?

So scaling distributed blockchain networks is pretty hard (as we’ve seen over the last ~2 years when scaling became a hot topic). A fix that we can implement now, instead of later, is simply providing a better experience over existing layer-1 solutions. This is the aim of gnarly, which provides optimistic-state transitions for use in user-facing applications.

You can read more about the technical details of gnarly below, but we’ll talk mostly about the properties it provides for creating user-facing applications.

Gnarly:

  • Enables declarative, reactive clients,
  • “Instant” updates with confidence intervals — i.e. Optimistic State Transitions,
  • The optimistic-UI pattern — apply expected changes immediately but revert to source of truth as soon as it’s known — the client need only know what the state is right this second with appropriate confidence in order to do its business,
  • Friendly error management means (i) developers get reasonable error contexts and (ii) consumers get explanations about errors,
  • Friendly error management allows anyone to know (i) that something occurred and (ii) why it occurred and (iii) how it affects their actions,
  • supports replay from arbitrary blocks to (i) bootstrap the steady state and (ii) resume after failures,
  • default output is catered towards a graphql consuming client

The primary upside of the gnarly approach is that users get instant updates on state transitions. They don’t have to worry about the transaction, and they don’t need to stop what they’re doing and wait for the confirmation before they move on—they can immediately start operating on the new information and generally just use the app like the normally would.

(it also doesn’t require breakthroughs in computer science and protocol design 😜)

The primary downside of the gnarly approach is that it’s optimistic. Gnarly cannot know, when a transaction is submitted to the network, if it will be finalized. At best, it can model its confidence in that transaction succeeding and convey that information to the user. Any future transactions that depend on some previous state must also wait until that state has been finalized—queuing transactions in Ethereum isn’t as easy as it should be, and could create cascading failures, resulting in awful UX. For example, if a user trades an item in a game to another, they will instantly see their account populated with coins, and the item debited from their account, but they won’t be able to go and spend those coins (with confidence) until the state transition has been finalized by the network as it traditionally would be.


Anyway, that’s the state of scaling for Ethereum as I understand it.

I enjoy 👏 and internet points. Also follow me on twitter/mattgcondon for intelligent stuff like this, crypto memes, and general tomfoolery.