Scaling Solutions on Ethereum explained

Johannes Hagemann
Coinmonks
9 min readMay 12, 2018

--

In this post we will talk about the scaling solutions the Ethereum Foundation and others are currently working on.

First I will explain the general problem the Ethereum Blockchain (and in general all public decentralised blockchains) faces at the moment. Then I will describe some solutions for this problem, especially Sharding, Raiden and Plasma.

This post is based on the handout of my presentation about this topic. That’s the reason for the short but straight to the point sentences. You can find the presentation here.

Problem with Ethereum Blockchain

Currently, Ethereum can only process 15 transactions per second, while for example Visa processes approximately 45,000/tps.

In the last year, some applications (like Cryptokitties, or big ICOs) have been popular enough to “slow down” the network and raise gas prices, that you have to pay if you want to do a transaction.

And we will need a lot more transactions per seconds, if we wanna build decentralised applications (like a decentralised social network) with millions of users.

The problem at the moment is that every operation that takes place on the Ethereum blockchain, like a payment, the birth of a Cryptokitty or the deployment of a smart contract, must be performed by every single node in the network in parallel. That means the blockchain can’t process more transactions than a single node can.

This is why we need scaling solutions!

Why is that important to us?

With us I mean people, who are interested in this technology and want to build stuff on it! Everybody who wants to disrupt whole industries.

We only have limited time and resources and that is why thinking about this topic is important. Because with 15 transactions per second there is no chance that we will disrupt any industries. We want to see, if there is a chance, that this is working out and that there are good solutions to this problem, otherwise we don’t need to spend our limited time on it.

Scaling on Ethereum

At the moment we have in general two ways which might solve this problem:

  1. We build a blockchain where every node doesn’t have to process every operation.

→ called Layer 1 solutions (or On-Chain solutions) e.g. Sharding

2. we squeeze more useful operations out of Ethereum’s existing capacity

→ called Layer 2 solutions (or Off-Chain solutions) e.g. State Channels, Plasma

Layer 1 solutions typically require a hard fork of the blockchain, on the other hand Layer 2 solutions typically do not require a hard fork, because they can be implemented as smart contracts.

Sharding

Sharding is no new concept. We have this concept for decades in normal software development of databases.

The Approach is that we build a blockchain where every node doesn’t have to process every operation. How do we do that?

We split the entire state of the network into partitions called shards, that contain their own independent piece of state and transaction history. Also this Shards can have Sub-Shards. Every node only has to validate the transactions in his shard.

Now there are several “levels” of nodes that can exists, for example a Super-full node (fully downloads every collation of every shard), Top-level node, Single-shard node, Light node.

Now we will look at some challenges of Sharding:

Single-shard takeover attacks:

What if an attacker takes over the majority of the collators in one single shard?

For a PoW blockchain you cannot stop miners from applying their work on a given shard, that’s why you only need 1% of the total hash-power to takeover a shard → That’s why PoS is very important for Sharding.

Ethereum want to solve this problem with something called random-sampling. This basically means validators cannot choose the shard they want to work on and Validators cannot know what shard they will work on ahead of time, which is solved by reshuffling of the shards.

Cross shard communication:

If we only do transactions in one shard, there is no problem. But how can we make transactions between different Shards? For example you have an address in one shard and want to send someone a transaction with an address in a different shard. For that problem you need a difficult new protocol.

State Channels

The approach of state channels is that not every payment needs to be agreed to globally. We only need to have both participating parties to agree to it and to have a proof of the transfer of the assets between them. This brings us a different perspective on consensus. → Instead of global consensus we can use local consensus between participants.

How does it work?

To transact Off-Chain you have to create a Payment Channel. The opening and closing of that Payment Channel has to be On-Chain and this costs the normal fee and takes the normal time (for Ethereum ~20sec ). In order to ensure that participants pay their debts, tokens have to be locked up as security in a smart contract for the lifetime of the payment channel. The transfer can’t be higher than the on-chain deposit. When both parties finish transacting, the final account balances are recorded on the blockchain.

Rather than have to open up a channel with the specific person you want to transact with, you can open up a single channel with a node connected to a much larger network of channels, that enables you to make payments to anyone else connected to the same network with very low fees. That means every participant only has to open a few channels, but will still be able to transfer to any other participant.

The Pros of state channels:

Your transactions Off-Chain are: faster, not visible on the shared ledger and do not cost gas. So it’s very useful for 2 parties that are going to exchange many state updates over a long period of time, because it’s “expensive” to create a channel, but once the channel is deployed, the cost to do a transaction is very low. Also State Channels have strong privacy properties. That’s because everything that is happening ‘inside’ a channel won’t be recorded on the Ethereum main-chain. Only the opening and closing transactions must be public.

The Cons:

State channels rely on availability. If for example one node which is connected to a big network of nodes looses his internet connection, then a lot of people couldn’t communicate until they open a new Payment Channel. Furthermore it can lead to centralisation, because some nodes would have a lot of payment channels and others only a few (We can see that on the Bitcoin Lightning Network at the moment).

Raiden Network

The Raiden Network is the State Channel implementation of Ethereum. It’s very similar to the Lightning Network the Bitcoin Community is currently working on.

Raiden is an off-chain transfer network but in general only for Ethereum ERC20 tokens (that is the ERC Standard Token most of the ICOs are using). It’s developed by a company called brainbot.

Thery are also developing on something called μRaiden. It’s intended for many-to-one payment setups, like users interacting with a Dapp. Though it only works unidirectional. μRaiden is already live on the Ethereum Mainnet and you can start today building stuff with it!

They are also developing something called Raidos (Raiden 2.0), which is not only limited to ERC20 Tokens and can transact every possible smart contract off-chain.

Plasma

The idea of Plasma is that you can create “child” blockchains attached to the “main” Ethereum blockchain. And these child-chains can have their own child-chains, who can create their own child-chains and so on. That enables that we can perform many complex operations at the child-chain level → We can run entire applications with many thousands of users, with only minimal interaction with the Ethereum main-chain.

If for example we want to build a decentralised gambling game. We would first create some smart contracts with the basic rules that we put on the Ethereum Main-Chain, here called the Root Chain. Then we create our child-chain which can even has his own consensus algorithm (also a different one then PoW or PoS). Now we deploy the actual game application smart-contracts on the child-chain, which contains all of the game logic and rules. The important thing is that the user, that uses our dapp, is only interacting with the child-chain, because we want a good user experience. The Blockproducer (so for example a miner) of the child-chain will have to publish a commitment to the Plasma root chain, that he has mined a block.

By moving more operations off the main-chain and onto a child-chain, it’s clear we can perform more operations. But now the question is how secure is it?

The short answer is that even in an scenario where we create a child-chain with a PoW algorithm where one entity has 100% of the hashing power (which would be very centralised), Plasma gives you a basic guarantee that you can always withdraw your funds and assets back onto the main-chain. So if this one entity starts to act bad the worst thing that can happen is that you get forced to leave the child-chain.

Other Scaling Solutions

Another scaling solution would be that we create larger blocks. For example Bitcoin has 1MB blocks approximately every 10mins. Then there was a fork to Bitcoin Cash, because they wanted to use 8MB blocks. It’s obvious that if you create larger blocks you can put more transactions in one block and this is some kind of scaling solution. The downside of this method is that if we double the block size, it means that each node has to do double the amount of work processing each block. But that’s bad for decentralisation, because less powerful computers couldn’t be a part of the network. And if the blocksize gets to big the network will start to rely exclusively on a very small number of supercomputers running the blockchain. Also the size of the blockchain would increase much faster, which would also be bad for devices with only small memory. I personally think that this solution can not be the only answer to the problem. But maybe somebody has another opinion and we can have a discussion about it.

“Split ether into 100 altcoins”

This one is another scaling solution called multiple chains. The approach is that we don’t scale one blockchain e.g. Ethereum, but create hundreds of other cryptocurrency. The downsides of this scaling solution are obvious, because we would split the security (i.e. measured in either hashpower or stake). Also we would have friction for cross-chain swaps.

I only presented a few, but I think most important scaling solutions, there are a lot more. Some scaling solutions that sound very interesting but I didn’t have time to prepare are for example TrueBit and Plasma Cash. If it is of high interest I will cover this two in another post. :)

About the Author

I am Johannes Hagemann a Computer Science student from Germany, Blockchain enthusiast and Software Programmer.

You can find/contact me here:

This text is based on my presentation about Proof of Stake and Scaling Solutions you can find the slides here:

For the Proof of Stake part of this presentation, check out my post here.

Sources

Here are the sources I used (you can check them out to get more information):

https://raiden.network/101.html

--

--