Scaling Ethereum for Global Adoption
A PDF version of the writeup below can be found here.
Over the past few months, Ethereum has seen a massive influx of users and capital into its platform. However, as Ethereum gains popularity, the big question on everyone’s lips is how Ethereum can scale to support large commercial-ready distributed applications (dApps).
In its current state, Ethereum can only support about 13 transactions per second. This means we’re still an order of magnitude away from supporting the load of even a single dApp with a million daily users each making a transaction on the blockchain. We’ve already seen the effects of the looming scaling issues with the release of CryptoKitties, a game where player breed and trade cartoon cats built on the Ethereum blockchain. Following CryptoKitties’ release, the number of unprocessed transactions rose from an average of around 2,000 to over 10,000, causing significant delays in all transaction confirmations .
While Ethereum may not be able to fulfill its grand initial vision of being a globally adopted computing infrastructure in its current form, there are already a variety of different projects and proposals to help optimize and scale various bottlenecks within the Ethereum network.
A general problem that most cryptocurrencies need to solve is the question of how the network can come to a distributed consensus when individual nodes may have an economic incentive for falsifying or altering data. Ether, which is the primary currency within the Ethereum platform uses a proof of work (PoW) system, meaning that miners need to solve complex cryptographic puzzles to receive payment for a block. Requiring miners to perform this task means that to able to reliably attack or spam the network, a group of attackers would need to control more than 50% of the total computer power within the network.
While the PoW system provides a simple solution the issue, solving these cryptographic puzzles requires a large amount of electrical power. The mining of Ether alone consumes about 1.2 billion USD in electricity per year . Ethereum aims to address this potentially unsustainable / unscalable construct by moving to a proof of stake (PoS) protocol dubbed Casper. In contrast to PoW, a PoS protocol consumes significantly less power, and achieves its consensus by having miners “stake” their coins — locking them in specialized wallets — to bet on the confirmation of a block of transactions. These coins are lost if miners bet on a block that isn’t chosen, providing a deterrent from people maliciously betting on bad blocks. This means that a failed attempt to attack a PoS network results in direct economic loss to the attackers, compared to PoW where the attack simply fails with no other repercussions than the lost time and power.
Taking this a step further, Casper should have the ability to effect economic finality at a level that a PoW system cannot (as described in multiple places including https://ethereum.stackexchange.com/questions/16121/understanding-economic-finality-in-pow-and-pos). In BTC for example, one could invest $10 billion in ASICs/hardware (maybe more as it is a logistical nightmare to collude at this size ) to temporarily attack a PoW chain and force a hard fork of the protocol to migrate from ASIC mining to GPU mining, where at this point a malicious actor could spend another $10 billion on GPU rigs to permanently attack the chain (because they’ve already moved from specific to general mining, and have nowhere else to turn, on a hardware security basis). While the PoW chain may perpetually hard fork the chain, the malicious attacker could still use the same hardware (a fixed cost at this point, with little incremental spend) to perpetually compromise the chain. Vitalik calls this “spawn camping”.
In Casper, an initially well-resourced malicious attacker can’t spawn camp without spending new ETH every time (as these will get slashed), and the cost of buying ETH will theoretically be more expensive every time as large sums of ETH get slashed in previous attacks, decreasing overall supply in each iteration (although network value/market cap may fall as the security of the system erodes, albeit temporarily, which might offset the supply crunch).
In addition to decreased power costs, a PoS system has reduced centralization risks as well. Having the capital to stake ten times more coins as someone else gives you ten times more of a reward in PoS, but in PoW, having additional capital to spend on mining power allows provides you a disproportionate amount of access to mass-production and specialized hardware.
Specific to Ethereum, its proposed implementation of Casper may reduce the block time from the current 15 seconds to 2–7.5 seconds, providing a direct increase in transaction capacity on the network . Additionally, Casper will provide the groundwork that will allow sharding, another major scaling improvement that will be discussed below.
While Casper brings a variety of benefits to the table, there are some associated challenges as well. Casper relies having more than 2/3 of validators to behave honestly to achieve a non-conflicting canonical chain, with a risk of the chain getting stuck without consensus when the number of malfunctioning or misbehaving validators rises above 1/3, compared to needing more than 1/2 of the computing power in PoW systems to maliciously affect the chain (i.e. lower threshold for fault tolerance) . Nonetheless, this can be considered a somewhat marginal risk in practice, as a successful attack would require the perpetrators to hold a very large amount of Ether, but may also likely drive its price down, which misaligns malicious incentives. Thus, the energy, scaling and decentralization advantages that Casper does provide will be a step in the right direction for the Ethereum platform. Casper will be introduced in 2018 with Ethereum’s next hardfork, Constantinople. This hard fork will introduce a hybrid PoS/PoW system, which will gradually be transitioned into a fully PoS system over time.
In order to validate transactions on the blockchain, Ethereum’s nodes need to store a copy of all past transactions and process all new transactions in each block. Although this is a serviceable and secure approach, it has some unfortunate scaling properties. Currently, the most direct way to allow Ethereum to process more transactions is simply to make blocks larger to allow for more transactions per block, but this would mean that each node would have to do that much more work in the same amount of time and may eventually preclude consumer grade hardware from running full nodes. Adding more nodes into the network also doesn’t decrease the work any other node would have to do.
Ethereum’s plan to address this particular property of the network is called sharding. The basic premise of sharding is to split Ethereum’s nodes and transactions into smaller groups or “shards”, all running in parallel, with each shard acting similar to how the entire Ethereum network acts today. This way, nodes in each shard only need to keep a small fraction of the entire blockchain that is relevant to their shard, and each shard’s nodes would only need to process the transactions for that shard. Interacting across shards will be possible, but likely require a longer and more complex method of confirmation. You could have situations where isolated shards have to go through additional legwork to communicate with each other, putting additional onus on developers to think about cross-shard interoperability. That said, the Foundation is trying to abstract this mechanism away from developers so they don’t have to worry about the back-end nuances of inter-shard communication. An effective sharding solution in Ethereum could provide a many magnitude boost to transaction throughput without sacrificing much in the way of security, safety or decentralization that other solutions such as trying to use altcoins to bring more transactions off-chain or having powerful masternodes.
Unlike some of the other Ethereum projects discussed in this article, plans for sharding are still being developed, and we’re still many months out from seeing any meaningful implementation of sharding. Because a specific approach to sharding in Ethereum hasn’t been finalized, we don’t know exactly how large of a scaling increase sharding will provide, or what specific disadvantages sharding will bring. The Ethereum foundation is currently working on a formal specification on sharding, after which they will look for teams to implement and deploy onto the testnet. They recently announced that they would be providing grants ranging from $50,000 to $1 million to various groups in order to facilitate the development of protocols such as sharding .
Currently, every transaction on Ethereum’s platform is saved onto the blockchain. These transactions incentivize miners to add the transaction into their block by providing a reward in the form of a transaction fee. As the value of the currency and the number of transactions grow, not only will transactions take longer to confirm, transaction fees will increase as well drastically as well. However, it may be feasible to perform quick and cheap transactions without instantly attempting to record them onto the primary Ethereum blockchain. The Raiden Network is one such off-chain scaling solution. For those following Bitcoin’s tech, Raiden is similar to Bitcoin’s Lightning Network for ETH and other ERC-20 compliant tokens.
Raiden introduces the concept of payment channels through a smart contract to enable secure off-chain transfers that don’t require blockchain consensus. A sender can initialize a payment channel to a receiver and place an initial deposit of tokens into escrow. By sending a digitally signed message to the Raiden smart contract, some or all of the deposited currency within the channel can be sent to the receiver. Either party may submit a signed message at any time to close the channel and have the results written to the blockchain. Setting up two channels that are bidirectional allows sending back and forth up to the initial token limit held by the smart contract.
The requirements for placing tokens into escrow mean that it may become expensive to open a new payment channel to each address you want to pay. However, Raiden allows you to route payments through the network in with multihop transfers. For example, if Alice want to pay Bob, she doesn’t necessarily need to open a new payment channel to Bob if they both have another channel already open to a common receiver.
Transactions sent though Raiden’s payment channels are designed to be fully confirmed within less than a second; on the blockchain, Ethereum transactions take seconds to confirm, but it is generally recommended to wait for multiple blocks to pass before before considering a transaction finalized, which can take minutes. Additionally, these payment channel transactions do not have any fees. However, it is very likely that intermediaries providing the channels to facilitate multihop transfers will be charging their own fee for providing access to the network. Nevertheless, these intermediary fees are expected to be an order of magnitude smaller than simply paying for an on-chain transaction.
Raiden Network’s design of using of using a graph of off-chain payment channels also has some interesting scaling and privacy properties. Transaction are only saved onto the blockchain when a payment channel is closed, so only the one total resultant payment is visible on-chain. More importantly, most payment channels consumers open are likely to intermediaries, meaning that most of the time, the receiver transactions can be hidden as well. In terms of scaling, there should be a linear relationship between the number of people using Raiden and opening payment channels, and the network capacity.
Although Raiden has the potential to address many of issues currently affecting Ethereum, it has a few challenges of its own it needs to resolve as well. Raiden is not well suited for larger transactions since each channel is secured through escrow, and channels might have smaller amounts. This is not too significant of an issue since larger transactions can be facilitated directly on the blockchain, where the transaction fee will still be very small percentage of the amount transferred. The main hurdle that Raiden will have to jump is community adoption. If many people are on Raiden, then connecting yourself to the network will give you easy access to fast and cheap payments to many parties, but if there are fewer people already on the network, the time and monetary cost for setting up Raiden may cause some users to turn away.
Ethereum Virtual Machine Improvements
A quick look into Ethereum repository for their Ethereum Improvement Protocols (EIPs) and their core devs meetings will show you that while they are indeed working on many of these long term massive scaling improvements, the are also still continuously working on smaller, more immediate upgrades to the Ethereum Virtual Machine (EVM).
The EVM is a turing complete virtual machine that executes EVM bytecode; smart contracts are run on the EVM after being compiled down into the bytecode. Some of the improvements to the EVM are proposals to modify the set of opcodes that the EVM supports, or how they function, in order to try to either reduce the size of compiled bytecode or improve its optimization. This would decrease the space smart contracts would take on a block and reduce the amount of gas needed for them as well. While such improvements to the EVM may only provide fractional improvements in scalability compared to the multi-magnitude solutions discussed above, as Ethereum process more and more transactions, their total impact may be quite significant.
Vyper is a programming language being developed for writing smart contracts. Currently, most Ethereum smart contracts are written in another programming language, Solidity. While Solidity is capable of expressing any smart contract a developer would want to create, Vyper aims to improve smart contract development in a few key aspects.