Why was Casper FFG — a partial Proof-of-Stake algorithm on top of Proof-of-Work — deprecated?
Mechanics of the deprecated Casper FFG — a brief overview
Casper the Friendly Finality Gadget was Ethereum’s first approach towards a Proof-of-Stake(PoS) based consensus system and was formally specified in EIP1011. Casper FFG was designed as a smart contract existing on top of the Proof-of-Work(PoW) chain. The following chapter describes the process flow of the deprecated Casper FFG solution. It is an optional read, so please feel free to go straight to the chapter “What is the problem?”.
[optional] Process Flow of the Casper FFG
Once implemented, the contract accepts a minimum deposit of 1,500 Ether sent along with a so called “validator contract” — a contract which would allow you to abstract out the signature validation scheme into something of your choice — not just limited to the ECDSA anymore. Once successfully staked, with 1,500 Ether (or more), one would have been officially considered a “validator”. Validators vote on the validity of “proposed” blocks by the independent PoW chain every epoch (50 blocks) while earning interest for voting in a ‘trustworthy’ fashion.
The contract proposed slashing conditions, meaning your whole deposit would be slashed if violated, to dis-incentivize untrustworthy behavior. They were:
1.) You shouldn’t vote for two or more distinct checkpoints during one epoch
2.) You shouldn’t vote within the span of your other votes
Resulting in the byzantine fault tolerant properties (the mathematical proof can be found in the official Casper FFG paper):
1.) Accountable Safety: two conflicting checkpoints could not both be finalized unless 1/3 of validators violated a slashing condition (meaning at least 1/3 of the total deposit would have been lost).
2.) Plausible Liveness: regardless of any previous events (slashing events, delayed blocks, censorship attacks, etc.), if at least 2/3 of validators followed the protocol, then it would be always possible to finalize a new checkpoint without any validator violations.
Whenever the supermajority of the validators agreed (2/3 of the staked ETH) on an checkpoint, that block would be registered as “justified” on the Casper-Contract.
In the case that the protocol goes as expected, the validators would vote after the 49 blocks in between the next votable “checkpoint”-block are being mined — for the next checkpoint. If the new justified checkpoint points back to the old justified checkpoint, the old justified checkpoint is now considered “finalized” in the Casper-Contract. Once broadcasted, all nodes would then fetch the last finalized block and never revert any fork no more, adding additional economic security to the Blockchain-System while paving the way towards full PoS. This new layer of economic security diminishes the reliance on PoW miners to secure the network. It was therefore planned to reduce the block reward by 80%, decreasing from 3 ETH every 3 months by 0.6 ETH down to 0.6 ETH.
- Deprecated Casper FFG was a hybrid PoW/PoS solution — a smart contract on top of the PoW Chain — funded with 1.25ETH out of ‘thin air’ (hard forked into existence).
- The new added economic finality ‘unburdened’ the PoW responsibility to secure the network. The PoW block reward would decrease to 0.6ETH.
- High minimum stake of 1,500ETH in order to become a validator due to on-chain overhead.
What is the problem?
The whole point of Blockchain is the ability to be fully decentralized. Why is this a problem for the above mentioned version?
All votes are collected and verified within the Casper Smart Contract on-chain. Because Casper FFG still relies on the normal PoW-chain, meaning similar block creation time/block size → constraint: 15tx/s. This constraint led to the high
MIN_DEPOSIT_SIZEof 1,500 Ether in the EIP1011 — restricting the total number of validators to approximately 900–1000 participants so that the network can deal with the overhead due to the new on-chain voting and verification system.
This system is not fully decentralized, obviously. Also, suffice it to say, 1,500 is a lot of ETH for many private individuals.
Signature Verification Bottleneck of the Layer 2 Smart Contract Solution
Let’s assume the
MIN_DEPOSIT_SIZEwould be reduced 32 ETH so that regular individuals can participate in securing the network. Let’s further assume that 10,000,000 ETH are staked in total = 312,500 validators. Each valid voting cycle requires 2/3 votes, meaning 208,334 votes collected by PoW miners. Further, assuming that the PoW chain is modified with a dedicated slot for vote-TX’s, resulting in a throughput of 4 vote-TX/s. That would result in 52,083 seconds or 14.4h to vote for a single epoch-block. Not feasible!
Another reason why it makes sense to merge the approaches is the overlapping infrastructure of Sharding and PoS(necessity of validators for both, stakes, messages, gossip channels, signatures, aggregation points, etc.). A merge would reduce the redundancies as well as later merging/synchronization issues.
How to collect and verify >100,000 Votes (Attestations) for every new block?
So, how do we solve the above mentioned problem of collecting several hundred thousand votes to ensure real decentralization?
Simple answer: Off-Chain.
This can be done by utilizing the so-called “BLS Signature Aggregation”, where instead of verifying each signature one by one like in the deprecated Casper FFG, signatures can be ‘paired’ together resulting in an ‘aggregated signature’, which is then verifiable with one corresponding public key. The BLS Signature Aggregation process can be described as follows (simplified):
A group of N members hold a logic private key S (messages signed by S are marked as SIG, and the public key corresponding to S is marked as P). Each member i in the group holds only a part of the key si, and no one has the entire S so that no one can calculate SIG directly. During the signing process, each member signs a message with their key si resulting in sigi. Only after having received k (a predefined number) number of signatures will the SIG be calculated.
While this scheme was not in the main focus because it would need n+1 pairings in order to verify n signatures on a message m, thanks to the recently published paper Compact Multi-Signatures for Smaller Blockchains, it is now possible to verify n signatures with just 2 pairings.
In Justin Drake’s post, he did the math for assuming the BLS signature scheme for 312,500 validators: The aggregation of all messages would be done in about 9 minutes, assuming the same throughput of 4TX/s as well as “sharded aggregation”, where beacon chain leaders are only responsible for the aggregating signature in the shard for which they are a proposer (So, 312,500/100 = 3,125 infrequently shuffled validators).
While the old Casper FFG would collect and verify every vote on-chain one-by-one, Ethereum 2.0 would aggregate the signatures of each shard off-chain and verify the whole aggregation of signatures at once, leading to a significantly faster verification process.
Ethereum 2.0 — SHASPER: Casper + Sharding
The current specification of Shasper is found in the Work in Progress Paper (v 2.1 as of writing this article) by Vitalik Buterin. The Shasper specifications are highly dynamic with major changes departing from the original publication, most notably by Core Dev Meeting on the 15th June 2018. The current Shasper version is still very far away from finalizing major mechanics like cross-shard communication (current discussion), proposal mechanisms, RANDAO reveal, etc. I recommend to keep track of the WIP-paper and ethresear.ch in order to stay updated.
Ethereum 2.0 Architecture
- Initially, the current PoW Chain was the bridge to the Beacon Chain. In order to become a validator in the new system, one must send — (down from 1,500) — 32 ETH along with the shard where you want to deposit to (since there is most likely no cross-shard communication in the beginning) in a special contract. The stake on your main chain gets burned and you become a validator in the beacon chain.
- Beacon Chain keeps track of the attestations of the respective shards and is therefore the managing ‘Beacon’ for the independent Shard Chains. An attestation is a signed vote from a validator for a block and the blocks’ ancestors in a shard chain. The Beacon Chain is also supposed to provide random numbers as a seed for reshuffling and allocating the validator set.
- Every Shard is itself a PoS chain (up to 1024 shard chains), where the transactions and accounts will be stored.
Ethereum 2.0 Casper Proof-of-Stake
The crux of the new system is still a WIP, and the theory and specifics behind many major components are still being researched. The current state-of-the-art ideas can be found in the official WIP-paper. However, Vitalik proposed to adjust the Casper FFG as follows:
- Time is broken up into “slots” of 8 seconds.
- The validator set is randomly (see below) shuffled every
CYCLE_LENGTH(64 slots = 64*8s = 512s = 8min 32s) and divided ahead of time into N equal-sized slices, which are repeated. (eg. slice 3 of the validator set is called to send messages during slots 3, N+3, 2N+3…).
In the case of the above mentioned 312,500 Validators:
- Committee Size per slot: active validators /
CYCLE_LENGTH= 312,500/64 = 4882
- Number of validators per shard and slot: Committee Size per slot /
SHARD_COUNT= 4882/1024 = 4
It ends up with breaking down validators set to a set of slot-shard committees, each one of size
- During each slot in each shard, a single validator can propose a block, and the slice of validators corresponding to that slot can vote for it.
- A vote votes both for a block (its “target”) and for that block’s N-1 nearest ancestors (ie. N blocks in total).
- A block is justified if 2/3 of the validator set votes for it (in any of the N slices that include or follow its slot). Note that any block can be justified, not just epoch-transition blocks. The chain keeps track internally of what the “last justified block” is, and votes use this as their “source”. Note also that a chain only accepts votes if their source is the source specified in the chain, which itself is guaranteed to be an ancestor of the head of the chain.
- If a sequence of N+1 blocks that are all part of the same chain is justified, then the earliest block in the sequence becomes finalized.
- The two slashing conditions, though slightly different from Casper FFG lead to the same properties:
- A validator cannot make two distinct votes in the same slot; and,
- A validator cannot make two votes
(s2, t2), where
slot(s1) < slot(s2) < slot(t2) < slot(t1), where
slot(x)is the “slot number of x” function.
Fork Choice Rule
The fork choice rule will be “favor the chain containing the highest-slot-number justified block”, to choose between chains that are all descended from the same justified block, the chain uses “immediate message driven GHOST” best explained in Vitalik’s post on ethresear.ch.
How to create non-exploitable randomness?
The development of how to implement a fully decentralized, non-exploitable, scalable PoS algorithm has been an open, actively researched topic since 2014. Vitalik speaks to a history of this work here. This section focuses on RANDAO scheme and Verifiable Delay Functions (VDFs).
The old version of Casper FFG did not have a full PoS functionality since it seemed to be hard to get real and fault-proof randomness in a decentralized, but deterministic system. So instead of letting randomly selected validators propose/create blocks out of thin air, the validators were supposed to vote on every epoch proposed by the underlying PoW chain.
The advent of RANDAO (Random and Decentralized Autonomous Organization) technology seem to overcome the issue of lack of centralism and verification. RANDAO is an on-chain (smart contract) random number service. Instead of having a central source generating the seed, randomness can be generated by the community in three phases (simplified):
Phase I: Every participant willing to take part in the creation process sends a hash of a randomly selected number s (sha3(s)) along with a certain amount of Ether as stake and within a specific time frame (e.g. 6 blocks = 72s) to the RANDAO contract .
Phase II: Now the participants should reveal the selected number s to the contract by sending it within another time window. The contract then verifies that s matches the sha3(s) submitted in Phase I, then save s into the seed of the function that will generate random numbers.
Phase III: After having collected all random numbers s by the participants, apply function f(s1, s2, …, sN) on the final random numbers, return the result to all contracts that required this random sequence.
This system has its weaknesses though…
The Commit Reveal Program has the disadvantage of low production in random number generation. It takes at least 10 block time from receiving the production request to the output of a random number. In the case of the Ethereum blockchain at current block time, it takes more than 3 minutes. In addition, participants need to send transactions and submit data several times, thus resulting in a relatively high cost of use.
It is also vulnerable to collusion or DoS attacks, where attackers would not reveal their s, thereby delaying random seed generation.
Verifyable Delay Functions
Imagine now one of the validators in the Ethereum 2.0 chain is about to propose a new block and therefore knows the seed for randomness first. He then has a short time window — ahead of anyone else — to calculate the possible outcome of anything based on the random seed (validator allocation, for example) or even influence the result (depending on the implementation).
Let’s assume he can not influence the result, but still has two options:
1.) make a block, reveal the random number and receive block reward; or,
2.) not make a block, reshuffle the random number, but loose the block reward.
If the proposer knows ahead of time how many slots they would get selected for if he makes a block would mean he might have room to exploit the system.
The example above underlines the necessity of a system where nobody is able to calculate possible outcomes upfront. Here is where Verifiable Delay Functions (VDFs) come into play.
A VDF is a function that takes non-parallelizable (e.g.
f(x) = g(g(g(g(....g(x)....)))), where the outer g(x) depends on the result of the inner g(x) ) work to compute, but can be verified very quickly. In other words, VDFs have the properties of clock-time asymmetry. It takes long to compute in one direction and the outcome is impossible to predict , but easy to verify once computed [Further Readings on VDFs].
Suppose the block-proposer is expected to submit data that determines the source data for a VDF computation, together with other source data that was revealed at time T, and the submission must appear before time T+D (e.g. D=8 seconds time slot), while it takes even under conservative circumstances min W time to execute the VDF on the source data (W>>D).
This approach prohibits the block-proposer of trying different commits before releasing them, because any effect of manipulation takes too long and is therefore hidden behind the delay function.
[VDFs have] relative to RANDAO similar schemes, but they cannot be manipulated — Vitalik Buterin
Utilizing VDFs as a random beacon in Ethereum 2.0 seems promising. However, how to implement a robust VDF system, as well as solutions to possible vulnerabilities from using VDFs, remain questions yet unanswered by ongoing research.