(Partially) Explained Casper CBC specs – Part two: Protocols from the Abstract

From abstract specs to practical protocols

In my previous post, I gave a brief visual interpretation to the latest paper by Vlad Zamfir, Nate Rush, Aditya Asgaonkar and Georgios Piliouras. We saw there that Casper CBC defines an abstract protocol specification, made up of validators sending messages to reach consensus. Once we prove some very nice properties on this general family of protocols, we can define a bit further some components of the mechanisms to obtain specific protocol instances. This is the magic of CBC: Do the hard work on the foundations and reap the profits downstream!

CBC Casper can be refined to yield various specific consensus protocols, starting from its abstract specifications.

This article will build on the previous one, with the explicit intent of presenting examples of specific protocol instances to illustrate the general framework. In terms of the paper, this post covers Section 4, while the previous dealt with Sections 2 and 3. Before diving in, let’s take a minute to remember what happened there. It’s also a good idea to read again Aditya’s post on the topic, as well as part one of this post and to keep the paper handy at your side while reading. Vlad’s latest Devcon talk can be helpful to understand the general philosophy behind CBC.

The good news however is that we have done a lot of hard work in the previous post, and this part will make some of the more difficult ideas very practical with examples.

Aditya’s post

Part one

The abstract CBC Casper (part one recap)

A group of validators attempts to reach consensus over some object. They can talk to other validators by sending messages. Each time a validator sends a message, it has to give an estimate, i.e. state the value it believes should be the consensus, and justify its message by referring to some previous protocol state. This justification is none other than a set of previous messages, acknowledging “I have seen these messages, and my new message is a continuation of that”. Meanwhile, nodes observe the messages and the protocol state transitions, in an asynchronous fashion: not everyone lives in the same state, but all states are made of messages referencing earlier states.

Now, if I am a validator, I can derail the protocol by doing the following: I send two different messages m and m’, none of which references the other. Here I am basically saying “I am continuing history with m. But m’ belongs to a different world, where history continues with m’ instead, and vice versa”. This is an equivocation fault. The protocol now runs two different versions (or threads, in the paper), none of which is compatible with the other.

The problem with equivocation is that if validators start saying everything and its opposite (tout et son contraire), a consensus failure may occur. Nodes decide on something if they hold it to be true from a certain state onwards. How do we guarantee that these decisions will not be inconsistent?

The answer lies with the safety guarantee. We give ourselves some weight t, and assume that among all states observed by our nodes, the total weight of validators that have equivocated there is less than t. Then, by the safety guarantee, we know that there is a common future for the nodes and thus the nodes cannot make contradictory decisions.

Two nodes A and B are respectively at states σ1 and σ2. They have a common future if from their current states they can reach a common state σ1 u σ2. The safety property guarantees that if less than t validators equivocate, there is such a common future where decided properties are consistent and still less than t validators equivocate. Every other state in the intersection of the futures cone σ’ or σ’’ is a superset of σ1 u σ2.

Forming consensus

We stand on solid ground with the protocol specs and its safety guarantee, it is time to make good use of it all. The protocol is developed to reach consensus and this is what we turn our attention to now.

Latest messages

When validators send messages, they are actively vouching for some value to be decided as consensus. Dishonest validators can try to derail the protocol by vouching for two different values, or justifying the same value from two different states that do not include either messages. This entails that validators that do not equivocate will always have at most one latest message – they can have none if they have never spoken, or if their messages are not observed in a particular state. If they are observed in the state however, there is only one such latest message. The consensus is argued over this set of honest observed validators’ messages.

We have three validators, A, B and C. 1) A and C send a message (states σ1 and σ2). 2) B equivocates, sends two messages m and m’. 3) At state σ4, we have one latest message from A (observed honest validator), two latest messages from B (observed equivocating validator) and zero message from C (unobserved honest validator).

Messages as input to the estimator

After we collect the latest messages of observed honest validators, we can move from the protocol state to the consensus values. Definition 4.13 gives us a function mapping validators to their own estimate, or to the empty set if the validator is not observed or if it is equivocating. The estimate is the consensus value they have included in their latest message.

Definition 4.14 returns a property of the specified protocol. That property is set to True if we are only using estimates from honest validators, i.e., we are simply discarding any information (or lack thereof) from equivocating or unobserved validators when forming the consensus. Thus, an estimator is “latest honest estimate-driven” if it only looks at estimates of honest validators.

Minimal Casper consensus protocols

What do we have left to do? We proved that nodes cannot decide on contradictory consensus values unless the weight of equivocating validators is too high. We saw that validators who play nice and are included at some state of the protocol have sent only one latest message each. Finally, we mapped these latest messages to the consensus value they are estimating.

It is now time to specify what exactly is the set of values we arguing over, and how consensus is reached from the estimates of observed honest validators.

The friendly binary consensus

The simplest of the family of minimal CBC Casper protocols, the binary consensus, asks of validators to choose between two values, say, 0 and 1. The consensus value is obtained very naturally: validator weights are tallied up for those who estimated value 0 vs. those who estimated value 1, and the heaviest set wins. Note that in case of a tie, both values are returned: remember that the estimator returns a subset of the set of consensus values, and this subset can have multiple elements.

4 validators estimate the consensus value. 2 are out of the final decision: B has equivocated and C has not been observed. D estimates 0 but its weight of 2 is inferior to that of A, who estimated 1. Thus, the consensus yields 1.

The friendly integer consensus

The logical next step is to reach consensus over a set that contains more than 2 elements. In this version, validators estimate the value of an integer. We could again take the value returned by the heaviest set of validators, as we did for the binary consensus protocol. However, this might not best represent the idea of consensus, as the mode of a distribution (its highest peak) can be very different from its mean or its median.

The paper chooses to estimate the median. Each integer is weighted by the total weight of the validators which returned that integer. The estimator then returns the median over these weighted integers. The weighted median is roughly obtained by returning an integer such that less than half of the total weight of all validators is below it, and less than half of the total weight of validators is above it. Not clear? A drawing may help:

The first array gives the estimates of six observed honest validators, with the validators’ weights. The second array gives the total score for each of the estimated integers. In the chart, we show in pink the cumulative score distribution, with the dashed line representing the median line, where half of the validators’ weight is. Since 2 is the only integer such that the total score of smaller integers is less than 7 (= 6, blue arrow) and the total score of larger integers is also less than 7 (= 4, second blue arrow), we get that 2 is the weighted median, and thus the consensus value. You could check as an exercise that 1 is not the weighted median. Can you find an example where the weighted median returns a set of multiple integers?

The friendly GHOST

Seemingly, we have strayed rather far from the blockchain issues that motivated all this in the first place. Rest assured, however, that with all this power in our hands, we can very quickly adapt Casper to fit in a more blockchain-oriented framework. Its first application is to the fork choice rule nicknamed GHOST, for Greedy Heaviest Observed Sub-Tree (can you see the connection with Casper? 👻)

GHOST is particularly helpful for blockchains that tend to add blocks very rapidly. Unlike Bitcoin which adds a block on average every 10 minutes, Ethereum does so every 10 to 20 seconds (around 15 seconds, at the time of writing). This naturally increases the uncle rate, i.e., the rate at which valid blocks are produced which are not included in the main chain. Under the GHOST rule, parent blocks may have several children, all included in the canonical main chain. Since I am a decidedly visual person, I refer you to this great animation done by the folks over at RChain.

What is the set of consensus values this time? The set of all possible blocks, starting from some genesis block g! This seems like a step-up from a few integers, and yet, armed with the GHOST rule and the few mechanisms we developed just previously, it is quite natural. When block b belongs to some state, one can compute b’s score by looking at the sum of weights of validators who return one of b’s descendants as estimate. This is OK, since we want the heaviest subtree.

Now, for the greedy part, we start from the genesis block g and get its best children, i.e. the children with the highest score, and recursively their own best children, etc., until we arrive at a leaf. At this point, GHOST returns a set of best leaves, which is exactly what we asked of our consensus protocol: tell us where to build next!

Coming next

The paper goes on to provide another flavour of Casper: the Finality Gadget. If you have already heard of it before, it is because it is actively discussed and researched on at the moment by the Ethereum community. It started as a consensus gadget on top of PoW, but was recently redesigned to allow for full PoS on top of sharding. It is a super interesting topic, and a very recent paper by Vitalik Buterin, Daniël Reijsbergen and Georgios Piliouras makes the case for the cryptoeconomics of deposits and rewards in the finality gadget.

Since we mentioned sharding, it is in fact entirely possible to apply the CBC Casper framework directly to it. Vlad Zamfir has been doing some work towards this, including another cool visual for parent and child chains working together, along with Alexander Skidanov. Section 4.6 expands on this topic.

Finally, a specification of CBC Casper for light clients would guarantee the safety of a blockchain where validators may not have access to the full history of protocol states. This is an extremely powerful technique to reduce the storage requirements of a node as well as associated computation overheads and an active research question today.


Any mistake? Question? I am still learning and would love your comments!

More thanks to the CBC team for their useful comments and discussion.