Blockchain Research Newsletter #1: Avalanche

Published in

Blockchain Research Newsletter

6 min readFeb 25, 2019

In this edition, we will be focusing on a novel family of leaderless consensus protocols introduced by an anonymous group called Team Rocket. Avalanche has received significant attention recently, when distributed systems researcher and associate professor Emin Gun Sirer praised the paper and co-founded the cryptocurrency Ava, that uses and improves on Avalanche as its consensus protocol.

The main innovations brought over Nakamoto consensus were 1) the use of random sampling and 2) the introduction of the metastable property to cryptocurrency consensus protocols.

Each protocol builds on top of the previously introduced protocol. We first start with a non-Byzantine fault tolerant consensus mechanism, Slush, that demonstrates the metastable property. Then, we build the Byzantine fault tolerant protocol Snowflake using Slush. Snowball is extended from Snowflake by adding memory in the form of a confidence score. Finally, Snowball is augmented using a DAG structure to store and order transactions to create Avalanche.

Protocols

Slush

Slush is a simple, non-Byzantine Fault Tolerant consensus algorithm that illustrates the metastable property. Assume that nodes in a network want to agree on some binary choice e.g., 0/1 or white/black. For the sake of illustration and ease of visualization, assume the binary choice is between white and black. A node starts out initially with no color. When the node receives a transaction, it takes the color of the color that appears in the transaction and starts a querying process to query other nodes in the network. To initiate a query, the node randomly chooses a fixed number k of nodes in the network and sends each of them a message. When other nodes in the network receive the query they either 1) respond with their color if they are a colored node or 2) color themselves with the color contained in the query if they are an uncolored node. If the node doesn’t get enough queries, then it’ll query the network again until it gets k queries back. The node does this process until is reaches a time m. At the end of each k sampling, the node decides whether to switch colors or stay with its current color. It makes this decision by computing a threshold, dependent on k, and for each color, checks if the fraction of nodes of that color is greater than the threshold. If the threshold is met, it will changes it color. Otherwise, it keeps its current color.

There are several interesting properties that emerge from Slush. First, the node doesn’t have any record of its previous colors and interactions with other nodes between rounds. This property is known as memoryless. Second, nodes repeatedly sample subsets of the network at a time instead of seeking messages from all of the nodes in the network at once. Finally, if the network starts in a 50/50 colored state, i.e., 50% of the nodes are colored black and the other half is colored white, then after repeated sampling the network will eventually favor one color over the other. If nodes do this for long enough, eventually the network will be of the same color.

Given these properties, we can see that Slush is non-Byzantine Fault Tolerant. A malicious node can easily influence the final outcome of the network by changing its own color according to its own self-interest. For example, if the network starts in an equal state and the malicious node wants the network to be color 0, it can change its color to make sure that all the nodes in the network are eventually colored 0. A set of malicious nodes can also keep the network in an undecided state. In other words, the honest nodes in the network never come to consensus on what color to decide on.

Snowflake

So, now we are ready to improve upon Slush and design a Byzantine-Fault Tolerant consensus algorithm. Surprising enough, in order to get Snowflake from Slush, all we need to do is add a counter to Slush. Each node gets a counter. The counter keeps track of the number of queries that return the same color as the node. Now, each node executes Slush as usual. But, instead of simply accepting a color once the threshold is met at the end of a query, we increase the counter by 1. The nodes keep on increasing the counter by 1 until a query returns a different color than its current color. Then, the node resets the counter to 0 and changes its color to the opposite color. Otherwise, the node continues querying until time m. It then compares its counter to another threshold. If this threshold is met, then the node accept the color the counter was accounting for.

One major property that we see in Snowflake that is not present in Slush is that now, Snowflake is no longer memoryless. The counter acts as a form of short-term memory since it keeps track of how many times a particular color has been returned from a query. However, once the opposite color gets returned, the counter gets reset to 0 and there is no memory of the previous instances of the counter.

Another major property that emerges is that of Byzantine Fault Tolerance. However, in order to show this would require some fancy math which is outside the scope of this post. The gist is to correctly parameterize the number of BFT nodes and the probability of a reversion in order to guarantee both liveness and safety.

Snowball

While Snowflake is safe in the presence of Byzantine adversaries, it can be improved as Snowball. A proof that Snowball has strictly better safety properties than Snowflake is included in the paper, but is outside the scope of this post.

Note that Snowflake only has short-term memory. This makes it particularly sensitive to small random perturbations and variance when performing the random sampling. Snowball extends Snowflake by adding long-term memory in the form of confidence counters.

Each node now retains a counter for every color, representing its confidence that the network is consistently agreeing with each color. After every query, the appropriate counter is incremented. The color that each node decides on at any given round is the one with the highest confidence count.

In other words, the network must present a particular color consistently over several rounds to influence the final decision. At a high level, this makes it more difficult for an adversary to overpower the rest of the honest nodes through random perturbations.

Avalanche

Now that we have a reasonably robust consensus protocol in the form of Snowball, we can use it to agree on serialized data (the same useful thing blockchains provide, and upon which we can potentially build executions engines). Note that Avalanche does not impose any requirements on a particular form of Sybil resistance (e.g., PoW or PoS). Rather, it only proposes a novel consensus protocol that assumes one or more such rate limits are used, if it is to be used in a permissionless setting.

As alluded to earlier, Avalanche is leaderless i.e., there is no block proposer. With no leader to collect transactions into blocks to form a chain of blocks (i.e., a blockchain), a different data structure must be used: a Directed Acyclic Graph (DAG). Nodes in a DAG can refer to one or more parents, with the important property that there are no cycles i.e., this isn’t Game of Thrones, and a node cannot be its own parent.

In Avalanche, a transaction DAG is used. Each transaction can refer to one of more other transactions, its parents in the graph. In order for a transaction to be valid, its parents must be valid recursively. In the event of conflicting transactions (e.g., an attempt to double-spend), the network can vote on the two or more conflicting transactions using the Snowball protocol to decide on one of them to include in the canonical DAG.

Through these means, a single agreed-upon view of a DAG will form. As with any DAG, it can be (possibly non-uniquely) topologically ordered, giving us serialized data. Note: the use of the DAG is solely for the leaderless property; it does not proffer any scalability improvements, as each node on the network must still verify every transaction. See this previous post for an overview of the scalability problem.

Conclusions

Avalanche is a novel family of consensus protocols that uses random sampling of a subset of the network. Unlike blockchains, it is completely leaderless through the use of a DAG whereby users build up nodes in the graph.

If you like what you have read, you can subscribe to the newsletter.