Harmony Protocol has launched a public sale via Binance Launchpad. Check it out here.
In the light of the event I suggested Stephen and RJ, two co-founders of Harmony, to do a review of their technology, since a) I have known Harmony since their inception, following the evolution of their technology, and b) I also work on a sharded blockchain protocol, and thus have very deep domain knowledge. Stephen and RJ agreed, and here is the review.
I will use three sources of information for this review: the whitepaper that is published on the website, the source code, and a 1hr whiteboard session that we did with Harmony back in February, diving very deep into the tech with RJ. You can find the whitepaper here, the source code here, and the whiteboard session here.
I will only cover aspects of the protocol related to the consensus and sharding, since that is my area of expertise. I will not cover staking, economics and many other aspects of the protocol, where Harmony might also have some major breakthroughs.
Note that I work on Near Protocol, which is also working on a sharded blockchain protocol, and thus I am biased. Make sure to consider this review in a bigger context and consult other sources as well.
Harmony uses a variation of a PBFT consensus that they call FBFT.
As a background, PBFT is a consensus that allows a predefined group of people agree on some outcome for as long as more than ⅔ of them follow the protocol. No matter what malicious behavior the remaining <⅓ of actors exercise, PBFT is guaranteed to result in all honest actors agreeing on the same outcome, and do it in a finite amount of time.
PBFT works by electing a leader, and having the leader orchestrate the consensus. In the optimistic case the leader will do two rounds of communication with all the remaining participants, each incurring O(N²) network communication. However, if the leader is malicious, a very expensive view-change procedure is invoked, that involves O(N³) network communication. The view-change is the most complex and expensive part of PBFT.
The whitepaper only touches on the optimistic case, in which Harmony reduces the network overhead by using BLS aggregated signatures. The second round of communication of original PBFT involves the leader sending N signatures to each of the N participants, thus resulting in O(N²) communication. Sending only one aggregated signature reduces it to O(N).
What Harmony paper doesn’t touch at all on is the view-change step.
It is worth mentioning that there’s a consensus algorithm called Tendermint that is specifically designed to avoid the view-change. It is significantly simpler than PBFT, and has been running live in Cosmos testnet for more than a year now. It would have been simpler for Harmony to use Tendermint and avoid dealing with complex view-change altogether.
Since the whitepaper doesn’t touch on the view-change at all, we will refer to the code. Before we do, let’s quickly review he NEW-VIEW message in the original PBFT paper described on page 6. It includes 2f+1 VIEW-CHANGE messages, each of which includes 2f+1 checkpoint messages and 2f+1 prepare messages, and such NEW-VIEW message is sent to all the participants, thus immediately incurring O(N³) network overhead.
In the Harmony code there’s an attempt to compress it to O(N) by first aggregating the 2f+1 prepared messages in the VIEW-CHANGE message into a single BLS signature, and then aggregating the 2f+1 VIEW-CHANGE messages in the NEW-VIEW message into a single BLS signature.
However, in an in-depth discussion with Chao Ma, one of the core developers of Harmony, we showed that if the current leader is malicious they can send different aggregated prepared messages (i.e. with different subsets of signatures aggregated) to different validators, and the new leader would not be able to aggregate signatures on such aggregated prepared messages, thus making the size of the NEW-VIEW message linear, and making a single view change incurring O(N²) network overhead, which exceeds the declared O(N). While Chao argues that due to the exact way the messages propagate through the gossip network in practice O(N²) is unlikely to be achievable, it is still the case that the worst case complexity in the paper is underestimated.
In theory it is also worth noting that in the absolutely worst case there will be O(N) view changes, making the total complexity O(N³), but having more than constant view-changes in practice is almost infeasible. For example, in Cosmos only on the order of 1% of all the Tendermint invocations go into the second round (rounds in Tendermint roughly correspond to views in PBFT).
Besides underestimated complexity, there’s another concern. FBFT makes some simplifications compared to PBFT, especially related to the view changes. The view changes are extremely complex, and the careful handling of corner cases there is crucial for safety and liveness of the consensus. While I was not able to find any issues in FBFT immediately, I would encourage Harmony to publish more details on the exact protocol for the view change, as well as formal proofs of FBFT liveness and safety. Without such details and the proofs it is extremely hard for the outside community to review the consensus.
Linearly scalable provably secure Sharding
Harmony implements a relatively standard approach to sharding. The idea of such an approach is to have multiple small blockchains called shards, each of which is responsible for a part of the state, and one blockchain that orchestrates them all, called the beacon chain in Harmony.
The validators of particular shards are assigned randomly from a common pool of validators, and are reshuffled regularly. Assigning validators randomly doesn’t allow an adversary that controls a major percentage (say 10–25%) of all the validators to assign all their validators to a single shard, while regular reshuffling doesn’t allow an adversary that can slowly adaptively bribe validators to take over a shard after the validators are assigned.
This approach is relatively common, for example both Elrond network and Multivac, two projects from different geographies and backgrounds, describe very similar approaches. Ethereum 2.0 is also based on this idea. You can read more in-depth overview of such sharding approach in my blog post from last year, when Near Protocol was also considering using it.
The biggest problem with such approach is that it relies on the fact than an adaptive adversary is slow. I do not believe that the slowly adaptive adversary is a meaningful model. With the design Harmony uses the validators cannot shuffle more frequently that once every day, since they need to download the state of a shard everytime they get shuffled. Thus, an adversary will have a full day to corrupt the validators. If Harmony has say 100 shards, it effectively relies on the assumption that it is impossible to corrupt 0.34% (⅓ of 1%) of the validators in one day, which in a very dangerous assumption. Read some insights into why it is an unreasonable assumption here.
Once we assume that a shard can get corrupted adaptively, the validators can corrupt a shard and apply an invalid state transition. It is a very dangerous attack with disastrous consequences. Read more about it here.
We touch on adaptive adversaries in the whiteboard session at 17:00.
Fast State Synchronization
When a validator joins a shard, they need to retrieve the state, and make sure that the state is valid. Since Harmony is assuming slowly adaptive adversaries, and thus works under the assumption that any block that has more than ⅔ of the signatures must be valid, the new validator that joins a shard doesn’t need to validate all the transactions in the entire history of the chain, and instead just needs to confirm the validity of the signatures on the blocks.
Harmony provides a further optimization, described in the Fast State Synchronization section of the whitepaper. The optimization only validates signatures on one block in each epoch, speeding up the process of the synchronization. It is implemented by the means of having first block of each epoch have a “skip-link” to the first block of the previous epoch. Since the validators within the epoch do not change, only validating such “skip-blocks” headers is no less secure than validating all the blocks headers.
In the whitepaper this optimization is described in the context of the shard chains. However, this optimization is not needed in the shard chains. All the blocks of all the shard chains get cross-linked to the beacon chain (see section “Hash Link from Shard Chain”), so the beacon chain always knows the last hash of the last block in each chain. It is sufficient for a validator to check that last hash, and verify that the hash corresponds to the state the validator downloaded. There’s no need to validate any blocks or block headers that precede the last block that was cross-linked.
The described optimization is also not applicable to the Beacon Chain, since the Beacon Chain must have higher security that the shards, it is paramount that the validators actually validate all the state transitions in the Beacon Chain when they sync.
Thus, in my opinion the existence of the skip-links introduced in the section is not justified. RJ Lan, one of the core developers of Harmony, suggests that having skip-links as another option for the validator who joins the shard to synchronize, in addition to syncing from the last cross-linked hash, provides extra means of security, which I personally disagree with: unless a shard was corrupted at some point in the past, such fast sync would not provide any extra security, and if the shard was corrupted and an invalid state transition was applied, such fast sync would not detect it since it doesn’t validate transactions.
Strongest aspects of Harmony
Harmony uses state of the art randomness beacon based on Verifiable Delay Functions (VDFs). The same design is used by Ethereum 2.0.
It is important to note, however, that the design relies on the fact that no malicious actor can execute the verifiable delay function orders of magnitude faster than the honest actors. Ethereum addresses it by investing into ASICs for VDFs. Assuming Harmony is planning to launch before such ASICs are available, it is unclear how they plan to defend against attacks that involve faster hardware.
It would be unfair to do a review of Harmony and not to mention their research on networking. Two blog posts Harmony published on their overall design and the peer discovery are very insightful, and the section on networking in the whitepaper is very strong.
A smart use of erasure codes allows Harmony distribute the block very quickly, which allows the consensus algorithm to start voting on the block earlier. Generally, Harmony has a very good team of networking engineers with previous experience at Amazon Web Services, and networking is one of the strongest components of the entire protocol.
Harmony builds a sharded blockchain and is based on many state of the art techniques, such as using VDFs for unbiasable randomness and leveraging RaptorQ for faster block propagation.
However, lack of details and formal proofs on the FBFT consensus, and the susceptibility of their sharding to adaptive adversaries are concerns, and must be accounted for during evaluation of the security of the network. Let’s see how the technology behind Harmony evolves over time, and how these issues get addressed.
I’m on Twitter: https://twitter.com/AlexSkidanov