Trustless Bridging

Web3 Philosopher
Composable Finance
Published in
12 min readMar 17, 2022

--

A common belief is that a multi-chain future is destined to be a multisig future, where the relayer is controlled by multiple insiders to monitor and facilitate a lock-and-release mechanism; moreover, it is ‘impossible’ to build trustless bridges, apart from a third party to resolve conflicts between networks through hard forks. Industry luminaries such as Vitalik Buterin and Rune Christensen have both made vocal arguments about this.

At Composable, we agree that trusted bridges operated via multisigs are not desirable long-term solutions. However, we think it is too early to draw the conclusion that trustless bridging is impossible. Theoretically, it is possible to relay the canonical true state of one network to another, if we can agree on the true states of two networks on-chain and resolve the different forks that may arise. Getting two networks to communicate in a trustless way may be a bold ambition — intra-network communications such as XCM in Polkadot or IBC in Cosmos are still being rolled out and optimized; but this is an area we need to solve to bring about a seamless cross-chain future. Without trustless bridging, we are left with silos in liquidity, information, and technical development.

We are excited to share our latest development on this front. Pending reviews from the community, the gadget will enable Cosmos chains to track the finality of DotSama parachains, a first step in setting up the Centauri bridge that connects Cosmos and Dotsama. In this article, we give more context and we introduce key pieces of technology that allowed us to build a trustless bridge — light client protocols, Merkle Mountain Range trees, and a novel finality gadget.

Light Client Protocols

Highly efficient light client protocols are crucial in enabling the decentralization and mainstream adoption of blockchain protocols. It does this by allowing environments with computation and memory resource constraints (e.g. mobile, on-chain contracts) to verify the latest blockchain state without the need to execute & store the full block data and state. Light clients instead track block headers instead of tracking the full blocks and executing transactions to arrive at the latest state. It is important to note that blocks are really just the header & transactions:

The size of the transactions in a block might vary, but headers have a fixed size, usually no more than 1kb. Fun fact: bitcoin header size is actually 80bytes, while ethereum is 508bytes.

Zoom into a block header

Light client protocols consist of a combination of transaction merkle proofs, state proofs, consensus proofs and finality proofs all of which are usually included in the block headers except for finality proofs. This is because finality proofs may differ from the consensus proof and require data that is an extension of the header. Let’s look into each one of them in more detail.

Transactions Root

This is the merkle root of all transactions that have been executed in this block. This merkle root can be seen as a kind of cryptographic compression that allows us to trustlessly verify if some data was part of the original data that was compressed by way of a merkle proof.

Such that the merkle proof required to check if some element was included in the root would be log2(n) hashes which are usually 32bytes. In the diagram above, we only need 4 hashes outlined in blue to prove or otherwise reconstruct the original root hash that K was indeed part of the original merkle tree. Where: n = 16, log2(16) = 4. This enables light clients to efficiently check which transactions were executed in a block without needing to download the full block that may potentially contain thousands of transactions and have to scan this list linearly.

State Root

The state root of a blockchain is similar to the transactions root in that it is the output of the cryptographic compression of data. However, where the transaction root is a compression of a list of transaction data, the state root can be seen as the compression of a list of keys & values.

Ethereum state trie architecture

Hence, the keys and values are the data stored on the blockchain by contracts or core blockchain subsystems like the consensus protocol storing the list of authorities and their stakes. By compressing this data into a kind of merkle root, we retain the ability to check the existence of some key, value against the root hash and a merkle proof, without needing to maintain the full blockchain state but still have the same trussless guarantees as a full node.

Consensus Proofs

The consensus proof that a block is valid is usually included in its header and its format is entirely dependent on the consensus protocol of the blockchain. For proof of work (PoW) systems, the consensus proof is a nonce that satisfies the equation:

As you can see above, finding a value that satisfies this equation would require a significant amount of computation as the hash functions cannot be brute-forced, but verifying this consensus proof is cheap and quick.

Meanwhile, the consensus proof for a proof of stake (PoS) protocol is usually the output of a verifiable random function where:

For most blockchain protocols, Their consensus mechanisms usually only guarantee liveness¹, hence verifying these consensus proofs only tell us if this block is valid. It does not, however, tell us if this block should be considered as final. In the case of PoS, blocks that are not signed by a public key in the known authority set are not considered to be valid. Consensus proofs provide trust guarantees about a block to the nodes on the network pending finalization, as competing blocks for the same block height may arise in a byzantine system. It is entirely up to the finalization protocol to provide safety² guarantees.

(Fun fact Polkadot uses a hybrid consensus protocol where Babe³ guarantees liveness¹, while Grandpa⁴ guarantees safety²)

Finality Proofs

For light clients to verify that events are happening on chain, they need finality proof, aka a cryptographic proof that the transactions in a block are final and that the block can never be reverted. These proofs could be a part of the headers or could be requested alongside the header from full nodes.

For proof of work blockchains, this “finality proof” is actually embedded in the headers, as the proof of work nonce. But this by itself is not enough to guarantee finality. Rather, we need a certain threshold of also valid headers that reference this header as its parent in order to be truly convinced of its finality. As you can imagine this increases the storage requirements for light clients that want to be convinced of finality because they have to store n headers to verify the finality of n-1 header.

Even then the finality is entirely probabilistic and dependent on the security of the hash functions used in deriving the proof of work nonce.

For proof of stake blockchains, the finality proof is usually not included in the header but is optionally available to light clients to request alongside the headers. The finality proof will usually consist of the signatures of the known authority set, who have staked their tokens on the network and sign essentially what they think is the latest block in the network. The finality proof in this system are signatures of this data from 2/3 + 1 of the known authority set, under security assumptions that less than a third of the authority set is malicious.

Ancestry Proofs

Unfortunately, most finality proofs require light clients to be aware of the full chain of headers that have been finalized in order to follow the finality protocol. To enable trustless bridging between two networks via light clients which run in smart contract environments and have even worse computation & memory constraints than mobile devices we need smaller sized ancestry proofs which do not require us to be aware of every header in the blockchain.

We can attempt to achieve this using merkle trees, where a light client simply tracks the merkle root of all block headers seen so far. This would enable us to use merkle proofs to prove finality about any header to a light client that only knows the merkle root of all finalized blocks. However, because of the structure of merkle trees, we would have to recalculate the full tree structure from the genesis block all the way to the latest block, for every new block!

Some quick math:

Tree height = log₂(1,000,000) // a million blocks

Tree height = 20

Nodes in the tree = 2^(20 + 1) — 1

Nodes in the tree = 2,097,151

A mind-blowing 2,097,151 nodes would need to be recalculated for every new block for blockchain that already has a million blocks. What we need is a tree structure that preserves the log2(n) proof sizes of a merkle tree but also re-uses in some way previous hash computation on older nodes in the tree whenever new data is added to the tree. Let’s take a quick detour into merkle-mountain-range trees and how they enable highly efficient ancestry proofs.

Merkle Mountain Ranges

Merkle mountain ranges(‘MMR’) are a special kind of merkle tree that is itself, composed of perfectly sized binary subtrees in descending order of height. For example, an MMR tree with 1,000,000 leaves will be composed of 8 perfect subtrees of heights: 19, 18, 17, 16, 14, 9 & 6

it actually lives up to its name and looks sort of like a mountain range

A key feature of MMR is that we can re-use the previous computation (root hashes) whenever we add new leaves to the tree. The rules for adding new leaves to an existing MMR tree require us to merge any two subtrees of the same height (pink section and blue section merged to the hash in blue), so we only ever have one subtree per height level.

Here we only have 2 subtrees
we merge subtrees of the same height

Merkle mountain ranges are also very efficient with proofs where the tree itself is composed of subtrees. Since MMR subtrees come in descending height, the first subtree is typically the heaviest to compute. This also means that as the list grows, new leaves are actually cheaper to insert and prove.

In our tree, the subtree that requires the most proof items is the first subtree, 4 + 2 (peak nodes of the other 2 subtrees) = 6 proof nodes. The beauty of the MMR is that when adding new leaves, we never need to recalculate the hashes of the first subtree, only the more recent ones.

MMR putting the ‘light’ in Beefy light client

With the Beefy protocol, the authority set produces an extra finality proof for light clients which consists of the MMR root hash of all blocks finalized by grandpa at a given height. With the introduction of this protocol, light clients no longer need to be aware of all the headers in a chain for them to be convinced about finality. This drastically reduces the size of the data needed to be stored by light clients to follow the chain’s consensus to exactly 124bytes!

A preliminary specification for Beefy is already available and is mainly implemented, barring a few kinks that need ironing out. But at a high level, this is a new protocol that will be added to Polkadot without the need for a hard fork. Thanks to the WASM runtime and the on-chain governance protocol, this new protocol will produce aggressively lighter finality proofs for light clients for both on-chain & off-chain uses. It will achieve this by having the existing Grandpa authority set periodically to vote on the merkle-mountain-range root hash of all blocks that have been considered final by the network.

pub struct BeefyNextAuthoritySet {
/// Id of the next set.
///
/// Id is required to correlate BEEFY signed commitments with the validator set.
/// Light Client can easily verify that the commitment witness it is getting is
/// produced by the latest validator set.
pub id: u64, // 8bytes
/// Number of validators in the set.
///
/// Some BEEFY Light Clients may use an interactive protocol to verify only subset
/// of signatures. We put set length here, so that these clients can verify the minimal
/// number of required signatures.
pub len: u32, // 4bytes
/// Merkle Root Hash build from BEEFY AuthorityIds.
///
/// This is used by Light Clients to confirm that the commitments are signed by the correct
/// validator set. Light Clients using interactive protocol, might verify only subset of
/// signatures, hence don't require the full list here (will receive inclusion proofs).
pub root: H256, // 32 bytes
}


// Data that light clients need to follow relay chain consensus
pub struct BeefyLightClient {
pub latest_beefy_height: u32, // 4bytes
pub mmr_root_hash: H256, // 32bytes
pub current_authorities: BeefyNextAuthoritySet<H256>, // 44bytes
pub next_authorities: BeefyNextAuthoritySet<H256>, // 44bytes

Contributing to the Beefy Protocol

At Composable Finance, we are fully committed to realizing a cross-chain future with our aggressive drive for delivering trustless cross-chain infrastructure, protocols, and applications for the ecosystem. We are doing this with a total of 8 PRs to core Beefy subsystems in both the runtime and substrate client, pending further review by the Substrate bridges team.

11-BEEFY COSMOS-IBC light client

As part of our efforts to bring the Cosmos ecosystem to DotSama, we are excited to announce that we have completed the development of a beefy light client in golang for the Cosmos ecosystem! Pending further audits, this light client will be merged upstream into the IBC-Go repo which hosts the canonical implementation of the tendermint light client.

Our intent is that our light client will serve as the canonical light client for Cosmos chains to communicate directly with Kusama/Polkadot parachains. A single instance of the light client can track either the Kusama or Polkadot relay chain finality and be used to prove the finality of any of the connected parachain’s state. In the spirit of trustlessness, we have replaced a demo with instructions for everyone to run a test to verify the operation of the light client. You can find the draft spec here.

The Beefy light client is a step towards building out the Centauri bridge which connects DotSama and Cosmos. At a high level, the bridge consists of three parts: light clients on both networks to prove finality on the other network, and a relayer that relays ibc packets and light client proofs. We have completed the development of the Beefy light client to be merged on the Cosmos side (pending reviews), and the other two components are in development.

Summary

We are building Centauri, a trustless bridge between Picasso, our parachain on Kusama and the Cosmos ecosystem. The key challenge is to reach consensus on the finality of the two networks in a trustless manner, which guarantees the correctness and security of cross-chain bridging. This is achieved with several pieces of new technologies: firstly, the light clients allow nodes to verify blockchain state without downloading the full block data, thus ensuring light computing requirements and decentralization; secondly, Merkle Mountain Range allows new block headers to be hashed in an ‘append-only’ manner, which is more suited for use in consensus subsystems than other Merkle trees; finally, the Beefy finality gadget makes use of Merkle Mountain Range to produce finality proofs for parachains in a more efficient way. The Composable team is currently contributing to the core Beefy systems as well as reviewing an IBC Light Client for Cosmos — once merged with IBC upon further audits, this will allow Cosmos chains to directly communicate with Substrate chains on Dotsama.

This is only the first step in our cross-chain bridging effort. Reach out to me on Twitter if you want to discuss the cutting edge of trustless bridging.

¹ & ² Lamport, Leslie (2005). “Generalized Consensus and Paxos”.

³ Blind Assignment for Blockchain Extension

GHOST-based Recursive ANcestor Deriving Prefix Agreement

--

--

Web3 Philosopher
Composable Finance

Mad scientist at Polytope Labs.