Comparing and Contrasting Different Blockchain Systems

By Immad Naseer on The Capital

Immad Naseer
The Dark Side

--

Photo by Eric Prouzet on Unsplash

There are myriad blockchain systems out there, from a few large public blockchains such as the public Bitcoin and public Ethereum networks, to a number of smaller blockchain systems, such as private Ethereum networks, Hyperledger Fabric, Corda, among others.

They all have similarities but also differ from each other in interesting ways. In this article, we will study how all these systems contrast and relate to each other. We will assume that you have a passing familiarity with these blockchain systems. If you don’t, there are excellent introductions available elsewhere on the internet. We won’t delve in the specifics of any of these blockchains but will study them at a level of abstraction which hides the irrelevant details and allows us to focus on the essential differences. This will help us understand the overall design space these systems occupy with greater clarity.

Before we begin, we’ll make a subtle but very useful distinction — the distinction between the blockchain data structure and the blockchain network.

A blockchain data structure is a collection of authenticated transactions, strung together in a tamper-proof list data structure. A blockchain network evolves the blockchain data structure in a shared and consistent manner following agreed-upon protocols such that all participants have the same view of the blockchain data structure. We’ll refer to the data structure as well as the network simply as ‘blockchain’ when the meaning is clear from context.

Blockchain data structure

A blockchain data structure is a collection of authenticated transactions, strung together in a tamper-proof list data structure.

Let’s discuss each of these terms in more detail.

Authenticated

Blockchains consist of transactions which are signed by entities using their private keys. The signature links the transactions to unique entities and also makes it impossible for anyone to tamper with the contents of the transaction.

Blockchains like Bitcoin and Ethereum use a simple set of private/public key pairs. Other blockchains like HLF and Corda use certificates (which contain private/public key pairs along with additional metadata). Public blockchains don’t restrict who can include transactions in the blockchain while private blockchains restrict the set of entities which can include transactions in the blockchain through various mechanisms.

While authenticated transactions by participants belonging to the same trust domain (say, the same organization) would technically qualify as a blockchain, blockchains are often used in situations where collaboration is needed across trust domains. Blockchains thus often contain authenticated transactions by participants belonging to multiple different trust domains (different organizations, jurisdictions, etc)

Transactions

We mentioned that blockchains contain authenticated transactions. But what exactly is a ‘transaction’ in the context of a blockchain? While all blockchains contain data signed by a participant's private key, the nature of that data differs depending on the blockchain system.

An Ethereum blockchain, for example, consists of a series of signed ‘instructions’. Some instructions deploy new contracts. Other instructions call methods on already deployed contracts passing in some parameters. But it’s all just instructions. A node in an Ethereum network is supposed to read all these instructions in the blockchain data structure and execute them locally to figure out the state of this ‘blockchain computer’.

HLF blockchains, on the other hand, consists of a series of read/write sets. Each signed transaction is a set of key/value pair writes, along with the set of versioned key/value pairs which were read to produce the write set. A node in the HLF blockchain reads this data to construct a local key/value database.

A Corda ‘transaction’ on the other hand contains a set of input states going to a set of output states. A Corda node maintains the list of these state objects, marking older state versions as invalid as they transition to new state versions.

On the face of it, there doesn’t seem to be a lot of similarity between the kind of data the different systems are recording on the blockchain. But they are referred to as a ‘transaction’ as they allow developers to write logic (executing either inside or outside the blockchain) which can update the state of world in a transactional way i.e. update the state of the world after inspecting the existing state of the world w/o having to worry about any intervening operation changing things in the middle. Databases provide similar transactional guarantees to developers. It’s the exact same idea. For those of you familiar with transaction consistency levels, all the blockchain systems mentioned above, in fact, provide serializable transaction guarantees.

Tamper-proof chain data structure

Authenticated transactions alone aren’t enough. We need to be able to string a collection of authenticated transactions by participants belonging to multiple trust domains in a specific order to do something interesting. The precise ordering of these authenticated transactions is very important and shouldn’t be tampered with as otherwise it might lead to a different outcome. It is thus important to string these transactions in a tamper-proof chain data structure.

An interesting consequence of stringing together a list of authenticated transactions, each of which modifies the state of the world in some way, is that we get a full audit trail of how the state of the world came to be a certain way. This property is often very useful to have.

Quick aside

As a quick aside, sometimes it’s helpful to contrast a system with something similar but not the same to better understand its boundaries. Consider a git repository as a data structure. If you read up on the internals of how git repositories work, you’ll learn that the commits are organized in a tamper-proof tree data structure, hash linked in a way reminiscent of a blockchain. It differs from blockchains in multiple ways however. It’s a tree data structure as opposed to a list. It also doesn’t contain authenticated/signed data by default though it can certainly support signed commits if desired. What about transactionality? I’ll leave that as an exercise for you to ponder upon.

With this aside out of the way, let’s turn our attention to blockchain networks.

Blockchain Networks

A blockchain data structure is only so useful w/o a means to share it with other participants and evolve it collectively in a consistent manner. Blockchain networks are what makes a blockchain tick and grow. Let’s study them in more detail.

A blockchain network consists of various actors:

  • Participants which propose new transactions to be added to the blockchain data structure, signed with their private keys
  • Node(s) which evolve the blockchain data structure by adding new transactions to it
  • Node(s) which observe the additions to the blockchain, and optionally, but crucially, validate whether the additions to the blockchain are valid

The blockchain network as a whole should satisfy the following crucial safety property:

The blockchain data structure should only ever grow linearly and never fork into a tree

Let’s study a few different blockchain networks and see how they operate.

Public Ethereum Network

  • Anyone can propose a new transaction to be added to the Ethereum blockchain, signed with their private key (we are intentionally eliding some details here, such as the use of gas)
  • The Ethereum network consists of a set of nodes gossiping with each other, all trying to solve a cryptographic puzzle to get a chance to evolve the blockchain data structure. Statistically, only one node solves the puzzle every 10–20 seconds and adds a new block of transactions to the blockchain data structure and disseminates the information to the rest of the network
  • A subset of nodes in the network observe changes, validate them and update their local copy of the blockchain data structure if the additions are valid. Other nodes, such as Ethereum light clients only observe the changes w/o validating them

The public Ethereum network (using proof-of-work style consensus) ensures that the blockchain evolves linearly and doesn’t devolve to a tree through forks by ensuring that only one node (out of many) can add the next block of transactions every 10–20 seconds thus ensuring two nodes don’t try to extend the chain in different ways at the same time. Furthermore, any attempt by a node to maliciously fork a blockchain (say, by appending the next block of txs to an older block) is promptly rejected by the rest of the network as the information about the faulty addition disseminates through the network.

Private Ethereum Networks

Private Ethereum networks often contain far fewer number of nodes than the public Ethereum network.

  • A smaller number of authorized participants propose transactions to be added to the blockchain
  • A small number of nodes, typically on the order of 9 or so, select a node between themselves, using Paxos/Raft/PBFT style consensus, to add the next set of transactions (block) to the blockchain
  • A larger number of nodes observe and validate the transactions added to the blockchain; they can reject the addition as invalid if the additions don’t pass validation or they notice a fork of the chain

Private Ethereum network relies on Paxos/Raft/PBFT style consensus protocols to ensure that only one node is able to extend the blockchain in each round of the protocol thus preventing forks of the blockchain resulting from two nodes extending the same chain in different ways in the same round.

Hyperledger Fabric

HLF networks typically consist of multiple blockchains, each blockchain called a ‘channel’, shared between a subset of the overall participants in the system. We’ll just focus on the operation of just one channel/blockchain in this article.

  • A small number of authorized participants can add transactions to the blockchain
  • Typical HLF deployments delegate the responsibility of adding new transactions to the blockchain to node(s) in a single trust domain, usually by using a Kafka queue operated by one organization. This service is called the ‘orderer’ and HLF deployments have a single orderer
  • A larger number of validating ‘peer’ nodes observe and validate the new transactions added to the blockchain by the orderer. Non-validating peer nodes simply observe the changes validated by the validating peer nodes

HLF networks ensure that the blockchain only ever grows linearly by delegating the responsibility of extending the blockchain to a single service called the ‘orderer’.

On the face of it, this level of centralization sounds alarming. But let’s think deeper about it. What damage can a (centrally managed) orderer inflict on the network? An orderer cannot tamper with the transactions proposed by the participants as they are signed by their signatures and any tampering will be detected. It can similarly not change the history of the blockchain already shared with the rest of the network as all the authenticated transactions are strung together in a tamper-proof list data structure. It can however decide to censor certain transactions, delay adding them to the blockchain data structure and decide upon the exact ordering the new transactions are added to the blockchain.

When looked at this way, we see that while an orderer clearly has power over when and whether new transactions are added to the blockchain, it can’t tamper with the transactions themselves nor change the history of the blockchain. A misbehaving orderer can attempt to fork the blockchain by extending the same block in multiple different ways, though such an attempt will be easily detected by the rest of the system so is unlikely to happen. Having a single orderer in the system leads to greater efficiency and throughput but with the aforementioned trade-offs.

The above trade-off is clearly not acceptable in all situations but some systems might be perfectly okay with this kind of a trade-off. Do you agree?

Corda

Corda is a distributed ledger with a heavy focus on privacy. While on the face of it, it looks fairly different from other blockchain systems, if we look deep enough, we realize that a Corda network consists of nodes sharing many different blockchains, each blockchain typically much smaller in size and only dealing with the evolution of one or a few states.

We’ll refer to these smaller chains as ‘state-chains’ and focus on how the network evolves a particular state-chain. Note that Corda doesn’t use the terminology of state-chain itself; we’re calling them state-chains in this article for the sake of discussion and comparison with other systems.

  • Like private Ethereum and HLF networks, only a smaller number of authorized participants can propose new transactions to be added to the state-chain
  • Unlike Ethereum and HLF, where the entity which proposes the transaction and the entity which adds it to the chain are typically different, in Corda, they are one and the same. The node which proposes the new transaction is the node which ends up evolving the state-chain as well. The node then shares the extended state-chain with other participants on a need-to-know basis
  • The node which evolves the state-chain shares it with other nodes (typically a small subset of all nodes in the system) on a need-to-know basis and they all observe and validate the complete state-chain upon receiving it

This is an interesting design and raises a few questions.

If the node which proposes the transaction is the same as the node which extends the state-chain, what’s stopping that node to extend the state-chain in conflicting ways, resulting in a fork? This is particularly problematic as the node doesn’t share the extended state-chain with the complete network but only a few other nodes, on a need-to-know basis. It’s thus possible to imagine that it gives one extended version of the state-chain to one set of participants, and a forked different version of the state-chain to a different set of participants. Since the two sets don’t talk to each other, they won’t be able to detect this fork.

Corda solves this problem by requiring the node to get the blessing of a ‘notary’ (in the form of the notaries signature) for its extension to the state-chain to be considered valid. A notary is responsible for ensuring that a link in the state-chain is only ever extended once. It’s tasked with remembering that a link has been extended and is supposed to refuse to sign requests to extend it a second time.

This solves the above problem but raises a few challenges of its own.

Just like orderers in HLF, a notary can refuse to sign the request to extend a state-chain, leading to a censorship attack.

More problematic, however, a malicious notary can lie and sign requests to extend the same state-chain twice thus leading to a state-chain fork. Unlike HLF where any such attempt by the orderer can immediately be detected by the rest of the network as the blockchain data structure is broadcast to all participants, a similar behavior by the notary won’t be detected by the rest of the system as the notary doesn’t share its internal information with anyone. The forking attempt, therefore, might never be detected if the two different forks of the state-chain are shared to a disjoint set of nodes.

Notaries thus require a higher level of trust than orderers in HLF.

One approach to mitigate this risk is to run a notary cluster where the nodes running the notary cluster belong to multiple trust domains all collaboratively running the service. This mitigates the risk but doesn’t eliminate it as the actors in the different trust domains can collide with each other. Disseminating this information to the entire network will eliminate this risk factor but sacrifices the privacy guarantees which Corda provides. There are however techniques which can prevent malice while still respecting privacy, such as TEE based execution or zero-knowledge proofs.

Note that while this comparison might lead you to consider Corda as being less attractive compared to other ledgers due to the higher trust requirements for notaries, do keep in mind that Corda has other desirable properties which we’ll touch upon in a future article. The above analysis should inform your decision when choosing ledgers but shouldn’t be the sole criteria for choosing between them.

Recap

This was a fairly quick and high-level tour of a few popular blockchain systems which attempted to abstract out the details so we can see the similarities and also appreciate the ways in which they differ from each other.

From a data structure point of view, all these blockchain systems are fairly similar to each other. They mostly differ in the kind of data they store on the blockchain and how they achieve transactionality.

From a network point of view, they differ from each other in more interesting ways. Public and private Ethereum networks are (unsurprisingly) very similar to each other and mostly differ in scale and choice of consensus protocols. HLF is fairly similar to Ethereum but centralizes the responsibility of evolution of the blockchain data structure. Corda differs the most in its overall approach to privacy (which we did not discuss in this article) but is similar if we just focus on a single state-chain. Similar to HLF, Corda also centralizes the responsibility of safely evolving a state-chain in a linear fashion to a notary (or notary cluster). Notaries however require a higher level of trust compared to orderers due to privacy considerations.

--

--