Thoughts on the taxonomy of blockchains & distributed ledger technologies
The recent discussion of R3’s previous public disclosures that Corda is a distributed ledger, not blockchain, as well as Hyperledger fabrics’ move from v0.6 to v1.0 has been spurred on to go much deeper into my own understanding of the technical foundations of distributed ledger technologies, and reframe some of my hypotheses of these technologies. This has led me down the road on three minor projects, first is a general framework to evaluate the suitability of use cases on blockchains & distributed ledger technologies, which I hope to share in the coming weeks. Second is a fun, but admittedly impractical and insecure, implementation of a rudimentary proof-of-work blockchain client using Microsoft Excel. Third, is an attempt to begin to collate my thoughts on the categorisation of some of the more widely followed blockchains and distributed ledger technologies, by a few of their architectural design choices. Dave Acton and I plan to massively expand on this over the coming weeks and months as part of our own drive to help answer some of our clients’ (and our own) questions about which technologies and use cases may, or may not be, suitable on a DLT or indeed on a blockchain.
So first question, why should one care about taxonomy? In other fields, for instance biology, our ability to classify and group plants and animals led to significant breakthroughs in our collective understanding of the world around us. Carl Linneaus (known as Carl von Linné after his ennoblement by King Adolf Frederick of Sweden) first published his pivotal Systema Naturae, in 1735, popularising what would later become known as the Linnean taxonomy which categorizes different life forms into a rank system of orders, families, genera, and species. His work and system helped give Charles Darwin the tools that eventually led to his theory of evolution.
Taxonomy of software is a fairly recent phenomenon, with aims to classify and better understand what has become such an important part of our everyday lives. Vegas, Juristo and Basili (2009), put it simply in their “Maturing Software Engineering Knowledge through Classifications: A Case Study on Unit Testing Techniques”:
Classification makes a significant contribution to advancing knowledge in both science and engineering. It is a way of investigating the relationships between the objects to be classified and identifies gaps in knowledge. Classification in engineering also has a practical application; it supports object selection. They can help mature software engineering knowledge, as classifications constitute an organized structure of knowledge items.
This ability to judge the strengths and weaknesses of technology in an objective manner is perhaps doubly important in an environment where many participants have taken an almost religious devotion to their technology of choice, occasionally to the point of completely losing their objectivity to what the technology may be capable of. This is hardly surprising given the millions, or even billions, of Dollars on the line.
And while, there are a lot of factors that one could eventually include in a full fledged study of the taxonomy of DLTs, there are two major factors that were particularly interesting to me. The first is the data diffusion model of DLTs, effectively separating the blockchain species from the DLT genus. The second factor is the functionality of each ledger system, and specifically what level in the stack at which that functionality is included. I finish by sharing a few initial thoughts on some of the advantages and disadvantages of each type of system, which I will further explore in the coming weeks.
Exploring the first factor of data diffusion it would be remiss of me to not include a brief hierarchy of the families and genera of technology which distributed ledgers and blockchains themselves fall into.
To date, the most complete deep dive into this topic, which I have come across was published by Sebastien Menuier in December 2016. In his article, Sebastien separates centralised relational databases (RDBMS) from Distributed databases (DDBMS), from the user’s point of view, both these systems offer a similar experience, in that data is organised (at least conceptually) into tables. One of the major differences between these two technologies comes back to the infrastructure choice that is hidden to the average user, infrastructure. Where a centralised database relies (ultimately) on a single authoritative copy of the database held in a single device (though redundant copies are often kept in archives), the distributed database system uses multiple devices connected to a common network (e.g., the internet), often physically separated from one another, and presents the end user a single experience (hopefully). There are lots of reasons for and against distributed and centralised infrastructures including speed, redundancy/data back-up, access to multiple data sources, complexity, etc…
After having examined where one’s data needs to live, the next taxonomical task is to look at the dichotomy that denotes the split between databases and ledgers genera. The first point I will include here is the relatively obvious separation, in that whilst a database could theoretically hold any type of data (think of a giant Excel, or a list of phone numbers), a ledger is data pertaining to balances and is linked to accounts. This can be something much larger than pure finance, indeed many applications outside of finance utilise ledgers, for instance mobile phone companies register prepaid phone credit (minutes) against accounts (phone numbers), adding or subtracting that credit based on usage and top-ups. Early internet service providers (ISPs) had a very similar operational process (remember AOL discs with 1000 minutes?). Now that we have established that the subset of ledgers does in fact exist, the question is why do distributed ledgers not operate in the same way as distributed databases? Richard Gendal Brown shared an excellent answer to this. Paraphrasing, and at the risk of oversimplifying his excellent post (please go read it!), users of distributed databases don’t usually have a reason cheat each other, while distributed ledger users often do. Quick side note here, though much of the functionality of distributed ledgers could (returning back to the families separation) utilise a centralised architecture, to build a platform of “shared ledgers”, effectively demonstrated by multiple organisations using their own logins to update a single copy of a centralised ledger (i.e. Xero or Google Sheets), the problem that one is looking to solve here is vastly different and leaves a significant reliance on the operator of that single service.
So now for the contentious bit, what species of distributed ledger technology is a blockchain and what the heck is Corda if not a blockchain? Antony Lewis recently shared a concise view, to which I wanted to elaborate. We know that distributed ledgers, must have 1) a ledger, which 2) multiple parties (entities) use, and 3) is stored across multiple locations. A blockchain is all three of these things, but where they differ from Corda is how much they share and with whom they share that. The first implementation recognised as a blockchain was Bitcoin, part of its resiliency and ability to function is borne out of the fact that anyone with access to the internet has access to every transaction that is or ever has taken place, I call this model universal data diffusion. Universal data diffusion, in many implementations gives perfect transparency, and allows Bitcoin to operate without a central validator. A by-product of this perfect transparency is that it becomes extremely resilient, if nodes and miners are sufficiently decentralised AND properly incentivised, it becomes infeasible to shut down the network or make unauthorised changes. This is further bolstered by the ability for anyone to join without permissioning, if a concerted effort was made to shut down a public blockchain like Bitcoin, there would likely still be at least a few copies somewhere hidden that could be used to reconstruct the network. Think of those apocalypse movies where people fight their way out of the city through zombies and find a group holed up in a compound in a mountain through a radio broadcast then recreate human civilisation (rarely goes this smoothly, so let’s not assume that recreating Bitcoin this way would be either). Of course, it nearly goes without saying that in order to keep everyone honest and make sure that we all play nicely (save the occasional Twitter/Reddit/Bitcoin Uncensored fight over design specs), these things need a native token with a real-world value in order to function.
There are however drawbacks to these systems, first every transaction is public, in addition many copies need to be kept, so the amount of data and data processing that the network can handle needs to be managed. Functionally these systems take new candidate transactions which are broadcast to other parties in the network and group them into blocks, at semi-fixed intervals to be validated by a nominated validator which is then incorporated into the transaction record that all parties can hold. This process then starts anew, taking other candidate transactions, new or yet-to-be-confirmed over another interval of time and selecting a new validator to update the record that everyone shares. To recap, transactions are grouped into blocks, and everyone can see everything.
Other implementations of blockchains take some of the features of a Bitcoin style blockchain and restrict visibility. The first is the simplest, rather than letting everyone in the world access the blockchain, we restrict who can keep access the network, see and initiate transactions, this is firmly the domain of the much hyped “blockchain, but without Bitcoin”. This model still operates by universal data diffusion, but does not allow every Tom, Dick and Harry to become involved (Monax’s eris db fits in here). While these systems offer some level of privacy to the outside world, they still require the participants to show everything to everyone that has access to their blockchain. And while these usually have less participants, and hence data is replicated less, scalability is still a consideration. Likewise the cockroach-like resiliency of having many hidden, and incentivised actors, maintaining data integrity may no longer exist. On the plus side, by cordoning off access, we can know who is on the network and thus are less susceptible to Sybil attacks (attackers pretending to be multiple parties on a pseudonymous network), and thus may be able to use weaker consensus rules (i.e. get rid of proof-of-work and native tokens).
Still other implementations of blockchains seek to obscure the data flowing around in the network by mixing or encrypting transactions. This can be done on public blockchains (e.g., Monero, ZCash) or permissioned blockchains (generally through encryption rather than mixing) (e.g., Hyperledger fabric v0.6). These features, though big improvements for privacy, do not fix -and may even work against- the scalability issue of massive data replication inherent in blockchains by creating more complex transactions.
Taking a step back, another approach to tackle both privacy and scalability is to limit 1) who receives each transaction into channels or sub-ledgers (data segregation) and 2) only require consensus on the state of a sub-ledger to parties within a channel (channel independence).
Now this model is clearly something different from blockchains, and while it is not a heretical rejection of all that is good and holy and blockchain, it does require that we relax some of the assumptions about transparency, security and assumed-immutability that blockchains strive for. This new class of DLT, which I like to think of as “distributed multi-channel ledgers” (DMCL), shares some traits with Open Transactions. Given the multi-channel nature of these systems, both could be configured in a way that allowed them to be a “true” permissioned blockchain according to my classification, where all participants in the network could view everything about transactions in that channel (Conceptually it is a bit like a #general channel on Slack). In fact, Corda actually uses a function analogous to this to create a transaction notary service, whereby smaller channels can send transaction records to ensure independent auditability and uniqueness (remember we are in a permissioned space, so double-spend is less of an issue).
Now while channels are great for keeping unwanted parties out, which improves confidentiality from those parties as well as scalability for the system as a whole. this construction doesn’t come without a few potential complications. Effectively, much like a traditional blockchain without some form of transaction encryption, each channel works a bit like a party line for all channel users, meaning that channels need to fit a specific purpose and not have extra users. But as one segregates users to groups either in the channel or out, the number of total channels in a system needs to have the potential to cover all eventualities and all possible combinations of connections. As an example, let’s say that in a DMCL network with participants A, B, C, D, E that:
- for all transactions in which A, B, C or D are involved, then E should also be a party to and;
- all transactions take place between at least two parties (not including E).
This means that one’s system needs (N-1C2) + (N-1C3) + (N-1C4) = 11 channels to cover all of our bases:
- A, B, E
- A, C, E
- A, D, E
- B, C, E
- B, D, E
- B, D, E
- A, B, C, E
- A, B, D, E
- A, C, D, E
- B, C, D, E
- A, B, C, D, E
That’s all well and good for a system with a small number of participants, but if we do the same for a network of say 75 participants we need to handle the potential for ~ 3.778 x 1022 channels to be created (to put that in perspective that is roughly 4000 times the estimated number of calls that Google needed to find a SHA-1 collision). Limiting that to only combinations of between 2 and 10 individual participants (fairly common for capital markets transactions such as book-running an IPO or new bond issuance, or an allocation of an FX spot trade by a mutual fund manager) that number drops to a mere 5,871,831,778,845 (~5.9 Trillion).
So to recap, DMCLs are excellent for scalability, for confidentiality, but potentially operationally complex. The will likely thrive in environments where lots of transactions need to be conducted between multiple entities which don’t fully trust (but know the identity) of one another. Unsurprisingly (given that R3 and their 70-odd banking participants helped shape the architecture), this is a pretty good description for the real life environment for a lot of ‘core banking’ operations, things like loans, mortgages, deposits, and payments. It also fits quite well for a lot of capital markets (investment banking) business, things that are highly bilateral. However, given the necessity for multiple independent entities to coordinate (read reconcile) the output of ledgers for more complex, multilateral transactions (including use-cases like clearing, fund administration, fund distribution, primary issuance, repo, etc…), more functionality may need to be added to these systems (perhaps something akin to Hyperledger fabric v0.6’s transaction level encryption…).
Because I’m a big fan of visualisation (and because it will make my point in the next section) I attempt to place a range of better known DLTs on a spectrum to show where they fit into my model:
Logical question at this point: “why is ZCash a bit to the right of Bitcoin, Ethereum, Monero?” Well, given that ZCash uses zero-knowledge proofs (don’t ask me to explain them), they can share unreadable data to other participants over the public network. So while the data is shared, it isn’t really shared. Similar set of logic for why Hyperledger fabric v0.6 (transaction encryption) is slightly to the right of Eris db (unless they have enabled transaction encryption without my knowing, which is entirely possible). With regard to Hyperledger fabric v1.0, it is multi-channel, but less multi-channel than Corda. This is due to the fact that Hyperledger fabric v1.0 uses Orders who have full visibility on sub-ledgers, but don’t otherwise participate, whereas Corda is more independent, allowing ledgers to operate without any outside party, only occasionally invoking notary services. This distinction, again, doesn’t mean that you couldn’t set either Hyperledger fabric v1.0 or Corda up to have universal data diffusion (i.e. become a “blockchain”, albeit that with a liberal interpretation of what a “block” is, or really to be more pedantic about it a “distributed single channel ledger”, but let’s agree to settle on calling it a “blockchain” here). Also worth mentioning, all of the public blockchain technologies on the left could be built in a way that allowed them to operate as permissioned blockchains.
The second factor that I wanted to touch upon (promise that this won’t be nearly as long as the first) is the functionality of the ledger system. Broadly speaking there are two types, stateless systems with limited ledger functionality (typically, but not always taking the UTXO form) and stateful systems which allow for greater on-chain functionality (i.e. smart contracts, or chaincode). Again, both designs have advantages and drawbacks.
The stateless DLT system is, again, best represented by its initial form, that’s to say Bitcoin. If we look at what one can do with the original tools of Bitcoin one would have to concede that the functionality is limited to a pretty basic set of functions, these include the following: 1) generate new coins (mining block rewards), or 2) send coins to another public address (Pay-to-PubkeyHash). Later functions were included to send coins to a script to allow the creation of more complicated functions, including multi-signature addresses (Pay-to-Script-Hash). The major advantages of this design are simplicity and fewer attack surfaces. The relative simplicity ensures that less things can happen on the ledger, which in turn means that there is less data on the ledger, ceteris paribus, than would otherwise occur, which means better scalability. A drawback however, is that because of this relative lack of functionality, adding more complex logic needs to be done externally (or by using cryptographic primitives, but let’s set that aside for now).
There are multiple ways to do this, which I won’t get into in this post, but what is key to remember is that this logic is not guaranteed by the DLT environment, nor automatically visible to the same parties as the transactions. This logic can thus be said to be an entirely independent process. This is good for some reasons, a) it means that not everyone has to process everyone else’s business logic, b) we can keep our business reasons for doing a transaction hidden from the rest of the DLT participants, and c) it is easier to fix mistakes before they hit the ledger. It also has downsides, including a) business processes can be changed, potentially without having notified the counterparty to the transaction, b) auditing transactions must include external data sources, and c) adding additional functionality to the DLT system requires acceptance by all parties to the DLT. Oversimplifying things greatly, all business logic can be said to be done in the application level (i.e. bolted-on to a process that reads and writes to a DLT). Aside from Bitcoin, most major public blockchains employ a stateless architecture, including Monero, and ZCash. In the permissioned DLT space, Corda also uses this model, and it is worth noting that Hyperledger fabric v0.6 can be configured to mimic this as well but is not stateless by default.
On the other end of the spectrum is the stateful model. The most well known of these is Ethereum with its quasi-Turing complete Ethereum Virtual Machine (EVM). This system allows participants to create nearly any imaginable functionality directly on a DLT, receiving inputs from the real world, via an Oracle service. The benefits and drawbacks are the inverse of the stateless system. Good: a) business logic is (assumed to be) immutable, b) auditing can become easier (assuming all required information is included in code), and c) less need to include new functionality. Bad: a) everyone has to process everything (speed and data scalability), b) anyone looking at anyone else’s chaincode can guess what they will be doing next, which means they can front-run those moves, and c) more attack surface (see “The DAO”).
In practice, given the cost of having everything done on ledger, many processes will still be done off ledger and only the things that require the greater certainty of being published, will be. Those processes which are published to the ledger for validators (miners) to process can be said to be kept in the validation layer. In addition to Ethereum, eris db, Hyperledger fabric v0.6 and Hyperledger fabric v1.0 are based on stateful architectures. Here’s a helpful diagram that I made earlier:
There is a key point to note, even if a DLT is built on a stateless model, with limited on-chain functionality, there is absolutely no reason that DLT couldn’t be packaged with a corresponding logic layer. In fact this is exactly what Corda does, their “CordApps” include lots of complex business processes which are able to read and write to the ledger but handle the computing outside the DLT environment. This model could even make sense for the more stateful ledgers, limiting the diffusion of information about intentions, and reducing the amount of data that needs to be handled by the DLT network. Simon de la Rouviere explores the potential of this model in Interplanetary Linked Computing: Separating Merkle Computing from Blockchain Computational Courts.
Now, combining my visual from earlier, with the dichotomy of ledger functionality models, we arrive at my first attempt to classify DLT species:
And with my final running commentary, I will add a mention that the degree that each implementation moves away from the 0 of the y-axis represents, a slight, degree to which functionality is increased or decreased (albeit perceived). While Ethereum requires gas to limit a miner from becoming entangled in executing an endless loop, Hyperledger fabric v0.6 and v1.0 do not do this. ZCash offers a slight increase of functionality versus a Bitcoin or Monero by allowing multiple types of transactions ranging from public to completely private.
So with that I will finish up by saying, like many technologies in the world there are different varieties. Some are more suitable in certain circumstances than others. While some may deviate (greatly) from their ancestors, that does not make them any less useful, though perhaps they may not be useful in solving the problem that the original was designed to solve. I for one wouldn’t spend my time trying to use Hyperledger fabric v1.0 to build a “purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution”. But by the same token, I wouldn’t envisage placing my medical history or deed to my house on Bitcoin’s blockchain either.
[UPDATE]: If you aren’t convinced of this and still think that there should only be one type of DLT, perhaps consider the NOSQL ecosystem: http://nosql-database.org/
Codicem meum pactum.