Blockchain Myth 2: Blockchains are good for reliably sharing data

By Jan Grabski

brrabski
9 min readJan 10, 2019

This is the second post in a 6 part series about prevailing myths in blockchain implementations. The first myth was about blockchain vs bitcoin.

Sharing data between organizations and across markets has always been a source of friction. It seems obvious that a system should be created to consolidate business functionality and data across verticals, whether it be for invoice financing, retail currency transfers, interbank settlement, trade finance, or any other multi-sided ecosystem. Isn’t it strange that with the level of cloud-based solutions today such systems aren’t yet omnipresent? After all, it has been feasible to share data using secure online databases for quite some time. Why haven’t the world’s institutions bought into such systems? Could solving these inefficiencies be as simple as putting the data on the blockchain?

Blockchain makes it possible to share data across regions with a central-like trusted database to store and fetch this data from. But does it offer improvement? The cloud giants already make secure global data sharing possible with ever growing ease and blinding scale. Why would a slower type of system, like blockchain be a better choice? Is the challenge of sharing data securely over today’s Internet really technological? Do we actually still have trouble trusting technology? You probably trust your bank’s website, emailed bills, and you probably also order stuff online. Do you need anything more trustworthy? Odds are, you don’t stand to gain a whole lot from an incrementally more trustworthy data sharing platform.

Am I making a case against blockchain? No, but I am saying that blockchain is no better at sharing of data than a cloud-based solution, and it can be argued that it is actually far worse if used in the same way as we would use a cloud-based database solution.

“Blockchain doesn’t replace databases, cloud infrastructure, rule engines, message queues, caches, and anything else that may be used to build an application, but it can serve to link or bond them across industries and markets.”

For example, the idea of a blockchain platform being used as a trusted document or business process data repository is perhaps the most unfortunate misuse of blockchains. Can a data repository be implemented on blockchain technically? Yes. Should it be done? Here, I encourage you to draw your own conclusions.

Blockchain is like mortar between bricks

Remember how I said in article on Myth 1 that blockchain is a bonding medium? Let’s ponder for a moment, what “bonding medium” might mean. We could say that mortar is a bonding medium. How would that analogy work? Well, if blockchain is mortar, then it doesn’t do bricks, and data are the bricks. Bear with me.

The first thing you might have thought, when you learned about building walls with bricks is that there is this thing called mortar used to bond them together. The role of mortar is to fill gaps between the bricks, to spread load evenly, and to bond the bricks together, so that the wall doesn’t fall over. If you’re anything like me, you might have thought “if this mortar thing is so pliable and strong, why use bricks at all?”¹

Well, one reason not to use mortar to build with, is that mortar is bad at maintaining shape before it sets. Another reason is that mortar is heavier and more expensive than bricks. Yet another one is that bricks are better insulators. In short, even though you might be tempted to build your house with only strong and pliable mortar, I and any sane contractor will strongly advise you to reconsider. So it is with blockchain. It doesn’t replace databases, cloud infrastructure, rule engines, message queues, caches, and anything else that may be used to build an application, but it can serve to link or bond them across industries and markets.

Can blockchains scale well enough?

Blockchain cannot be better than traditional technology at some things that traditional technology already does well. What are these things? Vitalik Buterin’s scalability trilemma observation might suggest some answers.

To do this, let’s broaden the scope of the trilemma and think of it as not just related to blockchains, but to computer systems in general.

Above is the scalability trilemma plotted out on a sort of flat three axis graph.² As a fun example, Voyager Golden Record is extremely secure (hard to modify), totally centralized, and not scalable at all.

Above is the scalability trilemma plotted out on a sort of flat three axis graph.² As a fun example, Voyager Golden Record is extremely secure (hard to modify), totally centralized, and not scalable at all.

At the bottom of the graph we have private systems, which are the traditional types of systems that we can own and operate. We can make them quite safe by putting them inside a military bunker, for example, and we can make them scale by filling the room with as much hardware as we care to buy. These systems are not very decentralised, so if we care about the information stored on them, we need to trust their custodian and keep off-site backups. We can also make these systems physically quite private by limiting access to viewing what is stored on them. Even without encryption, these systems naturally lend themselves to security and privacy using limited physical access and rights management. This is why banks and government agencies use these systems extensively.

Next, on the right hand side of the graph are systems that are native to the Internet, which could be classified as public cloud and serverless cloud. These systems are multi-computer and multi-location. They can scale very well, since an algorithm could run across as many machines as are available. These systems are also decentralized, because the computing can be done across many geographical locations. A good example of a decentralized and reasonably scaled out system is BitTorrent. Some of the feats Google and Amazon do with their serverless code execution are both decentralized (across locations) and scaled to an impressive degree. These systems are very fast, but we have limited control over where our data goes and who can see or modify it, so in principle they are neither private nor secure and do not naturally lend themselves to privacy and security. These types of systems are loved by small startups that aren’t too concerned with privacy and security.

“Blockchains are slow-moving steamrollers of the Internet, and everything they roll over gets irreversibly impacted into the Internet’s substrate.

Lastly, blockchains lie on the left hand side of the graph. They are a new type of computer system that is very trustworthy and as decentralized as we care to make them. This type of system uses a mix of data replication, cryptographic validation, and economic incentives to ensure immutability and it spreads itself across many locations to provide resilience. It has one problem though, which is that making and validating all those copies necessarily make it slower and less efficient than the other types of systems, since both private and cloud platforms can minimize the number of places that their data are stored and processed. As a new type of system, blockchains achieve their function by giving up the freedom to minimize the number of copies and by revealing relevant data to as many validators as possible. Blockchains are the darling of decentralized trans-jurisdictional marketplaces. Blockchains are slow-moving steamrollers of the Internet, and everything they roll over gets irreversibly impacted into the Internet’s substrate.

Beware of dragons

Looking at these three types of systems, we can observe another progression. Initially, general purpose computing was best done on private systems, while cloud was limited to application specific uses, like Google’s web search box or email. As the internet matured, cloud computing was made available for general purpose computing through public cloud platforms. Initially, blockchains were also application specific, but Ethereum changed that by exposing the blockchain as a general purpose computing platform by implementing a Turing machine (the theoretical model for a general purpose computer) inside it.

But just as general purpose cloud isn’t great at privacy, general purpose blockchain computing isn’t great at scaled distributed computing.

And so, the addition of a Turing machine to the blockchain opened the floodgates to misinterpretation of its benefits. If you ever wondered about why Bitcoin maximalists sigh at Ethereum, it is this. They’re not correct about Ethereum being inferior to Bitcoin, of course.³ But they have a valid point in that a blockchain isn’t a dumping ground for code and data. Because blockchains look so much like other general purpose computers, there is now widespread belief that the value of modern blockchain 2.0 technology for organizations is as a new general purpose storage with a general purpose CPU to replace P2P data sharing, workflows, document storage, and complex business logic and anything else that we would want to run in the cloud. On that path there be dragons.

“Blockchains are not designed to be all-purpose databases, file sharing systems, or personal computers, and that’s a good thing.”

Every digital technology that we use these days contains storage and a CPU, from a smartphone to a wireless computer keyboard, yet we don’t try to make the keyboard crunch workloads. Why is everyone all of a sudden asking blockchains to do everything?

Blockchains are not designed to be all-purpose databases, file sharing systems, or personal computers, and that’s a good thing. They are optimized to share unalterable facts that are validated against a fact history. We can even do some logic on these facts, but not that much (yet). Even though we could now think of blockchains as general purpose computers, we will be quickly disappointed with their performance and storage capacity. Because blockchains are massively redundant by design, they will never be as fast as systems that can minimize redundancy to achieve speed, and that’s ok. We can achieve these general storage and processing tasks on traditional private or public cloud systems very well already.

Blockchains are great at storing proofs

Before we collectively shed a tear for blockchain scalability, it’s important to understand that blockchains will get faster and gain capacity with time, but this will most likely be to a much smaller extent than traditional private and cloud systems. So what can we do with this insight? Quite a bit.

A useful mechanism (design pattern) for extracting value out of a blockchain is to use it as a store of truth about information objects that do not reside on the blockchain. An example could be a bill of lading in a trade finance use case a representation of a physical asset, like a house, or an abstract financial instrument, like an MBS (Mortgage Backed Security).

As a simple example, using a data structure called a Merkle Tree, we can store a single Merkle Root value on a blockchain and have it serve as a proof of existence for an almost arbitrary number of bills of lading at a given point in time. At any moment that we need to use the bill of lading, we first can validate it by verifying that it exists within a registered merkle tree that is in the blockchain and then proceed to process it as verified. After concluding our processing, we can store the new information in a similar merkle tree that is registered to the blockchain and send the document on its way to the next process step. While at first glance this mechanism might seem limited, because it doesn’t allow us to process the document on-chain, this is in fact not necessary to have the data on-chain to gain most of the value out of the use case. There’s more to this example and it deserves a separate paper, but suffice it to say that blockchains are best for sharing proofs, not information. How do I verify that my copy of data is the same as the official copy? I use a blockchain. How do I share the data itself? Not via blockchain.

¹ Before you stop me and say that walls can also be built of concrete or drywall, I’ll have to say that building walls out of other materials that are as good as brick walls is no picnic, so let’s stick to the brick and mortar analogy for now.

² The astute Computer Science major in you might see a parallel between the above diagram and the CAP Theorem, and you would be right. The scalability trilemma is another take on that theorem and offers Brewer’s observation with a vocabulary of concepts that are more aligned with how we think about systems in practice.

³ Ethereum is a general purpose computer next to Bitcoin’s accounting calculator functionality. You might quip that a computer is just a glorified calculator, and you would be right, but the concept of the Turing machine implemented on top of a calculator takes the calculator to another dimension of capability that prompts the creation of a new word to describe it, which we now colloquially call the computer. And so, Ethereum is a new type of blockchain that takes concepts pioneered in Bitcoin to create a new concept that could be called the blockchain computer or world computer, as some would call it.

Note: The views presented in my articles are my own.

Accreditation: The material for Blockchain Myths (of which this post is a part) was developed at ConsenSys with the fantastic feedback and help of many individuals, including: Tee Ganbold, Zunaira Arshad, Arielle Schnaidman, Brett Li, Micah Dameron, Chris Leishman, Van Sedita, John Wolpert, Jeff Gillis, Jérôme de Tychey, Ray Valdes, Igor Lilic, and other great people roaming cryptoland.

--

--