Blockchain Myth 3: Public blockchains do not offer privacy

brrabski
6 min readFeb 9, 2019

--

This is the third post in a six part series about prevailing myths in blockchain implementations. The previous myths are:
1.
blockchain vs bitcoin,
2.
blockchains and data sharing.

“Everything on a public blockchain is seen by all participants, so privacy must be specially engineered into the protocol.” This sounds ok , but is so grossly misleading, that it is hard to know where to begin.

It might be prudent to start with reminding ourselves that the whole idea of a blockchain is to provide a common record of facts that is validated by all participants. In a sense, blockchains must be able to prove to their users that they are honest. This means that there are some things in a blockchain that need to stay open for the blockchain to function at all.

What are these necessary open elements?

Realms of information that blockchains and classic databases deal with well.

In the first myth article we tackle the argument that the concept of cryptocurrencies and blockchains cannot be separated without making blockchain a pointless device, much like a mechanical wristwatch that doesn’t have a pendulum, and therefore doesn’t tell time.

Blockchains and cryptocurrencies are part of a singular invention that combines an open mechanism and an economic phenomenon to provide a new type of service, an economic medium. This new medium allows us to rely on an internet-native permanence that we do not need to pay to maintain as a user.

Blockchains become an economic medium by making the information stored on them internally consistent and possible to validate with open standards, and then by securing this validation and rewarding the security with a native token.

If we look at the data and validation handling features built into the most successful blockchains, like bitcoin or ethereum, we can understand that they necessarily need to be openly verifiable in some way. This open verification makes keeping private data on a blockchain particularly tricky, and perhaps even pointless. After all, if your data is meant to be private, what is the purpose of running it through a public verification mechanism that shouldn’t have access to it?

To be sure, there exist techniques for getting around privacy for on-chain information, like zero-knowledge proofs, but these are solutions that process aspects of the information, but not the information itself. For example, a zero-knowledge proof can verify that a balance after a transaction is equal to or above zero without knowing the actual balance or the amount of the debit transaction. This happens without showing a validator the balance even in encrypted form.

Although these solutions allow us to weave around on-chain privacy, they do not actually store the information on-chain and require the user to maintain crucial pieces of the puzzle for each piece of data we care about off the chain anyway. i.e. the chain logic knows your balance isn’t negative, but only the user has the data to prove what the balance actually is.

“How to store private data is not a problem that should be solved by blockchains, and that’s ok.”

In addition to the need to store private data off-chain, privacy chains also carry limitations and computational overheads that make them even slower than blockchains already have to be. Furthermore, it is very unlikely that such zero-knowledge schemes will ever match simpler blockchains in speed (they’re objectively more complex to compute), and we end up with somewhat of a dead end. So what are we to do?

The engineering approach to the openness problem of blockchain is patching it to handle private data and/or to limit access to the chain itself using firewalls and access control mechanisms. These techniques are a reflex that we have from database management, and it forces the blockchain into a contortion that runs against its fundamental design (open validation) without even solving our privacy problem all that well either.

Let’s consider on-chain encryption for a moment. Blockchains are permanent records so encryption will store encrypted data permanently in a shared and distributed location. Assuming we actually care about the privacy of our data, we will need to consider that:

  1. Any shared location over which we do not have sufficient control will eventually get shared with parties that we can’t control. This means that we must accept that encrypted data that we put in the shared location will never be deleted AND there is a high chance that it will eventually leak. In fact, the only way to ensure it never leaks is to find all the copies and destroy them. And so, once we put encrypted data on a blockchain, it will hang there in cyberspace over our heads, like the Sword of Damocles.
  2. Any encryption will eventually be broken, whether through leaking of keys or more advanced technology. Specifically, most modern encryption methods are known to be only temporary in the face of quantum computing.

If the above two points are true, then by storing encrypted data on a blockchain, we are basically accepting that the encrypted data will eventually be publicly shared and decrypted. Who will agree to this? Things only get more difficult when we inevitably have to bring laws like GDPR and the right to be forgotten into the picture.

“Great solutions are not engineered; they are finessed.”

It therefore seems that encryption on a blockchain is a big red no-go zone. This leaves encryption on private channels (like on Fabric and Quorum) as an interesting attempt to solve the privacy concern. Both of these solutions use private channels to transmit data and they record proofs on a chain. There’s just one concern with that approach: If we are going to share private and encrypted data, then why not use traditional P2P and other data sharing technology?

After all, we have already established that blockchains are not data sharing infrastructure, so the functionality could be decoupled from the blockchain technology altogether. How should private data be shared? We could do it with anything from robust encrypted APIs to simple email, suffice it to say that how to store private data is not a problem that should be solved by blockchains, and that’s ok.

The same argument goes for document storage, workflows, and complex rules. Should organizations give up their Document Management Systems (DMSes), Business Process Management Suites (BPMSes), Business Rule Management Systems (BRMSes)? Giving up mature systems is surely premature.

Fortunately, there is an alternative way to share proofs without sharing any encrypted data. It has some limitations, but they’re not as tragic or as hopeless as one might assume. They’re not much different than the initial limitations of the Internet, which also couldn’t initially be used to stream movies, have voice conversations, and do secure credit card transactions. These awful limitations did not stop Amazon from selling books and challenging bookstores, and it did not stop Netflix from mailing DVDs and challenging movie rental shops. It turned out that a lot could be done with very limited functionality. Amazon did not need an e-book format, and Netflix managed to disrupt video rentals on the internet without the need to stream movies. They just mailed their product to their customers by post. This is because great solutions are not engineered; they are finessed.

One data structure that allows for impenetrable privacy on a public blockchain is the infamous cryptographic hash function used as a fingerprint for data that is private. In many valuable blockchain use cases, it is not an arbitrary logic in a smart contract that provides value, but a timestamped proof of truth, and a hash posted to a blockchain does that.

The hash function is in principle totally obscure, which means that it does not contain information that can be decrypted into the original value. Therefore, we can use hashes to validate sensitive data in the open on a publicly accessible blockchain. Passing the data around off-chain using standard protocols and APIs, and using validation software to make sure that it cross-references with hashes that are on the blockchain.

Mechanisms to accomplish this have been developed long ago in the OpenTimestamps standard and projects like Tierion building platforms around the idea. Blockchain is already at the stage, where its limited functionality can be used to successfully disrupt a sleepy industry that might be feeling unjustifiably safe due to the inability of public blockchains to handle private data.

Note: The views presented in my articles are my own.

Accreditation: The material for Blockchain Myths (of which this post is a part) was developed at ConsenSys with the fantastic feedback and help of many individuals, including: Tee Ganbold, Zunaira Arshad, Micah Dameron, Chris Leishman, Van Sedita, John Wolpert, Jeff Gillis, Jérôme de Tychey, Ray Valdes, Igor Lilic, Joseph Khalife, and other great people roaming cryptoland.

--

--