Blockchain, Data Privacy, and ZKP

Does blockchain ensure data privacy?

Mehmet Gürevin
Octabase
6 min readAug 19, 2019

--

You’re right, actually, ostriches don’t bury their heads in the sand to hide. But it’s a useful analogy to describe the privacy issue in the blockchain technology.

Does blockchain ensure data privacy?

In a nutshell, no. Recently I’ve been seeing that people interested in blockchain technology usually assuming that data on a blockchain is stored in privacy. In many blog posts and videos, it is being said that the blockchain encrypts all of the data on it. However, this is a big misunderstanding, even the most popular blockchain platforms store data as plain text. Everyone who has access to the blockchain can clearly see all the data and analyze it.

I think this confusion has roots in two of the cryptographic basics of blockchain: hashes and public-key cryptography. Don’t worry; we won’t be diving into the depths of cryptography. Let’s talk a little about these two simple concepts.

Hashes:

In blockchain technology, from a logical perspective, every transaction represents a bit of data. A bunch of transactions together makes a block in the chain. Then these blocks connect to each other with their fingerprints. Every block keeps the fingerprint of the previous one and eventually, the entire structure becomes a blockchain. There are other similar structures like DAG, but they are out of the scope for this article since our focus is on the privacy issue. So, the fingerprint of each block is calculated using the hash functions, and they are one-way functions. Especially in cryptographic hash methods; it is not possible to recover the original content using the hash value. So you cannot determine the block content by only looking at the block hash. However, in a regular blockchain, every block has two main parts. They are content of the block and hash of the content, no it’s not a joke, block data is stored as plain text side by their hashes. Hashing is not an encryption method and does not obfuscate the data on the blockchain.

Public-Key Cryptography:

One of the most significant differences between blockchain and databases is that in a blockchain every single data has an owner. The owner of the data proves ownership via electronic signatures. Data owners have a private key and an associated public key. You can share your public key with everyone with no doubt. When you want to prove that you own a certain piece of data, you create a signature with your private key. Thus, others can validate your ownership with that signature and your public key. The whole process is about proving the data ownership.

These two concepts; hashing, and signature are usually confused with data encryption. So it’s common for people without a practical knowledge of cryptography to assume blockchain technology has some sort of inherent data privacy feature. Well, as I said at the beginning, the truth is far from that.

At this point, let’s consider the technology of a very famous cryptocurrency as an example. Blockchain as a word first appeared in the Bitcoin whitepaper. Bitcoin is a great example of the decentralized blockchains. In bitcoin, you get to have a wallet, and you can generate a few addresses in this wallet. Also, you don’t need any permission from anyone to create a wallet or address. Indeed, you can receive and send any amount of bitcoin without any authorization. Obviously, there is no authoritative party, thanks to the Proof of Work consensus model.

Every bitcoin address represents a hash of your public key. You can receive bitcoins to your address from anyone. Nobody knows the relationship between you and your address yet. Then you can send your bitcoins to anyone. The relation between you and your address is the critical point of your privacy. The only way for your privacy to be compromised is by sharing your real identity and address. For example, if you want to exchange your bitcoin with a local currency over an online exchange, the exchange will ask you your passport or national identity. Once you reveal your real identity on the bitcoin network, all of your transactions are traceable to the first transaction. That is a strong reason to not want to get your wage-salary as Bitcoin. Because it might not be a good idea to let everyone know your wealth. So we can’t say that the bitcoin network is capable of respecting your privacy.

So, the ostrich analogy is useful at this point, anonymity never means privacy, be aware.

Well, what about enterprise blockchains?

There are a few blockchain infrastructure options we have in the enterprise blockchain area. Unlike decentralized blockchains, we don’t use the Proof of Work consensus model for enterprises. Usually, we prefer PBFT or variants to reach consensus. In this model, we exactly know who are the participants of a network. Every transaction must be seen by participants, then validated. Thus, we can reach totally distributed trust; there is no single point of failure anymore. That is the fundamental value proposition of blockchain technology for enterprises.

One of the most popular enterprise blockchain platforms, Hyperledger Fabric, makes use of encrypted communication methods, named channels, among relevant parties to provide privacy. This approach is a simple and very common technique to introduce some privacy. Imagine your HTTPS connections; if you have trusted authority certificates in your computer, your emails on the Gmail can’t be read from anybody except you because the established connection is encrypted between your computer and the Gmail servers. In Hyperleger Fabric, channels work like the HTTPS protocol. For example, there is a transaction between you and an insurance company, and there is another party to handle a part of this transaction like a bank, you should open a channel among these three parties. Any traffic on this channel will be encrypted, so all of you will have transaction privacy. Although it looks like a viable solution, by doing that we lose network-wide consensus. The primary value proposition of the blockchain is total trust, and it comes from network-wide consensus. Another popular enterprise blockchain platform Quorum, a fork of the Ethereum, by JP Morgan follows a similar approach to provide privacy. It uses a tool named Constellation to share private data between relevant parties.

If we want to preserve network-wide consensus to reach a real network trust while still being able to keep our data private, we need advanced systems. Numberless regulations like GDPR don’t permit us to share sensitive data with other parties. Also, in most cases, we don’t want to share our data with others, because it’s against the nature of business. At the same time, we don’t want to use blockchain technology without network-wide consensus because if we do, our blockchain will not be any better then conventional systems like databases and API connections.

As you have seen the critical requirement is the network-wide consensus with privacy abilities for the enterprise blockchains. So, we need an almost magical solution to our problem and we just have that: zero-knowledge proofs.

Zero-knowledge proofs or shortly ZKP is a cryptographic protocol that can help us to prove that we know a value, without actually revealing any piece of the value.

Although it might sound like magical at the beginning, it’s a revolutionary technology for many business models. Imagine that the government collects exactly fair and legit taxes in real-time without any knowledge of the transactions. If this is not enough for you, you can think the government spends the taxes for planned expenses and legit resources. Thanks to ZKP, it is possible to build systems where citizens won’t know where their taxes are spent but they will be pretty sure all of it is fair.

Using ZKP solutions on the blockchain, it is possible to introduce much-needed privacy in place. However, once we have perfect privacy using ZKP, we will also need auditing capabilities. Because after we hide all of the data, depending on the business model, usually some authority must audit the hidden data. Let’s call this party as an authorized auditor. The authorized auditor of a blockchain network should have perfect transparency over the transactions on the blockchain while the data is still being kept as a secret from the other parties.

In order to build an enterprise-grade Blockchain application with its powerful consensus abilities, blockchain needs to be strengthened with ZKP based privacy solutions and auditing capabilities.

The silver bullet: Blockchain + ZKP + Auditing

At Octabase, we have been developing zero-knowledge proof techniques and blockchain solutions for the last three years. We have developed a state of the art ZKP solution called Octa Privacy Framework. Using Octa Privacy Framework we can empower any blockchain platform, like Ethereum or Hyperledger Fabric, with the privacy and auditing capabilities. We also provide an enterprise blockchain platform that has built-in auditable confidential transactions with network-wide consensus. We have partnered with Takasbank, the clearing and settlement bank of Turkey, for Octa Blockchain based solutions since 2018.

You are welcome to reach us about this article or our products & services. Your feedback would be highly valued.

--

--