With an increasing number of blockchain applications going in to production privacy is a concern which, in most cases, still remains unanswered. A number of solutions are now emerging which look to address the issue of ensuring privacy of data on blockchains.
Why is privacy a concern when using blockchain technology? Unlike centralized systems where one centralized party can view all the data and control the access to it, the premise of a decentralized system is that the transaction data is shared with all participants. This sharing is required to enable all parties to agree on the validity of the transactions, a process called consensus. Disclosing information to multiple parties in order to have them validate the transactions introduces a problem if that data also needs to remain private.
For instance if I buy a pizza and pay with Bitcoin the whole world (well, everyone with access to a Bitcoin node) can see my payment-transaction as it is shared with all Bitcoin nodes. In public blockchain networks like Bitcoin and Ethereum this means that all transactions are shared with all nodes. And although it is hard, it is not impossible to reveal the real-life identities of the person involved in the transaction. In permissioned blockchain networks where only identified parties can participate, all identities are immediately known by definition.
The problem of disclosing private information in order to have multiple parties validate the transactions can be addressed in various ways, this article describe the 3 most common ways:
- Trusted computing
- Selective multi-casts
1. Trusted computing
The concept of trusted computing originates from gaming where a gaming console can have more information about a game than the actual player who owns the gaming console. For instance in a first-person shooter game the fact that an adversary is hiding behind a wall may be hidden to the player even though this information is present on the gaming console.
Intel Security Guard Extensions (SGX) a feature found in modern Intel CPUs uses a trusted execution environment (TEE), an isolated part of the memory that can only be accessed through specific APIs. The TEE cannot be accessed by any other process, not even by the operating system or BIOS. The TEE is therefore sometimes nicknamed secure enclave (see purple square in the picture below). The SGX-CPUs also feature a built-in private key that remains unknown to the owner/operator of the machine. This private key is used to decrypt data in the TEE in order to execute logic on that data while the data remains secret.
This feature can be used in blockchain context such that transactions are encrypted and sent to all SGX-equipped machines involved in the validation. The SGX-equipped machines decrypt those transactions and validate them in the TEE. If the transaction is valid it is signed in the TEE with the built-in private key. This way multiple machines operated by multiple organizations can be used to validate blockchain transactions while keeping the content of the transactions secret because no-one, not even the owner or operator of the SGX-equipped machine, is able to access the confidential data.
Besides being useful for privacy, secure computing can also be useful in blockchain in order to make the consensus protocol more secure and faster as is done in Hyperledger Sawtooth that uses Proof of Elapsed Time (PoET) to reach consensus.
Though SGX is very promising technology, the current state is that regularly flaws are detected in the security of Intel SGX: it’s an arms race between Intel and attackers. Some examples of successful attacks on SGX:
- This Spectre-related vulnerability (March 2018)
- Foreshadow (August 2018)
- The Return-Oriented Programming (ROP) technique. (Feb 2019)
Another option to address privacy on blockchain is with the use of cryptography. Obviously just encrypting or hashing data that should remain private has significant downsides as it makes it harder and often impossible to validate the transactions that use that data.
There are, however, more advanced cryptographic solutions which can be used to hide information while allowing for validation. One example of such a solution is Monero that uses:
- Ring signatures to hide the sender of a transaction
- Homomorphic encryption to verify that the sum of inputs of a (UTXO) transaction equals the sum of outputs of a transaction.
- Zero knowledge range proof to validate that a secret number (the secret amount of XMR paid in a transaction) lies in a known range (between 0 and a very high number) in order to prove that the paid amount is positive.
Another example is zCash that uses ZK-SNARKS, a generic zero knowledge proof that allow the user to create a cryptographic proof of any kind of statement. Because ZK-SNARKS are generic, they are a good fit to smart contracts that can model any kind of business logic (trade finance, international payments, identity management etc.). A downside of ZK-SNARKS is the trusted set-up phase: during bootstrapping of the network private key material is generated that needs to be destroyed. Bulletproofs however also allow generic zero knowledge proof but without the trusted set-up.
Zero knowledge proofs like the ones used in Monero and zCash have been used in production on public networks and therefore battle tested for many years and, contrary to SGX, no major vulnerabilities have been found in them. This is why ING is actively contributing to integrate Zero Knowledge Proofs in Blockchain technology.
3. Selective multi-casts
A good approach of selective multi-casts is implemented by Corda, a blockchain-inspired Distributed Ledger that sends transaction data only to the parties involved in a transaction and optionally to a consensus service (called Notary service).
For example, if one person pays another person using a token on Corda that represents 10 GBP, this transaction is not shared with the entire network but only with the sender and recipient of that token (and optionally with the Notary service). The recipient of the 10 GBP token will also receive the transaction history of that token to verify that the sender legitimately became the owner of it.
The notary service either receives the full transaction data in case of a validating notary, or just a hash of the transaction in case of a non-validating notary. This introduces a trade-off: using a validating notary will make the network tamper resistant (only valid transactions will be recorded on the distributed ledger) but will require the participants to share their private data with this notary service. On the other hand using a non-validating notary will keep data private between participants and though the DLT will still be tamper evident (in case an invalid transaction is submitted it is clear who submitted it) it is no longer tamper-resistant. When choosing the non-validating notary the network will remain tamper-evident (the participants of a transaction can see if an invalid transaction is submitted). Arguably tamper-evident could be robust enough on a network with known participants.
Selective multi-cast is for instance used in HQLAx on Corda.
We discussed 3 possible solutions to allow multiple parties to validate data while keeping confidential data secret. With SGX being very promising but at the time of writing not yet ready, this leaves us 2 viable solutions today:
1. Cryptographic solutions such as Zero Knowledge Proof.
2. Selective multi-casts.
Which one you choose depends on your use case, so far we see in practice that cryptography, particularly Zero Knowledge Proof works well for public cryptocurrencies (like zCash and Monero) and selective multi-cast works well with production systems on permissioned networks.