When starting to design our Fully Anonymous Identity Resolution (FAIR) network protocol, one obvious direction was to look at blockchain technology.
After all, blockchain provides a solid decentralized framework that facilitates simple data exchange and consensus building between multiple parties. Blockchain has evolved in the last few years, moving from the marginal to the mainstream, with the promise to truly revolutionize the way we store, share and authenticate transactional data.
As a product-first company, we started by analyzing and identifying what it is that we want the solution to achieve. Our key requirements can be summarized as:
- Create trust between parties - all member companies need to know that their business secrets and customer data is kept secure
- No centralized database - if there is no centralized database, there is nothing for hackers to attack
- Complete privacy - without sharing or exposing any consumer data, not even in a hashed or pseudo-anonymised fashion
- Full anonymity - no one can learn who is asking to validate an identity, which identity is being validated, or who is vouching for the user
- Clear data control - each company holds the data about their own users, and do not replicate or duplicate it anywhere else
- Fully compliant with GDPR, CCPA, and all other privacy regulations - this includes the user’s right to be forgotten, and the right to correct any inaccurate data
We then started looking for the right technology that can help us achieve all the above requirement. Blockchain was at the top of the list. However, as we dug into the blockchain technology on one hand, and the privacy requirements on the other, we realized that blockchain may not be the right technology for what we are trying to build.
“An open, distributed, immutable ledger that can record transactions between parties efficiently, verifiably and permanently.”
In a nutshell, blockchain technology provides transparency, and the ability to generate consensus on the state of the stored data without the need for a central authority. It was designed to create agreement on transactions between two or more parties that need to be recorded with a specific timestamp. Some common use cases include financial transactions, contracts, and asset ownership records.
In all these cases the users of the blockchain gain visibility and trust by knowing who owned a certain asset at a certain time without relying on a centralized authority.
What about using blockchain for Identity?
We realized that identity data is neither transactional, nor is it tied to an exact point in time. Therefore, there is very little benefit in hosting identity data on a blockchain.
However, there is much to lose.
Storing identity data on a blockchain means that all data is public and replicated multiple times. This introduces many unnecessary risks. First, this database becomes an obvious target for hackers, who are constantly seeking to obtain accurate identity data. Second, the fact that the data is replicated makes it even easier for would-be hackers to get a copy of their own.
Moreover, data on the blockchain lives forever, so curious parties have all the time they want to extract information from the blockchain, its structure, or meta-data associated with it. Even if such extraction seems unfeasible today.
Last but not least, history shows that all software has bugs. Blockchain is no exception, and there have been many reported cases where blockchain or crypto-currency data was exposed as a result. Putting identity data on a blockchain puts it at risk of being exposed to the world — and, especially, to the malicious actors in the world — if and when the underlying security protocols are broken.
Can’t We Solve That?
These are very serious risks, and they have not gone unnoticed. Some companies who build blockchain-based identity solutions try to work around the problems by implementing one or more of the following workarounds:
- Encrypt the data on the blockchain
- Only store a digital signature of the data on the blockchain, where the real data is stored on a separate private database
- Make the blockchain private — AKA permissioned blockchain
Let’s take a deeper look at each of these approaches.
- Encryption is not a panacea for security — encryption protocols frequently get cracked or are compromised due to bugs. Given the immutable nature of the blockchain it is not enough to patch a broken protocol, since the old data is still available for anyone to exploit. And given the high value of identity data, the likelihood of repeated attempts at hacking is equally high. In addition, the encryption method itself is known to everyone who can access the blockchain, therefore making brute-force attacks significantly easier.
Finally, even though that data is stored in an encrypted form on the blockchain, all members can read and decrypt the data — therefore making all data about all users accessible to all members of the blockchain.
- If we only store signatures on the blockchain, then the data itself must be stored on a separate private database. In that case, the only value of the blockchain is to provide an immutable record of a person’s data. This may both expose the identity of the signer, as well as prevent users from being forgotten. Moreover, we are still creating a centralized database including all the data, which we would like to avoid.
- Finally, permissioned blockchains limit the number of companies who can access the data, but this does not change the fundamental exposure of privacy. In addition, the more successful a solution is, the more players have access to the data on the blockchain, making it ever less secure as the solution grows more popular. We cannot assume privacy or security solely by limiting access to the network. Security must be guaranteed at all scale if it is to be meaningful.
Blockchain and GDPR
The GDPR working party considers user data as “Private Data” (and therefore in need of protection) if any party can connect the data to that specific person. This means that any encrypted data or hash of private data is still considered private, e.g. anyone who knows how to calculate the Hash, can re-connect the user to the hashed value. Consequently, all existing blockchain solutions are considered to hold private data and need to be managed as such under GDPR.
This means that any company working within the EU and hosting a copy of the blockchain is considered a data controller and processor for all the data in it, and will have to comply with all the requirements of the law. This is true whether the data belongs to their own customers or is a copy of data obtained from other participants in the blockchain solution.
Moreover, with many copies in existence, any individual company’s data is replicated many times over, and is effectively being sent outside the control of the data controller. Each company is also holding other information about other companies’ users on its own ledger.
These stand in direct violation with the GDPR directive to minimize data storage and replication to the absolute minimum, potentially exposing these companies to fines of up to 4% of a company’s annual global revenue. We wonder how many blockchain companies have fully realized their vulnerability here, and the legal exposure they impose about users of their solution.
Data leaks and privacy exposure
Finally, this data can then be further copied off of the blockchain, whether it is private or public, and sent to other unknown parties, similar to the way that Facebook data was copied and distributed by 3rd parties. With the escalation we’ve seen in consumer concern about their privacy, and the increase of data breaches and leaks in recent years (twice as much in 2018 as in 2017, which was itself a record year for breaches), companies wishing to protect their business reputation today have to be fully aware of any such risks before committing to any technological solution.
The alternative approach
We concluded that blockchain is not an appropriate solution for managing identity and private data. It introduces unnecessary risk by creating both additional complexity and a highly replicated centralized database of all identities.
For us, the alternative is clear. Let each company hold its own data and keep it secure, while allowing companies to provide each other with proofs of identity in a fully anonymous way. Then, if one company is breached, only the data that this company holds become exposed. The rest of the network remains protected. Moreover, there is no need for a cumbersome central repository which adds no value, while exposing its holders to increased privacy risk and GDPR regulation.
Blockchain is an up-and-coming technology that is getting a lot of industry attention. It is hailed for its inherent trustworthiness, without the need for a centralized authority. This truly is valuable and has some excellent transactional use cases. Our analysis showed us that identity verification and validation between a network of companies is simply not one of them.
Blockchain is poorly suited to manage identity data (or any other sensitive private data, for that matter).
Using blockchain for identity led to added complexity, and increased exposure of data leaks, without providing real value. It’s a poor fit, and a serious risk. Therefore, we looked in another direction, and were happy to find a true alternative that meets all our requirements.
We believe that a truly decentralized network protocol, based on secure zero-knowledge, and multi-party computation, is the right way to go. This is why we developed F.A.I.R (Fully Anonymous Identity Resolution), a network protocol that allows its members to validate the identity of new users, and vouch for ones they already know, without sharing any user data whatsoever.
If you want to find out more about Identiq, FAIR technology, and what we do, take a look at our previous post - What is Identiq?
If you’d like to see it in action, feel free to get in touch.
About the Author
Uri Arad is the Co-founder and VP Product & Research at Identiq. An expert in risk, fraud, and data solutions, Uri was previously the Senior Director of Risk and Data Science at PayPal. He has over 25 years of experience building technology and product teams, an M.S.c in Computer Science from Tel Aviv University (summa cum laude), and served as an officer in Unit 8200 in the Israeli army.