Forget erasure: why blockchain is really incompatible with the GDPR
Whether blockchain-based projects can comply with the GDPR is a question of much debate and controversy at present. Many projects make bold claims that they are “GDPR compliant” or that the GDPR does not apply in the first place because they “don’t put personal data on the ledger.” At the same time, these projects often use the pseudonymous identifiers of individuals to write transactions to the ledger. Such pseudonymous identifiers are personal data,¹ so those claims are questionable.
Other projects claim to be compliant on the basis that they have solved the question of erasure, i.e. how to give effect to the data subject’s “right to be forgotten” in the context of an immutable, append-only ledger. This narrow focus on erasure loses sight of other core GDPR challenges in respect of distributed ledgers, including how to identify the relevant data controller(s) and processor(s) in a network, how to (reversibly) restrict processing, how to explain and honor objections to automated processing, and how to achieve compliant cross-border data transfers, among others.
Moreover, participants tend to dive head first into debating technical and nuanced details about the implementation of specific features or functionality in a given network, often losing sight of the bigger picture. In this way, solving one discrete issue often makes another tension harder to resolve, in a never-ending game of compliance whack-a-mole. By abstracting to a higher-level discussion based on the core GDPR principles, we can see how blockchain is, at least as presently conceived, fundamentally at odds with the Regulation.
Taking each core principle in turn:
1. Lawfulness, fairness, and transparency — Lawfulness means having a lawful basis to process personal data, i.e. what is the “lawful basis” for writing data to the ledger in the first place? Most existing projects rely on “consent” but do not effectively address the mechanism for obtaining adequate informed consent or its revocable nature. Some could argue (though few have) that there is a “legitimate interest,” but that legitimate interest has to be assessed on a case-by-case basis weighing the interests of the controller against the rights and interests of the individual,² which is at odds with the very automated nature of processing in these networks. Maybe there is an argument that the data is processed in furtherance of a contract? But what is that “contract” and where would it have actual legal status? What happens if the contract is invalidated? If a ledger-based project cannot answer this question — “what is our lawful basis for putting this data on the ledger in the first place?” — it should stop there.
2. Purpose limitation — Personal data must be collected for specified, explicit, and legitimate purposes and not further processed in a manner incompatible with those purposes. Given the automatic replication of data across all nodes in a network, it is hard to argue that data is not “further processed” beyond actually writing a given transaction to the ledger. Surely the individual transacting is interested in completing that transaction and not in having their transaction data broadcast to an indeterminate number of nodes across an unspecified geographic scope and stored indefinitely (see also storage limitation below). In a permissioned network, there might be some limitations on the scope of this replication but that does not solve the disconnect between the individual’s intent to do something discrete and the excessive means used to achieve that purpose. Incidentally, this also demonstrates the existence of a controller, separate and apart from the individual transacting, who is instrumental in determining at least the “means” for processing (see also accountability below).
3. Data minimization — This principle requires that you only collect data that you actually need for your specific purpose, that the data collected has a rational link to that specific purpose, and (in line with the storage limitation principle) that it is only held for as long as necessary to fulfill that purpose. In my view, this is where most blockchain projects fail straight out of the starting blocks. Sure, there may be limited data minimization in terms of the data collected where zero knowledge proofs or other technical measures are implemented in services that rely upon an underlying ledger. But the data minimization principle applies to the entire data processing lifecycle, not just to collection, and the automatic replication of data across all nodes in a ledger is an automatic violation of the data minimization principle (see also storage limitation below).
4. Accuracy — Reasonable steps must be taken to ensure that personal data processed is accurate and up-to-date, not incorrect or misleading, and that inaccuracies are corrected, erased, or rectified without delay. In some respects, blockchain does hold some promise of providing better data integrity or so-called “verified data,” but most projects stop there. The biggest hurdle in respect of accuracy is linking information that lives “off-ledger” in the real world to data recorded on the ledger. Take the example of “tokenized” real estate where the accuracy of ledger data depends on the real-world state of a given piece of real property, or the case of identity where proving ownership and possession of a mobile device managing an individual’s digital identity is not something that can be addressed through the integrity of data on a ledger. The challenges of correcting, erasing, or rectifying inaccuracies on an immutable, append-only ledger are self-evident.
5. Storage limitation — This principle holds that personal data should not be kept in a form which permits identification of data subjects for longer than necessary for the purposes for which it is processed. This means having a clear retention period, a logical justification for that retention period, periodically reviewing data held, and deleting or anonymizing data past a valid and justifiable retention period. Techniques like pruning aside, where a blockchain is meant to be a permanent and immutable digital record, it is inherently at odds with the storage limitation principle (see also data minimization above).
6. Integrity and confidentiality — Personal data must be processed in a manner that ensures appropriate security of the personal data, including protection against unauthorized or unlawful processing and against accidental loss, destruction or damage, using appropriate technical or organizational measures. Notwithstanding that many ledger-based projects are working diligently on data security measures, they are failing to grasp the nature of the integrity and confidentiality principle, which goes well beyond data security as conventionally defined. Integrity means that what you have recorded on the ledger is an accurate representation of what it is meant to represent, i.e. that the digital asset is an accurate proxy for its real-world equivalent (see also accuracy above). Moreover, confidentiality is hard to achieve on a publicly accessible, transparent ledger.
7. Accountability — Finally, a core and often overlooked principle when it comes to blockchain or distributed ledger technology is the accountability principle. The GDPR requires that parties handling personal data accept responsibility for, and are able to demonstrate compliance with, these core principles. This means taking responsibility for their role in controlling the systems processing personal data and decisions regarding that processing, as well as having appropriate measures and records in place to demonstrate such compliance. Many blockchain or ledger-based projects argue that they are too “decentralized” to identify data controller(s) or take responsibility for giving effect to data subject rights, inadvertently shooting themselves in the foot from a compliance perspective. To the extent that a ledger-based project insists that no one is accountable, it cannot satisfy this core accountability principle and therefore cannot comply with the GDPR.
The above is not meant as a commentary on the suitability of blockchain or GDPR, taking either in isolation. Rather, it is meant as an assessment of blockchain against the GDPR’s core principles. In this way, it is intended to provide a higher-level entry point into the conversation about the compatibility (or incompatibility) of blockchain and the GDPR, as well as a tool for reconsidering bold, an often unfounded, compliance claims.
— — — — — — — — — — — — — — — — — — — — — — — — —
¹ See, e.g., Article 29 Working Party Decision, 0829/14/EN WP216.
² See, e.g., Article 29 Working Party Decision, 844/14/EN WP217.