The EU General Data Protection Regulation and the Blockchain

A Declaration of the Rights of the Digital Subject

The European General Data Protection Regulation (GDPR) is set to become active on May 25, 2018. It was created as a response to the rapidly-evolving challenges posed by the 21st-century information economy to the preservation of individual privacy and autonomy. These challenges have entered the popular lexicon under the terms “surveillance capitalism” and “platform capitalism,” both of which describe an economy in which information — specifically, information about people, stewarded by centralized infrastructure providers — increasingly forms the raw material for profit.

It is not essentially a bad thing for corporations to make use of information about actors within their markets to better target their products and services. Particularly in an economy where personalization and niche industries proliferate in the absence of geographic colocation, the ability to reach the right audience and deepen customer relationships is crucial for the survival of many companies — not only large ones but especially small and medium-sized businesses. Danger arises, however, when individuals no longer have control over how their information is collected and used. Under those conditions, the good-faith matching of content to audience and knowing your customers becomes an intensifying regime of control and unending, predatory solicitation. The consumer becomes a de facto victim.

To remedy this, the EU has crafted the most robust and far-reaching privacy legislation the world has yet seen. Any company that does business in Europe will need to comply or face steep fines (up to 4% of global annual turnover). The GDPR mandates that the rights of the “data subject,” that is, the individual whose data it is, be protected. These rights include (in summary):

Article 12: The right to have questions about use of personal data answered, and to seek redress if these questions are not answered in a clear, concise, timely manner.

Articles 13 & 14: The right to know how personal data is being used at the time of collection, as well as the length of time for which it will be stored and contact information for the collecting party.

Article 15: The right to access the personal data that is being processed.

Article 16: The right to have incorrect personal data rectified.

Article 17: The right to have personal data erased when they are no longer necessary for the purposes for which they were collected and there is no legal ground for their maintenance.

Article 18: The right to restrict data processing where the data is inaccurate, its collection unlawful, or its processing no longer required.

Article 19: The data collecting party must inform all additional data processors with whom it shares personal data to cease processing data that has been rectified or erased.

Article 20: The right to receive their personal data in a structured, commonly-used, machine-readable format which they can freely share with other data processors.

Article 21: The right to object to personal data being used to profile or market to them.

Article 22: The right to not be subject to legal outcomes that rely solely on automated data processing.

In addition to the explicit declaration of the rights of the data subject (data access, data portability, right to erasure, etc.), the GDPR also mandates that data controllers and processors abide by the principle of “data protection by design and default.” This means architecting solutions with privacy as a foundational consideration rather than as an afterthought or add-on. It includes, wherever possible, employing techniques such as pseudonymization (decoupling data from individual identity) and data minimization (sharing only absolutely necessary data points) to protect privacy. Finally, by requiring that data controllers and processors use common technology standards, the GDPR aims to create an environment maximally friendly to individuals needing to transfer data between different vendors, governments, and institutions at their own discretion.

The Blockchain: Data Protection by Design and Default

All together, the GDPR could be considered a “Digital Declaration of Rights.” In listing the detailed requirements by which any institution or individual that processes personal data must abide, it places limits on the power of “digital states” — software platforms — and those who make use of them. By doing so, it reflects a commitment to many of the principles of digital self-sovereignty articulated by Christopher Allen in an influential 2016 essay. However, the GDPR also takes for granted centralized models of digital data storage and transmission that are now in the process of being replaced by newer ones based on distributed ledger technologies, most prominently blockchains. Blockchains get us closer than ever before to modes of digital identity in which the user is the primary owner of their data.

Centralized models of data storage rely on the implicit premise that custodians of information are trustworthy actors with a mandate to steward personal data. Blockchains, however, were designed in light of the frequent failure of even the best-intentioned centralized authorities to live up to their promise as stewards of the public trust. Accordingly, blockchains were built to function in a “trustless” environment — that is, one in which people can transact directly with one another without needing to trust any other actor in the ecosystem. This is why blockchains are not only decentralized but distributed — none of the nodes in the network running a blockchain protocol acts as an authority over others. A structure of incentives mediated by unidirectional cryptography ensures the integrity of a ledger of transactions that is shared by all the nodes, without relying on human beings to come to consensus. In short, math, executed and validated by a network of computers, functions as a substitute for middlemen.

Not only does the blockchain remove the need to trust a centralized authority in order to keep an accurate record of activity, but it makes surveillance of activity extremely difficult. There are many different types of blockchains out in the world today, but the Bitcoin blockchain, the world’s largest and most secure blockchain, was designed with pseudonymity and data minimization built in. All it records are the following pieces of data:

  • The public key of the transaction sender
  • The public key of the transaction recipient
  • A cryptographic hash of the transaction content
    (This could be anything: a land title, a birth certificate, an academic diploma, a copyright, an article of clothing, currency, a quantity of precious metal, etc.)
  • The date and time of the transaction

This is “data protection by design and default.” It’s impossible to reconstruct the content of a transaction from the one-way cryptographic hash. And unless one of the parties to the transaction decides to link a public key to a known identity, there is no way to map transactions to individuals or organizations. What this means is that even though the Bitcoin blockchain is “public” — that is, anyone can see all the transactions on it — no personal information is made public. This is by design: it allows anyone to validate the integrity of the transaction ledger without violating the privacy of the parties transacting on it.

Blockchain Applications: Centralization All Over Again?

The Bitcoin blockchain was built with the intention that each individual transacting on it would own their own blockchain “address” — that is, their public and private keys. This makes individuals the true owners of their personal data, which they can transport and present for verification wherever they need to.

However, management of one’s own keys is far from an intuitive process. And the stakes are high: if a private key is lost or stolen, all of the data it owns is gone as well. For this reason, many applications that write transactions to the blockchain don’t give users their own keys. Rather, they function as new centralized authorities; they write transactions to the blockchain using their own keys. While appearing convenient for the user, this strategy has three broad implications:

  1. Uncertain identity: Without proof of ownership, identity verification (i.e. making sure the person presenting a blockchain transaction is the one who really received it) continues to be insecure.
  2. No longevity: If these centralized blockchain applications go away, the individual has the same problem they would have had if a traditional centralized software platform (like Facebook or their University account) disappears: their data is gone.
  3. Vendor/Issuer Dependence: The recipient of blockchain transactions is in a new relationship of dependence: now with the application provider (usually a software vendor, but it could also be an issuing institution).

In other words, many of the applications that use the blockchain are still functioning from the centralized authority model whose overreach the EU General Data Protection Regulation was designed to reign in. Some applications, however, hew very closely to the original promise of the blockchain for self-sovereign digital identity — a promise whose values are echoed in the GDPR legislation.

Blockcerts: Beyond GDPR Compliance

In order to achieve wide adoption as a consumer-facing technology, key management must be radically simplified and mediated through an intuitive user experience. The Blockcerts Wallet (iOS and Android) is an example of an application that manages users’ public and private keys in a way that allows them to independently receive official records anchored to the blockchain. It doesn’t rely on centralization to achieve this: the app and the data it stores are not owned by any vendor or issuing institution, which means there is no centralized honeypot for attackers. It is also fully open source, so anyone can inspect the code and confirm that it meets the highest security standards, or build their own versions of the app.

Blockcerts allows for issuing official records to the blockchain in a common, machine-readable format (JSON-LD). By using an open technology standard, Blockcerts also solves for maximum portability and interoperability of official records: recipients can take them anywhere in the world, and any organization can verify their authenticity and their ownership by the individual presenting them.

The issuing institution may choose to retain copies of issued certificate files in their own database, in which case they will need to protect them under the auspices of the GDPR, along with any other recipient data they choose to store. But Blockcerts recipients can rest assured that for the first time, they have vendor-independent, authentic digital documents which they own directly. And because they are written to a maximally-secure blockchain using an open standard, they have them for life and can verify them anywhere in the world.

The blockchain represents the opportunity to not only fulfill but go beyond the promise of the European General Data Protection Regulation. While not all applications that write transactions to the blockchain achieve this promise, some, like the Blockcerts open standard, do — and those will be at the forefront of digital privacy and self-sovereignty for years to come.