Digital Identity Management in the Context of GDPR & Sovrin

Why Data Privacy Matters & How to Protect It

Jimmy J.P. Snoek

Published in

Tykn

9 min readMar 20, 2018

Private Information as Big Data Food

In the developed world, the battle for privacy is arguably one of the most important ones to overcome within our lifetime. Our personal and (ought-to-be) private information is being collected en masse by companies large and small, sold for a handsome profit, and subsequently used to feed the “big data machine”, where all these data points are correlated and further used to generate leads for third parties.

This may sound relatively innocent, but it can (and has, in many cases) become a democracy,- and equality-threatening problem when used to prey upon the already disadvantaged subsets of our population. Imagine this in the form of using algorithms to offer “attractive” payday loans to people already in financial dire-straits, who have been estimated not to be fiscally educated enough to know that it will put them in debt forever; or algorithms filtering out university/job applications from specific postal codes (of relatively poor neighbourhoods) for the sake of “efficiency”, severely limiting the opportunities of those who need it the most; or worse even: the LSI-R models for estimating risk of recidivism establishing a dangerous feedback loop which effectively contributes to the average prison sentence of African-American and Hispanic incarcerates in the US being 20% longer than those of their Caucasian counterparts (credit for these examples go to Cathy O’Neill’s brilliant, yet shocking book: “Weapons of Math Destruction”).

Without meaning to sound like a regionalist, in terms of data privacy laws, Europe does a lot better than the U.S. (although The Netherlands is attempting to tarnish that reputation by introducing a “dragnet law” within the new Intelligence and Security Services Act, granting them NSA-like surveillance powers). One of the latest big steps towards increased data privacy within the EU is the introduction of the General Data Protection Regulation (or GDPR) on May 25th of this year, with the “Right to be Forgotten” and “Privacy by Design” data subject rights being two of the most impactful changes to the previous directive.

I would like to touch on this within the context of Distributed Ledger Technology (what the cool kids call “Blockchain”), the Sovrin Network, and identity management.

General Data Protection Regulation (GDPR)

This upcoming regulation has shaken many businesses, NGOs, and even countries to their core now that suddenly they find themselves at risk of facing colossal fines at the result of non-compliance post-May, 2018. Well, perhaps not “suddenly”, since the word about the GDPR has been out for a good while now, having given all parties to be affected plenty of time to get their operations in order in anticipation of the regulation going into full effect. However, a substantial share is still playing catch-up trying to figure out how to be compliant.

In essence, the GDPR (Regulation (EU) 2016/679), establishes one single set of rules across Europe to enable individuals to better control their personal data, and will thereupon cause the repeal of Directive 95/46/EC when finally enforced on May 25, 2018.

Most interesting for us as a company focussing on digital identity management are the revisions to the requirements of “implementing appropriate technical and organisational measures” concerning security actions “appropriate to risk”, such as:

Pseudonymisation and/or encryption of personal data;
Ability to ensure ongoing confidentiality, integrity, availability and resilience of systems and services that process personal data, and:
Ability to restore the availability and access to data in a timely manner in the event of a physical or technical incident.

The Sovrin Approach

Since Tykn is a Founding Steward of Sovrin, we find it important to contribute to the education of regulators, decision makers and the general public about the merits of using Distributed Ledger Technology and particularly the Sovrin Network for identity management.

As required by the spirit and letter of the law under the GDPR, Sovrin was built with “privacy by default” and “privacy by design” at its core. In my previous post I briefly touched on this, in that Sovrin’s approach to distributed ledgers kind of strays from the conventional. In doing so, Sovrin might actually prove to be more effective, efficient, and above all: privacy compliant; namely because, through the use of the Sovrin Network, personal data will never have to be stored on the ledger, not even in its encrypted form. This is done because:

(1) Distributed ledgers are still the most inefficient databases for storage of data around.

//The main advantages of using Blockchain/DLT over centralised databases are in absolving issues of “trust” and “robustness”; if your application requires neither of these, then centralised databases are still a better option. The primary reason for this is in that centralised databases will likely always be faster than blockchains, since blockchains carry the additional burdens of signature verification, consensus mechanisms and redundancy, which heavily slow down processes within the network.//

(2) As set by The Article 29 Data Protection Working Party, hashed/encrypted private data is still to be considered private data regardless, as it could always be brute-forced (if not now, probably at some point in the future), and:

(3) This way, privacy can be ensured through non-correlation principles via pseudonymisation. So, instead of storing actual private information, the only things stored on the ledger (for the purpose of verification) are:

Decentralised Identifiers (DIDs) and associated DID Descriptor Objects (DDOs) with verification keys and endpoints;
Schemas;
Credential definitions;
Revocation registries, and:
Proofs of consent for data sharing.
(See: Sovrin: What Goes on The Ledger?)

We need pseudonymisation and a “right to be encrypted” to prevent these things from turning our lives into a Black Mirror episode by flying around with facial recognition software and access to our entire personal history under the guise of “surveillance” (Photo by: João Rocha)

Now, without working in the identity space, these words probably mean very little to you. So here’s a short breakdown, whilst showcasing how no private information is ever stored on the ledger:

DIDs are a new type of identifier for verifying digital identities, and are entirely controlled by the identity owner, independent of centralised registries, authorities or identity providers. In line with the concept of “Privacy by Design”, identity owners may choose to issue as many DIDs as deemed necessary to obtain a sufficient degree of separation between identities, contexts or personas.
DIDs resolve to DDOs, lightweight JSON documents containing metadata which proves the ownership and control of a DID.
Schemas are basically the formal description for the structure of a database. For example, a simple proof-request schema for an NGO such as The Netherlands Red Cross could look like this:

{
  "proof-requests": [{
     "name": "Beneficiary-Registry",
     "version": "0.2",
     "attributes": {
        "first_name": "string",
        "last_name": "string",
        "date_of_birth": "string",
        "phone_number": "string",
        "address": "string",
        "household_representative": "boolean",
        "house_owner": "boolean"
     },
     "verifiableAttributes": ["address", "house_owner"]
  }]
}

Generally, we call “credentials” the different (often tangible) proofs of identity or qualification issued by authorities; such as drivers licenses, passports, identification cards, credit cards, etc. Hence, credential definitions are — as the name suggests — merely the definitions of these different credentials to be stored on the ledger.
In the case that a claim is wrongly issued, or if the privilege associated with the claim is lost, there should be an option for issuers to be able to revoke the claim. The revocation registry is what tells the rest of the world how the issuer will publish the revocation information.
In order to prove consent or reception of data (basically saying the data has been received and checks have been executed on it), these consent receipts (i.e. proofs of consent) let people do so.

As you can hopefully start to see: there is no inherent need to store private information on the ledger (opposed to keeping it locally); by not doing so it actually makes the network more efficient and privacy-compliant, as retrieving private data from a ledger would actually slow the whole system down and create privacy and compliance risks (in terms of increased risks for correlation and theft, as well as brute-forcing of presently trusted forms of encryption, in the future).

By not allowing users to relinquish personal data to an external ledger/database by design, Sovrin also enables the more effective and efficient exercise of data subject access rights, whilst promoting fundamental principles of data protection by reinforcing purpose limitations and promoting data minimisation and accuracy, ultimately providing a lawful basis for processing in the form of the data subject’s consent.

Another important aspect of Sovrin is that, though permissioned, the network is still built on a public ledger (Hyperledger Indy). So, whilst disallowing correlation of private data points, it still provides an auditable record of data processing-related rights, requests, and activities, promoting the transparency principles of the GDPR.

Evernym, the original founders of the Sovrin Network currently designing enterprise solutions leveraging the network, have so far been of great help to Tykn in exploring Sovrin’s capabilities to help us make current identity infrastructures more resilient — a special thanks here to Jason Law (CTO), who came to our office in The Netherlands on a Sunday last month to go over Sovrin’s architecture with us, and Andrew Tobin (Managing Director) for providing in-depth showcases of Evernym’s tools! — . Hence, in looking to get some confirmation on my hypothesis of Sovrin being GDPR-compliant by default, I contacted Andrew, who responded as follows:

Yes absolutely. In fact we designed Sovrin to have the highest possible levels of privacy, which actually exceed what GDPR asks for. For example, selective disclosure is built in (only sharing a small set of attributes from a larger set), and zero-knowledge proofs are also built in and come for free.

Source: “Sovrin™: A Protocol and Token for Self-Sovereign Identity and Decentralized Trust”

Exactly the answer I was looking for! But, what exactly is “selective disclosure”? Well, verifiable claims and proofs can be issued using “Zero-Knowledge Proofs” (ZKPs). Using ZKPs, one could offer proofs (of identity, qualification, solvency, etc.) without having to actually ever show any private information. Relating this further to GDPR, this concept of selective disclosure using ZKPs already fulfils the criterium of “ongoing confidentiality of systems and services that process personal data”.

An example of an application for selective disclosure would be not having to send anyone your passport details anymore for the purpose of verification (which, of course, includes sensitive private information such as your national ID number, nationality, DOB and full name); instead, you send a mathematical (ZK-)proof that those details are indeed valid, without the other party ever having to verify this themselves (and thus having “zero knowledge” of your private information). Had such a system been in place last year, the private data of 145.5 million U.S. individuals, 860.000 U.K. individuals and 8.000 Canadian individuals would not have been exposed following the Equifax breach, placing these people at a high risk of identity theft and credit card fraud. (And as a bonus, had this happened following the implementation of the GDPR, maybe Equifax would have actually ended up being held accountable for their negligence).

Whilst ZKPs have been around for a long time (~30 years), so far, they have never been brought to scale because current infrastructures cannot efficiently employ ZKPs, leading to relatively simple statements taking literally minutes to prove (or on a mobile phone: hours). Through adoption of identity management systems using the Sovrin Network, implementing ZKP cryptography for issuing proofs and verifying claims might finally become a reality; a GDPR-compliant reality.

With all of this in mind, it is most satisfying to know that Sovrin’s novel architecture for identity management systems goes even beyond the new requirements set by the GDPR, and since fines for non-compliance with the new regulation will substantially increase as of May 25th (including penalties equal to the greater of €20 million or 4% of global gross revenue), we argue that the tools we develop will become even more valuable to our clients once the GDPR will be enforced!

Digital Identity Management in the Context of GDPR & Sovrin

Why Data Privacy Matters & How to Protect It

Private Information as Big Data Food

General Data Protection Regulation (GDPR)

The Sovrin Approach

Written by Jimmy J.P. Snoek