Personally Identifiable Information and GDPR compliance

It looks like many organisations advise to start a compliance journey towards the General Data Protection Regulation by evaluating the “Personally Identifiable Information” held by the company seeking conformity with the GDPR. The term “Personally Identifiable Information” does not appear in the GDPR, but has a specific meaning in US privacy law. Hence, this term in itself is very likely to cause massive heartache to anyone seeking compliance with the GDPR, and one can definitely question the wisdom of introducing that term in GDPR training material.

Indeed, the US interpretation of that term is completely at odds with what is actually relevant for a GDPR assessment, because it pre-selects a set of identifying attributes. The discrepancy has been neatly explained before.

Instead, the GDPR refers to personal data as data about an identified or identifiable person, either directly or indirectly. Recital 26 reads:

To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

A similar recital was also in the EU Data Protection Directive (also Recital 26, coincidentally), but not introduced in all Member State laws (which might explain some of the misunderstandings). This recital has served as the basis for a few recent decisions at the CJEU.

In short, for purposes of assessing the personal data held by your company, you can’t assess the data you hold as if it was in a closed system (see “either by the controller or by another person”). Individuals now interact with interconnected companies, everyone knows this, and the GDPR reflects that. The first step should instead be to focus outward, and understand who in your ecosystem can connect which attribute to which other. You should be able to deduce from this many different paths to re-identification within your ecosystem. Secondly, you should understand which identifiers that you hold or pass on for processing are actually held also by others, i.e. how you connect yourself to those re-identification paths. Thirdly, you can turn inward and assess what is personal data you hold within your systems or pass on for further processing (as you are also partly responsible, as a controller passing on data to processors, for protecting some of the rights that have been granted to individuals in the GDPR).

In practice, there are some very obvious re-identification paths that cut across many many companies. To give concrete examples, social sharing buttons or Google Analytics cookies on your websites will, as currently implemented, definitely constitute a re-identification path. Since the companies offering those services to you only consider themselves processors, it is really the controllers’ (your) responsibility when contracting with them or installing their code snippets to make sure their services can help you fulfil your obligations. If not, you should ask them to change the service. If they don’t comply, you should not use them. In general, the adtech space is ripe with such re-identification paths, but they sometimes are difficult to find due to the opacity of that ecosystem (also, it wouldn’t fit the exemption of Recital 26, as the point of some of those re-identification paths is precisely to make it easy).

I have tested this point of view in Switzerland and transatlantically (Safe Harbor) and have been validated every time.