All information can be represented as strings of bits, recorded as 1s and 0s by a computer.

A Tale of Two Strings: An Information Economics Thought Experiment

It was the best of times for data scientists, it was the worst of times for data subjects.

People create information all day every day. Neurons fire in our brains, and our fingers react by tapping against keyboards or our feet respond by walking along the road. In both cases, a string of bits is generated with the help of an electronic device. Both strings of 1s and 0s get traded, exchanged, bought, and sold in markets.

But while markets for the handmade strings are functioning smoothly, as they have more or less for several hundred years, the nascent market for footmade information is a live dumpster fire of extraordinary proportions. The failure of the personal data market to function is now causing widespread social damage, undermining trust and even disrupting our democracies.

What makes handmade information so different from footmade information? Well as the prior thought experiment shows, nothing. The only difference is the rules we, as a society, choose to apply to those strings of 1s and 0s.

We protect handmade strings with civil rights and call it “intellectual property” supposedly because we intended to produce it and want it shared. At the same time we protect footmade information with human rights and call it “personal data” supposedly because we didn’t and don’t.

The reasons for this is a historical accident; the technology that facilitated the mass production of handmade information was invented first. Gutenberg unleashed his printing press in 1439, while the smartphone and IoT dropped only this millennia. “Personal data” of course has existed for nearly as long as people, but industrial scale production and (ab)use was impossible until recent technologies enabled the marketplace.

Regardless of whether information is handmade or footmade, it shares identical economic characteristics. There are two interrelated economic features of information that dominate its market architecture.

The first is the cost structure of production. Information is expensive to create, but cheap to recreate. In economic jargon, it has high fixed costs but zero variable costs. Hollywood blockbusters cost hundreds of millions to produce, but burning a DVD costs a few cents. Writing a novel takes years, but printing another paperback takes seconds.

The second is information is non-excludable. It is extremely difficult to stop people from consuming your information once it has been created, precisely because making copies is cheap, easy, and impossible to monitor. Just ask any author or musician. This is why no one “owns” any information in the traditional sense; instead you own rights to that information. In information markets, ownership and rights mean the same thing.

This is unusual and causes easy to understand problems. There is an ever present profit incentive for someone else to copy your information. Whether that is a book or some data, if the information has value, you can be sure some bad actors will eventually try to steal it. Obviously, IP markets to this day are plagued by content piracy and require a range of complementary monitoring and enforcement mechanisms to minimize theft.

This is our current privacy problem in a nutshell: personal data is a book or a newspaper about you written by 3rd party machines. You might call it your robo-biography.

And the market for robo-biographies is failing disastrously because your rights to that information are being widely infringed.

Luckily, the rights in the GDPR taken together are similar in practice to a modern day, god-given, non-transferable copyright over your robo-biography.

You have edit rights, and deletion rights to what goes in that book. You can ask your co-publishers to limit it’s reproduction, for details about its audience and distribution, and for what purposes it’s being consumed.

Most valuable of all, you can request a copy of all the pages from that book about you for free and do what you like with it. It’s your robobiography after all.

Rights to an editable copy of your robobiography. Copy. Rights. Upgraded for humans living in the 21st century.

How can we be certain that copyright is an appropriate analogy despite the concept originating in commercial law rather than human rights law? Well because both copy rights and data rights are attempting to solve the exact same economic problem; fighting the destructive incentive to copy someone else’s valuable 1s and 0s. That similar rules emerged from an entirely different body of law is another historical accident.

The disparate tale of our two strings tells all. The personal data market is a dumpster fire because of too much illicit copying. This is a choice and we have the power to stop it. The end.

We know how to solve this problem because we do it all the time; what is missing is an adversarial incentive structure that fights the theft. Regulatory penalties are insufficiently effective. Royalties serve this incentive realigning function in the IP markets, but nothing similar currently exists in the market for personal data. This pattern is normal if history is any guide; copying rights come first and the penalties for their violation follow.

It’s time to broaden the definition of “identity theft” to include the unauthorized reproduction of any and all personal data. This is well grounded, because all personal data, even anonymized data, has entropy. Entropy is a mathematical estimation of the likelihood a string of 1s and 0s positively identifies a person and it is easily calculated. This variety of damages is feasible under the GDPR’s non-pecuniary provisions.

But that doesn’t go far enough. We need to also clearly link damages for illicitly copying personal data to the economic value generated by its reproduction. There is scope for this already within the GDPR’s pecuniary damage provisions. Only this can stop commercial actors from exploiting people’s information without their permission. 2018 will prove a turning point and data pirates ignore the economics and history of broken information markets at their peril.

