Using The Blockchain To Reform Data Protection: Coding Privacy Into The System

Some solutions are good, some are so great that we should use them to solve problems they weren’t even designed for

Pauline Kuss
8 min readFeb 25, 2017

Much is written about the shortcomings and challenges faced by our current approach to protection of personal data and privacy and I myself have pointed towards some of them in a previous article. I have always been a friend of constructive problem solving and as such I find myself repeatedly annoyed and disappointed that I seem unable to propose any real solution to the problems I can so easily point out. Although I could effortlessly write essays about my doubts concerning the effectiveness and long-term practicability of our current way of (legally) handling data protection issues, I don’t really have an alternative I could present to strengthen my argumentation and to provide something to work with. So I sat down and I thought about it, and I went for a walk and I thought about it, and I tried to do something creative to let all my thoughts flow around freely — but nothing really came.

Then — I had somewhat given up or at least allowed myself a break from all that thinking — totally unrelated, I was reading some articles about the Bitcoin system. I had been talking with a friend of mine about it and wanted to understand how exactly it worked. I thus dove into elaborations of the underlying blockchain technology, a system that allows for trust and thus secure (financial) transaction, enabled though a decentralized register which records every single transaction and whose information can be accessed by all parties operating in the system. In fact, all nodes (computers) who are part of the blockchain store one, constantly updating copy of this register so that no single node could make unauthorized changes or corrupt the file in any way. Such an attempt to falsify the stored record would immediately be detected when the concerned register-copy is compared with other copies in the system — it would thus have no effect since a transaction is only regarded valid if a majority of nodes display it in their digital ledger. That way, the blockchain allows for security of transactions through the facilitation of trust based on a decentralized verification network which no single player involved can outgame. Although originally used for financial services, the idea behind the blockchain system has recently captured the interest of various parties who come up with ever more scenarios in which such decentralized transaction or storage system could overcome some frequently faced challenges in their field of expertise.

That’s where they got me. We are talking about a system that facilitates trust among parties of a transaction, or better, a system that replaces trust between parties involved in a transaction through a highly secure network that makes such trust needless since it technically prevents parties from cheating on each other.

Now, think of some of the problems related to data protection — those problems arising when uninformed citizens download a bunch of mobile apps and accept all kinds of terms and conditions presented to them without spending a minute to read through it or reconsider if the weather app really needs to access their phone’s address book. Think about the issue of missing transparency: what happens to such data after it was accessed and possibly retrieved from your phone, stored somewhere in the depth of the computing center of some company? Could you list which kind of access permissions you have granted to which particular application on your phone? Maybe you’d like to take some of these permissions back but how do you do that? Simply deleting the app? — What happens to the data they already got? Data protection is complicated and although data subjects do have quite some rights concerning the revision and access of their personal information stored by third parties, it is rather unlikely that the average citizen is aware of any of these or how to use them.

We need something easier — something which gives back a feeling of personal control to individuals who nowadays simply gave up on caring about their privacy because of the complexity and forlornness of the entire topic.

Besides missing knowledge and awareness, also the temptation of short-term benefits and simple convenience quickly win over careful considerations of possible consequences of generous data sharing. Personal control and informed consent are central themes in the (at least European) data protection regime, consent of the concerned data subject being one of the means to excuse a lot of stuff which would otherwise be prohibited by regulations (such as the processing of certain types of data). Although I like the sound of freedom adhering to the notions of personal control and trumping individual consent, I do worry that data subjects’ missing competences to make truly informed decisions concerning their data may often lead to lower levels of personal protection for these individuals — levels that are below the standards the legislators who crafted the nice compilation of general data protection rules would have liked to see. Maybe the rapid development of technology and its possibilities is too complex for the average citizen and the benefits he can receive from carelessly signing away personal data processing agreements are too tempting as that notions such as “informed consent” and the related concept of free and deliberate choice, can have such a dominant place in our data protection regimes.

So okay, we got some problems here but why did you tell us about all that blockchain stuff? That doesn’t have to do with anything!

Well, it kind of does — or at least, I started to wonder if it was possible to use the blockchain system in such a way that it’s possibilities align perfectly with current challenges in the protection of personal data as identified above.

Let’s leave the purpose of financial transactions behind for a moment and think about the scenario in which I wish to download a certain app and the developer of the same in return asks me to access certain kind of information such as my contacts and location details. This is basically nothing else than a transaction! I transfer the rights to access the particular chunks of data to the app provider. However, seeing how many apps the average person is running on their phone, how many of us subscribe to some kind of customer program or use other online services — it is quite easy to loose sight of the abundance of varying rights you have transferred to various parties, each asking you to access a different chunk of your data. Besides this intransparency, the multitude of redundant data copies (if you have 5 apps that all require access to your location data for example) sitting around somewhere, obviously increases the chance that some of it could be leaked or otherwise find its way in the hands of people you didn’t authorize. The multitude of copies and various degrees of access rights are thus clearly an issue and while reading about all the other ideas based on a re-utilization of the blockchain technology, I started to wonder — wouldn’t it be possible to use the whole thing as one big and secure data storage in which technical means are used to ensure that agreements of data access cannot be misused? An extensive Google search left my surprised that not too much seems to be written about such a possibility yet. However, I was able to find one paper that takes the idea of “Decentralizing Privacy”– which basically follows my line of reasoning up to here and then makes some suggestions how an actual implementation could look like. The paper goes in great detail and especially the nerdy reader might enjoy diving into the entailed pseudo codes and technical elaborations — I suggest anyone interested in specifications to have a look at it. But just to draw a rough pictures for the less tech-savvy reader, the basic idea is, to think of every agreement between a particular individual and another party (such as an app developer) who asks to access certain (personal) data of that individual as a transaction. Thus, every time you grant someone permission to use your location data a new transaction is created, which specifies the parties involved (you and the app developer) as well as the granted permission (access location data but maybe no information related to the contacts stored on your phone), and this transaction information is spread throughout the entire network of blockchain nodes. One additional feature of the blockchain, that I haven’t touched upon so far, relates to pairs of public and (secret) private keys which secure authenticity of transactions through a way of nifty encryption. I won’t go into details here but the important thing is that such a key system allows us to restrict who can access which transaction (or information) stored in the network. Using these digital keys both partners are subsequently able to view the details of their transaction, which gives data subjects great transparency and oversight over all their given permissions and furthermore the possibility to easily annul given permissions at all time. The party interested in the data (think app developer) on the other hand can use their authentication key to poof their identity and proof their permission to access a certain chunk of data (such as your location data) — all this information is stored in the transaction written in the digital register that can be found in all nodes of the network. Thus, it is impossible for anyone (at least ignoring the fictional case in which one entity would own more than half of all nodes of the network) to corrupt their rights to access certain data as written in the transaction log. Accessing data outside the reach of a given permission is simply technically impossible.

The attentive reader might argue that storing large amounts of data in the blockchain would be infeasible, which is a really good point that I got stuck on as well. But it seems as if the data itself might not even have to be stored in the blockchain — it could be stored in an off-blockchain storage while the particular transaction stores merely a digital pointer (a hash/some kind of code — which indicates the particular data it refers to and where to find it).

So on the one hand we have the individual who has a clearly arranged and transparent overview over what access rights they granted to which parties. Based on that he can take more conscious actions to restrict such permissions at any point in time, which will strengthen the position of individuals as owners of their own data. On the other hand we have those parties interested in accessing and processing (personal) data from individuals, who can do so by using their private key to identify themselves in the network which knows to what particular datasets the concerned party is allowed access. To prevent such party from accessing, downloading and subsequently storing such dataset in their own system and thus undermining the regulatory power of the blockchain, the authors of above mentioned paper note that it would be possible to restrict access to the raw data files and instead ask permitted parties to run computations on the data on the network directly and only accessing the final results.

Besides facilitating transparency, a clear overview and easy to manage access control for data subjects, a blockchain-inspired approach to data access management would also support the power of regulatory bodies which could, through the system, interfere more directly with the agreements made between companies and individuals concerning the accessing and processing of personal data of the later — regulations or standards concerning storage, access, processing or sharing of specific kinds of data could simply be implemented in the code of the blockchain!

I don’t pretend as if the here presented idea was flawless or able to solve all challenges surrounding the protection of personal data and privacy. Rather do I ask you to read this article as an attempt to offer some constructive suggestions to a discussion which is often shaped by cynical criticism and as an incentive to change your perspective on the process of solving a problem — any problem! Maybe it is not always necessary to come up with a new solution from scratch, maybe sometimes it is all about observing your problem from multiple angles and identifying which of its aspects might be solvable by a solution that already exists somewhere else.

--

--

Pauline Kuss

Data-Driven Innovation, Artificial Intelligence and Bigger Picture Perspectives – Writings of a DataScientist-slash-TechLawyer