Data protection, open source and APIs. How do they match?

Ilari Mikkonen
APInf
Published in
3 min readMar 26, 2018

We are generating more and more data every day. IOT devices spew out data in breath taking amounts. Devices that we carry around send sensor data to god knows who. On top of that we voluntarily generate and give out data. How many times do you tweet per day?

Usually single data source is not that interesting alone, but the combination of data and profiling. This is interesting for various parties and their motives are not always benign.

Data privacy is a hot topic right now. There are changes in the global legislation, take GDPR for example. Transparency in general is a big thing. Many public sector organisations prefer open source components to be used in their system, and try to steer away from vendor lock-in scenarios. This is EU wide direction. For example, HSL report from year 2016 explicitly states how they used “ open data and open source code” in their Journey Planner.

Open or closed source? Let’s take a look at this via an example from other area: security. Open SSL for example. Ask yourself this: what crypto library would you rather use, open source or closed? With open source SW you can (There are people doing this, I’m not one of them) go and inspect the source code to see what is going to happen when they use this library. Remember heartbleed? Good thing is that usually the security holes in the major/mature open source components are so difficult to exploit that it does not matter.

Open source also has it’s issues. By inspecting code, you can see all the holes in the source and exploit them. Popular components are fine, but if you use obscure open source components, you are asking for trouble. Also there usually is no guaranteed support so you might end up maintaining that component alone, or dropping it. So choose wisely. One the other hand if you choose, say, Microsoft, the probability of losing support is minor. But do you trust Microsoft? How do they treat your data?

So what does open/close source have to do with data privacy? REST APIs can be seen as a sort of a mixer where you pour (your) data and processing. If everything is closed you cannot tell where the end result is being distributed. Unfortunately even if the API would be open source, it does not prevent misuse of the data, but it at least adds another layer of transparency.

Should all REST APIs be open source also?

Well, like many things life, that depends. If you have a business and you are exposing your data via API, there is no particular point exposing internal workings. Even though it may look like there is no problem providing open source of your APIs, this may change in the future by combining information from other sources. You’d need to keep configs and such locked out anyway so why not hide everything.

Where would I like to see open source API? Public sector where I give my data. For example, if I choose to give my location data to the local bus company, I’d like to see that they don’t share it to someone I do not like. Or if I go to a health thing to see a nurse. I’d also would like to see what Google and Facebook are doing with my data, but I think that is not going to happen. I might not want to know if NSA gets my data on request. Or where all the backdoors are. There are reasons why many companies in Europe do not want their data travelling to USA. Why many people use Slack for company secret stuff, I do not understand, but maybe I’m just paranoid.

Is open source approach really be-all-end-all solution? Nope, because people are idiots or may be coerced. Whoever person that is handling your data and has access to it can violate laws and your rights. Especially police in Eastern Finland has excelled in violating people’s rights by snooping data.

Understand how your data is used. I’m giving my data to Facebook, Google knows where I go and K-group knows what I’m buying. I’m trading my information for some stuff, like discounts or entertainment. For those really sensitive information scenarios, I want to see open source and open source APIs.

image from flicker under cc-by-sa-2.0

--

--