Continuous delivery for data protection by design and by default — Quality assurance and approval of tools and third party software

Published in

Sydseter

10 min readMay 27, 2019

In a series of articles, I will present a way of continuously delivering software while protecting the privacy of natural persons by design and by default.

Just because you’re college, David, said it was ok, doesn’t mean that you can use whatever technology you want without approval or review. Data protection by design and by default requires that all tools are approved and reviewed before and during the processing of personal data.

Data protection by design and by default

As a developer or architect, you should be encouraged by the data controller to “take into account the data-subject’s right to protection when developing and designing” applications and services. article 25
Let’s say you need to select a primary database that will store health data, a so-called high-risk scenario. As you are reviewing various database solutions, you find two options that would be suitable for your need. Option 1 will cost 60.000 $ per year while option 2 will cost 90.000 $, but then you find out that option 2 has all of the security and privacy protection features and security certifications you are looking for while option 1 is rather lacking in features and certifications. What if the less expensive option could help you win a contract with a potential customer? Should you choose option 1 because it is cheaper? Of course not, it would be highly unethical. If you were the customer sitting on the other side of the table getting presented with 2 offers from two different contractors. One that was 33% more expensive, but that had taken into account proper data protection according to the risk, nature, scope, and context, associated with data processing, while the other had presented a rather shallow security and privacy solution, then it would be highly unethical by the customer to, knowingly, choose the less secure and confidential solution. When selecting a data processor or data sub-processor, price, should not be the deciding factor. The deciding factor should be based on a proper analysis of the risk related to the processing of personal data “taking into account the state of the art, the cost of implementation and the nature, scope, context, and purposes of processing”. If the decision is made without taking these factors into consideration, then your not doing data protection by design and by default.
To help your customer or manager in the decision-making process you could set up a small table comparing the cheaper option with the more expensive to show how they compare in relations to privacy and security.

Showing due regard to the state of the art

According to GDPR article 25, clarified through recital 78, processors and controllers should «take into account the right to data protection…with due regard to the state of the art», meaning that you always should take into account that the environment in which your applications and services are running is changing and that the applications, therefore, need to change with it in order to ensure secure and confidential processing of personal data.
A typical example would be that a security vulnerability like the Heartbleed vulnerability gets exposed and that a patch immediately becomes available. If you knowingly or recklessly do not make sure to patch your systems with the latest patch, you’re not showing «due regard to the state of the art». So let’s say that you start patching your software whenever there is a new patch available, would that be enough to secure your software? You would always install the latest security update that way, right? Yes, but you would also be exposing your software to threats from the latest patch version coming from some of your favorite open source libraries. Sometimes, new patches introduce critical vulnerabilities. Being able to continuously discover new threats and rolling back quickly is a critical capability too. You should continuously scan your dependencies and compare them against national and international databases that publish the latest publicly disclosed, vulnerabilities. There are a lot of new tools on the marked to help you automate and keep track of your dependencies. Jfrog is a company that specializes in DevOps solutions for dependency management. From experience, I can say that their solution more then pays for the manual labor you otherwise would have to be doing in order to manage your dependencies and stay on the right side of the law. A microservice architecture can have thousands of dependencies spread across hundreds of services. Keeping track of them all is impossible without a central management solution. Continuous delivery becomes impossible without when developing microservices.

Approving new tools, and third party software

A sure recipe for failure when implementing data protection by design and by default is to allow your developers, meaning me, freedom to choose whatever open source solutions I want to use in your project without any formal approval or review process in place. What will happen is that I will run off, ask my cool developer friend, David, for some tips and find a lot of nice code libraries and databases I want to use. Some of them will for sure be GLP licensed, but what does GPL stand for anyway? Then I will go ahead and use them when developing the 100–200 microservices that the architect team wants me to develop, and once the test manager says that the testing went true ok, I will deploy all of it to production and hope that the project manager will get some time set aside for some maintenance and manual patching of the 100–200 microservices and 3000 dependencies sometime in the distant future.
If you thought that microservices was a great idea before you started such a project, then you will never ask a developer or architect to design and develop microservices ever again.

Before things get out of hand, preferably before you start the project, you should sit down with your dpo, security officer or security architect and write some criterias for approval- and use of third party software. The list will look something like this:

In order to get approval for any new third party software that will be used in production, the following criterias have to be met. Third party software must

have a compatible license.
undergo monthly patching.
follow semantic versioning.
undergo security patching by the vendor on a regular basis.
have no publicly disclosed critical vulnerabilities reported for the software without there being a patch that can immediately be applied.
have no deprecated dependencies included in the software.

You will then need to continuously monitor and review the dependencies you use to make sure the criterias are followed. The Enterprise solution from Jfrog can cover most of your need, but you will still need a change management process for approval of third party software. That means that you need to keep an up to date list of all your direct dependencies and software that is part of or used by your system where the approval for the use of software has been documented by a security officer, manager or security architect. The developers can keep the list up to date themselves as long as you have a central management solution for dependencies that alerts management of any deviances. I would really advise against trying to build such a solution yourself or rely on manual processes. It‘s too error-prone, time-intensive and demotivating to do it manually. Ignoring the risk altogether means you're deliberately not showing „due regard“ which means your not doing data protection by design and by default.

You still need to review your dependency management system on a regular basis to make sure it works. This is what I call „The tools and libraries review“. You will have to review once every sprint if your central dependency management system is not automated and complete according to the points described below, otherwise, you can probably reduce reviews to two times a year.

The tools and libraries review

The Norwegian supervisory authority for privacy-related concerns created an example checklist sometime ago to ensure that processors implemented appropriate measures for data protection. Part of the checklist goes something like this:

Use approved tools and frameworks

Establish a list of
- approved tools with associated security features that will help to automate and enforce security procedures in the coding
- approved supporting components
- permitted third party components and development tools. Avoid sharing of personal data through third party libraries — use synthetic data instead
Describe in the lists what the various tools and supporting components will be used for, including new security analysis, functionality, and protection. Tools and supporting components should undergo risk assessment, and be analyzed for privacy and security vulnerabilities.
Keep the lists updated according to organization guidelines. This means that new tools and versions must be reviewed, and used wherever possible.
Use only approved tools and supporting components from the list. Any exceptions should be documented and approved by the security officer.

In order to cover the points above and review your dependency management system, create three documents. One page explaining the purpose behind doing the tools and libraries review (the why). This is the document you show to your data controller. Secondly, create a document describing the step-by-step process (the recipe). This is the document the team will continuously update as their process changes and is used to learn new team members about the process. The last document is used for keeping track of the approved software, (the contract). Without a central dependency management system, the review process will become too time intensive and heavy to complete. The consequence for your company can get severe. Equifax, an American credit card company, got hacked in 2017 because they didn’t patch up their third party software. Estimated cost, 1.4 billion $. That is 5000 times the cost of paying a 1-year subscription for the enterprise version of the central dependency management system I talked about. To me, implementing a central dependency management system sounds like cheap insurance.

Dealing with false positives

“The tools and libraries review” is needed in order to evaluate the effectiveness of your system when running “software composition analysis”. These systems tend to report a lot of false positives and unimportant findings. False positives and non-applicable findings need to be actively labeled as such to ensure that the SCA system stays relevant and are used and heeded during the development and testing processes, hence the review. According to Gartner, several vendors of AST systems have started to look into using AI to help their customers with avoiding these false positives. Artificial Intelligence and Application Security Vendors: Marketing Hype or Genuine Hope? You will have to rely on some manual processes if AI not is an option for you.

Selecting and approving cloud providers and development partners

You will probably not do all the development yourself. As infrastructure- and database management might not be part of your core business model, electing to use cloud providers or development partners might be a good idea, but if you do, you have to make sure they are doing data protection according to article 32. The data processor has to prove it by showing that they adhere „to an approved code of conduct as referred to in Article 40 or an approved certification mechanism as referred to in Article 42“. article 32.3.

A common way for data processors to do this is to point to their ISO 27001 certification. I have a couple of issues with this. First of all, ISO 27001 is an international standard for an information security management system. Being certified means that the information security management system the data processor candidate is using follow the ISO standard, but that does not mean that the ISMS that your data processor candidate is using is taking into consideration the risk, nature, scope, context or purposes of processing that is required for your project. Secondly, that they are ISO 27001 certified means that the way the processor manages security risks is compatible with how you handle security risks if you also are ISO certified, but they can have elected to use entirely different security control mechanisms. You may choose to encrypt data at rest, while they don‘t, but that doesn‘t mean they aren‘t following the ISO 27001 standard. It‘s the same with ISO date and time standards. Sometimes you need to provide timezone information while other times you don‘t, you’re still following the ISO standard. This is why many companies in their ISO 27001 documentation writes the following for every security control they elect to use:

TBA: To Be Agreed: security controls need to be agreed between the client and <data processor xxx> where the controls in this standard must be considered to protect the respective.

If the company you have elected as your data processor have a statement like this under each and every of their security controls in their ISO 27001 documentation, then that means that if the data controller or you as a processor hasn‘t specified any security controls, then no security controls need to be implemented by the data processor according to the data processor’s ISMS. You may think that most project managers read through the ISO documentation of their data processor candidates, but I have been surprised again and again about how few that actually do. You may need to break your contract with your data processor if you find that one of them isn‘t properly doing data protection by design and by default. If the reason is that you didn’t read their ISO 27001 documentation or privacy policy, then it's your problem, not theirs.