A Pathway to GDPR Compliance through Data Lake

Nazanin Gifani
data-science.be
Published in
6 min readNov 16, 2018

In this article, we will introduce some of the new changes that every organization should go through in the post-GDPR era. We will explain what seven key principles of GDPR mean. Also, we briefly introduce some of the requirements of an effective data governance tool which help you embed privacy into your everyday work. Finally, we demonstrate how the choice of an appropriate data architecture model can ease your way towards compliance and make your success sustainable.

How GDPR Will Change the Way you Manage Your Data

The new General Data Protection Regulation (GDPR) is the biggest change in the regulatory landscape of data privacy over the past 20 years. With data being the oil of the 21st century, almost all EU companies now use personal data in some way or another. They collect, share, and store personal details in the course of their daily activities and are therefore subject to the GDPR. This regulation, which came into force on the 25th of May, aims at harmonising Member State’s data protection laws. The regulation affords extensive rights to people residing in Europe and imposes hefty fines to data controllers and data processors when they do not comply with its provisions. The penalties can go as high as 20 million or 4 per cent of annual global turnover, whichever is higher. The regulation has a broad scope as it also affects companies based outside the EU when they sell goods or services to people in the EU or monitor the behaviour of individuals in the EU.

In practice, compliance with the GDPR can be complicated and costly. The holistic approach of the GDPR requires that personal data be protected throughout their lifecycle. Compliance is a continuous project that starts with the collection of personal data and extends to their storage and deletion. The seven principles relevant to the processing of personal data are listed in Article 5 of the GDPR. They are the foundation for the fundamental obligations of the regulation. These principles include:

obligations of the regulation. These principles include:

  • Data processing should be lawful, fair and transparent:

To be lawful, companies should rely on an appropriate legal ground for data collection. To be transparent, they should communicate the necessary information to the data subject at the time of the data collection. To be fair, they should enable users to make informed decisions regarding the sharing of their data.

  • Purpose limitation principle:

This principle requires companies to ensure that personal data is collected for specified, explicit and legitimate purposes. Also, companies are required to process personal data only for the purposes for which it was collected originally and to avoid processing beyond the original purpose.

  • Data minimisation principle:

This principle requires companies to ensure that the personal data that they process is adequate to fulfil the stated purpose, is relevant and has a logical link with the purpose, and is limited to what is necessary. Data minimisation also means that data should be periodically reviewed and deleted if it is no longer needed.

  • Accuracy principle:

This requires companies to take reasonable steps to make sure the personal data they hold is correct. Some categories of personal data need to be kept up-to-date. When a company detects incorrect data, it should rectify it.

  • Storage limitation principle:

This refers to the obligation of companies to avoid keeping personal data for longer than they need. Companies should decide on how long they keep the data and have a well-thought retention policy. They should regularly review the data they hold and anonymise or delete it if it is no longer needed.

  • Integrity and confidentiality principle:

This demands that companies take the appropriate security measures including anonymisation or pseudonymisation. To protect the confidentiality of personal data, companies should track how they share the data internally and with other organisations, and whether they transfer the personal data outside the European Union. Having a full picture of the security and confidentiality status of personal data allows companies to detect data breaches and produce the necessary notifications to data protection authorities promptly.

  • Accountability principle:

This requires organisations to take responsibility for their processing activities. The GDPR also puts a great emphasis on documentation, which means that companies should maintain records of their processing activities to be able to demonstrate that they abide by the regulation.

Implementing these principles is no easy task. It requires fundamental changes to the organisational, technical, and legal procedures of personal data processing within an organisation. That is why four months after the regulation entered into force, many companies have not yet been successful in fully meeting the challenges of compliance.

Understanding your data is the key

Since compliance is not a one-time project, thinking about compliance in a sustainable manner requires the implementation of a reliable data governance framework. An effective data governance model allows your organisation to have a deep insight into all the data that you collect, hold, and share. It also enables you to detect personal data and distinguish sensitive data from the rest. The framework should allow you to understand the sources of data, how data is used internally, and with whom it is shared.

Having insight into your data assets enables you to put in place appropriate security measures, restrict access to data, and stay within the limits of your data collection purpose. A good data governance framework includes a metadata management system which helps you enrich your data, create meaningful lineage between data, and ensure access to accurate and risk-free data. Data governance frameworks allow you to respond to audits and data subject access requests swiftly and define policies for data retention, legal ground management, and consent management.

Fast-track your Data Governance with an Appropriate Data Architecture Model

A good data governance model requires an appropriate data architecture. One of the issues that every company faces is to organise flows of data from diverse means of data collection. A data lake is a storage repository of all enterprise data. A data lake allows you to store and organise massive amounts and diversities of data. It also provides you with a single view of data. A data lake can act as a single point of data transfer and help you overcome many challenges of governance by organising your data. Using such a data-centric architecture and integrating it with a metadata management tool can lead to a sustainable approach to data governance.

With this model, you can ensure that only authorised people have access to the most recently updated and accurate version of data, you can view who has access to data and you can review for what purpose they access them with the help of metadata. Having a central repository also enables you to protect the confidentiality of data more effectively. To put it simply, this architecture helps you organise your data and avoid messy data flows.

The digazu’s governed data lake is a one-stop shop for data supplying. It acts as a single point of data transfer, allowing you to organise your data and making your data readily available. With digazu, removing or replacing applications has a minimal impact on enterprise information architecture. The data lake can be integrated with most data management tools and pave your way toward compliance.

Learn more at https://digazu.com/

--

--