Your second step to GDPR

Make Data Better
Draft · 6 min read

Has it ever happened to you that you come across a contact saved in your phone but you cannot recollect who the person is or how you got the number in the first place? Data can be tricky like that. We tend to store it and sometimes it becomes difficult to keep track of the source or the purpose behind it. Now imagine a company which is processing data of thousands of people and at the same time trying to reach out to more customers and get their data too. The amount of data being collected can be mind-boggling. When the law demands that the data of every individual be protected and there should be transparency in the processing of data, it becomes imperative to have an inventory so that the data is catalogued as per type, category, purpose and so on. Further, the fact that GDPR itself demands that data controllers should maintain a record of its processing activities, makes it all the more important that a company should set up a data inventory.

While many GDPR experts advice that a company should have a data inventory, preparing an inventory is not an easy task. Random data may be stored in different files in different parts of the system and tracking it manually is nearly impossible. The bigger the company, the harder it gets. For instance, if a company has 200 employees and it is processing the data of 50,000 citizens, the database that it has of information pertaining to these citizens would be vast. Manually looking for an email id of a particular individual is like looking for a needle in a haystack. In our previous article, we had dealt with the operational as well as the technical approach that might be adopted by a company to be GDPR compliant. We have already explained that as the data being processed by a company increases, it becomes difficult to manually categorise the data and effectively form an inventory. This is where technology comes to your rescue. What a company needs is a software that would scan through its database and pull out all email ids, phone numbers, addresses or other data at a much faster pace and in a more efficient manner.

Source

The key task of the software should be to help mitigate risk based on an assessment. Simply put, to be GDPR compliant, a company should mitigate its risk of breaching the law so as to avoid the harsh penalty. In order to be able to do that, it should be able to assess its processing activities and in order to get a detailed assessment of all the data being processed by it, a company needs to classify and categorize all its data and that is why an inventory is required. An inventory is the key to assessment. Therefore, preparing for GDPR compliance requires setting up an inventory of the company’s personal data processing activities, from collection and use, to storage, retention and deletion. A good software should enable a company to create an inventory of existing data as well as automatically classify and categorise new incoming data.

Ideally, a software should be able to create a data inventory wherein it has classified sensitive data so as to alert the company about the need for compliance. Not only this, it should be able to keep the database clean by getting rid of all the obsolete data or encrypting the same so that the company is not in violation of the law. The purpose of technology is to assist the company. So a good inventory should at least be able to answer the following:

  • The details of the data owners.
  • Where is the data located in your database?
  • Details of the data subject.
  • How the data was collected?
  • The purpose of processing the data.
  • How long the data will be stored?
  • Who all have access to the data?
  • Has the data been transferred to a third country or an international organisation?
  • Is the data subject aware of the processing and has he consented to the same?

Imagine that your database is full of phone numbers which may be saved as “phone number”, “contact number”, “office number”, “personal number” and so on. With so many distinctly named columns representing phone number and given a high volume of data, it is hard to come to a conclusive answer of “how many phone numbers are there in the system?”. A software should, ideally, be able to give you this answer, with a little manual help of course. Most of this manual help is provided by the Information Architects (IA) and the Data Stewards (DS). Simply put, the IA is the owner of the data architecture of the system and the DS is the owner of the actual data in the system. Usually, there are multiple IAs and DSs in an organisation — each responsible for a part of the data architecture or actual data respectively. An IA decides what all concepts or domains should be created. For example, should the concept be called “Phone Number” or “Contact Number” or should there be separate concepts for “Home Phone Number” and “Work Phone Number”. For a DS, the key objective is to know all the data that exists in the system in terms of the concepts created by the IA. Assuming the IA creates a concept called “Phone Number”, the DS, for the data under their purview, would want to know — how many phone numbers are there and where all do they exist?

Given the above, a software should enable an IA to create concepts and provide initial training to the system. For a DS, it should enable them to take those concepts and ensure that all the data in their purview is classified in terms of existing concepts with sufficient accuracy. This doesn’t mean that the DS should have to manually mark each column. They should have to mark minimum number of columns to train the system to attain sufficient accuracy in identifying columns as existing concepts in the system. As the DS is closest to the data, the system should also allow the DS to propose new concepts to the IA, while allowing the IA to accept or reject this request.

The software should also assist the team to catalogue the data into categories so as to enable it to cater to the citizen’s requests swiftly. The law requires a company to ideally respond to a citizen’s request within 30 days from receipt of the same. Imagine Ms. Jane Doe has sent a request to update her phone number in the company’s database. Wouldn’t it be easier to effectively process such a request if you knew where her phone number was stored? This is how the software helps the company. It takes less time and provides more efficiency.

It is clear that to ensure compliance with GDPR, a company needs to have a data inventory. Depending upon the size and resources of a company, it may choose to do it manually or with the help of technology. There are a number of software tools available in the market that facilitate the creation of an inventory.