When Data Goes Bad: How To Improve Data Quality?

*instinctools
*instinctools
Published in
7 min readJul 29, 2022

The correlation between data quality and decision-making is obvious. Garbage in, garbage out, remember? When organizations don’t care about data quality (DQ), it can play a cruel trick on them. Handling issues caused by bad data can cost a company from 15% to 25% of its annual revenue. Not to mention that poor data quality hinders the organization’s digital transformation efforts.

A data warehouse isn’t a trash can. It should contain only meaningful data that is valuable to your business. Making a dump out of your data warehouse, you waste money on storing dead weight data you can’t take advantage of to boost your business.

How to turn this loss into a profit and leverage data quality as a competitive advantage that will reshape your position among rivals? We’ve listed common issues you may face while dealing with data and outlined the ways to improve data quality.

Six possible issues you may face on your way to improve data quality

Data has particular quality characteristics — completeness, validity, uniqueness, consistency, timeliness, and accuracy. There are a number of issues related to them. Bad DQ results in:

  • Data silos. According to McKinsey, multiple data lakes and warehouses with no common data model are one of the top challenges at the enterprise level. Even if you have only one warehouse, running an analysis becomes troublesome when your data is scattered across multiple enterprise systems.
  • Human errors. When customers or employees make typos such as writing “Minesota” instead of “Minnesota” when entering information manually, you get data that doesn’t represent reality.
  • Duplicated data. When one employee enters customer data into your CRM, and another records the same customer data into another system, you end up with duplicates. If they are not completely identical, then there is a problem: which one is reliable?
  • Invalid data. The analysis doesn’t make sense if you get just any data instead of the data you need. An example of this error is when the name field is filled with surnames. Imagine yourself having a whole table of Smiths when you need to determine which of your regulars deserves a personal discount.
  • Missing values. Missing data is unacceptable for statistical procedures. If some obligatory fields aren’t filled out, you can’t analyze the data and take action. For instance, if you are collecting data on the age and gender of your buyers in a customer satisfaction survey, some of them might not reveal their gender if only “female” and “male” options are offered. This may be related to young people identifying themselves as non-binary, queer, etc.
  • Inconsistent data formats. You may feel like you’re going through hell when having to handle dates entered in European and US styles.

High-quality data makes data governance easier. And if you can confidently manage data, you can confidently manage the whole company. That’s why raising DQ is one of the top priorities for the next 6–12 months for 91% of organizations. If you are still undecided about how soon you should start fixing your DQ, this is your sign to not put it off until tomorrow.

How to mitigate data quality issues: embrace state-of-the-art technologies

Before answering the question: how to improve data quality, you need to figure out how to improve data management first. Focus your attention and budget on the adoption of new technologies. There are at least two possibilities to facilitate your data quality enhancement journey:

  • Take advantage of automation to eliminate human errors. For instance, adopting robotic process automation (RPA) frees your employees from monotonous, repetitive operations, erases the possibility of human error, and lowers the cost of processing data by up to 80%. For example, with RPA, you can easily convert all dates into one format, verify the absence or presence of the data, its actuality, etc., as all these actions can be reduced to a clear algorithm performed by a bot. Besides, in highly regulated industries such as healthcare, automation improves compliance with numerous protocols (HIPAA, PSQIA, GDPR, etc.) and, thus, helps to create a better patient experience.
  • Leverage Business Intelligence (BI) to have a comprehensive view of the quality of your data. You have to regularly evaluate your data to ensure that the information is still reliable.

Cooperation with experienced BI analysts is key. They help you figure out which questions you need to answer, what story you want to tell with your data, and create a custom dashboard based on that information.

— Ivan Dubouski, Business Intelligence Team Lead, *instinctools

A generic dashboard can show the extent to which the data meets data quality requirements. According to Gartner, tracking data quality metrics helps improve them by 60%.

You can also provide your data scientists and engineers with more granular dashboards that visualize the stories of issues underlying major data quality problems.

Use BI consulting services to decide where to start your data quality improvement journey and identify appropriate technologies to help you along the way.

How to develop a robust data quality improvement strategy

One-off initiatives and ad-hoc actions treat the symptoms, not the disease. You need long-term strategic adjustments to empower your staff with advanced analytics at all organization levels. That’s why, before jumping into a DQ initiative create a data quality strategy (DQS). We’ve listed six vital elements of it.

1. Do an inventory of your data and describe the issues

Developing a common vision of data quality for employees from different departments is essential. To achieve it, answer basic questions such as: How much data do you have? What types of data do you collect and store? How many errors are there in the data? What kind of errors are these?

2. Develop your requirements and objectives

At this stage, you should identify the stakeholders of the future data quality improvement process. The more experts that can evaluate the data from different perspectives, the more accurately you can define the DQ requirements and aspirations for your organization and the ways to improve data quality.

It may turn out that your company needs a dedicated employee who will assess the quality of data according to key parameters — a data steward. They are responsible for what data you keep in your organization, enforce internal rules on how data can be used and track the movement of the data inside the company. A data steward’s mission is to coordinate all the processes and decisions that arise from your DQS.

Don’t forget to set an approximate timeline for implementing a data quality improvement plan as it depends on the scale of your organization.

3. Set priorities for different data sets

Working on the quality of customer data and the company’s internal data simultaneously is great. But if your budget is limited, you need to choose the improvement of which data is the priority for your business success and growth. By enhancing the quality of the data related to the customers’ personal information, you can personalize their experience and increase customer satisfaction. However, revamping the organization’s internal data can bring you just as much benefit. Having high-quality data about your staff, you can fully reveal the potential and talents of your employees and uncover how to optimize the processes within a company.

4. Select technologies and tools to improve data quality

Given the sheer number of offerings on the market, it turns out to be time-consuming and tricky to compare their features, licensing costs, payment options, etc. Consider that if you are burdened with outdated software, the task gets more complicated as you may need to modernize it.

Adoption of new technologies and tools may require more inside-out knowledge than was initially expected, so choose tech partners who are an old hand at handling data issues.

5. Identify the roles and responsibilities for stakeholders

At this stage, you settle on the tasks assigned to a data steward, data engineer, business analyst, executives, etc. For the boat of your data quality improvement strategy to sail smoothly, you need many hands rowing in the same direction. A data steward can track data quality standards across the organization and in particular projects, business analysts prioritize tasks from the perspective of business benefits, and C-suite members make final decisions about what actions should be taken.

6. Set KPIs to evaluate the progress

What degree of data quality do you want to achieve in six months, in a year? How much time can it take your employees to correct errors of different types? To what extent do you expect to reduce them? An experienced business analyst can help you determine the realistic KPIs for your organization.

When the time period you’ve designated as a benchmark has passed, analyze achieved results, review your data quality improvement strategy, and modify it if necessary.

The draft of your data quality improvement plan may look like this.

Clean up the way for accurate data analysis and genuine insights

The quality of the data you process determines how valuable the insights will be. In some way, without advanced analytics, an organization is deprived of the future, at least one, that is bright and prosperous.

You can partially and temporarily solve burning data quality issues by adopting modern technologies. But it’s like putting out a fire in one room when an entire building is engulfed in flames. Creating a data quality improvement plan is a surefire way to pinpoint what to do with your data to enhance its quality, how to do it, who is in charge of the process, and track the progress to analyze when you can achieve an expected outcome.

Originally published on instinctools.com

--

--

*instinctools
*instinctools

*instinctools is a software product development and consulting company with a proven track record of over 20 years.