When is a Potential Data Issue Small Enough to Ignore?

Spencer Essenpreis
3 min readJul 8, 2023

--

This is the fifth in a series of posts about moving beyond data quality to data trustworthiness. You can start from the first post here.

Sometimes if you pull a loose thread it stops and can be cut off; other times it completely unravels the sweater. We build data trustworthiness by investigating every potential issue and prioritizing solutions.

I know this seems counterintuitive. There’s more to fix than there’s time for fixing. We’ve got to prioritize. We’re helped by a natural tendency to downplay minor issues so that we aren’t always confusing mountains and molehills.

It’s not always easy though to tell the difference. A small crack can be nothing, or it can be the beginning of a bridge failure. A few forgotten words could be normal, or they could be the beginning of Alzheimer's. Significant problems often present with insignificant symptoms, causing that natural tendency to fail us miserably.

This happens all the time with data. Someone thinks a standard metric looks a little off, and we find out there’s a mistake in the SQL we’ve been using for 5 years. A trend deviates unexpectedly in a dashboard, and we discover an unknown change in the source data. The scale and complexity of modern data environments combined with our own imperfection mean issues are always happening and we’re always missing them.

In this kind of environment we build data trustworthiness by pulling every loose thread, investigating any issue no matter how small. Often the thread stops — we discover there’s no real issue, and we build trustworthiness by proving the data is accurate. Other times the thread keeps going and going until the whole sweater is unraveled, and we build trustworthiness by making things right. We can’t dismiss a data issue because it seems small; we need to investigate the data issue to prove it’s small.

I know what you’re thinking — if we do this, we’ll spend all our time investigating data issues. Maybe in the short-term, but the long-term payoff is worth it. The frequency of people’s data quality concerns is inversely proportional to their trust in the data: as we prove the data to be trustworthy those concerns drop precipitously. In an environment with high data trustworthiness, almost any time someone has a data quality concern there’s a real and significant issue. They’ll also trust us enough to bring us those concerns, giving us the opportunity to continually improve data quality.

If instead we just ignore people’s seemingly smaller concerns, we end up in a much worse place. Yes, we do have more time up front for other priorities, but at the expense of trust and data quality. We’ve sacrificed high trust with our stakeholders and the quality of our data. They won’t trust that we’ll take their concerns seriously, and so they’ll only come to us when something is abjectly broken. That leaves all kinds of hidden issues degrading data quality, growing a technical debt that will be painfully paid down.

Think about that new hire who spends all day as if they are on a mission to find everything that could be wrong in the data, hitting some member(s) of the data team with endless questions. They are on a mission — to determine if they can trust the data. Downplaying or ignoring their questions may quiet them down, but it won’t earn their trust. Answer their questions and they’ll also quiet down because you have earned their trust.

That doesn’t mean we have to actually fix everything. Solutions do need to be prioritized. As I said before, there’s more to fix than there’s time for fixing. Some of those smaller issues may not be worth the effort required for a solution. There are good reasons George Bailey never got around to fixing that loose finial on the stair case. However much it bothered him at times, in the end he realized the surpassing worth of everything else he invested his life into.

What’s most critical for building data trustworthiness is investigating potential issues — taking all concerns seriously so that we can build trust and catch significant problems presenting with insignificant symptoms. It’s a high short-term investment with an unmatchable ROI from long-term trust and quality.

In the next post, we’ll look at the importance of partnering with others to achieve trustworthy data.

--

--

Spencer Essenpreis

Strategic Analytics Leader & People-Centric Culture Builder