Data is the new oil — start digging!

Carl Fransman
dScribe data
Published in
4 min readAug 11, 2021

Imagine your backyard were located right above an oil field. And you knew about it. Somehow though, you never got to installing a drilling rig. Can’t complain about missing the pot at the end of the rainbow then, can you?

Well, this is exactly the kind of situation we encounter all the time; companies are awash with data, the C-suite keeps repeating the mantra “data is the new oil”, yet, somehow, they never started drilling. Sure, shovels and pick axes, yes. But really drilling? With the right equipment? Nah.

So far for similarities though. Contrary to discovering petroleum, drilling for and exposing data is not very capital intensive. Nor does it require physical labor. And that may be part of the problem. Companies are using their data on a daily basis. Some of them actually do quite a decent job given the tools they employ. Very few though actually know if and how well they’re using their data resources or what state those resources are in.

One of the challenges facing corporates is the disparity of data sources. Data may originate in different systems, be maintained by different people or even started life in different companies altogether (as in the case of post-M&A). Also, whereas the 90’s saw a push towards centralization and unification around massive ERP deployments, the 20’s see an uptake of backbone plus best-of-breed and the corporate IT landscape becomes more heterogenous. It’s as if a company would have different warehouses, some storing similar parts, some storing different parts. In the end, in order to manage and deploy the inventory efficiently, one needs a good inventory management system. We’ve gotten so far that this is a given. Why then, are data not yet viewed as inventory?

Data have long been part of the technical domain. But as business users have increasing demand for using data, i.e. for self-service BI, companies have discovered that users don’t always access the right data, understand the data correctly or plainly don’t find the data altogether. In which case they need to ask IT again. And IT often discovers there may not be a single source for the requested data. Thus, we must put in place a data inventory management (for both technical — IT — and business data). Enter the Data Catalog.

This Data Catalog is nothing else than a layer above all your data systems to tell you the “what, where, how to access” of all data. For end users, this means that through a single access point they can find any data, get a clear understanding of what this data point represents and how to access that data point from a technical point of view so it can easily be incorporated in a report for instance. Similarly, the Data Catalog will allow the user to find reports that incorporate the searched data point.

The Data Catalog allows people, whether they’re technical or business users, to track down data. The current process often involves asking data specialists. This creates avoidable non-value-add overhead for a team that would rather (and better) use their time analysing the data instead of merely helping end-users access it.

A good Data Catalog though has to do more than just be the altas of data. Instead, it should allow enriching of data so that it becomes searchable like we search the web. And the found data should be actionable. All this leads to finding data according to relevance, not just by exactly matching search terms.

Success of a Data Catalog implementation can be measured through adoption: do your users connect to and employ the catalog? Population of the catalog is automated as the tool crawls corporate systems in search of data and reports. The Data Catalog is inherently dynamic, brings down data silos and reduces data inconsistencies. Appointing data stewards lets the best placed people validate and enrich the catalog and more often than not, several people must contribute. This means that the Data Catalog must be a true collaboration tool. A good tool thus leads to more people (effectively) using data, increases data literacy and data quality.

Most companies report “rapid ROI” just by reducing time spent for searching and finding data. Most companies avoid reporting how much they saved by avoiding data inconsistencies, wrong interpretation and usage of data, etc. but a general consensus is that Data Catalogs lead to:

- improved efficiency

- better decision-making through improved data comprehension

- better data consistency / quality

- better compliance

In this day and age of digitization, employees’ effectiveness is enhanced by smart data usage. So let’s empower our employees by equipping them with the right tools. Now more than ever, speed of action is important and we want to make sure our employees can access data rapidly but also avoid they use the wrong data or use data wrongly. The Data Catalog is the tool for the trade.

--

--

Carl Fransman
dScribe data

Passionate about introducing pragmatic solutions to everyday business challenges