Navigating the challenges of Metadata Management

Rik Meter
Sogeti Data | Netherlands
5 min readJun 15, 2022

Metadata. We’ve all heard about it. We’ve all been told it’s important. And yet we still we struggle to manage our metadata. To better help you manage the many challenges of metadata management, this article is addressing four of the common pitfalls you can avoid when implementing metadata management, while offering some common practices that can help you to mediate these challenges.

Metadata 101

Even though the literature on metadata can easily fill a bookcase, we’re often left none the wiser about metadata. To this day metadata and metadata management remain some of the more vague and obscure topics surrounding data management. Typically, when inquiring on metadata you’ll be treated to the ultimate clincher; ‘it’s data about data […stupid]’.

Though this definition works great in a sales pitch, it really fails to provide any meaningful clarification. I believe metadata is therefore best explained as being any and all types of descriptive data that are used to enhance our understanding and the usability of data.

We can typically distinguish between three main types of metadata:

· Business Metadata, which describes the context in which data is used, such as terms and definitions, business rules or business policies;

· Operational Metadata, which describes the operational use of data, such access logs, job execution logs or audit reports;

· Technical Metadata, which describes the actual (handling of) physical data, such as data sets and attributes, data quality rules or data transformation rules.

Typical applications of would metadata include

· to catalog your data assets or data products;

· to facilitate the exchange and distribution of data;

· to analyze or audit data processing.

So why is metadata a big deal?

Now that we’ve got that all cleared up, all our metadata endeavors should be careening to the finish line, right? Well, unfortunately no… Acknowledging metadata is a first step, but implementing Metadata Management is still a though challenge for most organizations.

And that’s a problem as metadata is quickly becoming one of the key enablers of digital innovation. Metadata is a necessary component of just about any data related project. Anything from data democratization, regulatory compliance, or data automation will rely heavily on easily accessible and reliable metadata. Organizations will therefore need to master metadata management before being able to become truly data driven.

For this article I will highlight four challenges to metadata management that will relate to scope, quality, governance and distribution.

Scope, scope, and scope…

First and foremost is scope: the need for metadata often stems from management’s desire to have more control over data assets and data usage. After all, how can you ever be in control of your data if you don’t even know what data consumers might be using?

Management will often prompt a process of data discovery without being able to prioritize just yet (‘just show me everything about everything’). While data discovery is a necessary step in metadata management, a failure to scope and prioritize can quickly cause projects to derail into a high effort, low yield endeavor. Instead focus your efforts on identifying and cataloging your primary data assets. These might include master & reference data, secure & private data or multi use data products.

Consider metadata quality

Assuming you’ve identified some of your key data assets, you’ll eventually start to metadata these assets. The main goal here is to ensure that metadata is fit for purpose. The metadata will need to answer some of the most prominent questions that your data consumers have, such as:

· Who produced this data asset?

· Am I allowed to use this data asset?

· What does this data mean?

To answer these questions, you’ll have to consolidate metadata from various sources or maybe even document the metadata from scratch. Metadating large swaths of data can be quite the chore. Therefore, you’ll want to make sure that you provide your data owners and data product owners with the proper incentive, support, and instructions to properly develop the metadata.

Failure to do so will leave you with metadata that is at best inaccurate or incomplete, or at worst just plain gibberish. For example, having your definition for CUSTOMER data read ‘data on customers’ likely won’t help anyone. A better way to go about metadata would be to provide an explanation of the intended use of the data. Or better yet, to link the data to your business glossary to help explain the definition of a customer. Focus on describing those characteristics that will help add value to your metadata.

No data governance, no metadata

With Metadata Management the word ‘management’ is often forgotten. Companies tend to invest so much time and effort in data discovery and metadating, that little thought is spent on what comes after. When no concerted effort is made to manage one’s metadata assets, the quality of the metadata tends to decline rapidly. And offcourse once your consumers find that metadata is outdated or inaccurate, they’ll quickly abandon it to instead search for more accurate information.

In short, metadata is a data asset in and of itself and it needs to be managed just like any other data asset. Ideally metadata management should be a logical and integral part of other data management processes. Common practices would include linking the responsibility for physical data assets with that of metadata assets or having active participation of various consumer groups in the review and refinement of metadata. We nowadays see the movement of data being rated by the consumers based on the usability. In that way, the adoption rate of use of the data shared by the data owners can even be a KPI to measure success. Parallel to that, the data which is low rated will be abandoned. This can be seen as a self-cleaning system in sharing and the use of data. A strong meta data practice will directly support the KPI of success and will drive trust in the use.

No such thing as ‘private’ metadata

And finally, you’ll want to enable metadata to be shared. Often metadata is managed decentrally by data architects, administrators, or BI developers. Management might even try to limit access to the metadata as it relates to private- or secure metadata.

As a result, metadata tends to be poorly accessible. ‘Hidden’ metadata however, has limited use outside a narrow group of stakeholders. After all, consumers won’t be able to use data assets that they’re unaware of. Therefore, not sharing metadata is a surefire way to promote data silo’s and to entice the development of similar data assets with lack of awareness in the availability. Needless to say, you’ll want to make sure that you share metadata with all potential current and future data consumers. Typically, you’ll want to use a commonly accessible platform for distributing metadata. In the early days this might have been your BI portal or even a lowly Excel file, but contemporary solutions like data catalogues or data marketplaces are great for distributing metadata and promoting collaboration.

To conclude

It is my sincere hope that I’ve been able to provide you with insights in some of the many practices for metadata management. Hopefully this blog will leave you better equipped to deal with some of the challenges that might arise.

Looking for more than just helpful advice? Sogeti has extensive experience in helping clients get the best out of metadata and metadata management tools. If you’re interested to learn what metadata can do for you, then be sure to read about our Data & Business Analytics services.

--

--