The importance of broad discovery in metadata governance

Pieter Delaere
dScribe data
Published in
3 min readMay 13, 2022

When thinking about managing authorizations to access actual data, caution is warranted. You might not wish for everyone in the organization to be able to consult your monthly turnover figures. Specific margins per product are valuable information that should be protected from leaking outside the organization. And especially in a GDPR world, not just anyone should be allowed to consult personal customer information at will.

When talking about metadata authorizations however, things are different. There are two main reasons why you should think differently about your authorizations model for metadata versus for actual data.

Reason number 1: broad access comes with more rewards

Although a wealth of information can be gained from metadata alone, the fact that no actual data is exposed offers the opportunity to give significantly broader read access to metadata assets. Doing so comes with significant benefits for three user groups.

Data consumers, who are making use of data for decision making, analyses or various other reasons, waste less time looking for the reports or datasets they need. Rather than going through an often frustrating search (see an example of classic data discovery here), a single metadata portal gives them an easy place to start. The better metadata is exposed to them, the broader they can search and the higher the chance they will find something of value to them.

Data producers, who create reports and data models for themselves or for others, become more productive. Before starting their work, they can have a quick look at which assets exist already, often reducing the need to start from scratch or even resulting in the conclusion that the exact asset they intended to create is already available. Additionally, these data experts who are typically in high demand have a reduced ‘support’ burden. They no longer need to deal with common questions such as ‘Does a report on X exist?’ or ‘Where can I find a dataset with Y and Z?’. These questions can now be answered directly by users via the discovery portal.

IT managers, who need to decide on which tools to invest in and subsequently wish to maximize their return on those investments, can use a metadata-based discovery portal to boost adoption. Even users who are not yet aware of a tool at their disposal, might discover valuable data assets in it via the portal and subsequently start to make of it. Another result is reduced stress on how to manage a complex and distributed data landscape with various systems. Without having to physically integrate each system, via the metadata at least a single access point can be offered, easily pointing users in the right direction.

Reason number 2: broad access incurs fewer risks

When granting access to data, important questions need to be answered. For example: can people cause harm by sharing information with unauthorized users? Are there regulations that need to be complied with when consulting this data (e.g. GDPR)?

On the other hand, when exposing metadata, such as the name and description of various reports and datasets, only one question matters: is there a reason why the existence of this asset should only be known to a select group of users? In practice, the answer is rarely “yes”. One rare example of where access to metadata might lead to undesired consequences, is the exposure of highly sensitive content such as ‘salary reward reporting’ or ’employee performance ranking’.

In essence, a key use case for a metadata discovery portal is maximizing discoverability of assets throughout your organization and across various systems. Deviating from that openness should really be done as little as possible in order to reap maximum rewards, as discussed earlier.

Conclusion

Offering your employees access to a data discovery platform (powered through metadata) is an incredible gift. Via a single access point, you enable them to find and understand any data resources available in the organization. The result? Increased productivity, data usage and cross-team collaboration, without compromising your actual data’s security.

Now that we have clarity on how to define metadata discovery policies, in a next blog, let’s talk about defining metadata contribution policies. Who should be allowed to further document various metadata assets? Should you assign metadata ownership and if so, to whom? Follow-up blog post coming soon!

--

--

Pieter Delaere
dScribe data

Fascinated by the possibilities of data. On a mission to enable everyone to generate value through data as CEO of dScribe.