Master Data Management

Mirko Schedlbauer
shipzero
Published in
8 min readMay 1, 2019

A practical Guide on the 10 Steps towards successful MDM Projects

Seeing a Master Data Management (MDM) system implemented on a company wide scale is still very rare. But why is that? In this article we’ll look into the most common obstacles and how to overcome or avoid them.

Generally speaking, MDM is all about creating one single reference source for important business data. It is critical for being able to provide real-time digital solutions to customers — be it internal or external ones. Thus, from a strategic point of view, Master Data Management is critical for a companies ability to innovate. The 10 steps described will hopefully make your life easier in setting up an MDM project.

The 10 steps towards a successful Master Data Management project

So why is managing Master Data that hard anyway? There are several reasons, but most MDM projects fail due to the complexity and the volatility of the corporate systems-landscape. Analyzing all existing data sources is a huge and quite unpredictable amount of work to do. Evaluating the data quality and the relationships to other sources doesn’t make it easier. Not even speaking of slowly changing database structures because of new business requirements. An iterative approach of building an MDM system with an agile mindset of failing and learning fast is therefore inevitable.

And always keep in mind, it is about solving business problems so involve the business side when planning MDM. They are the ones to profit from it and should therefore be in the lead for prioritizing — so involve them early and often to ensure management support.

1. Define Business Goals

Everything starts with a business decision to release budget and/or allocate valuable internal resources on developing a MDM solution. So where are current issues that require Master Data to be clean and always available? But also: what is the big picture, the vision for the company? Where does the company see itself in 3–5 years? What business models will be predominant in the industry?

We already see that there are 2 different views on MDM here:

  • the efficiency perspective of generating short term ROI through data availability, common data definitions and the thereby derived reduction of inconsistencies
  • the infrastructure perspective of having MDM as a basis for being able to execute business strategies in the future

Make both paths visible and measurable to get a full scope of the potential added value and to be able to manage expectations effectively.

2. Identify Master Data

Not all data is Master Data. As we’ve learned before, it is not easy to manage Master Data — so we should narrow the existing data pool down to a relevant subset. Here are some selection criteria that are valid in most MDM-projects:

  • High business value
  • Low volatility
  • Complexity involved
  • Re-usability

And just to point it out for once, transaction data is not Master Data. To translate that into a Data Warehousing (DWH) perspective: It is some slowly changing dimensions that should be considered as Master Data, but no facts.

We recommend creating a matrix to evaluate potential Master Data in order to have a common view for all stakeholders. The criteria for the measurement of business value, volatility, complexity and re-usability have to be defined transparently in advance. This is how the evaluation for a specific product dimension could look like:

Exemplary evaluation for the product entity

Obviously, having a DWH in place already makes this process quite a bit easier. But don’t let the old architecture fool you: The database structure from 10 years ago might not represent the business of today perfectly.

3. Identify & Evaluate Data Sources

Now that the relevant entities to manage as Master Data are identified, those are evaluated in detail. The first goal is to figure out what data sources are involved — which sounds quite easy but there are usually more shadow systems or manual steps in historically grown processes than any one person would expect.

Then the entities must become canonical — meaning that attributes are to be mapped to eliminate duplicates. Sometimes it is already possible to identify the one real data source for an attribute at this point. But make sure not to miss any role playing dimensions as those are often a valuable source for generating analysis from the perspective of different business units.

Creating such an overview is especially important when dealing with complex and decentralized IT-landscapes in order to get the right people together for analyzing the data in-depth.

4. Analyze Metadata

We usually speak about a company-wide, central data store as the ‘single-point-of-truth’. Obviously, it is vital to agree on a common data vocabulary and taxonomy. But sometimes data can have different meanings to different stakeholders in which case it is hardly possible to find a universal definition.

Here, utilizing metadata is the key. Use as much metadata as possible, make it (or the information derived from it) available to the business users and let them decide what to do with it — instead of desperately trying to define one truth beforehand.

5. Analyze Data Lifecycles

Create, Read, Update, Delete. The fairly common CRUD data lifecycle can be used to get a deeper understanding of the processes currently used to work with the identified Master Data entities:

  • Create: Who generates the data? When and how?
  • Read: Who consumes and uses the data? When and how?
  • Update: Who changes the data? When and how?
  • Delete: Who removes or disables the data? When and how?

In most cases it will become evident that the producer of data is not equal to the consumer and that they have different requirements concerning the data. This gap of needs has to be closed to build a seamless process for all stakeholders.

6. Appoint Data Stewards

Only when you own the problem, you can own the solution. In the previous step, the employees responsible for processing each of the Master Data entities were identified. Out of that pool, one person from the business side and one person from the technical side should be appointed as Data Stewards.

Enlist people to help so they feel engaged and become part of the solution with you. Of course you actually have to include them and give them the chance to contribute on an ongoing basis instead of just naming them early on and then provide a solution when the project is over.

7. Choose Architecture & Data Model

Choosing the ‘right’ data architecture and develop a data model is super important — but would be a topic for at least one blog post each. And is there really the one and only data model that fits your requirements? I would argue that there are usually several solutions that can do the job. It is much more dangerous to create an immense additional effort blindly going with buzzword technologies because they sound fancy and could in some scenario provide an amazing performance.

I am not arguing to only use established technology but it is all about the actual value that the new system can add to the business. This article on picking the right data model can give you an overview on how to approach that decision closely aligned to the business requirements.

8. Choose Infrastructure & Toolset

This is another complex issues that will require at least one full article to get into as there are various technical details as well as business implications to consider. And obviously there is the omnipresent topic of cloud vs. on premise as well. Just to get an overview of the currently well-recognized solutions on the market, here is the Gartner Magic Quadrant for Master Data Management Solutions for 2017 and 2018:

Gartner Magic Quadrant for Master Data Management Solutions (2017 & 2018)

Of course the main BI vendors IBM, Informatica, Microsoft (kind of represented by Profisee) and SAP are represented and they all have their edges over one another in specific domains.

So it is definitely not an easy task to choose one. And yes, there are vendor guides out there, but I would just urge you to get a solution agnostic partner to not get hooked on the first fancy pitch by one of the vendors — according to their sales teams sky is the limit.

9. Evaluate System-Modifications

An effective Master Data Management is not a one-way-street. It requires communication between the database storing the Master Data and all other systems using it. From the prior analysis of data lifecycles (see point 5) you already know which data sources are affected: all systems in which at least one of the CRUD-steps is executed.

If it is about collecting data about customers in your CRM, regularly buying external data on market developments or gathering data about technical products — as soon as the criterion of re-usability is met, you’ll want the possibility to access this Master Data from applications as well. Those systems should either have an API to the MDM-database to perform lookups or the Master Data already integrated as soon as it is processed.

Make sure to address potential alterations early and include the responsible stakeholders in the decision processes. This way unpleasant surprises e.g. in the form of capacity shortages or non-expandable software can be avoided.

10. Prototyping

Now it is time for the fun part, actually delivering a prototype for the users. But is this a small version of the fully implemented MDM-solution? No, it is just a sketched process of how one or more specific use cases are approached.

It can involve a click-dummy for end users if the solution contains a new interface. Or it can demonstrate how a specific part of data flows through the newly arranged APIs. This part is all about on-boarding the team on how the solution will look like and of course get their feedback on the overall vision.

You keep your internal or external customer engaged in providing them some look & feel experience of the solution. They are the ones working with the system in then end — their continuous input is invaluable.

Conclusion

  • make short-term and long-term benefits visible and measurable
  • involve key stakeholders early and often
  • define a narrow scope to start off with
  • identify relevant Master Data & get to know it in detail
  • appoint data stewards to create accountability
  • develop your model and infrastructure iteratively
  • involve a solution-independent partner
  • keep all affected systems in scope, they distribute the added value

--

--

Mirko Schedlbauer
shipzero

Digital Strategy, Data Analytics & AI Specialist | Founder @ Appanion | Passionate about digital, disruptive technologies