Master Data Belongs on the Blockchain

Steve Moore
Inside Machine learning
4 min readMay 8, 2018
  • Distributed ledger techniques can be optimal for managing master data — cheaper, more efficient, and more trustworthy than a centralized approach.
  • Missing piece is matching and linking records reliably.
  • By building permissioned networks of nodes, master data management can get cheaper, more efficient, more transparent, and more trustworthy.

___

It’s easy to think of large organizations as centralized and monolithic, but the reality is often the opposite. For banking, healthcare, transportation, energy, manufacturing, and other sectors, the trend is decentralized locations and teams managing local data. But it’s a trend that comes with the potential for chaos — especially for master data, where accuracy, security, and conformity are essential.

Building Master Data Management capabilities on the blockchain offers the benefits of traditional MDM while also taking advantage of powerful new paradigms for flexibility, consensus, and embedded analytics.

Let’s look at the details.

Master Data Management (MDM) depends on creating consensus truth for the enterprise. If Hospital A merges with Hospital B, their big stores of master data need to merge as well. It’s critical that the process reliably matches patient records when it should, while carefully avoiding false matches. Real lives can depend on the accuracy of the master data matching process.

Traditionally, matching has meant linking the records within the two different databases, based on identifiers like Social Security Number, date of birth, drivers’ license information, and so on. The MDM system could write the linkage information to a central database accessible from different locations. But having a single copy of the linkage data in a single location has meant that admins need to take special care to ensure that the data is highly available and secure. Private blockchain networks (also called ‘permissioned networks’) offer an intriguing alternative.

A new role for blockchain

What started as a digital ledger for assets and currency is expanding into new realms. Over time, large enterprises will adopt distributed ledger models to record and manage biographic and biometric data. For example, imagine hospitals, banks, and governments all wanting to maintain their master data on the blockchain. But those organizations will need ways to match and link that data across private networks.

Consider Hospital A and Hospital B. If they each maintain their patients’ records, how will they combine those records in the event of a merger?

The hospitals could first create a business network using the blockchain technology. That offers an advantage because data sharing then happens on the blockchain network as opposed to being centralized. Once the teams create the network and begin sharing data on the network, sophisticated algorithms kick in to perform matching and linking — and the linking information is also stored natively on the blockchain.

Teams could also choose whether each node should maintain its own copy of the linkage information on the ledger. If not, the node can simply consume the linkage information that’s maintained elsewhere on the network. That option keeps transaction activity from swamping any nodes that might have less compute power or connectivity, while helping to ensure that the linkage data is stored redundantly across multiple nodes.

Hopefully, the hospital example helps paint a compelling picture of the potential advantages of MDM on the blockchain, but the gains don’t stop there. Consider…

  • Data reconciliation: When every participating business unit is part of the blockchain network, there’s no longer a need to move data between the business units. With traditional MDM, data movement can consume an enormous amount of time and energy.
  • Cost and Trust: Maintaining a central infrastructure is expensive and prone to security compromise. With the blockchain system, transactions aren’t committed without the consensus of the whole system.
  • Organizational efficiency: The blockchain eliminates the need for complex reconciliation between different nodes, whether the nodes are branch banks, health clinics, distribution centers, or other peers in the system.
  • Disintermediation: Eliminates central intermediaries and reduces the fear of arbitrage within the ecosystem.
  • Transparency: Enables audit trails to be established for assets and transactions, minimizing disputes.

Looking forward

Like all big data, master data offers important opportunities for machine learning analytics. Obviously, embedded analytics of anonymized master data can yield powerful insights, but machine learning can also play a role further upstream. Innovative firms will find ways to apply machine learning to the matching process itself to ensure even higher confidence for the linkages between records.

Ultimately, the goal is to make Master Data Management as easy and intuitive as possible. New tools will give non-technical users across industries the ability to manage master data flexibly, efficiently, securely — and with perfect confidence.

--

--

Steve Moore
Inside Machine learning

IBM Story Strategist. Machine Learning researcher. Speaker. Teacher. Opinions are my own.