Data centralisation: Joining dots for trustworthy predictions

Nikki Miles
MPB Tech
Published in
6 min readJul 12, 2023
A light source illuminates strings to create strands of light, giving the impression of data pointing towards a centralised location
Photo by Joshua Sortino

It’s often said that data is the new oil. But, like oil, data isn’t very useful in its raw state. Only after careful processing is its true value revealed.

Organisations today can call upon more data from more sources than at any time in history, but that has its downside too. If it isn’t maintained, understood and thoughtfully queried, its value can be lost.

Data centralisation — creating a common, accessible single source of truth instead of disparate data silos — can give your people the kind of understanding that enables strategic decision-making.

At MPB, we’re on a journey to unlock the value of our data. We want to provide our business units with analysis and predictions they can work with. In this post, I’ll share how we’re going about it and what we’ve learned along the way.

Making the connection

Combining data from various sources to reveal the bigger picture is immensely powerful.

Of course, isolated data silos — maintained and queried by super-users with detailed business knowledge — can be incredibly valuable. But data centralisation brings a host of benefits that siloed data can’t match.

Bringing the disparate sources into a central repository — a data lake, data warehouse or both, usually cloud-hosted — enables some big wins:

  • Centralised governance: minimises risk from regulatory and compliance issues. Maintains data quality and ensures appropriate access.
  • Single source of truth: Data is transformed into key aggregations, which then form the cornerstone of reporting, analytics or data science. This creates consistency and removes ‘on the fly’ calculations from the end user.
  • Consistent, efficient, automated: Preventing the same data transformations being repeated by many different people, which might introduce human error and reduce data store efficiency.
  • Minimises data redundancy: Consolidation avoids the same data being stored in multiple places, often telling different stories.
  • Happier teams: Engineers can better understand the wider business value of their work. Data users can access the clean, reliable information they need to ‘actually do their job’. And building a central data team can create a supportive network that learns and grows collaboratively.
  • Happier stakeholders and enhanced decision making: Data democratisation; analysis sourced quickly from reconciled and trusted data.

All of this results in an output far more powerful than the sum of its parts.

Empowering the people

For data to be easily accessible, it must be transformed. One classic example of such an aggregation is the Single Customer View. Knowing your customers’ behaviour patterns enables enhanced interactions and communications which are trusted and informative, relevant and timely.

Key data aggregations can be brought to life with a dashboard — an accessible window into the data for business users, designed with a clear use case in mind, supporting data-led decisions.

Business dashboards can encompass many different areas of the business and associated data, enabling analysts and business users to drill down into top-level metrics in a safe, structured and repeatable way.

A dashboard doesn’t have to be one-size-fits-all but should future-proof where possible:

  • Its value is from supporting business-wide understanding and surfacing key metrics in a centralised and simplified way.
  • It should encourage the ‘so what?’ and ‘why?’ questions that lead to focused analytics. With the use of filters and parameter-driven flexibility, it can do a good job of covering the top-level drill-downs, allowing end users to investigate drivers of trends.
  • It provides a tool for analysts to reconcile to the truth and bring confidence to any deeper analysis.

Of course, the more powerful the dashboard, the more daunting it can be for those less comfortable with data. As data specialists, we need to support these users and ensure their requirements are built into our solutions.

For example, having just a few key centralised dashboards with increased functionality drives frequent use. Sharing the same views instils familiarity and confidence — people can easily understand where screenshots or KPIs came from (and there’s no risk of losing a hidden Google Sheet full of vital data if its owner leaves the business).

Getting the whole truth

To achieve a single source of truth, skills across multiple data disciplines are required.

  • Data engineers are needed to build robust automated pipelines, owning the data products that support business success.
  • Reporting and Analytics use this well-structured, maintained and trusted data, surfacing key insights to the business in digestible formats to enable quick understanding and drive strategic decisions.
  • Data scientists, utilising the transformed data alongside less structured/aggregated data, apply statistics and data science techniques to identify groups or predict patterns.

Data ownership

Ownership of data can be a grey area. After all, you have a team of people with ‘data’ in their title who work with data. So they own data … right?

Not necessarily. Business owners should own their own data. They are the subject matter experts. They understand what they want from the data, the key processes involved in collecting it and the caveats to use when interpreting it.

Data and analytics teams are custodians of the data but of course, this doesn’t absolve them of all responsibility. Data professionals need to work closely with stakeholders to understand their business roles and pain points.

This could be done by formally embedding data team members whilst maintaining a common data Centre of Excellence, or through a formal and centralised data team that works closely with business units.

A centralised data team can learn and grow together, sharing knowledge and peer reviews. These are ever-changing specialist roles, and this continual learning can be lost when data professionals are embedded with non-data peers.

A centralised data team can mitigate against single points of failure, with cross-focus peer reviews alongside formal and informal discussions on approaches.

Keep everything. It might be useful … ?

MPB has a long history as a data-driven operation. We have over a decade of proprietary data, covering millions of data points, which we can use to make accurate predictions about how customers will behave for a suggested change.

Of course, too much data has the potential to cause analysis paralysis. Data centralisation means taking stock of what’s useful and what’s redundant.

There is now so much available for e-commerce businesses, from platform metrics to customer metrics to third-party integrations and so on. Given low data storage costs and ever-changing business requirements, it can be hard to decide what to load and what to leave. The reality is, it’s easier to bring in everything and sift later.

As we said at the start of this article, data is only valuable if it is usable. It is at this point that clear use cases, hypotheses and goals need to be established. We must understand not only the expected value of the data but also the transformations we can apply to make it fit for purpose.

For example, do you require historical data or simply a current view? Thought needs to be given as to how to store historical data to enable easy access — snapshots are generally much more difficult to work with than a log of data with start and end dates.

Often this transformation consideration occurs after loading, which means the storage of the redundant data is often not reviewed. With the advent of all this data, data catalogues and data dictionaries have never been more important.

The changing landscape

Centralising data is a job that will never be finished. The key is to build foundations that enable scalability and changing requirements.

The world of data moves fast — how we use it, what kind and how much, all are moving targets. It is important to be flexible, to build iteratively and not to be afraid to change direction.

Amid all this change, data catalogues, data dictionaries and entity relationship diagrams are key to success. Because garbage in, garbage out.

It’s about the journey

You still want to know where you are headed, right? I know I don’t like surprises!

At MPB, we have a wealth of first-party and third-party data sources, supported by a data-literate senior team and a huge appetite for data across business functions. Over the past 18 months, we have gone through a data transformation to enable a scalable, consistent and governed approach supported by our data warehouse and expanding Data and Analytics team.

We have a clear destination in mind, and some key milestones to reach along the way. Together, we’ve overcome roadblocks and pivots. And, in less than two years, have enhanced our approach to data storage, accessibility and business self-serve — and there is so much more to come.

With the foundations in place, we’re already better able to share insights, tell meaningful stories with our data, and support our community of visual storytellers — keeping our customers at the heart of everything we do.

Nikki Miles is Head of Data and Analytics at MPB, the world’s largest platform to buy, sell and trade used photography and videography kit. https://www.mpb.com

--

--