7 Step Strategy to Build Data Products and Platforms

everill.peter@googlemail.com
8 min readDec 1, 2021

--

7 steps with typical ownership

Building data products and platforms is hard. The 7-step method is the practical strategy for building data products and platforms against which you can hang the enabling components of a data operating model. By that I simply mean the people, skills, technology, governance, delivery methodology, measures and culture that is needed to achieve these steps.

By products I mean two things:

  1. Analytical products typically classified as — self serve performance insights, automated alerting, automated decisioning, monetised insights,
  2. Data products — typically data marts and feature stores within a data platform which the analytical products above consume.

Note: Analytical services like performance deep dives, experimentation and supporting one off decision making all feed into and use these data product builds.

The 7 Steps

Step 1: Problem & Data Product solution is defined

Outcome: The problem you are looking to solve is clearly defined. Success of solving the problem can be measured. The data product to solve this problem is defined along with the ownership and organisational readiness for execution

Key principles:

This step is all about ensuring you are solving a valuable problem. Data products in some way are trying to improve performance of a commercial metric. Often people will define the problem but overlook a clear commercial metric for measuring how well that problem has been solved. You’ll also be looking to put some sort of benefit figure towards solving this problem, which would be the job of the analytics teams to support the business owner in sizing the opportunity.

The second part of this step is to define what the data product is. This will cover both a business-friendly overview of how it works, but also the more detailed technical version that allows the build teams to estimate the dependencies and effort involved. Dependent on the experimental nature of the data product there may be some need to be flexible both on technical design and estimate.

A big part of analytics & data science professionals is working with business owners to translate business problems into data products and credibly selling the art of the possible.

The third part is to jump ahead and consider the organisational readiness. If you build this data product — will it be used? In the execution step who will use this data product? What process or decision does it impact? What roles will be involved? How will that role be impacted? How will they consume the data product?

If you haven’t got organisational buy-in from the stakeholders impacted at the execution step, you shouldn’t proceed any further.

Step 2: The Data & Tech dependencies are defined

Outcome: The data & Tech dependencies required to build the data product are defined and the organisational commitment to build

Key principles:

The reason for defining the data product solution in the step before is not to be limited in solution thinking to only current data. However, now you have defined the data product you need to think through the practical data & tech dependencies. You might not currently have the data needed to solve this business problem — in which case you will need to go through subsequent steps of capturing it and building it. As your data platform matures this data might already have been built ready to consume.

Whilst the first step considers the value opportunity. The second step is more grounded in the feasibility of the tech and data effort and associated costs. By the end of this step, you need to be able to put some sort of estimate and cost to solving the business problem through building the identified data product.

Example considerations will be what data is needed? Which source systems is that coming from? how difficult is it for us to capture? What tech do we need to capture, store and process the data? What tech will the data product be consumed in and what’s the associated development work needed?

At the end of step 2 is an appropriately robust value v feasibility assessment which you’ll want to get your business owner comfortable and engaged in. Your business owner really is relying on your expertise here. It’s perfectly okay at this stage to decide some data products shouldn’t proceed past this stage. Or defer them until you’ve got a number of data products which require the same data & tech dependences, and their collective value makes it worthwhile proceeding with. Data Products that might have seemed a great idea in step 1, might be less attractive at the end of step 2. This assessment will enable you to prioritise which data products to take into the subsequent build steps.

Step 3: Data is captured by Source Systems

Outcome: The data needed to build this data product exists and is captured by a source system.

Key principles:

This is where the cultural mindset of thinking of data as a product needs to happen across the organisation. As we increasingly move towards data driven or even algorithmic businesses, there can’t be a disconnect between the quality of source system data capture and the assessment of what impact changes to source system data capture might have on downstream decision making or algorithm effectiveness.

Data quality governance and metrics around data capture needs to be pushed to the responsibility of the source system owners at step 3. Failure to spot data issues until steps 6 and 7 is inefficient as time is spent diagnosing it back to the root cause.

Step 4: Data is published to the Data Platform

Outcome: The source data is published to the data platform

Key principles:

Depending on the size and scale of the organisation this can potentially be a bottle neck for larger organisations with a centralised team. This is the problem the concept of the Data Mesh is looking to address — decentralising the responsibility of publishing data to the source system owners. The benefits to this are as the source system data changes/evolves, the domain experts closest to it can ensure the data pipeline is updated. To enable this decentralisation to happen, strong architectural and infrastructure patterns need to be implemented so there is consistent in the way data is published, otherwise this just creates additional work at step 5: Data build.

Step 5: Data build

Outcome: The data is engineered and made available for consumption

Key principles:

Getting the source data onto the platform is one thing, making sure it’s in a consumable state is another. Time needs to be spent modelling, curating and quality checking to ensure the data is fit for use by analysts, data scientists and reporting teams. This can either be through a presentation layer for the underlining data warehouse or it can be the feature store for your machine learning teams.

A focus for your data platform team is promoting what is available for the analytics / data science teams and citizen analysts. Data engineering might not get the limelight but can have massive impact through producing a data platform whereby data can be used for a number of analytical products. This helps the data platform achieve economies of scale.

One of the concepts to explore is thinking of your data platform as having a similar business model to a social media platform — using engagement metrics to measure the effectiveness of the content being published in the data platform.

Step 6: Analytical build

Outcome: The data product is built and ready for deployment

Key principles:

Once the foundations have been laid in steps one through five, step six is where analysts turn the data into a consumable analytical product. As discussed, the products could be self-serve performance insights, automated alerting, automated decisioning, or monetised insights.

Each might have its own tailored delivery methodology. Algorithms for example are not linear in development and move back and forth between the steps as the teams experiment on what is possible and what data is needed.

Step 6 is where the business becomes more heavily engaged in the typical end user activities of the development cycle — notably testing and preparing to deploy.

Step 7: Execution

Outcome: The data product is live in a customer journey or business process and delivering the intended value

Key principles:

Put simply, this is where the data product is used, and value is delivered. The data product is embedded into customer journeys or business workflows / decision making.

Business change will be required to ensure the data product is used by the business and this is something that should have been considered and planned for in step 1.

Post-delivery you should be factoring in the resource and cost of on-going product support, monitoring, feedback and iterative build.

The value of adopting the 7 step method

  • It is business friendly language. Much of the language and acronyms used in data isn’t, which creates barriers to organisational adoption.
  • Within the data world there any many skillsets and types of works. Individuals and teams along with business stakeholders need to be united in a common language and framework to foster collaboration.
  • This in turn promotes professional mastery through different disciplines combining into high performing teams by having clarity of what their role and responsibility is.
  • It enables you to visually show the flow of delivery for stakeholders and teams. Demystifying the data/analytics/data science black box or “black hole” into steps with clarity on; what they are, the intended outcome (definition of done), which step the work is at, and who is responsible for progressing it. This drives engagement and throughput.
  • It enables individuals to understand how their role ladders up to solving business/customer problems and ultimately the business value they are helping to achieve.
  • While presented as a linear process analogised as a production line, the steps are intended to be iterative and completed in parallel to one another where possible.
  • It can be a framework against which you evaluate your organisational capability & capacity against to understand which steps need developing. Value is only delivered on completion of all 7 steps.

What makes it product led?

  • That is starts with a customer or business problem and that its end goal is to deliver value. Everything should be grounded in achieving this.
  • It’s geared towards building data products (which is enabled by a data platform, engineering, and culture). These data products embed data decisioning directly into customer journeys or business processes. Which in turns builds the need for the associated analytics and insight work.
  • Focuses on building the right data products in the right way. Addressing the 4 risks of value, usability, feasibility, and viability up front.
  • Each step is outcome driven.
  • It embraces the principles of experimentation, prototyping, and iterative product build.

Final thoughts

The steps show how cross functional/cross departmental building data products and platforms is. This is something that often surprises an organisation, particularly the non-data departments.

The skills, people, technology, data and organisational change needed to be data driven is complex, costly and difficult.

If you are failing to be data driven as an organisation, the root cause will be failure in one or more of these 7 steps. This strategy is intended to be a high-level plan to build organisational data driven capability.

Disclaimer: This article represents my opinions based on 15 years’ experience in analytics working across 17 companies as a consultant and permanent staff member. Across those companies I have led teams to design and build data, analytics & machine learning products, data platforms and data driven organisational capability initiatives.

--

--