Data Product Canvas — A practical framework for building high-performance data products

Leandro Carvalho
15 min readAug 15, 2022

--

How to avoid creating the right solution to the wrong problem?

Data Product Canvas Framework
The Data Product Framework : Data Product Canvas — DPC

Introduction

Do you know why most data products fail? It is because they are not always aligned with the real objectives of the companies.

So if you really want to build a good data product, start with the mission. Make sure you have a real problem to solve and a well-defined scope from the start, that is, something that can adjust to the companies’ business challenges. After all, the value generated by data products is not related to the use of technology, but to its use and practical application.

In this way, whenever you think about products and projects related to Data Science, remember before defining: the focus, objectives and strategies. Because if you know WHAT to do, the HOW to do it won’t be a problem.

And so, precisely thinking about answering these questions, the Data Product Canvas — DPC was created, which will be presented shortly afterwards. Hope you all enjoy reading it, comments are always welcome.

What is the Data Product Canvas?

It is a framework for the development of data products, based on a Canvas model, which follows the principles of the Agile/ Lean methodology. Its main objective is to serve as a practical tool for generating a data product roadmap, aligning in a single document the complete view of all those involved about the real purpose of the project.

The Data Product Canvas is divided into 10 blocks (problem, solution, data, hypotheses, actors, actions, KPIs, values, risks and performance/impact), and separated by 3 domain areas , namely: the product vision (which encompasses the blocks: problem, solution, data and hypotheses); the vision of the strategy (with the following blocks: actors, actions and KPIs); and finally, the business vision (values, risks and performance/impact).

In each block , all the Discovery necessary to have a unique understanding of each part of the data product that will be developed is explored in detail. And each domain deals with a key area for the correct planning and development of the product, providing a 360 view that goes from problem determination to strategic execution, including KPI monitoring and risk mapping.

What is it for?

The idea of ​​the DPC is to ensure a practical and objective planning of the data product based on a common understanding between the technical and business areas. In this way, it is possible to map the entire project from end to end, thus allowing a single view of what was planned and what will be executed. In addition, it also helps in creating actions and monitoring the main metrics to follow the solution that will be developed throughout its life cycle.

Thus, it is possible to clearly identify and map the following items:

Data Product Canvas — 10 blocks divided by 3 domains
The Data Product Canvas is divided into 10 blocks and separated by 3 domain areas.
  1. Problem definition;
  2. The solution that will be adopted;
  3. Data mapping;

4. The hypotheses that will be tested;

5. All actors (customers and stakeholders ) involved;

6. The strategic actions that will be developed;

7. The KPIs that should be monitored;

8. The values ​​(the size of the problem);

9. The risks; and

10. The performance and/or impacts of the product on the business (values ​​generated or saved).

Why use?

The success of a data-driven culture depends on the definition and implementation of strategies , not technologies. That’s why it’s important to make it clear that data products are a business domain problem, not a technology one.

So, before starting a project, think about what you really need. Don’t get lost with technology, use the most accessible one. Focus on the business, not the technical solution. Therefore, you need to know what is expected of your data product and what to do with it when implemented, that is, what strategic actions will be generated at the end of the product delivery. In addition, monitoring performance and its impact throughout the lifecycle is critical to the long-term success of the product.

This is very important, because without these understandings, you will probably fail.

How to use?

1. Start with the problem

How to define the problem by using DPC
How to define the problem.

As a good Data Product Manager you will have to guide Product Discovery through the most important part: the problem. Otherwise, there’s a good chance you’re creating the right solution to the wrong problem. Believe it!

So, regardless of how the demand comes (it is very common to come from a stakeholder who already has the “ready-made” solution in mind), always start with the problem. After all, a well-defined problem is a solved problem. This is a crucial and non-negotiable point. It is even possible to start from a premise from the idea of ​​a solution or from the test of a hypothesis. However, filling the Canvas, and all initial effort, should focus on this point. That is, the clear and specific definition of the problem.

It is from this theme that the rest of Canvas will be developed. So much so that, in many cases, it will be possible to fill in other parts of the Canvas from what is discussed in this topic. Sometimes it is possible to change a project proposal and/or discontinue its execution at this point. And this occurs when: there is no consensus on what the problem actually is, or when it is perceived that the problem identified is not what is actually expected.

So, remember: always focus on the first block of the Canvas, that is, establish clearly and objectively the problem you want to solve. And for that, always ask:

  • What ‘s the problem?
  • Why is it a problem?
  • Whose problem is it?

You’ll be surprised how these 3 simple questions can change the direction of a project. But don’t stop here. For each of them, after the answer from your client/stakeholders, ask at least another 3 times the reason for each answer.

As an example, I will bring an experience of mine with a CEO of an e-commerce group. Watch the dialogue between the CEO and Data Product Manager (DPM), and notice how things have changed:

CEO: I would like you to come up with a solution that could bring more customers to the company. It could be a project using customer segmentation and A/B testing with our data, then making a recommendation. What do you think? (Note that the customer has already imposed that the problem was the search for more customers and has already defined the technical solution!!!)

DPM: Of course fine, we can do that. After all, we are talking about a problem at the top of the funnel. But before we go any further, could you explain to me why you need more clients? (At this time the DPM prays not to be sent away);

CEO: It’s just that we need to sell more, and for that I’d like to have more customers.

DPM: Hmm… I get it, but you can sell more by focusing on current customers. This seems to me to be a problem in the middle of the funnel, that is, we can solve this with recurring purchases and an increase in the market basket, working on top of the current customer base. In this case, I believe that a solution with up-selling and cross-selling techniques would be more appropriate. What do you think?

CEO: Exactly, that’s why I always say that Data Science is the solution to every problem. Go straight on!

DPM: Sure, let’s go! But before that, answer me: why do you need more sales?

CEO: Look, our revenues are actually going down. That’s why I need more customers, who buy more, to generate more revenue. Did you understand? (At this point the CEO is already showing a little more impatient).

DPM: Now we have a real problem! If revenue is falling it is because we have a customer retention problem (bottom of the funnel), and this can be solved through a solution to prevent and reactivate churn (customer loss). What do you think?

CEO: That’s why I always say that Data Science is the solution to every problem. Go straight on!

See in this example how unproductive it would be for the ultimate business objective to create the initially proposed solution. It would undoubtedly be a solution that would not address the real problem of the project, which was a problem of customer retention and not of attracting new customers.

So the right thing to do is ask:

  • What ‘s the problem?
  • Why is it a problem?
  • Whose problem is it?

Then, for each question: a save the answer. Write it down and put it on a post-it in Canvas. Then ask: why? save the answer. Write it down and put it on a post-it in Canvas. Then ask: why? save the answer. Write it down and put it on a post-it in Canvas. And then ask why, why???…

At the end, you can still reinforce by asking: why?

2. Try to identify the solution that will be adopted

DPC — identifing the solution
How to identify the solution.

Here it is worth reinforcing a point already highlighted earlier: a data product is not a technical problem, but a business problem. In this way, stop falling in love with fashionable terms ( Deep Learning , neural networks, Big Data , AI) and start focusing on the practical solution of the problem. Why use deep learning when logistic regression works? Why use Big Data if you don’t even do the basic statistics of your data? Sometimes a data product is the simple search for the data, because if you don’t have data, you should start here.

Therefore, focus on identifying the simplest and most objective solution possible for solving the problem. Remember the purpose of the Lean methodology and the famous phrase:

Test fast, fail fast and adjust fast” — Tom Peters

So always ask:

  • What kind of solution will be adopted? (Ex.: Analytics, Machine Learning , AI, etc.).
  • What will be the solution? For example:
    If the adopted solution is Machine Learning, we must take into account that: for each problem, we have different approaches. For each approach, several algorithms. And for each algorithm, several parameterizations. That is, there is not and never will be a “best algorithm” for a given problem. But in any case, having the mapping of what you want will provide guidance for the development of the project.
  • What is expected of the solution? What would the outputs of the product be?
    Eg: A report with the final product of an analysis? A specific prediction about a data type?
    Oh, and don’t forget, at the end of each of these questions, always ask: why, why, why???

3. Map all Data

DPC — mapping the data
How to map all that you need about the data.

If you are a Data Product Manager and intend to work with data projects, remember:

1) The main challenge lies in the origin of the data and its sources;

2) The result will depend much more on the quality of these data than on the analyzes and models;

3) Always prioritize the creation of an internal environment that favors a Data-Driven culture.

So keep in mind that data discovery and management are applicable at every point in the process. Therefore, to ensure that your project is successful, you will need to follow some steps that will provide you with a smoother path during the journey. So always ask:

  • What is the source of the data, ie what is the source? (Ex.: Is it on a system? Is it a set of files? Does it have structured formatting?)
  • What is data quality? Are they sufficient for analysis?
  • Accesses vs. Availability — Do you have access to the data? Are they available?
  • Process / Transformation — Is it necessary to establish a process for reading the data? Will there be any transformation process?
  • Outputs — What are the output formats?
  • Test / Training / Validation — Are there any strategies or assumptions about test, training and validation data?

Remember, often the data product is the data capture itself . After all, if you don’t have the data, or access to it, you won’t be able to develop your product.

4. Which hypotheses will be tested

Indentify the hypothesis
What hypotheses will be tested?

This is a point that is often overlooked by DPMs. Sometimes, we are fooled by the problem definition and the solution identification, believing that these topics, together with the data mapping, are enough to move forward with the product development.

However, to be sure that the proposed solution will meet the real problem identified, we must keep in mind a set of hypotheses that we want to test. These hypotheses will monitor, from a business point of view, if in fact what was suggested as a solution is bringing real value to the company. So always ask:

  • What are the hypotheses we want to test?
  • What are the expected responses for each of them?
  • What to do from each answer? In other words, what strategy should we follow?

As an example, we can use the previous case of the CEO of the e-commerce area. Remember that in the end we came to the conclusion that we had a churn problem (bottom of the funnel) and not a customer attraction problem (top of the funnel). Therefore, we should ask:

Eg: Will the proposed solution reduce the Churn rate? With the proposed solution, is it possible to predict the customers who are thinking of leaving?

In this way, with the conclusion of the hypotheses, we reach the end of the first domain area, which is planning the construction of the product vision itself, that is, the basic definition of what our product will be.

5. Identify all actors (customers and stakeholders )

DPC — how to identify all actors?
How to the identify the actors?

Now it’s time to move on to the second domain area of ​​the Data Product Canvas, which is the part about developing the product from a perspective on the product strategy view.

This is the most strategic point, which will guide us in the elaboration of the actions that must be implemented when the final solution is ready. This will ensure that the solution does not run out of an application after its development, as we will already know in advance what to do with the created product . In this case, it is important to identify all actors that will have some involvement with the product . It is worth mentioning that for each identified actor we must validate what was understood in the first domain area, that is, we must validate with them the problem definition, the identification of the solution, the mapping of the data and the hypotheses that will be tested.

A caveat point: the mapping of actors must be done at all times during the product roadmap, that is, it occurs in parallel with the activities carried out previously.

Thus, we must ask at any time:

  • Who is the sponsor?
  • Who is the final customer of the product?
  • Who are the interested parties and stakeholders ?
  • Who will use the solution?
  • Who will consume the solution?
  • Who will be impacted by the solution?

Points of attention : You need to assess whether the project should go ahead if you cannot find a sponsor for the product. This should light up a warning signal. However, if you cannot find the product’s client, you should immediately stop the project. After all, there is no product if there is no customer.

6. Plan the strategic actions that will be implemented using the solution

How to plan the actions?

Now it’s time to bring your product to life. At this stage, DPM and Stackholders must begin to define the strategic actions that will be built from the use of the product. Here, the roles are reversed, it is the stakeholders who guide. However, it is the DPM that records the actions to ensure that the product will be used properly by the business area.

This prevents the developed product from being stopped, unused by the business area, because no one thought about what to do after implementing the proposed solution.

An example: Imagine that the e-commerce company has managed to make a predictive model of Churn prevention. Once the product has been delivered and is ready for use, what business will you do with it? What actions will be created? Given the signal that a customer will leave the company, what can be done for him to continue? Will it be an automatic action (email marketing) or human action (someone will contact you)?

Therefore, to ensure that your product is not forgotten in a “drawer” or rather, on a cloud server costing the company money, we must ensure, even before the start of product implementation, that the business area will be able to map the strategic actions that will be used. So ask:

  • What actions will be used?
  • Which campaigns should be created?
  • How to generate value for the business from the use of the data product developed?

7. Create the KPIs that will be used to monitor the entire product

DPC — how to crate KPIs.
How to create the KPIs to monitor the product along the journey?

This Canvas block is the bridge between two other blocks: Solution and Actions. This is where you will create your product’s quality and monitoring indicators.

On the part of the solution, you can use technical indicators to monitor the quality of the developed product, for example, if it is a Machine Learning model, you can indicate here which indicator will be used to measure the accuracy of your model (Ex.: F1, recall, etc.). In this case, here are some questions that should be asked during this step:

  • How to evaluate the quality of the finished product? (Ex. if it is a Machine learning model we can use its accuracy rate.);
  • What metrics should be used?

In addition, as you have already mapped the actors and the actions that will be used, it is important to create indicators that monitor the strategic effectiveness from the implementation and use of the product. That is, we will monitor if the data product is actually generating value for the business, for example, if the churn rate has fallen, if it has fallen by what percentage, if customers have started to buy more and how often, and so on. against. So we can ask something like:

  • How to measure the results of actions?
  • If it’s with A/B testing , how?
  • How much uncertainty can we deal with?

Another point to keep in mind is that no data product is 100% effective/accurate. After all, the past is not a mirror to the future. Therefore, every Data Product Manager must identify, together with its main stakeholders, how much uncertainty the company can deal with based on the results that will be produced by the data product in question.

Thus, we finish the second domain (the strategic vision ), and move on to the third and final domain of the Data Product Canvas.

8. Estimate project values

DPC — Estimating product values
How to estimate values.

Measuring the size of the problem is essential for prioritizing the various development fronts. In addition, it is here that we will assess the feasibility of the project, that is, whether the gains or savings generated will be sufficient to proceed with the execution of what was planned. In this case, we must ask:

  • How big is your problem ?
  • What is the baseline?
  • What is the expected gain or savings from using the product?

Remember, at the end of each question, after listening to each answer, always ask again why?

9. Map the risks

DPC — Mapping the risks.
How map the risks.

It is not the purpose of this Canvas to manage risks. However, knowing that they exist and mapping them properly is critical to ensuring complete and healthy planning throughout the development journey. Therefore, we must always keep in mind, and close by, a list of the main identified risks . So always ask:

  • What are the risks?
  • What could these risks block during product development?

10. Identify the performance and impacts that the product will generate for the business

DPC — Identify performance and impacts.
How to identify performance and impacts.

The best way to justify a data product to company executives is to show the impact and performance the product will bring to the company . Of course, at this stage of the product roadmap, what will be done is an estimate, based on a certain business situation, which will be planned by the product team. In any case, this estimate will also help to monitor the value for the business during the use of the developed product throughout its life cycle. So ask:

  • What is the impact for the business?
  • How to measure it?
  • Where and how can we see this improvement or impact/performance?

Conclusion

Thus, after presenting the 10 blocks that define the roadmap of a data product using the Data Product Canvas framework, we have reached the end. I hope this description makes you feel more comfortable and confident in creating high-performance data products. And remember: avoid creating the right solution for the wrong product.

After all, effective projects must always start with a strategy, that is, it is not a technical issue, but a business issue.

Oh, and if you liked it, please feel free to clap and share.

Thanks for your reading!

--

--

Leandro Carvalho

Data Science Manager | Data Scientist | Machine Learning Specialist | Professor | IT Manager. LinkedIn: http://br.linkedin.com/in/leandroscarvalho