How can PMs become better at analyzing data, visualizing data and convincing others using data
If we have data, let’s look at data. If all we have are opinions, let’s go with mine — Jim Barksdale, former CEO, Netscape.
If you search the internet for quotes on importance of data, you will find tonnes of them.
Most companies and leaders claim to be ‘data driven’ in their decision making and expect their employees to be the same.
Yet, being data driven is easier said than done.
Many a times, numbers are used as an afterthought. They are meant to back a decision that you are leaning towards rather than actively help in forming a hypothesis or making a decision.
However, once you start your professional career, you are expected to have date backing most of your decisions. Having data-backed arguments helps you stand apart from the crowd of fluffy, opinionated people.
In this article, we will learn everything it takes to become more data driven — starting from challenges, moving to various techniques for analyzing & visualizing data and finally concluding with strategies to inform & convince others using data.
Let’s dive in.
Challenges in becoming data driven
Following are some challenges faced by PMs face in their journey of becoming more data driven.
1. Not Comfortable with Data Analysis
You have never really used or been trained to analyze data and find out insights. Given a problem, you are not sure what data to analyze, how to analyze it and extract insights.
2. Missing Data
Very minimal data logging has been one for your product. Whenever you seek data for a analyzing product usage, you find crucial data missing.
3. Lack of DIY (Do it Yourself) Systems
You know the data that you need. However, you are not familiar with the tools used to extract data. Thus, you are dependent on someone else for providing you the data. This leads to inordinate delays and makes it impossible to be data driven.
Next, we look at how to overcome each of the above points.
Mathematical concepts to understand
You don’t need to be a Math's whiz to gather deep insights. But there are some basic concepts you need to be clear about. These include:
Percentage
They are a way to express something as part of a whole. For example if 5 out of 100 users click on an advertisement, then conversion rate is (5/100) = 20%.
As a PM, most of your measures happen in terms of percentages. For eg. percentage of users who converted from free to paid, percentage of users who clicked the ad, percentage of users who liked the ice-cream.
Changes over time are also often reflected in terms of changes. This helps identify the magnitude of the change. For eg. our user base has grown by 70% since last year.
Percentiles
It involves dividing data into 100 equal parts, with each part representing a percentile. So, if we say that a value is at 95th percentile, it means 95% of the values are less than that.
There are certain metrics where just measuring the average is not sufficient and you want to focus on outliers (so that their experience can be improved).
For eg. in case of website load time, you would want to consider the 90th percentile for website load time. Say, its 1 second. This means that for 90 percent of users, load time is less than 1 second. For remaining 10%, its more than 1 second.
Mean
It is the average of all values in the dataset. It is calculated by (sum of observations)/(number of observations).
You can use it when data is well distributed and there are not many outliers.
Median
It is the middle value of a dataset when it is arranged from smallest to largest. Use medians when your data has outliers (i.e. there certain datapoints which are significantly smaller or larger than the rest of the data).
Correlation and causation
Correlation means that two variables seem to exhibit similar trend. For example, if variable A is moving up then variable B is also moving up.
Causation means change in one variable is influencing another variable. For eg. when variable A increases, it causes variable B to increase.
Correlation does not imply causation.
For eg. both variable A and B might be moving up because of another independent variable C.
Use for PMs — Say, you want to understand which features lead to a higher retention rate. Understanding causation between a feature’s usage and retention would be critical. This would help focus on features that are actually moving the needle.
Analysis Techniques for Drawing Insights
Here are the three popular ways to analyze data and extract insights:
Trend analysis
It involves collecting data over a period of time and then analyzing data to identify patterns or predict future.
For eg. i) Looking at revenue or customer growth over last few quarters to understand the change and forecast future revenue or customer growth. ii) Looking at feature usage data to understand the pace of a new feature’s adoption.
Funnel analysis
This method is used when users are moving through a multi-step journey and you want to understand what % converted and what % dropped off (at each stage). User journey acts like a funnel because out of all the users entering the top, only a few reach the bottom.
Useful when you are trying to understand your onboarding flow, buyer journey (imagine all the steps that a buyer on an e-commerce site goes through), booking flow (imagine the steps user goes through while booking a trip) etc
Cohort analysis
Cohort refers to a group of users with similar characteristics or usage patterns.
In this analysis, you take a ‘cohort’ of users analyze their usage pattern or interaction with the product.
There are primary three types of cohort data — acquisition cohort (group users based on when they sign up), behavioral cohort (group users based on past behaviors or user profile), predictive cohort (group users based on how they expect to behave in future)
*Behavioral cohorts — Ability to group users across fields such as pricing plans, device type, geography, company size, age etc can be a powerful way to identify insights.
Deciding the data points to track when a feature is shipped
One common mistakes which PMs make is that they forget to ship the analytics with the feature.
To prevent this, do the following:
- Add an ‘analytics’ section to your PRD template — Having this in the template ensures that you don’t forget about analytics.
- Make ‘checking for analytics’ part of your testing process — QA team is explicitly mandated to call out if analytics is missing from the PRD.
To identify what analytics to ship, follow this 3 step process:
- Identify the questions you want to answer — Here, the goal is to identify what all questions would you want answered once the feature is released.
- Identify the metrics which need to be tracked to answer the above questions.
- Identify the event signals that need to be sent to track the above metrics
To learn more on the above process with the help of an example, read this article.
Visualizing the data and Communicating Insights
Here are the various formats in which you can share data with the stakeholders
- Tables — Useful when you want to summarize multiple metrics relevant to a specific aspect of business in a single view— Source
- Line chart — This connects a series of data points using a straight line. It represents a sequential progression of values (usually over time). For eg. change in stock prices, website visitors etc.
- Bar chart — It helps compare numerical values like integers or percentage. The value is indicated by the length of the bar.
- Stacked bar chart — It extends the bar chart to look at numerical values across two categories simultaneously. For eg. when you draw a bar chart for indicating monthly revenue, then in each bar you could indicate the revenue change due to churn, downgrade, upgrade, new revenue.
While its easy to go down the rabbit hole of various visualization techniques, the above charts should suffice for most cases you will encounter .
Tips to remember while sharing your analyses with a colleague or a boss
- Before sharing data, think about who your audience is — Your audience determines the format in which you present the data.
- Start with the key insight you want to share — Keep the numbers or graphs for later slides. This convinces the audience that its worth their time to try understanding the data.
Additional Points to Remember
- Remember to log data for every feature you ship — Otherwise, you would never be able to understand or demonstrate the impact of your features.
- If numbers look too good/too bad to be true — Get a data sanity check done on them. Discovering that you are analyzing incorrect data during review meetings can cause lot of damage to your reputation.
- Invest in setting up Do-It-Yourself (DIY) dashboards — So that you don’t have to spend days waiting for the analyst to respond to your request. This also allows other stakeholders to access these dashboards and deep dive into the data.
- Have periodic meetings with leadership for reviewing data — Having a forum such as weekly business review or a feature usage review, ensures that everyone is on top of the metrics they own. It becomes a place where people can discuss the insights from data, identify anomalies in advance and plan actions to take.
Conclusion
Being data driven is essential if you want become a better leader.
It helps you improve your decision making process, lends credibility to your arguments and helps you learn faster from mistakes.
But looking at the right data in the right way matters. Otherwise, you might end up vanity datapoints which could do more harm than good.
So, its time to put the above strategies into practice and start the journey towards becoming a data driven leader.
Further Reading
- https://gopractice.io/data/arithmetic-mean-and-median-for-product-managers/
- https://gopractice.io/data/correlation-and-causation/
- https://amplitude.com/blog/causation-correlation
- https://amplitude.com/blog/cohorts-to-improve-your-retention
- https://www.atlassian.com/data/charts/essential-chart-types-for-data-visualization
- https://equals.com/guides/saas-metrics/appendix/operating-model/
- https://productlessons.substack.com/p/were-using-numbers-wrong