Retention Series
Blended retention vs Cohort retention
Never use the blended calculation to figure out user retention
In the previous post, I introduced you to the user cohort and how to use it to learn about user retention.
Today I’m going to move forward and show you:
- two ways to visualize cohorts
- what is blended retention
- why you don’t need to use blended retention
Two ways to visualize cohorts
There are two ways to visualize cohorts:
- Calendar-based approach
- Lifetime-based approach
In the Calendar-based approach, we visualize cohorts in a way how they appear in time.
It’s the simplest form of user cohort visual and it can be easily visualized in Excel by using Pivot Tables.
In the Lifetime-based approach, in the horizontal axis instead of using dates when a user did some action, we are using [lifetime] = [date_of_action]— [date_of_signup].
This simple transformation gives us two big advantages:
- it’s much more convenient for analyzing cohorts over time
- it’s much more convenient for comparing different cohorts at specific periods
By the way, the most careful readers probably noticed that all user cohorts are the same in terms of behavior:
- Survival rate — S at t1 = 0.67
- Survival rate — S at t2 = 0.50
- etc
The whole config of cohort you can find in the first row in the Lifetime-based cohorts visual.
What is blended retention?
As I mentioned earlier, a Calendar-based cohort visual is easier to build, and often analysts and business leaders use it.
One of the questions that business leaders ask themselves is what retention rate from the previous period the product has.
Let’s calculate the blended retention rate for Mar-23. To do it we need to get 3 figures:
- get the total number of users in that period (R15)
- get the number of new users acquired in that period (R5)
- get the number of users who were active in the previous period (Q3+Q4)
The formula for the blended retention rate looks like this: [Blended retention rate] = (R15-R5) / SUM(Q3:Q4)
Why this calculation is called blended?
The answer is very straightforward: to calculate a measure we take users with different lifetimes.
If it does not sound like a big problem, then we are ready to move to the next section.
Why you don’t need to use blended retention?
Earlier I have already told you that all cohorts behave the same. So, we know how each cohort behaves over time.
Let’s compare the blended retention rate and the cohort-based one:
Probably the more prominent difference will be visible when we compare the blended survival rate and the cohort-based one:
What can we learn from the chart above?
If we try to calculate the survival rate from a Calendar-based cohort visual we can easily overestimate our survival rate per cohort.
Why does it happen?
To answer this question, let’s try to replicate the blended survival rate for the last period: Dec-23.
For Dec-23 blended survival cohort rate = 0.31.
When we calculate the blended survival rate at a specific period we mix a set of old cohorts with a new one:
- Jan-23 cohort in Dec-23 has 162 retained users, so the survival rate = 0.16
- Feb-23 cohort in Dec-23 has 178 retained users, so the survival rate = 0.18
- …
- Nov-23 cohort in Dec-23 has 670 retained users, so the survival rate = 0.67
To get a blended survival rate we will average all these survival rates (taking into account the share of each cohort).
The blended survival rate = 0.31 is almost 2 times bigger than the real survival rate of a cohort in 11 periods = 0.16.
That’s actually the reason why it does not make any sense to calculate blended retention in a specific period. Blended retention figures are very misleading, so don’t use them!
In the next post, I will show how to build user cohorts properly.