Retention Series

All user cohorts are noisy, but some can tell you a story

Use statistics approach to tell a story

Paul Levchuk
5 min readApr 1, 2024

In the previous post, I proposed to stop using classic color gradients to show user cohorts. The main reason for this recommendation was the natural variability of user cohorts.

Let me remind you what the original user cohort chart looks like:

User cohort chart: color gradient approach

From the chart above we can learn a few things:

  • some user cohorts are different in size
  • the variation of user cohorts in the first few periods is difficult to distinguish as the color gradient is almost the same
  • variation of user cohorts in the last few periods is better noticeable but the figures are small so there are some doubts that the variation is a signal, not noise

To improve our user cohort's understanding, remove some noise, and focus on the most important signals, let’s color each period based on stats for the previous 7 days for the corresponding periods.

As a result, our user cohort chart will look like this:

User cohort chart: statistics approach, 7-day window

Is a new approach better?

The short answer is: yes, it is (but we need to interpret data with caution).

Let’s start with the following general rules:

  1. Any conversion metric (like user cohort retention rate) depends on the company’s User Acquisition scaling efforts. As a rule, the larger the number of acquired users — the lower the conversion rate would be.
  2. The lower the figures in the cohort long tail — the higher the error.
  3. User behavior changes in the middle stage of the user cohort as a rule are product-related.
  4. Many statistics approaches use some kind of time window. If you change it, the results will change as well.

Keeping in mind these rules let’s try to tell a story from the chart above.

User Acquisition Scaling

The User Acquisition team started scaling on 3/5:

  • Have the next user cohorts (3/6, 3/7) demonstrated the user retention drop as well? No, the next cohorts have the average D1 user retention.
  • Has user cohort 3/5 stabilized for the next 1–2 days? Yes, just in 2 days the user cohort started behaving as the average user cohort.

There is no serious issue with scaling on 3/5.

The next period of scaling was seen on 3/10:

  • Have the next user cohorts (3/11, 3/12) demonstrated the user retention drop as well? No, the next cohorts have the average D1 user retention.
  • Has user cohort 3/10 stabilized for the next few days? Actually, no.

There is something wrong with user cohort 3/10 and it’s recommended to dig deeper into it with the User Acquisition team.

So, from the 14 cohorts on the chart above, there is only one that you should learn from a User Acquisition perspective.

Error

User retention by definition is a conversion metric. Conversion is what we have by measuring user cohort once. If we had a chance to acquire an unlimited number of user cohorts we could learn the spread of all possible conversions and figure out our measurement error rate.

To calculate the errors for conversion we can use the following formulas from probability theory:

Standard Error = SQRT( [retention rate t] * (1 — [retention rate t]) / [n_users_t0] )

Relative Error = [Standard Error] / [retention rate t]

Let’s calculate the Standard and Relative errors.

Standard error and relative error for the above user retention cohorts

From the chart above we can learn:

  • The average user cohort D11 retention rate = 0.16. However, the real value of the D11 user retention rate is 0.16 ± 2*0.011 or [0.141, 0.184].
  • The Standard Error of user cohort retention is decreasing slowly. At the same time, the Relative Error of the user cohort retention rate is growing quickly and linearly. It means that the further we move within the user cohort long tail the less reliable our user retention rate measure is.

User Behaviour

Also, we could see some cases not related to User Acquisition scaling:

  • cohort 3/3 user retention improved starting from D2
  • cohort 3/13 user retention deteriorated starting from D5

These 2 cases are related to some needs of these users that were satisfied differently by specific product features:

  • cohort 3/3 had improved user retention because of a bug in the onboarding test on that day
  • cohort 3/13 had deteriorated user retention because of some issues with the product feature related to sharing data with other users

Researching such cases is useful to learn what can be improved in the product. But the most important point here is to check whether these patterns can be found in the next cohorts.

Time Window

My initial recommendation is to use a 7-day time window for checking ±2 STDDEV() for each cohort period. The idea is to align color sensitivity to the recent User Acquisition team activities. The User Acquisition teams often have weekly plans, so a 7-day time window is just a good starting point.

Let’s color our user cohorts one more time using a 3-day time window:

User cohort chart: statistics approach, 3-day window

The user cohort retention chart has changed: some new cells have been colored, and a few old ones have been cleared.

Which time window is better?

There is no right answer here.

As a rule, the shorter the time window — the more sensitive coloring to recent changes and the more false positives there could be.

SUMMARY:

  1. Always consider 4 factors: scaling factor, measurement error, user behavior, and time window.
  2. The average user retention rate is becoming less reliable in the long tail of the user cohort.
  3. Adjust the time window to be in sync with your User Acquisition team.

--

--

Paul Levchuk

Leverage data to optimize customer lifecycle (acquisition, engagement, retention). Follow for insights!