Engagement Series

D1 retention — a tricky metric to predict

What are early signals of good D1 retention?

5 min readDec 6, 2023

The definition of D1 retention is quite straightforward: it’s the percentage of users who return to the product in a range of 24 up to 48 hours after signup.

As usual, D1 churn = [1 — D1 retention] is the biggest drop-off your product will ever experience daily. In the previous post, I propose to decompose D1 retention on 2 metrics:

% of users who returned to 2nd session
% users who returned on D1 after 2nd session

It’s a useful approach in product analytics indeed.

But product analytics is not only about metric decomposition. It’s also about learning from user behavior.

Let’s draw a schematic user flow and think for a moment about what else we know about users on D0:

Schematic user flow. Image by the author.

Based on the scheme above we can assess the following things on D0:

time the user spent on the product in 1st session
absence time between 1st session and 2nd one on D0
% of users who returned to 2nd session on D0
total time the user spent in the product on D0
total number of events the user had during D0
total number of sessions the user had during D0

Let’s calculate these metrics for daily cohorts and build linear regression to figure out how good they are at predicting the metric [% users returned on D1].

Regression analysis. Image by the author.

There are a few moments that we need to consider:

R2 — the score of how good is linear model fits the data (the closer R2 to 1.0 — the better)
Variance of points around the linear regression line (the lower variance around the red line — the better)
Whether some points look like outliers (remove them and check R2 again)

R2 score and Variance

As we can see the signals

[1st session time]
[Time away between sessions 1 & 2]
[Total time on D0], and
[Totlal events on D0]

have quite low R2 (in a range [0.0, 0.05]).

It means that our linear regression does not explain well variance of the target metric [% users returned on D1]. The main reason for this is a very high variance of these signals.

At the same time, two other signals demonstrate a much better fit:

[% users returned in 2nd session on D0]
[Total number sessions on D0]

R2 score is still much less than 1.0 but at least it’s around 0.3.

What’s more important is that these two signals are 10x times stronger than the previous group of 4.

Outliers

Outliers are points that are unusually far from the other points. I colored them with red color.

[Time away between sessions 1 & 2]

If we remove the red point, the resulting fit will improve from 0.03 up to 0.23. It’s a significant improvement.

Now it makes sense to think about what this signal is about.

The signal [Time away between sessions 1 & 2] is about the time that users spent on something else after they finished their 1st session in the product. According to the chart, this time varies from 10K seconds (~2.5 hours) up to 20K seconds (~5.5 hours).

My initial guess was that the shorter period between 1st and 2nd session — the better. But data demonstrates the opposite.

Probably when users take their time their next session will be less stressful (is it evening time?) and as a result, they get more value from the product and return the next day.

It’s an interesting moment that the signal [Time away between sessions 1 & 2] has more power than the signal [1st session time]. Think about this.

[% users returned in 2nd session on D0]

If we remove the red points, the resulting fit will improve from 0.31 up to 0.44. It’s a great improvement.

The signal [% users returned in 2nd session on D0] is about % of users who returned to the 2nd session on signup day.

From the chart above we can learn that the higher the percentage of users who returned to 2nd session on D0, the higher the percentage of users who returned the next day.

It makes a lot of sense to me.

If we expect users to return on D1 they should demonstrate similar behavior on D0. This reminds me of the recommendation of finding covariates in the CUPED paper:

Across a large class of metrics, our results consistently showed that using the same variable from the preexperiment period as the covariate tends to give the best variance reduction.

[Total number sessions on D0]

If we remove the red points, the resulting fit will improve from 0.29 up to 0.52. It’s a significant improvement.

The signal [Total number sessions on D0] is about how many times on average users returned to the product on signup day.

The more times the user returns to the product in a day — the higher the chances that he returns to the product the next day. In this sense, the metric [Total number sessions on D0] is an extension of the metric [% users returned in 2nd session on D0].

Both of them share similar pieces of information about users’ behavior and therefore have a strong positive correlation:

Correlation between signal [% users returned in 2nd session on D0] and [Total number sessions on D0]. Image by the author.

To summarize:

Not all behavior signals are equally useful. Always check the relationship between signals and the target metric.
The strongest signals are usually the same signals from previous periods.
Not all signals contain unique information.

Engagement Series

D1 retention — a tricky metric to predict

What are early signals of good D1 retention?

Written by Paul Levchuk