Optimizing push frequency for new users

Published in

SmartNews, Inc

7 min readOct 4, 2022

Push notifications are a powerful tool for a news app. Unlike in-app features, notifications show up on a user’s phone unprompted, directly asking them to open the application. Optimizing them has been one of the most important strategies for increasing SmartNews user engagement. In general, sending more notifications can increase short term engagement, but can also cause higher churn, uninstall or notification un-subscribe rates.

Over the past few years, SmartNews has performed extensive testing to determine the best frequency for push notifications. In this blog post, I will go through our general findings for our push frequency strategy, and how we applied an ML approach to the problem.

Negative impact can take a long time to show up in top-level engagement metrics

When we first started running AB tests on push notification frequency, we naively used typical short-term engagement metrics (for example, daily active users) to measure success. After running one particular test that increased the number of notifications, we saw huge gains in our short-term KPIs. We decided to release the change after monitoring the test for two weeks, leaving a small percentage of users behind in a holdout.

What we found next was a big surprise: the positive results were consistent with the holdout group in the short — and even medium — term. However, after checking the results six months later we noticed a gradual but sustained degradation in performance. And after checking a year later, the holdout was actually beating control.

We learned that push notifications increased the frequency with which users came back to the app, but also significantly increased churn, uninstall and other negative impacts. This dual effect can be easy to miss using the typical engagement metrics like daily active users (DAU). DAU measures both how many users use the app at all, and of those remaining how often do they come in. So while the increase in engagement frequency is apparent immediately, the effect of the negative impact on DAU can take a long time to become noticeable.

DAU for the users in a long-term AB test (DAU will naturally decay over time for a fixed set of users). You can see after a very long period, control starts to beat test.

New users are particularly sensitive to too many notifications

When we broke down the results by new users vs existing users, we noticed a clear difference in behavior. Increasing push notification frequency dramatically increased negative impacts from push for new users, while the increase for existing users was more subtle and gradual. We found there are two reasons to explain these findings:

1. Survivorship bias

Existing users are, by definition, ones who have been using the app for a long period of time without churning. There is a self-selection bias for users who find our pushes engaging and are comfortable with the baseline frequency at which we send them.

2. Push notifications require trust, which takes time to build

Beyond survivorship bias, new users are also inherently more sensitive to too many notifications. We found a strong relationship between the number of days since a user has installed the app and the probability they will uninstall, churn, or unsubscribe from too many push notifications.

Push notifications can be intrusive and noisy — allowing an app to send them requires a certain amount of trust from the user. If the user is just getting to know your product and doesn’t yet have a full understanding of the value proposition, they’ll quickly get turned off or annoyed when you send too many.

First approach — the slow boil

Given this, our approach was to focus on new users, and to pay close attention to how many days it has been since the user installed the app. After multiple iterations, we found the best approach was to slowly increase the number of notifications over time. When the user first installs the app, we wait a few days until we send our first notification. Then we slowly increase the frequency over the first month after installing.

This results in a huge decrease in engagement for the first week after install, but then a dramatic increase in the weeks after, and ultimately a big increase in overall lifetime engagement.

However, this was a one-size-fits-all approach. There are some users who would want to get notifications starting much earlier, and others where even this approach is too aggressive.

Second approach — applying machine learning to the problem

Business logic

Our goal is to create an algorithm that finds the best push notification frequency for each user, and adjusts it over time as the user ages.

For this first iteration, we were focused specifically on modeling the first week after install. After waiting a couple of days to gather some in-app usage data, we decide what each users’ daily push budget will be for the following few days. Daily push budget is defined as the maximum number of pushes users can receive in one day. We then use a heuristic to decide on each user’s budget based on the model’s initial decision for the first week.

The model would make a decision for each user between weeks 0 and 1. Then, based on which budget was decided for the first week, a heuristic would gradually increase it until the end of week 4.

Modeling and performance metric

What should the goal metric of the model be? As we learned above, we can’t use a simple short-term engagement metric — it needs to consider both engagement and the negative impact from push.

This led us to two initiatives:

Create models to predict each user’s short-term engagement as well as churn or uninstall probability given different push budgets.
Create a value function that trades off the two predictions, with the ultimate goal being long-term engagement.

Model creation

For each user, we can predict both their short-term engagement and the probability for a negative impact for the first week after installing, given different push budgets and a vector of features.

We used XGBoost for the short-term engagement prediction, and a random forest classifier for the negative impact probability. To generate an unbiased training set, we randomly assigned new users to different push budgets.

Value function

We then implemented a function that compares the negative and positive impacts and makes a decision on which budget to assign to each user. The function is ideally one that optimizes towards long-term engagement.

For this, we added one more prediction — the predicted long-term engagement of the user. The long-term engagement model was trained on historic data, and does not take the push budget into consideration.

This resulted in three pieces:

Predicted short-term engagement, given user and push budget
Predicted short-term negative impact, given user and push budget
Predicted long-term engagement, given user

The value function then looks like this:

u = user
b = push budget
P(engagementₛ|u,b) = the predicted short-term engagement for the user and push budget combination
P(engagementₗ|u,b)= the predicted future long-term engagement of the user
W = free parameter that increases or decreases the weight of the predicted negative impact
P(negativeImpactₛ|u,b) = the predicted short-term negative impact for the user and push budget combination

With this function, we can convert the probability that the user churns or uninstalls into lost future engagement, and add it together with predicted short-term engagement to get a final prediction for total long-term engagement. We are multiplying the predicted long-term engagement term by one minus the probability for negative impact. If we believe they will churn or uninstall then the long-term engagement prediction is set to 0, and their total lifetime engagement is simply their predicted short-term engagement. We also added in a free parameter, W, that will let us weigh the predicted negative impact higher or lower. We can then test different values of this parameter through online testing.

And so the final algorithm is simply the argmax of the value function, with user and budget as inputs:

How budget decision changes over the lifetime of the user

As we saw before, users’ tolerance and affinity for push notifications changes over time. The budget algorithm above is only deciding the budget for the first week after the user installs the app.

For the following weeks, we simply used a heuristic that slowly increases the budget over time. Users initially assigned to lower push budgets would have their budget increase gradually over time, while users who were assigned a more aggressive budget would increase more quickly. Finally, after the first few weeks from install, all users would exit this onboarding period and would have roughly the same push budget going forward.

Performance

We went through multiple rounds of AB testing the best new user push frequency strategy. First, we tested various setups for the one-size-fits-all slow budget increase. Then, for the personalized budget model, we tested different values for the W (negative impact weight) parameter, different designs for the heuristic increasing push budget over time, and improvements to the engagement and negative impact model predictions.

Overall, the push budget onboarding projects increased lifetime engagement by more than 15%.

Conclusion

Push notifications are an important tool in driving engagement for a news app. But they can oftentimes be a double-edged sword. It’s imperative to consider the upside and downside carefully, with an eye towards the long-term impact of any change you make. At SmartNews, through many rounds of experimentation and development, we were able to successfully roll out an algorithm for handling push frequency to new users — one that substantially increased our users’ total lifetime engagement.