How to calculate LTV for mobile games

Alejandro Paz
8 min readDec 18, 2023

--

LTV = ARPDAU x Retention is the standard response for how to calculate a mobile user’s lifetime value. Simple and easy, good enough for sketching out a guesstimate for a P&L, but not really effective for active campaign management.

The following is an initial attempt to create an LTV model that can help UA teams manage their active campaigns on a cost-effective basis.

This model is meant to be simple enough that you don’t have to rely on a data science team to decipher the results (though you may need some help in setting up the initial data pull).

Also, this model is built to analyze in-app purchase (iAP) revenue for free-to-play titles. This will still be somewhat relevant for subscription revenue, but another model will be required for ad revenue-based LTV analysis, which will be discussed in another doc.

Dataset

We’ll be using formulas and randomized data from the attached excel workbook. You can download the workbook at the end of the article to review and follow along (you may need to enable macros for the calculations to work correctly).

The raw example data is located within the red tabs. Feel free to plug in your own data to test it out.

Concept

Before getting into the calculation, I’d like to first take a step back and set up some basic concepts that will form the basis for our analysis.

  1. Payers bring revenue, not Users — If your goal is to predict revenue, then there is no point in analyzing the behavior of the users who don’t make any payments. Payers are typically such a small percentage of a user base, that it’s more of a benefit to remove non-payer behavior from your analysis, as it helps remove a bunch of noise. We’ll first calculate the LTV of Paying Users, and then back the result into a install-based LTV.
  2. Payers develop over time — Most LTV calculations that I’ve seen only focus on projecting the revenue that has been delivered. Depending on what timeframe you are analyzing, this may cause you to ignore payers that have yet to make their first payment. Our method will require a rough prediction of the total payer base within an install cohort, including those outside of our dataset’s timeframe.
  3. You can’t predict extreme user behavior — As mentioned, payers are likely going to be a small percentage of your user base. The average payer rate for mobile games is ~2%. On top of that, most revenue for iAP based games tends to be generated by the top 10% of their payer base. Hence, most of the revenue that you will be trying to estimate will come from only ~.2% of your users. So, our LTV calculation will need to be able to account for the potential of wild revenue swings that depend on whether we have whales within our sample.

Predicting the PU count

To begin with our analysis, first we’ll need to predict the expected count of purchasing users within an install cohort.

We’ll do this by building a reference table of how many DaysfromInstall it takes for a known purchaser to make their first purchase.

The 1stPurchaseData tab in our workbook has an example of how you will want to format your raw data.

Once we calculate the DaysfromInstall per individual paying user, we can then use a pivot table to calculate the % running total of PU by day. We will want to segment the data by OS, as spend behavior typically varies between android and iOS users.

Our dataset only has 100 paying user samples per OS, which won’t fill out every day in our table. So, we will need to fill in the missing DaysfromInstall to get a complete dataset of 365 days. The work has been done within the Age tab in the workbook.

Our table only segments by OS, but you will want to extend the segmentation to Network and Geo due to broad differences in purchase behavior

We can now use the resulting table to cross-calculate a predicted PU count from our existing data.

For example;

Assume you have 1000 android installs with 10 pu, and a weighted pu cohort age of 6 days from install at the time of analysis.

Based on our table, 10 pu at 6 days old should account for ~70% of the total pu that we can expect to have over the course of 365 days.

10 / .70 will come out to ~14 PU expected over 365 days

Drawing the Historic Curve

The next step in our analysis will be to draw an ARPPU curve using our projected PU count.

ARPPU stands for Average Revenue Per Paying User. Our goal here is to plot aggregate ARPPU for existing revenue over time. This will be the start of our LTV tracker, which we will then extend.

We plot ARPPU on the Y-axis, while the X-axis is based on how many days have passed since the install date for the cohort. This will be the start of our LTV curve.

This type of chart usually only looks at the data for existing Paying Users. This would typically ignore the long tail of users who make their first purchase very late in their lifespan and, therefore, would inflate the ARPPU value. In our case, we will try to account for these not-yet payers using the PU prediction in our previous section.

First, we’ll have to cohort our existing PU and revenue data by the original date of install, and track aggregate revenue by days from install. You can see an example format in the Data tab of the workbook.

The Rev(x) columns are aggregate revenue based on days (x) from install. So, Rev 0 is the total revenue delivered by the user group on their day of install. Rev 1 is the the total revenue delivered by the user group by the day after install, etc…

At the end of our sample table we also calculate the projected payer count, based on our previous calculations, and an Age Helper that will help us produce a properly weighted age when we pivot the data.

We can now use the data in our pivot table to chart out our historic ARPPU curve, which you can see in the first table of the puLTV tab.

You will typically have more users that have complete 1 day from install than those that have completed 30 days. Hence, the drop in the PU base over time. This can lead to volatility in the lead data point, like in day 45, which we will exclude from our analysis until we collect more data.

Projecting ARPPU

The previous calculation gave us historic ARPPU based on our projected payer counts. By extending the initial trend of this dataset, we will be able to track projected ARPPU over time, which will be the basis for our LTV calculation.

Using our previous chart as an example, we’ll draw two trend lines to project the potential paths of ARPPU over time.

  1. A log curve — commonly used in most LTV calculations that are based on user retention, which follows a similar trend of rapid decay, followed by a leveling off.
  2. A power curve — not as common, probably because it tends to give a very high projection in most LTV calculations
The Log and Power curves will serve as Min and Max boundaries for our LTV calculation

The log trend serves as a tracker for low quality users as their ARPPU will be highly determined by their retention. Low quality payers typically have rapid revenue growth early in their lifespan, due to the large amount of payers willing to make low value purchases. But, spend rate for these users will drop quickly as they will lack the retention to keep up the revenue trend.

The power trend serves as a tracker for high quality users. ARPPU for higher quality users will tend to be more consistent over time as they retain and spend regularly on a weekly/monthly/quarterly basis depending on your product.

On excel we can just extend these trendlines to the target day we want to get the LTV for our paying users. We also calculate the results in the table at A23:

As we can see, the Min and Max trends produce two very different results for puLTV that gets wider over a larger time period. This reflects the level of uncertainty in our calculation.

As we collect more data over time, this difference will shrink (think of a zipper narrowing as you get closer to the end). In the meantime, we will use the average of these two calculations as our “official” puLTV.

Converting puLTV to install-LTV

We now have our LTV estimate. The issue is that it is based on Paying Users, while most use cases will require an Install-based LTV. So, we’ll need to do some conversion to our calculation.

To do this, we’ll simply multiply the projected puLTV by the Projected PU base count for the sample that we are analyzing. That will give us the total expected revenue for our cohort, which you can see in the Table at A30.

Revenue projections are based on puLTV * projected Paying Users

We can then divide our Revenue Projections / Total installs. The result will then be the LTV’s for our install cohort. You can see the resulting table in B34 of the workbook

Projected Revenue (over 365 days) / Total Installs = LTV

And now we have our LTV.

You can download the workbook here to review the above analysis and input your own data. Feel free to reach out if you need some advice on implementation - xj5elemental@gmail.com

Conclusion

So, we have been able to calculate an LTV estimate. But, how accurate can this estimate really be when we have these Min/Max calculations that produce very different results.

As mentioned, the difference between our Min/Max is indicative of the uncertainty in our calculation. We are taking relatively few data points to expand out 365 days of revenue. Also, the majority of revenue in iap products is usually driven by a very small count of payers. So, we need to account for the possibility that a late-paying whale will show up and dramatically change our results. We can see an example of this in the Cumulative ARPU calculation for day 45 in our sample, where a small userbase is able to dramatically change the overall trend:

One way to deal with this issue is to just keep adding more behavioral data points to further segment and track our users’ spending patterns. The issue is, every segment we add will require even more data to analyze (since we will be splitting our userbase into smaller samples). Adding more data points will also require increasing amount of calculations in order to normalize our results.

Adding further data analysis to our projection may be effective if you have the resources to build and maintain such a model. But, a simpler, and much cleaner solution, would be to use our model to build a risk matrix for our campaigns.

The higher your Cost per Paying User is, the less likely you are to make your money back

In a future writeup I will show how you can use this risk method in conjunction with your media mix model to maximize the performance of your campaigns

--

--