Predicting Players Lifetime Value in Free-to-Play Mobile Games

Funmetric
7 min readAug 2, 2021

Hello everyone, I am Kutay Akalın. I am a Data Scientist and Co-Founder of Funmetric. I did my master's degree in Big Data and Analytics. As Funmetric, we believe knowledge grows when shared. I hope you will enjoy our article :)

In this article, we are going to talk about what is Lifetime Value, how we can calculate the current players’ LTV and predict it based on new players’ behavior. We will explain the different methods like Retention Prediction, Probabilistic Models, and Machine Learning. Each model has different advantages and disadvantages with different use cases. But, calculating and predicting LTV is one of the essential keys for building a sustainable growth strategy and there is no globally accepted one approach to calculate. Let's dive into details!

What is Lifetime Value (LTV)?

Actual Lifetime Value is one of the most important metrics to understand your game’s value. According to El-Nasr et al. (2013) Lifetime Value is the total amount of money a user will contribute to a game’s bottom line while they are engaged. For example, the player’s lifetime value in the paid games is equal to the initial purchase price [1]. LTV of players in subscription-based games is calculated each month for as long as they remain active. In freemium games, there isn’t any generally accepted approach to calculate players' LTV. Thus, most of the developers use their own way of calculating it.

What are the main variables of LTV?

To calculate LTV, we need to consider two main variables:

  • Monetization
  • Lifetime (Retention)

In free-to-play games, there can be different revenue sources for monetization like ad revenues, in-app purchases, and subscriptions. Therefore, the LTV calculation can be modified according to the proportion of income from resources. But, we need to consider all these sources with future goals. If we examine LTV in terms of retention, most of the developers consider player sessions as the most valuable time window in player-level and calculate their retention in daily cohorts. Thus measuring daily activity and engagement based on these player sessions becomes problematic.

Necessities for LTV Calculations

Firstly, developers should decide the period of time which they want to calculate players’ value, like:

  • Day 7 LTV
  • Day 30 LTV
  • Day 180 LTV

There are many factors, such as genre, business model, stage of the game, and ROI, that affect the selection of this period [2]. Besides, developers should select the appropriate monetization metric to use in calculating and predicting.

Models

Lifetime Value consists of all revenue which sum of monetary values of a specific player or a specific cohort. The selection of the LTV model can be changed based on the predicting level (like cohort or player) and the availability of historical data.

Historical Industry Benchmarks

If you have just launched your game and you don’t have any data, you can analyze the previously published similar games, sub-genre, and industry average benchmarks. In this analysis, you can linearly project the future of the game but this approach may not give very accurate results [2].

Projecting Retention Curve

Predicting the lifetime of a user is one of the most popular approaches to predict LTV. In this approach, you can simply use regression analysis. Our aim is to fit a regression curve that minimizes the error between actual points and our curve. For example, assume that you are in the soft-launch phase and want to predict your 60-day retention for a cohort. On the 14th day, you calculate your retention %45 on Day 1, %13 on Day 7, and %8 on Day 14. Based on these values, you can fit a nonlinear curve to see change:

Retention Curve Fit

As we see in the graph, our Day 60 retention rate is projected as %3.69. With our retention curve, we can calculate the LTV in a given period using the formula:

If our ARPDAU is 0.3$ and CPI = 1.5 $, our break-even point will be seen in the figure below:

Break-Even Analysis

Of course, this is a simple example. You can extend this analysis by generating models for different audiences, different businesses, different retention curves, and so on. Also, you can even use regression analysis to analyze the relationship between spending and CPI. You may want to use this analysis to optimize your ROI for your marketing strategy.

Try our FREE tool to project your LTV using optimized retention curves and analyze your break-event point!

Probabilistic Models

In Probabilistic Models, we try to fit a probability distribution based on the RFM (recency, frequency, and monetary) values for predicting players’ purchase behavior. Note that in this approach, the variables of the LTV formula cannot be used in a single model. Thus, we must build two models for predicting the main two variables separately: Transaction variables (purchase frequency & churn) and Monetary variables (average order value) [3]. Some of the Probabilistic Models are displayed below:

Probabilistic Models [3]

Among these models, Pareto/NBD and BG/NBD models are commonly used.

Pareto / NBG

Schmittlein et al. developed the Pareto/NBD model in 1983. The Pareto/NBD model attempts to forecast individual players’ purchasing behavior based on previous purchases. To predict transactions, this approach assumes that becoming a churn follows a Pareto distribution and the purchase count of an active player follows a Negative Binomial Distribution (NBG). Hence, this approach is called Pareto / NBG model. These distributions are controlled by 4 main parameters: r and α for NBD, s and β for Pareto. [4]

BG/NBD

Although the Pareto/NBD model is widely utilized, it is difficult to apply because the estimation requires multiple evaluations of the Gauss Hypergeometric Function. Fader et al. suggested modeling player behavior using the beta-geometric model, which is easier to find optimized parameters and runs faster. Like the Pareto/NBD model, BG/NBD uses four parameters and assumes that players make purchases at any time in the period [4].

To calculate LTV using the probabilistic model we need to perform 4 main steps:

  1. Data processing to extract RFM values from raw data.
  2. Tune the parameters for the transactional model (Pareto/NBG or BG/NBD)
  3. Create a model for monetary variables (Gamma-Gamma Distribution) and fit it.
  4. Predict LTV for each player

Machine Learning Models

Machine Learning and Deep Learning models are becoming more popular every day. In particular, the performance of these algorithms is quite successful compared to statistical models. Since the lifetime value is numerical, we can use regression models in Supervised Machine Learning models if we have historical data. Since the machine learning models require a large amount of data, we need to obtain cleaned, processed data for the model building phase. For this purpose, we can use synthetic data generation techniques and transfer learning to improve our model performance if we don’t have enough data. Also, we can build different models for different monetary resources if needed.

One of the main advantages of machine learning models is that we can use different types of player features to predict LTV in the selected period. In probabilistic models, We can use only tree variables (Recency, Frequency, and Monetary). But in the ML models, we can generate different valuable features using Feature Engineering techniques from raw data. Of course, these significant features vary by business model, game genre, and data availability.

Also, building a machine learning model requires many steps, from data pipeline building to deployment and production. Funmetrics’ ready-to-use AI Pack builds this pipeline automatically, processes and explores the data, generates meaningful features, and returns the results with high accuracy even if you have not enough data.

Summary

In this article, we briefly explained the models for predicting the players’ lifetime value in mobile games. Predicting new players LTV is essential for various approaches like optimizing marketing budget allocation, managing monetization strategy, and calculating ROI. Basically, we can select the appropriate model by checking the historical data availability and monetization model. If you have enough data, machine learning models can generate higher accuracy than other models.

Please feel free to ask any questions via info@funmetric.io or book a free call. Let’s maximize your profit together!

References

[1] El-Nasr, M.S., Drachen A., & Canossa A. (2013). Game Analytics: Maximizing the Value of Player Data. Springer. DOI: 10.1007/978–1–4471–4769–5

[2] Monereo, I. (n.d). Insights for evaluating lifetime value for game developers. Google Play Developer Communications.

[3] Google Play. A Definitive Guide for predicting Customer Lifetime Value (CLV). Retrieved from https://www.analyticsvidhya.com/blog/2020/10/a-definitive-guide-for-predicting-customer-lifetime-value-clv/

[4] Cloud Architecture Center. Predicting Customer Lifetime Value with AI Platform: introduction. Retrieved from https://cloud.google.com/architecture/clv-prediction-with-offline-training-intro

--

--

Funmetric

We are group of analysts, data scientists, designers and marketers developing data-driven technologies to help publishers and game studios achieve their goals.