Diving into LTV

Sean Billings
OutPost: OutPoint Growth Blog
6 min readMar 10, 2022

The Lifetime Value of customers is one of the most important metrics for measuring the health of a business.

At one point we were quite heavily invested in LTV research, one of the original ideas behind OutPoint was actually basically LTV-Co. As we evolved, we have focused more so on the primary levers of marketing capital allocation, and focus on key driving factors like marginal ROAS and diminishing returns based on media-mix. We still love the nuances of most modelling problems in this space.

In this post we explore Customer Lifetime Value (CLV) also know as Lifetime Value (LTV) (or long-term value). We outline and derive several basic LTV estimates, propose several regression models for LTV, and investigate important levers for managing LTV

Simplistic LTV

The simple LTV equation decomposes LTV into acquisition Cost c, customer profit p, and expected lifetime l. By taking expectations over each of these quantities, businesses derive the following simplistic LTV equation.

Note, E[some variable] is the expected value of that variable, and is essentially the mean of a variable with enough samples. E[p] in the above is the expected value of profit averaged over the customers, and each time period (in this case, the year) i.e. yearly profit is averaged over customers.

Also note, many people estimating LTV would actually omit the acquisition costs in this formula. We will leave it in for now, due to some nuances in the formulations later, but it can be mentally omitted from the following equations if you like. Additionally, many businesses would focus on the revenue generated from customers instead of profit. Again, a mental replacement can be done here.

Unfortunately expected profit, lifetime, and acquisition cost are not very precise or independent levers. For example, it is quite easy to say ‘lets reduce our acquisition cost’, but it’s a bit more difficult to act on that without reducing the number of customers you actually acquire.

Retention vs. Acquisition

Rule of thumb: retention is cheaper than acquisition — is this true?.

Compounding Lifetime Value

In the early stages of abusiness, estimating the expected lifetime of a customer is understandably inaccurate. One proxy for expected lifetime, is the retention rate r. Rather than ask ’how long will a customer stay with us’ we can ask, how many customers churn Month-over-Month or Year-over-Year.

Assuming customers stick around for about n years, our updated LTV equation is then

Expected profit can also be thrown off as many customers will have not completed their lifetime spending, and annual spending itself can vary over time. There are several spending curve patterns that customers may follow including flat, sub-linear, linear, and super-linear.

So, taking profit as a function of year, and assuming customers stick around for up to n years, our second LTV equation is

Now, we can inject our first lever into our LTV equation. Lets assume that retention r is actually variable based on how much we spend s on retention in some way.

Our new equation

In this breakdown, understanding churn is the most important part of our equation. Our most important levers are then retention as a function of spend r(s) and spend itself.

One interesting examination is how to balance churn and annual profitability of a customer. To do this, we need to explore the value of

The idea in the above equation is basically, left part: I make my customer more profitable and see how that propagates over the years, right part: I improve my retention rate and see how the static yearly profit propagates. If I can improve retention by 10% or improve profit by 10% for equal cost, it does generally follow that I should improve retention. However, if epsilon, the % of change, is not consistent across both terms, e.g. if it’s a lot easier/cheaper to improve profitability, it can make sense to improve profitability/return instead.

Actionable LTV

Now what? Focus on actionable LTV.

Decomposition of LTV into meaningful and statistically driven levers is so important for a business.

One such route to gaming LTV is to understand that LTV can vary quite significantly between different customer segments.

We can compute the LTV for each of these segments, and then take the expectation based on the probability P(segment) for the LTV computed for each segment. This looks like the following expectation.

Assuming the number of customers and the LTV for each segment remains constant, the best way to game this equation is to maximize the probability P(segment) for the customer segment with the highest LTV. This is perhaps too tight of an assumption but we will make do with it for now. Under this model, to improve the LTV of customers overall, one should (1) segment customers in such a way that LTV is non-uniform, and then (2) acquire proportionally more customers in the highest LTV segment. It sounds simple, but is not trivial to segment customers in (1) so that acquisition in (2) is feasible or actionable.

Acquisition Channel contribution to LTV

One interesting observation is that all acquisition channels are in fact their own natural segmentation of the population. Therefore, if customers from some acquisition channels have greater LTV than others, then it might make sense to invest proportionally more into those acquisition channels.

One caveat of the above, however, is that we have no explainability for why a customer from a given acquisition channel is more valuable than another, or why one acquisition channel is better than another. Without extracting features that help us understand our customer, LTV based on acquisition channels can essentially be a black-box system for the businesses growth.

Let us address this by building a causal inference model for the affect of acquisition channel on LTV. We propose the linear regression model with an indicator variable 1(channel) for acquisition channel and a set of covariates xi.

Now, we expect LTV across the entire customer population to be non-linear. So, to address this, we assume some segmentation of our model based on the covariates (xi). This can be done via a clustering algorithm like k-means, a decision tree, or other learning algorithms.

Now, among each segment, we perform a set of regressions, swapping in acquisition channels, to produce a set of coefficients α for each channel which approximate linearly the unexplained contribution of the acquisition channel to lifetime value.

Why might this be valuable? Well, intuitively, if in a given segment αi > αj and model i has a lower error or similar error to model j, then we could want to invest more heavily in acquisition channel i. Additionally, we can get a rough sense of the relative profitability of investing in channel i compared to channel j, after adjusting for the number of acquired customers by channel.

Additional technical concerns here include cross-validation, sample-size, and spurious correlation which should be addressed systematically.

If you are a fellow curious mind, please reach out or check out our other articles on Bayesian MMM and time series modelling.

--

--

Sean Billings
OutPost: OutPoint Growth Blog

Technical cofounder of OutPoint: looking to help bridge the gap between research and production.