Diligence at Social Capital Part 4: Cohorts and (engagement) LTV
In the first two parts of this series we discussed growth accounting and applied it to both usage and revenue. In the third part we discussed lifetime value (LTV) in businesses that generate revenue. In this post we will take the LTV approach from the last post and apply it to the engagement and retention aspects of a business.
In the revenue LTV case, the main lesson was to observe actual monetization throughout the lifetime of existing users and to view the trends of those LTVs. For businesses that intend to grow users and monetize them in the future the focus shifts from monetization to retention over long time periods. Imagine we have some consumer app that delivers some social/content style experience. This usually starts with a retention graph possibly like the following:
The parenthetical numbers are the sizes of the cohorts. This app shows that after four weeks weekly retention is between 15–40%. One of the cohorts (2014–03–03 in green) is exhibiting particularly poor retention and there are some odd dips in retention in some of the cohorts as the retention continues to decline over time. This is a useful view of the data as it shows us the shape of the retention falloff. As with many of these types of views it gets hard to read with too many lines so I only included roughly one weekly cohort per month. In this view it’s also hard to see the trends in the falloff so it’s often useful to look at this with a retention heatmap.
This says, for instance, that the cohort of customers who started on 2014–02–03 had 144 new users and was 41% weekly retained four weeks after their first visit. The oval at the top of the figure shows several cohorts all decaying at roughly the same rate. The second oval shows three cohorts that are doing worse than others. Looking at the cohort sizes on the left axis you can see that this coincided with a big spike in user acquisition. One of these cohorts was visible in the green line in the above retention curves. This is often what happens when a company significantly increases paid acquisition. They end up getting users with a lower propensity to engage and who thus retain much worse than other cohorts. This company stopped their paid acquisition experiments after a couple of weeks and subsequent cohorts went back to the same retention pattern as before the user acquisition bump. As time passes, data accrues on the diagonal edge thus events that occur on a fixed calendar date appear as diagonal features (see arrow). This one was probably due to some outage in the app that occurred on the week of 6/2/2014 which affected all cohorts (at different ages for each cohort). This fixed calendar date effect was the cause of the strange dips in the original retention curves. Luckily, after the outage, retention continued apace without any apparent long-term negative effects. Over multi-year timespans, seasonal effects (such as Christmas, summer break, Easter, etc) appear as repeating diagonal features.
The retention curve graph above is similar to the LTV graphs described in the last post. The main difference being that the LTV view computes a cumulative figure whereas the above retention curves compute an incremental figure. In the revenue case, incremental revenue accumulates up to an LTV figure. In the usage case, incremental retention accumulates up to a cumulative activity LTV figure. In this example we are accruing weekly retention up to a weekly active user (WAU) LTV figure. For example, say that a cohort starts out on week one and is all active and in week two 50% of them are active. This can be phrased in an LTV sense by saying that the cohort has accumulated an average of 1.5 active weeks within the first two weeks since first visit i.e. “the 2-week WAU LTV was 1.5”.
Here’s the cumulative active days LTV view of the sample retention data above.
This says that the 110 users who started in the first week of 2014 had an average cumulative 5.5 active weeks on the app after about six months. The outlier weak retention cohort appears here as having a much lower WAU LTV than the other cohorts. Similar to the revenue case, there are three qualitatively different cases of activity LTV curves here.
- Flat: No incremental visitation past a certain date.
- Sub-linear: Non-zero but decreasing visitation. This is what we’re seeing in the example. All users appear to be gradually losing interest.
- Linear: Consistent retention through the lifetime of the cohort. A core of users is consistently using the product indefinitely. The slope is determined by how much of the cohort is made up of core users vs. non-core users.
- Super-linear: Increasing retention as cohorts age. The propensity to use the product goes up as the user ages.
Since this quantity is non-decreasing, we can view it’s trends just like we did for revenue-based LTV.
As before, the bars show the cohort size for each week. Look at the 2014–04–07 cohort which consisted of ~205 users. In their first week they had a WAU LTV of 1 (i.e. all users were active in their first week of visitation). In the first month after first visit this cohort averaged 2.7 total active weeks. This cohort’s 12 week WAU LTV was just observed to be ~4.5. We don’t yet know the 12 week WAU LTV of later cohorts because they are not yet old enough. The effect of the paid acquisition experiments are clear in this view. The 2014–03–03 and 2014–03–10 cohorts were unusually large and the WAU LTVs subsequently suffered for those cohorts.
Needless to say, for engagement based consumer businesses, we are interested in situations with high retention that either holds steady or gets better as cohorts age and larger cohorts of new users are acquired. This is equivalent to saying that we want linear or super-linear cumulative activity LTVs (whether DAU, WAU or MAU).
As before, we can look at the WAU LTV trends in a heatmap. In contrast to the previous heatmap above, this one shows cumulative active weeks as opposed to weekly retention.
This says, for instance, that after 4 weeks cohorts are averaging ~2.3 total active weeks except for the cohorts that were unusually large and had lower accumulated active weeks. The colors show the cumulative WAU attained by each cohort after some specified age. The two heatmap views are complementary. The retention heatmap shown earlier is useful because it gives you a sense of what on-going retention looks like once users have aged a bit. The shortcoming is that the retention view makes it hard to detect small differences between cohorts. For instance, after a few months, one cohort might have weekly retention in the 9–10% range and another cohort might be more in the 8.5%-10.5% range. It would be hard to tell if one is retaining better than the other because of the noise in the weekly retention figures. By comparing the cumulative activity between the cohorts we can get a better sense of whether the slightly different week-to-week retention is amounting to materially different total activity with the product over long timespans. The downside to the cumulative views (whether the heatmap or the trends) is that it’s harder to see small effects such as the outage that caused a short-term dip in retention. Said another way, in this case, the one week of outage didn’t cause any lasting effects and so is not very meaningful to the cumulative activity in the lifetime of the app. This is a particular case of the general statement that integrating a time series of any sort smooths out the time series.
Lifetime Value of Everything Else
So the framework of cohorts and LTV is useful for understanding both incremental and cumulative revenue generation and is also easily generalized to incremental and cumulative retention (measured by active weeks/days/months). It should be clear at this point that this framework can be used to understand any sort of behavior that customers are exhibiting in your business. To give an example outside of retention and monetization, one can consider applying this framework to understand the dynamics of referrals. Say that you have an app that tries to get users to refer a friend. Each time a user sends a referral it is not that dissimilar from when a user decides to spend money — in both cases they are signaling some incremental bit of product-market fit. One can analyze the cumulative and incremental referral sending behavior as cohorts age and come up with an “N-week referral LTV”. Here’s some sample data showing referral LTV (really the same data as above but with different titles and meant to be interpreted differently).
The cohorts in this case are determined by the week in which they sent their first referral (although you could certainly create cohorts by week of first visit, registration, etc.). In this case, the 2014–04–07 cohort had 200 people sending their first referral who sent an average of 4.5 referrals per user within the 12 weeks after sending their first referral. There were a couple of weeks with a lot of users sending their first referral (possibly through some very aggressive referral up-selling) which caused those cohort’s referral LTV to suffer. And so on.
Such an approach is complementary to comparing bulk referral sending (whether before/after the change or measured purely as a daily difference in a controlled A/B test) because it allows the difference to accumulate over a longer time period which might be necessary if the changes are small. For instance, on a daily or weekly basis, referral sending may only change by a small amount but such a difference would be much more noticeable when comparing, say, one-month referral LTV between two groups. The overall referral LTV curve shape would be useful in understanding whether referral sending happens early in the life of a customer followed by a sharp drop-off vs. sustained referral sending through the life of a customer.
Together with the growth accounting framework, the LTV framework described here forms a big part of our quantitative diligence process. The last area that we focus on regards the distribution of value (either revenue or engagement) across customers. From the visitation point of view, over longer time spans, we don’t have a sense of whether or not it’s a small core of highly engaged users mixed with a large group of disengaged users vs. a large group of more moderate intensity. We will explore this in the next post.
Edit: For reference, here’s the full table of contents.
Published in #SWLH (Startups, Wanderlust, and Life Hacking)