What is a cohort and why should I care?

Travis Giggy
epiclabs
Published in
6 min readSep 1, 2020

This is part one of two. The second part is here:

Don’t be average! Follow this advice for success in business analytics

A manager of a company or business line is like the head chef in a restaurant.

The chef knows all the ingredients, steps, and techniques to cook every plate on the menu.

If you’re a guest at the chef’s restaurant, it’s perfectly fine to look at a finished plate of beautiful food and think how wonderful it will taste without knowing a thing about how it was made. But the chef must know the recipe in detail — all the elements that make a great plate. If something is wrong with the plate, the chef immediately knows which ingredient is causing the problem.

Managers are like the chef of their company, yet they often look at data without understanding the recipe! They see the finished plate but they don’t know what makes it great or what is failing. Most managers look at reports and see time series of rolled-up data. E.g., a total number of “something” over time.

In this article, I’ll cover the difference between time series and cohorts, from simple to complex. If you already know the basics, stick around and prepare to become a chef!

To skip the blah blah blah and go straight to the sample cohort spreadsheet, click here.

Time series of customer data like this are “flat” and they hide important information. This time series is like the finished plate, after all the results have been cooked. It tells you nothing valuable. I call this a “vanity metric” and many managers rely on flat data like this to do their job.

E.g., when looking at a report like “Active Users by Month”, a time series shows you the total number of customers who transacted with you during the month. You don’t see how many of those are new users, how many are existing users. You don’t see what percentage of your users have come from different periods, user retention, or growth.

A cohort triangle, on the other hand, separates out the new customers (growth) from the existing customers (retention). I like to think of a 1-D time series getting blown up, like this:

How to read a cohort triangle. First, the simple stuff

A cohort is a 2-D view into a time series:

  1. The cohort “period” is in the left column and the top row.
  2. The new customers for the period match up to the appropriate coordinate.
  3. The existing repeat customers for a cohort flow to the right for each subsequent period. In the “Apr 2018” cohort there were 4,260 new customers, with 2,449 of them transacting again in the “Jul 2018” cohort.
  4. The cohort’s leading edge always shows the new group of customers who came in during this time period, with the previous cohorts’ repeat customers stacked above it.
  5. Sum all the values in a column to get the aggregate at bottom

Boom! You just gave yourself lots of clarity on growth and retention! Simply scan the leading edge of the triangle to get growth, and across any row to see retention.

Now some extra power

Still keeping it simple though…

There are two ways to add value to a cohort triangle:

  1. Add more relevant information using the existing cohort data
  2. Make the cohort easier to read

I’ll build up a cohort step-by-step and show you why it’s so powerful:

Add more relevant information using the existing cohort data

  1. Add summary rows for New and Existing users
  2. The New user values will be the same as the leading edge of the triangle, they’re just repeated for ease
  3. The Existing users are the sum of all the existing users from previous cohorts
  4. Calculate period-over-period growth
  5. Compare the New/Existing user ratio
  6. Calculate gross churn
  7. Calculate CGR(3) and CGR(6)

Make the cohort easier to read

  1. Put the cohort size on the left with a spark chart
  2. Make the growth and retention data different colors
  3. Graph revenue growth by New/Existing over time

4. Graph New vs. Existing customer percentage over time

Create cohorts with different metrics

Revenue, Total Number of Transactions, AOV, ARPU, LTV, and other relevant metrics give you new insights. All the same methods and rules apply.

Note: Always keep the same cohort size and spark charts in the left columns. This will make it easy for you to reference across many different metrics of cohorts and understand, for example, why the Oct 2018 cohort had such higher initial revenue than other cohorts (it was the biggest cohort).

Left aligned cohorts are used to compare progress over time

To transform a regular cohort triangle into a left-aligned cohort, simply slide every row all the way to the left and change the header row to indicate generic periods instead of specific periods of time.

Note: A right-aligned cohort is the preferred initial view, because of the extra data which can be summed and compared across time (as demonstrated above). When you left-align a cohort, most of the summary rows from your right-aligned cohort no longer make sense, so remove them.

Once you have a left-aligned cohort it is easy to apply conditional formatting in Excel or Sheets in order to visually see trends both across rows and down columns. Are cohorts getting better or worse over time?

The graph of this view is called a spaghetti graph, you can see trends over time.

Now turn the numbers in your left-aligned cohort into percentages. The initial cohort size is the denominator and the current period is the numerator. Just like before, apply conditional formatting and create a spaghetti graph to get clarity on cohort retention over time. Is performance getting better or worse? Why?

Now you should have 3 cohort triangles for each metric (Active Customers, Revenue, LTV, etc). One is right-aligned and two are left-aligned.

Next up: Segmentation

You’re well on the way toward being a chef, but one critical piece is missing: Segmentation. In the next article, I cover the way to think about segmenting your customer audience. Not all customers are created equal!

Read Part II here: Don’t be average! Follow this advice for success in business analytics

Sample cohort spreadsheet

Here’s a sample spreadsheet you can use to get an idea of what a detailed cohort analysis looks like. This is a fictitious company in the “box of the month” space, growing fast, with segmentation, and all the chef-y goodness from this article.

RocBox Sample Cohort Analysis

How to get help

Cohort analysis is part of Acquitention strategy at Epic Labs. Acquitention is “acquisition + retention” strategy, and includes cohort analysis.

We’ve analyzed billions of dollars of customer revenue, and helped improve companies all over the world. A simple Acquitention Heuristic can be done in days, and follow-on strategy implementation is available as well. We often increase retention by 5+% percent, improve marketing conversion by 10%, and we’re willing to structure our fee based on your success.

Get in touch at https://epic.so to schedule a conversation about your company.

--

--

Travis Giggy
epiclabs

Hello, I am a co-founder of Epic Labs — https://epic.so — I was an early technical architect at 2 unicorns, and founder of multiple startups with exits.