# How to measure statistical significance in retention cohorts?

When we measure retention rate (or churn rate) of customer cohorts at TouchNote, it’s important for us to ensure any detected differences are statistically significant, and that these differences have sufficient statistical power. (For a quick explanation why you have to validate for statistical power as well as significant, read here).

Luckily there’s an easy web tool built by Evan Miller that lets you just that very easily. Here’s how to test for power and significance in a jiffy.

# Test for statistical power

First, you want to test that the sample size (i.e. the size of each cohort) is large enough. There’s no point testing for statistical significance if your results do not have sufficient power.

Assume your data for the two cohorts are these:

Your screen should look like this when you’re done:

This tells you that the minimum sample size (i.e. cohort size) of each cohort needs to be 7,562 users. Luckily, we have more than that in each cohort (7,875 in week 1 and 8,181 in week 2), so we’re OK to proceed.

If for example you increase the statistical power to 95%, you’ll see the minimum sample size increases to 9,165 users.

We can now test if the difference detected in the retention rate between our two cohorts is statistically significant or not.

# Test for statistical significance

We’ll now use another statistical test, Chi-squared (χ2), to compare if the difference in retention rate of the cohorts is statistically different.

Your screen should look like this when you’re done:

# Conclusion

The difference we detected between the retention rate of the two cohorts is indeed statistically significant. There’s a 99% confidence that the actual retention rate of the first cohort ranges between 57.2% and 60.1%, and that of the second cohort between 60.3% and 63.1%. As these two ranges do not overlap, we can tell with 99% significance that the improvement we detected is real.