Product feature retention analysis — MCC coefficient

Paul Levchuk
2 min readNov 23, 2022

--

To make a reliable assessment of how product features impact retention we need to consider:

  • popularity of product feature
  • retention performance of product feature
  • adjust for the performance of not using it

One of the possible solutions to this is to use Information Gain.

The main drawback of Information Gain is that it does not distinguish a direction of impact: positive or negative. That’s because Information Gain is always a positive number.

Let me show you an example with product feature18.

Information Gain for product feature18. Image by the author.

feature18 has 3rd highest Information gain (0.0140), but if we look carefully we will see:

  • users who used feature18 have user retention = 7.9%
  • users who didn’t use feature18 have user retention = 20.7%

The data tells us that using this product feature has a strong negative impact on retention.

From a Machine Learning perspective that’s completely OK, but from a product analytics perspective — it’s not. We need to know the direction.

To overcome this issue, I recommend using the MCC coefficient.

The MCC calculation is a correlation coefficient for two binary variables. There are several variants of how to calculate it, but I prefer to use this one:

MCC coefficient formula. Image by the author.

Let’s calculate the MCC coefficient for the product feature list and visualize it.

Probably the best thing that we can do is to compare the MCC coefficient with the Information Gain in one chart.

MCC coefficient vs Information Gain. Image by the author.

There is one important insight here:

Almost all popular product features have a negative impact on retention (see below).

As a rule, the most popular product features are `setup` features.

These product features appear at the top of the funnel (where user intention is low) and because of this user retention is also low.

A list of product features with calculated MCC coefficient. Image by the author.

Bonus fact:

If we look carefully we can spot that the MCC coefficient is negative when the metric [% returned users prd] is lower than the weighted average.

--

--

Paul Levchuk

Leverage data to optimize customer lifecycle (acquisition, engagement, retention). Follow for insights!