Modeling Customer Churn With Survival Analysis

5 min readJan 23, 2019

All code related to the article below can be found here.

‘Customer Churn’ is the loss of clients or customers. In order to avoid losing customers, a company needs to examine why its customers have left in the past.

How can we use data science to better understand customer churn?

The Tool: Survival Analysis

To do so, we’re going to borrow a tool from an unlikely place, survival analysis. Survival analysis was first developed by actuaries and medical professionals to predict survival rates.

Survival analysis works well in situations where we can define:

- A ‘Birth’ event: for our application, this will be a customer entering a contract with a company
- A ‘Death’ event: for us, ‘death’ is a customer ending a relationship with a company

The component that makes survival analysis superior to other regression models is its ability to deal with censorship in data.

In the traditional sense, censorship may refer to losing track of an individual or an individual not dying before the end of an observation period. This data is ‘censored’ because everyone dies eventually, we’re just missing the data.

Similarly, we would expect to lose all customers eventually. Just because we haven’t observed them canceling their contact, doesn’t mean they never will.

The Problem: Customer Churn in Telecom

Treselle Systems, a data consulting service, analyzed customer churn data using logistic regression.

This approach works for a binary classification of whether or not a customer has left, but survival analysis is more appropriate.

The data can be found here.

Our goal is to identify ways for the telecom company to reduce customer churn.

The Analysis: Lifelines Library in Python

For our analysis, we will use the lifelines library in Python. Our first step will be to install and import the library, along with some of the classics.

!pip install lifelines
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import lifelines

Next, we will import the data and perform some basic cleaning.

For each customer, we will need two important data points for survival analysis:

- ‘Tenure’: how long they have been a customer when the data is observed
- ‘Churn’: whether or not the customer left when the data was observed

We will first identify these features and ensure the data type is correct. Note, many customers in our data have not left yet.

churn_data = pd.read_csv(
    'https://raw.githubusercontent.com/treselle-systems/'
    'customer_churn_analysis/master/WA_Fn-UseC_-Telco-Customer-Churn.csv')
# transform tenure and churn features
churn_data['tenure'] = churn_data['tenure'].astype(float)
churn_data['Churn'] = churn_data['Churn'] == 'Yes'
churn_data.head()

Before going into any further analysis, let’s look at the survival rate for the average customer using a Kaplan-Meier survival curve.

Using the code below, we can fit a KM survival curve to the customer churn data, and plot our survival curve with a confidence interval.

The survival curve is cumulative. Meaning, in the graph below, after 20 months, the chance of a customer not canceling service is just above 80%.

# fitting kmf to churn data
t = churn_data[‘tenure’].values
churn = churn_data[‘Churn’].values
kmf = lifelines.KaplanMeierFitter()
kmf.fit(t, event_observed=churn, label=’Estimate for Average Customer’)# plotting kmf curve
fig, ax = plt.subplots(figsize=(10,7))
kmf.plot(ax=ax)
ax.set_title(‘Kaplan-Meier Survival Curve — All Customers’)
ax.set_xlabel(‘Customer Tenure (Months)’)
ax.set_ylabel(‘Customer Survival Chance (%)’)
plt.show()

The above should give us some basic intuition about the customers.

As we would expect for telecom, churn is relatively low. Even after 72 months, the company is able to retain 60% or more of their customers.

To examine the effects of different features, we will use the Cox Proportional Hazards Model. We can think of this as a Survival Regression model.

‘Hazards’ can be thought of something that would increase/decrease chances of survival. In our business problem, for example, a hazard may be the type of contract a customer has. Customers with multi-year contracts probably cancel less frequently than those with month-to-month contracts.

One restriction is the model assumes a constant ratio of hazards over time across groups. Lifeline offers a built in check_assumptions method for the CoxPHFitter object.

After some data cleaning, including encoding categorical variables (k-1 dummies), we can fit a survival regression model to the data.

cph = lifelines.CoxPHFitter()
cph.fit(churn_hazard, duration_col='tenure', event_col='Churn', show_progress=False)
cph.print_summary()

In the above regression, the key output is exp(coef). This is interpreted as the scaling of hazard risk for each additional unit of the variable, 1.00 being neutral.

For example, the last exp(coefficient), corresponding to PaymentMethod_Mailed check, means a customer that pays by mailing a check is 1.68 times as likely to cancel their service.

For the company, exp(coef) below 1.0 is good, meaning a customer less likely to cancel.

To better visualize the above, we can plot the coefficient outputs and their confidence intervals.

# plotting coefficients
fig_coef, ax_coef = plt.subplots(figsize=(12,7))
ax_coef.set_title('Survival Regression: Coefficients and Confident Intervals')
cph.plot(ax=ax_coef);

Visualizing Coefficient Confidence Intervals

The Conclusion

How can our telecom company reduce customer churn?

We can make recommendations along three dimensions: contract specification, customer selection, and payment systems.

To visualize some of our findings, we will fit categorically based Kaplan-Meier curves and plot them, allowing us to see difference in churn rate between customer categories.

Contract Specification

The most important feature, by far, is the presence of a 1 or 2 year contract. Customers are .25 and .02, respectively, times as likely to cancel their service if they are under contract.

Cancellation fees are a possible underlying cause. As long as these fees do not prohibit new sales, we would recommend continuing to put them into as many contracts as possible.

Kaplan-Meier Curves Segmented by Contract Type

Customer Selection

Customers with a partner or dependents are .82 and .91 times as likely to cancel as normal customers. Families and other large households seem to be less likely to change providers.

This could be due to higher incomes, less time to consider options, or another combination of factors.

Kaplan-Meier Curves Segmented by Dependents

Payment Systems

There is a reason companies now default to opting employees into 401k plans. It takes effort for people to make a change, even if it is beneficial.

Make sure your customer’s default is an automatic payment made monthly. This requires little effort from the customer to remain subscribed.

Conversely, sending a check, in the mail or electronically, is a pain. It requires effort to remain subscribed.

Kaplain-Meier Survival Curve Segmented by Payment Method

That’s all for now! Please let me know if you have comments, questions, or other topics you would like covered.