TELECOM CUSTOMER CHURN Prediction

Semih Izinli
4 min readSep 12, 2021

--

What is the churn rate?
- Wikipedia states that the churn rate (also called attrition rate) measures the number of individuals or items moving out of a collective group over a specific period.
- It applies in many contexts, but the mainstream understanding of churn rate is related to the business case of customers that stop buying from you.

Importance of customer churn prediction:
- The impact of the churn rate is clear, so we need strategies to reduce it.
- Predicting churn is a good way to create proactive marketing campaigns targeted at the customers that are about to churn.
- Forecasting customer churn with the help of machine learning is possible.
- Machine learning and data analysis are powerful ways to identify and predict churn.
- Churn is a one of the biggest problem in the telecom industry.
- Research has shown that the average monthly churn rate among the top 4 wireless carriers in the US is 1.9%-2%.

Explanation Data

Telecom customer churn prediction:

This data set consists of 100 variables and approx 100 thousand records. This data set contains different variables explaining the attributes of telecom industry and various factors considered important while dealing with customers of telecom industry. The target variable here is churn which explains whether the customer will churn or not. We can use this data set to predict the customers who would churn or who wouldn’t churn depending on various variables available.

You can find this entire project on my Kaggle.

Expected

Bringing the telecom customer data given to us into a form that will put them into the model. The expectation of the company is how many of the customers they have next year, whether they churn or not. To help them create a plan and strategy accordingly.

Exploratory data analysis (EDA)

Let’s critically explore the data to discover patterns and visualize how the features interact with the label (Churn or not).

We can see the numbers of the features in our data, how many non null values they have and their data types from this table. 10 of the features are integer, 69 of them are float and 21 of the them are object values.

Let’s explore the target variable:

We’re trying to predict users that left the company in the previous year. It’s a binary classification problem with an balanced target.

  • number of people who stay : 48401
  • number of people who churn : 47647

Let’s see the top 40 features with the highest correlation with Churn.

Correlation measures the linear relationship between two variables. Features with high correlation are more linearly dependent and have almost the same effect on the dependent variable. So, when two features have a high correlation, we can drop one of them.
Churn prediction is a binary classification problem, as customers either churn or are retained in a given period. Two questions need answering to guide model building:

Which features make customers churn or retain?
What are the most important features to train a model with high performance?

Let’s start creating a baseline model with a Light GBM Classifier

We will build our model with the Light GBM Classifier model. We will see the Tuned versions of them. In the end, we will see how many percent we can predict. We will also see our recall and F1 score in the report.
In order to decide which model should be the most accurate in data science projects, we need to evaluate the demands coming from the business units. If we only choose the model based on Accuracy in our project outputs, we can see with bitter experience in the business results that we should not only look at this metric.
How is accuracy calculated in a model and what does it mean?
The Confusion Matrix table we see below shows the actual and estimated values in a classification problem.

Accuracy is a metric that is widely used to measure the success of a model but does not appear to be sufficient on its own.

Precision shows how many of the values we estimated as positive are actually positive.

Recall, on the other hand, is a metric that shows how much of the operations we need to estimate as Positive, we estimate as Positive.

F1 Score value shows us the harmonic mean of Precision and Recall values.

You can check my Kaggle for codes:

https://www.kaggle.com/semihizinli/churn-telecom-project

--

--