Powerful EDA With Tableau Creating Dashboard And Story, Full Tuned XGBOOST Model On IBM Watson Telecom Churn Data.

Shubham Pundir
Analytics Vidhya
Published in
12 min readSep 14, 2020

Life starts when you solve problems, as a data scientist I love solving business problems.

What Is Churn First of All:

Churn is the Number of subscribers to a service that discontinue their subscription to that service in a given time period. In order for a company to expand its Client Base, its growth rate (i.e. its number of new customers) must exceed its churn. Churn is an important consideration in the telephone and cell phone services industry.

Why Companies is Much worried about the Churn:

Photo by LYCS Architecture on Unsplash

Churn is used to indicate the strength of a company’s customer division and its overall growth prospects.The less the Churn the more the company can make revenue out of Them.

High Churn means the company need to again spend money to acquire new customer Base.

Thats why companies are much worried about churn because its always difficult to acquire new customers and its mostly easy to retain them but the important question is how we know who will churn.

That's where we find our Business Problem.

Business Problem:

Photo by Daria Nepriakhina on Unsplash

Every Day to day passing by the competition is high in the market for the Telecom Industry and losing the customers from its customer base gives a lot of loss to the company and on the other hand, acquiring new customers is difficult and costly. The Telecom Company wants to know the customers who going to churn and want a model that classifies the customers which are going to churn so that the company can run measures to retain them.

Classification Description:

I will be classifying the customers based on the various features we collected from the Telecom Company and will be given output if the customer will churn or not.

Data Description:

Photo by Carson Masterson on Unsplash

The data is from the IBM Watson Of Telecom Churn, Thanks to IBM providing real-life scenario data so that like me aspiring Data scientists can learn and perform the task which can be in future replicated in Real Industry.

Attribute Information:

customerID : Customer Identification

Gender : the customer is a male or a female

SeniorCitizen : the customer is a senior citizen or not (1, 0)

Partner : customer a partner or not (Yes, No)

Dependents : customer dependents or not (Yes, No)

Tenure : Number of months the customer stayed with the company

PhoneService : a phone service or not (Yes, No)

MultipleLines : customer multiple lines or not (Yes, No, No phone service)

InternetService : Customer’s internet service provider (DSL, Fiber optic, No)

OnlineSecurity : customer online security or not (Yes, No, No internet service)

OnlineBackup : customer online backup or not (Yes, No, No internet service)

DeviceProtection : customer device protection or not (Yes, No, No internet service)

TechSupport : customer tech support or not (Yes, No, No internet service)

StreamingTV : customer streaming TV or not (Yes, No, No internet service)

StreamingMovies : customer streaming movies or not (Yes, No, No internet service)

Contract : The contract term of the customer (Month-to-month, One year, Two years)

PaperlessBilling : the customer has paperless billing or not (Yes, No)

PaymentMethod : The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))

MonthlyCharges : The amount charged to the customer monthly

TotalCharges : The total amount charged to the customer

Churn : Whether the customer churned or not (Yes or No)

We have Successfully defined our business Problem and now we will solve the Problem Using our Business Understanding First with approaching the Problem Solving by Exploratory Data Analysis and then using Machine learning to Classify the Churn customers.

Exploratory Data Analysis:

Tableau Link File

This time I will be doing the EDA in Tableau, as is a very powerful tool.I have created a dedicated dashboard regarding the same and have depicted a powerful Data story of Telcom churn.

Tableau Dashboard Link:

https://public.tableau.com/profile/shubham.pundir#!/vizhome/TelecomChurnEDAAndInsightStory/TelecomChurnEDAAndInsightStory?publish=yes

I will be linking the snips here for better understanding.

Our data contains 26% of the churn people and 73% of the people who did not churn.

Insights: Contracts

In a single view, we will be looking at the Gender ratio among the Churn people in the Contract and there charging pattern.

The above figures show The combination of Contract and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned yes have a greater Money generation in the two year and one year Contracts it means when these customer leave the company it generated a huge loss.

Most of the revenue Generation is from the long Contracts ,company should concentrate more on retaining the longer contracts.

Insights: Tech Support

In a single view, we will be looking at the Gender ratio among the Churn people in Tech Support and there charging pattern.

The above figures show The combination of Tech Support and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES have a less charging in the Tech Support ,we can conclude that most of the unavilability of Tech support the People are leaving the Company.

We can also see that in the monthly basis the average charging is same means that people are not much satisfied with the service hence Churn YES.

The Tenure section says it all, validates it as we can see the average tenure of the People with Tech support is less and should be taken into considersation by the company.

Insights: Streaming TV

In a single view, we will be looking at Gender ratio among the Churn people in Streaming TV and there charging pattern.

The above figures show The combination of Tech Support and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES are generating less revenue still they have subscribed to the Steaming TV ,their should be more customer centric plans to increase the revenue and to retain them.

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more yearly plans should come up with customer centric mindset.

Insights: Streaming Movies

In a single view, we will be looking at the Gender ratio among the Churn people in Streaming Movies and there charging pattern.

The above figures show The combination of Streaming Movies and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES are generating less revenue still they have subscribed to the Steaming Movies ,their should be more customer centric plans to increase the revenue and retain them.

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more yearly Movie plans should come up with customer centric mindset.

Insights: Phone Service

In a single view, we will be looking at the Gender ratio among the Churn people in Phone Service and there charging pattern.

The above figures show The combination of Phone Service and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES are generating less revenue as there are signficantly many in the monthly charges who have left the service means there is wrong in the service provided by the company the phone service should be more customer centric.

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more yearly Phone service plans should come up with customer centric mindset.

Insights: Online Security

In a single view, we will be looking at the Gender ratio among the Churn people in Online Security and there charging pattern.

The above figures show The combination of Online Security and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES are generating less revenue we should deep dive more into the Online security measures and should one to one clear the online security problems people are facing and customers can be retained.

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more Online Security Measures plans should come up with customer centric mindset.

Insights: Online Backup

In a single view, we will be looking at the Gender ratio among the Churn people in Online Backup and there charging pattern.

The above figures show The combination of Online Backup and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES are generating less revenue we should deep dive more into the Online Backup measures and should one to one clear the online Backup problems people are facing and customers can be retained.

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more Online Backup Measures plans should come up with customer centric mindset.

Insights: Internet Service

In a single view, we will be looking at the Gender ratio among the Churn people in Internet Service and there charging pattern.

The above figures show The combination of Internet Service and the average Total, Monthly charges with the tenure.

The Green is: Female

The Gold is: Male

The Values in The Bars is the average Charges and tenure in months and year.

We can see that the people who have churned YES are generating less revenue we can even see that there is less amount of people opting for Fiber optics and DSL company should come up with customer centric flexible plans to provide least Internet as the other steaming TV and movies is more dependent on those.

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more Internet Service flexible plans should come up with customer centric mindset.

Insights: Contract pattern overall Data

In a single view, we will be looking at the Contract pattern among the Churn people in all data and there charging pattern.

The above figures show The combination of Contract pattern and the average Total, Monthly charges with the tenure.

The Green is: Churn NO

The Red is: Churn Yes

The Values in The Bars is the average Charges and tenure in months and year.

In the Total charges we can see that The Churn No is creating a huge loss for the company,but in month to month is not that much to be worried about.

The monthly customers are not showing much of patter In the Contract,but Average tenure is worth looking into for the contract in the Company.

Insights: Payment Method pattern overall Data

In a single view, we will be looking at the Payment Method pattern among the Churn people in all data and there charging pattern.

The above figures show The combination of Payment Method and the average Total, Monthly charges with the tenure.

The Green is: Churn NO

The Red is: Churn Yes

The Values in The Bars is the average Charges and tenure in months and year.

Pattern In Total Charges Among Various Paying Method:

The people who are paying vai Bank tranfer less then 23000 are likely to be Churn YES.

The people who are paying vai Credit Card less then 24000 are likely to be Churn YES.

The people who are paying vai Elctronic Check less then 15000 are likely to be Churn YES.

The people who are paying vai Mail Check less then 500 are likely to be Churn YES.

Pattern In Monthly Charges Among Various Paying Method:

On an average who are paying less then 80 they are likely to get Churn YES.

Pattern In Tenure Among Various Paying Method:

If a customer is with the company for less then 30 months and is paying via Bank Transfer and Credit Card they are likely to get Churn Yes.

If a customer is with the company for less then 17 months and is paying via Electronic Check they are likely to get Churn Yes.

If a customer is with the company for less then 9 months and is paying via Mailed Check they are likely to get Churn Yes.

The company should try to increase the tenure of the payers and move them to automatic Paying via options by giving more attractive cashback options and This will help in Less Churn Yes.

Lest kick in our Machine Learning and apply the All best XGboost and tune The model to reach our best accuracy Score(Using Confusion Matrix).

XGboost Total understanding:

Whenever the Imbalanced data set comes up to mind The XGboost performs really well. I use XGBOOST in the imbalanced data set because I don’t want to opt for the Upsampling and Downsampling as it creates a bias if I upsample and loss of Valuable information when we do downsample.

Ther is a hyperparameter Scale_pos_weight which lets the Xgboost penalize each time it classifies wrong the class and it helps to reach a better accuracy other algorithms fail to.

Machine Learning (XGBOOST) Highlights:

When Ever you get the Inbalance data always fo the Stratified Split, what it does is that it splits the same amount of class in both the testing and training Set, and it's very important.

Always keep the Track of the AUC scores while training model Do early stoping at 10 so that you get the lowest Validation Auc.

You can see here without tuning the hyperparameters it is not much doing the better job for True negatives which is our main concern.

Round one of hyperparameter Tuning to know which parameter to tune more and get a better performance in classifying our True negative Churn YES.

Round 2 of hyperparameter Tuning for better classification.

Final Hyperparameter for The XGBOOST.

Accuracy Measure:

We were Successfully able to classify 86% correctly the Churn Yes customers.

Let's plot the first decision Tree to have an idea of how the functionality is Happening.

Conclusion:

We have successfully solved the Business problem and have given insightful Mesures while Exploratory Analysis of data to help retain Customers at Company Level and successfully proposed a model which is 86% accurate in predicting the Customers who are going to churn.

I hope you liked this journey of Business Problem Solving with me next time, will come up with another interesting Business Problem.

--

--