Customer Churn Forecasting: Make Accurate Predictions with Decision Tree.

Learn machine learning with decision tree algorithms to enhance customer retention strategies.

Asish Biswas
AnalyticSoul
3 min readJun 11, 2024

--

Welcome back, data enthusiasts! In our previous lesson, we built and evaluated a logistic regression model to predict customer churn. Today, we will explore another powerful algorithm: Decision Trees. Decision Trees are intuitive and interpretable, making them an excellent choice for classification tasks like churn prediction.

Decision Tree

A decision tree is a tree-based algorithm that can be used for classification and regression problems. In a decision tree, the dataset is transformed into a tree-like structure of conditional control statements, and predictions are made by following through the conditional steps of the tree to the leaf node.

Let’s assume a hypothetical scenario where we try to estimate customer churn based on three features: gender, marital status, and internet service. We can form a decision tree based on the features, where the features are the nodes of our tree and the possible values are the edges. The node at the top is called the root node. For each customer, the algorithm starts evaluating from the root node and follows the edges depending on the feature values. At the leaf node, there is the target (churn or no churn).

Traversing through the decision tree, we can see that marital status is a stronger predictor of churn than gender, as unmarried individuals are more likely to churn regardless of their gender. Where married persons with internet service are likely to stay (not churn).

Implementing decision tree

Let’s implement the decision tree with our telco dataset. The steps are the same for most of the classification models.

  • Step 0: Load the encoded dataset.
  • Step 1: Split the data into train and test datasets.
  • Step 2: Initializing the decision tree model.
  • Step 3: Training the model with the training data.
  • Step 4: Predicting with the test data.
  • Step 5: Measuring the model performances.
# load the dataset
df_telco = pd.read_csv('data/telco_customer_churn_encoded.csv', header=0)

# step 1: split testing and training data
y = df_telco_encoded['Churn']
X = df_telco_encoded.drop(columns='Churn')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=123)

# step 2: init decision tree classifier
dtree = DecisionTreeClassifier()

# step 3: training the model
dtree.fit(X_train, y_train)

# step 4: predicting
y_pred = dtree.predict(X_test)

# step 5: model performance evaluation (accuracy)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)

# Print results
print('Accuracy:', round(accuracy, 2))
print('Precision:', round(precision, 2))
print('Recall:', round(recall, 2))
# Output:
Accuracy: 0.72
Precision: 0.5
Recall: 0.49

As we see, the accuracy rate of our model is quite good (72%), similar to the logistic regression model but the recall rate is suffering similarly (around 50%). You will find more ways to evaluate and interpret the results in the accompanying jupyter notebook.

This concludes our Customer Churn Prediction chapter. You’ve learned how to explore the dataset, engineer features, and build and interpret machine learning models for predicting customer churn.

We look forward meet you in the next chapter where we’ll predict customer lifetime value and revenue. Keep learning :-)

Recap of the Customer Churn Prediction Chapter

  1. Lesson 4.1 — Introduction to Customer Churn: Understanding the importance of predicting customer churn and an overview of the dataset.
  2. Lesson 4.2 — Explore the Dataset: Initial data exploration to understand the distribution and relationships between features.
  3. Lesson 4.3 — Feature Engineering: One-Hot Encoding: Transforming categorical features into a format suitable for machine learning models.
  4. Lesson 4.4 — Customer Churn Prediction: Implement logistic regression to predict whether a customer will churn.
  5. Lesson 4.5 — Predicting Customer Churn with Decision Tree: Use decision tree algorithms to improve prediction accuracy and interpretability.

Join the community

Join our vibrant learning community on Discord! You’ll find a supportive space to ask questions, share insights, and collaborate with fellow learners. Dive in, collaborate, and let’s grow together! We can’t wait to see you there!

--

--