Building Behavior Segmentation by Leveraging Machine Learning Model

Published in

Life at Telkomsel

6 min readFeb 4, 2021

Have you ever been challenged to build customer segmentation? Customer segmentation is important for companies to learn more about their customers (behavior, wants, and needs) so that they can offer relevant products and services. Let’s have a closer look at this.

How do we make the right customer segmentation? One of the techniques is using an unsupervised machine learning model. But before we go into a more detail and technical part, let’s have a quick glance at three types of machine learning models:

1. Supervised machine learning model

A statistical approach to predicting target variable which might be binary or continuous by finding the relationship between features and the target variable in the training data set. Let’s take an example of setting a binary target variable as to whether a customer will buy a product or not. The target variable will be 1 if customers buys and 0 otherwise. With this model, we can predict which specific product that a specific customer wants to buy. Few algorithms commonly used:

a) XGBoost

b) Logistic regression

c) Decision tree

d) Random forest

e) etc

In short, a supervised machine learning model algorithm assists the computer to learn the relationship between features and target variable in the training dataset.

2. Unsupervised machine learning model

In an unsupervised machine learning model, we don’t set the target variable upfront, we will only rely on features then let the model put the similar samples to one cluster while separate different samples to another cluster. That is why this method is good for customer segmentation. Few algorithms commonly used:

a) K-Means

b) Hierarchical Clustering

c) etc

In an unsupervised machine learning model, since the data set contains only features without target variables, it seems that we let the computer to learn by itself.

3. Reinforcement machine learning model

The reinforcement machine learning model is a model where the machine conducts a self-discovery by interacting with its environment by trial and error using feedback from its own actions and experiences.

If the machine meets its objectives, it will be rewarded. This process will continue to maximize the rewards. An analogy for this would be a baby learning to walk, where the baby would explore different possibilities until they can walk on their own.

Few algorithms commonly used:

a) Q-Learning

b) State Action Reward State Action (SARSA)

c) Deep Q Network

d) etc

So, going back to the main challenge of creating a behavior segmentation, in order to do so we can follow some of the steps that are described in CRISP-DM (Cross Industry Standard Process for Data Mining):

1. Business Understanding

Before jumping into making a behavior segmentation, first, we need to define the background and objectives. This first step will be the foundation for us to determine the numbers of relevant features/variable input needed to build that particular segmentation.

2. Data Understanding

In this step, we will conduct a data exploration/discovery to build descriptive analytics as a basis for behavior segmentation. Different hypotheses will occur in this stage.

3. Data Preparation

This will be an exhaustive and tiring stage, where we build an analytics-based table by inputting all relevant features/variables. These features will be used as an input variable in forming behavior segmentation.

4. Modeling & Evaluation

This will be the core in building behavior segmentation using an unsupervised machine learning model. Let’s have a more detailed look at K-Means, one of the most commonly used algorithms for segmentation.

There are two types of data clustering: Hierarchical and Non-Hierarchical. K-Means is one of the methods in non-hierarchical data clustering.

How K-Means Clustering Works?

The aim of this algorithm is to split data into several clusters. In this learning algorithm, the computer clustered the data into several clusters without first knowing the target class. Other than the data, we need to define one hyperparameter K which is a number of desired clusters. This algorithm will classify data or objects into these K clusters. In each cluster, there is a center point (centroid) that represents the cluster.

Following are the steps how K-Means update its centroids:

a. Randomly pick locations for K number of centroids; This is the initialization of centroids.

b. For each sample assigns a cluster based on the shortest centroid.

c. Update the location of the centroid with the mean of samples assigned to each centroid. This is an iterative process to find the optimal location of centroids. It ends when:

· The centroid locations are stabilized. It means that it finds optimal locations where samples are not assigned to different clusters anymore.

· Limit number of iterations reached.