Stay or Stray: How Brands Can Drive and Maintain Customer Loyalty Using Machine Learning

Gin Bai
SSENSE-TECH
Published in
6 min readFeb 9, 2024
Figure 1. SSENSE+ Loyalty Program which allows customers to accumulate points on purchases and unlock a variety of benefits and services.

Customer loyalty is one of the most important strategies that contributes to a brand’s success. Numerous companies have developed their own in-house loyalty programs to ensure customer retention and satisfaction, ultimately helping them generate revenue, increase referrals, and achieve overall growth.

Let’s look at a few well-known examples, Aeroplan is Air Canada’s loyalty program which offers its members benefits as they accumulate travel miles. Another example is Sephora’s loyalty program, The Beauty Insider, which allows customers to accumulate points to be applied on future purchases, even offering customers a special gift on their birthday. Within the hospitality industry, most larger hotel chains also offer loyalty rewards programs. All of these programs are point-based and tier-based, as are the majority of loyalty programs these days, offering rewards to their most loyal customers. The more they spend, the more they are rewarded… and hopefully, the more loyal they are.

So how can we ensure our customers remain faithful? The truth is, cultivating customer loyalty is more competitive than ever. Visa or Mastercard? Uber or Lyft? Think about the brands you are loyal to. What keeps you faithful? Do you ever flirt with the idea of straying? What would tempt you to look elsewhere? Brands are working hard to understand this… spoiler alert: it’s not easy!

The million-dollar question is: how can brands drive and maintain customer loyalty in an ever-changing competitive landscape? For any tiered loyalty program, if we were able to predict the probability of a customer’s eligibility for entering the next tier, then we could be proactive and have personalized interactions with the customer. This would help organizations to act proactively based on predicted customer behaviours rather than react to customer behaviours. Depending on the response variable changes, this also applies to the gamification part of loyalty programs, including non-tiered loyalty programs. Accurate proactiveness is the key for retaining existing customers, or even encouraging customers to reach the next level. But it is a fine line to navigate between engaging customers and driving them away if your advances are not welcome.

In this article, we will tackle this conundrum using Machine Learning solutions.

The idea is to use customer historical data and current eligibility labels (0 being not eligible for the tier/program, 1 being eligible for the tier/program) to train a predictive model. Then, based on the current data, we’ll use this trained model to predict the probability of customers’ eligibility of reaching the next tier in the future.

Classification is a supervised Machine Learning method where the model tries to predict the correct label of a given input data. Probabilistic classification is a special type of classification. Instead of predicting a class, a probabilistic classification model predicts a specific probability (e.g. Customer A is predicted to be 79% likely to reach the next tier). For our use case, we will use LightGBM[1], a fast, distributed, high-performance gradient boosting framework based on a decision tree algorithm, for implementing probabilistic classification.

Figure 2. LightGBM is a gradient boosting framework that uses tree based learning algorithms.
Figure 3. The training dataset on a rolling window is used to train a LightGBM model. Customers’ future eligibility for the tier/program is predicted using the trained model and the inference dataset. The time window size can be adjusted based on the use case.

Make sure to include as many important features as possible, such as transaction features and browsing features. Features we sometimes think are not important can end up playing a big role! For example, features like “the standard deviation of browsing time” could have an impactful contribution towards the probability of eligibility. However, keep in mind that too many features can be a bad thing as this may lead to overfitting. But that’s a story for another time.

Moreover, imbalance is one of the common issues for classification models. We can use parameters like ‘is_unbalance’ or ‘scale_pos_weight’ to adjust the balance (only in ‘binary’ and ‘multiclassova’ applications). We can also incorporate a custom loss function in the LightGBM model, focal loss[2], to address the imbalance and learn from the hard misclassified results. The main problem with imbalanced dataset prediction is how accurately we can predict both majority and minority class. Imbalanced data refers to datasets where the target class has an uneven distribution of observations, i.e one class label has a very high number of observations and the other has a very low number of observations. More specifically, in our case, the imbalance occurs because the number of customers who can be labeled as ‘’most valuable customer’’ is a lot less than the number of customers who cannot be labeled as such.

Figure 4. Focal loss is a dynamically scaled cross entropy loss, where the scaling factor decays to zero as confidence in the correct class increases.

Here’s a step-by-step guide on how to perform classification with LightGBM:

Step 1: Data Preparation

Prepare your data in a format that can be used by LightGBM. This typically involves splitting your dataset into features (X) and labels (y). If you have categorical features, you can convert them into numeric representations using techniques like one-hot encoding or ordinal encoding.

Step 2: Dataset Creation

Create a LightGBM dataset from your prepared data. LightGBM provides a native data structure called Dataset for efficient data handling. You can create a dataset from your feature matrix (X) and label vector (y) using the lightgbm.Dataset() function.

Step 3: Model Training

Train your LightGBM model using the created dataset. Specify the objective function as “binary” for binary classification tasks. You can also specify additional hyperparameters such as the learning rate, number of trees, maximum depth and any custom objective you may have defined for your use case. Use the lightgbm.train() function to train the model.

Step 4: Predictions

Once the model is trained, you can use it to make predictions on new data. For probabilistic classification, you can use the predict() method to get the probability estimates for each class. By default, LightGBM returns the probability estimates for the positive class (class 1) only. If you want the probability estimates for all classes, you can set the num_classes parameter to the number of classes in your classification problem.

Step 5: Threshold Selection And Predictions

LightGBM provides probability estimates, but you still need to select a threshold to make binary predictions. You can choose a threshold that balances precision and recall based on your specific problem requirements. Alternatively, you can use evaluation metrics such as ROC AUC or PR AUC to determine an optimal threshold.

In addition, you can select multiple thresholds other than the one selected above for carrying out personalized interactions. For example, if the predicted probability of a customer reaching the next tier is somewhat high but not very high, the business might want to take some actions to engage and encourage the customer.

Conclusion

In the quest to cultivate customer loyalty, machine learning — specifically LightGBM — emerges as a formidable ally for brands. The judicious application of predictive analytics enables companies to not only interpret but also anticipate the ever-shifting patterns of consumer loyalty. By employing probabilistic classification, brands can engage customers with a degree of personalization that transforms the loyalty experience from generic to genuinely bespoke.

This strategic foresight is imperative in today’s volatile market, where brand loyalty is continuously challenged. LightGBM offers the agility and precision necessary for brands to fine-tune their loyalty programs, ensuring that customer interactions are both timely and relevant. The goal is a seamless blend of anticipation and adaptability, creating a loyalty framework that resonates with customers on a personal level while steering clear of unwanted advances.

The fusion of data-driven acumen and a customer-centric strategy paves the way for a future where loyalty is not just rewarded, but deeply rooted in a mutual exchange of value between the brand and its customers.

[1] https://lightgbm.readthedocs.io/en/latest/index.html

[2] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár, ICCV 2018. Focal Loss for Dense Object Detection. https://paperswithcode.com/paper/focal-loss-for-dense-object-detection

Editorial reviews by Catherine Heim, Gregory Belhumeur & Mario Bittencourt

Want to work with us? Click here to see all open positions at SSENSE!

--

--

Gin Bai
SSENSE-TECH

A full-time data scientist who loves learning and building AI/ML/DL solutions | A part-time DIYer who desires to be a nomad