Predicting the best way to reach customers

Tech

Published in

ABN AMRO Developer Blog

9 min readAug 15, 2024

Imagine the countless times you’ve received marketing emails or messages about products or services that were not relevant for you. I’m sure it’s happened to many of us. Often, these messages come not just once, but repeatedly through different channels, leaving us annoyed and frustrated. Eventually, we find ourselves unsubscribing from all communications from that sender. This scenario highlights exactly what most companies strive to avoid in digital marketing.

Predicting the best way to reach customers — cover image.

As Seth Godin (an influential author in the field of marketing) wisely said, “Marketing is a contest for people’s attention”. Misusing that attention through spamming can quickly turn potential customers into detractors. As a bank, ABN AMRO wants to ensure that we can effectively grab the attention of our customers by delivering the appropriate message, at the right time, through the best channel.

Over the past few months, we have been working on a use case to address the challenge of reaching the customer through the correct channel based on their past interactions. This helps businesses send messages through the channels that customers are most likely to engage with, thereby improving customer interaction metrics such as Click-Through Rate (CTR).

This use case falls under the umbrella of recommendation systems, which aim to predict the “rating” or “preference” a user would assign to an item. These systems are widely used in various applications to suggest products, services, information, or content to users based on factors such as past behavior, preferences, or demographic information.

Just as recommendation systems suggest products, movies, or songs to users based on their preferences, our channel preference model recommends the correct communication channels for the marketeers to reach the customers and maximize user interactions.

In today’s world, there are several types of recommendation systems, each using different algorithms and having its own strengths and weaknesses. Below are some of the most used systems:

Popularity-based systems
Content-based filtering
Collaborative filtering
Hybrid systems
Matrix factorization
Deep Learning-based systems
Reinforcement Learning-based systems

Before diving into more complex systems and investing significant time, we implemented a popularity-based system as the foundational model for our use case. The primary reason for choosing this approach was to quickly establish a baseline framework that could effectively demonstrate the end-to-end process of recommendation, from data collection to system integration. A popularity-based model is relatively simple to implement and understand, allowing us to validate the core functionalities without the complexities of more advanced algorithms.

The dataset utilized for our use case comprises customer interaction data including impressions (views) and clicks across a variety of channels and campaigns. We selected the Click-Through Rate (CTR) as our interaction and performance metric and it served as the basis for computing the channel ranks for each customer.

Channel Preference Model

We designed a popularity-based model comprising two components:

Individual customer channel rank and
Segment channel rank (A segment can be a group of customers that shares common characteristics or behavior. An example segment for customers would be student or professional).

These components are combined using a weighted average element, which determines the final channel rank for each customer. This approach ensures that a customer’s preferred channel takes precedence over the segment popular channel when sufficient data is available for that customer. Conversely, it also addresses the cold start problem when sufficient data is not available for a specific customer.

To provide a clearer understanding, allow me to break down the individual components of the model and discuss the potential drawbacks of using them in isolation, as well as advantages of combining them, in the following sections.

Individual customer channel rank

The individual customer channel rank component focuses on the unique preferences of each customer, ranking channels based on their specific interactions.

Consider an illustrative example of a retail customer with Customer ID — X. In the past we sent this customer messages related to products, services, tips, etc., through various campaigns using a number of digital channels like Email, bank mail, banner, and pop-up notifications.

The table below summarizes this customer’s interactions with these channels.

Table summarizing customers interactions with digital channels.

Based on this data, we could infer that the preferred channel for Customer X is Email, as it has the highest CTR compared to other channels.

Drawbacks of individual customer channel rank

There are some drawbacks in relying solely on individual customer rank to determine the preferred communication channel, especially when data is sparse for a specific customer.

Consider an alternative scenario for Customer X:

In this case, both Email and Bank mail channels show CTR of 100% and 50% respectively. However, this data prompts several critical questions:

Can we draw robust statistical conclusions from channels with very few impressions, such as Email and Bank mail?
What constitutes enough impressions to make a reliable determination?

These questions highlight the necessity of an approach that balances individual customer data with broader segment insights to ensure more accurate conclusions.

Segment channel rank

To include insights at a broader level, we looked into the metrics at segment level, identifying the most popular channels within that segment.

The definition of a segment varies depending on the business use case. In the example below, the segment is defined by a combination of Customer group, Region, and Age group. The computation follows the same methodology as the individual rank but is applied at the segment level. Based on this approach, for the customer segment “Retail — NL — Young,” we observe that the preferred channel appears to be Banner.

Drawbacks of Segment channel rank

While this approach benefits from a broader perspective, it may not accurately reflect the preferences of individual customers. It assumes that all the customers within a segment behave similarly, which is strong assumption, and it is often not the case in real world scenarios. Therefore, this approach results in less personalized outcomes.

Weighted average component

To solve the drawbacks of both ranking methods, we thought of a method that combines customer and segment channel CTRs using the formula below to create a score. This weighted average score will be used to determine the final channel rank for each customer.

Weighted average = ((constant * segment CTR) + (customer impressions * customer CTR)) / (constant + customer impressions).

The value of the constant depends on the use case. For illustration purposes, let’s consider the overall median impressions of the channels for each customer. An example is shown below:

Based on the data above, we can infer that each customer has received 100 Emails (on median) within a specific timeframe. The constant component addresses the drawbacks discussed in section “Individual customer channel rank”, providing an understanding of what constitutes a sufficient impression for each customer.

Final channel rank

Based on the data we used for illustration purposes, the final channel ranks for the customer X after computing the weighted average score will be the following:

Now, let’s take a closer look at the results. Although the Banner channel has a high segment CTR, it does not rank highly for Customer X. This is because Customer X has more banner impressions (Banner impression for customer X is 420) compared to the overall population (Banner median impressions is 70). As a result, more weight is given to the customer-specific CTR in the final score.

On the other hand, if you look at the Text channel, customer X has zero impressions in this channel (therefore the customer CTR is also zero). However, at the segment level, it has the highest CTR. This insight provides us with a new channel recommendation to explore for this customer.

By combining customer and segment metrics through the weighted average score, we can generate personalized results for each customer while also exposing them to new potential channels with which they have not previously interacted.

Evaluation

Following the implementation of the model, our next natural challenge was its evaluation. To assess the model, we employed both online and offline evaluation methodologies. Offline evaluation offers a controlled and cost-effective environment for initial model selection and tuning, while online evaluation provides valuable insights into real-world performance.

Offline evaluation

Offline evaluation enabled us to experiment with various hyperparameter configurations, including the choice of constant, training period, and different segment definitions. Our goal was to closely simulate an A/B testing scenario within our offline evaluation framework. The setup we designed is outlined below:

Data preparation : For offline evaluation, we performed a temporal split of the interaction data. The models were trained on data spanning a specific number of months and subsequently tested on campaigns initiated after the training period (referred to as test campaigns).

Methodology: To measure the model performance, we used campaign CTR and channel-level CTR as our metrics. We started by filtering the test data. We selected only the interactions from the test campaigns for each customer that matched the channels predicted by the model and then calculated the metrics.

By comparing models with different hyperparameter configurations against random predictions and predefined business rules on the test campaigns, we identified the hyperparameters that yielded the best performance.

Online evaluation

To assess the real-world performance of our model, we conducted an online evaluation using A/B testing methodology in a campaign, with campaign CTR and channel-level CTR serving as the primary metric.

Methodology: In collaboration with the business, we developed a campaign to evaluate the performance of the model. We analyzed the past interaction data of similar campaigns to establish baseline CTR (at both campaign level and channel level) and conducted a power analysis. Power analysis is conducted to determine the appropriate sample size required to detect a true effect of a specified size with a given level of confidence. This helps to ensure the experiment results are statistically significant and reduces the likelihood of Type I (false positive) and Type II (false negative) errors.

Based on the power analysis results, we then divided the audience into treatment and control groups. The treatment group received messages in the channel predicted by the model (with hyperparameter configurations chosen from the offline evaluation). Meanwhile the control group continued to receive messages with the existing setup. This allowed for a direct comparison of metrics between the two groups.

Upon completion of the campaign, we calculated the CTRs for both treatment and control groups and conducted significance test on these CTRs. The results indicated a statistically significant increase of 73% in the CTR for the treatment group compared to the control group. It is important to note that this finding pertains to a single campaign; the robustness of the model will be further evaluated across multiple campaigns.

Limitations of popularity-based models

While popularity-based recommendation models are simple and easy to implement, they have several drawbacks when compared to more advanced recommendation models. Some of the common drawbacks we could encounter are:

Long-Tail Problem: They neglect less popular items, resulting in a lack of diversity.
Temporal Dynamics: They struggle to adapt to changing user preferences.
Bias and Fairness: Popularity-based models can reinforce existing biases by continually promoting already popular items. This can create a feedback loop that further entrenches the popularity of certain items while marginalizing others.
Dependency on segment definition: The model effectiveness depends on the definition of the segments. If they are broad the recommendations will be too general. If the segments are narrow the model may suffer from data sparsity which limits its ability to give accurate recommendations.

Conclusion

By leveraging the historical interaction data and conducting in-depth analysis, we developed a model that enables the business to reach the customers in the right communication channel, thereby enhancing the user engagement. We will continuously test the model across various campaigns and closely monitor its performance. Building on the insights gained from this foundational model, we plan to take the next step and develop more advanced models in the future.

References and further reading:

Author

Veera is a data scientist at ABN AMRO N.V., working within the Customer Digital Experience department. He is passionate about solving business challenges by uncovering hidden patterns and deriving actionable insights from data.

Connect with Veera on LinkedIn.