How to build a recommender system: A real Use Case to personalize E-Mail campaigns

Benedikt Schifferer
NVIDIA Merlin
Published in
5 min readNov 2, 2022

There are many great resources on how to train a recommender model and it is still difficult to apply these to an own use-case in the real-world. The materials often focus on the technical aspects, such as architectures, and they often neglect the challenges on the scope, data and deployment. We want to address the gap by sharing our experience in building a recommender system to personalize some of our E-Mail campaigns. If you are planning to develop an in-house recommender system or just interested in the domain, these blog posts should give you guidance on how to structure and start your own use-case.

This is a series of blog posts, in which we want to share our approach, challenges, decisions and learnings for developing a recommender system to personalize the E-Mail marketing campaigns. The content is based on our own internal use-case and we will provide a step-by-step guide on how to build recommender systems from scratch (Of course, we will not share any user information :)).

How to start a recommender system? Defining the Goal is a good Step

Many companies engage with their customers and members through their own events. NVIDIA hosts the GPU Technology Conference (GTC) twice a year, which is a global AI conference for developers that brings together developers, engineers, researchers, inventors, and IT professionals by hosting hundreds of sessions across different topics. For the GTC Fall 2022, the NVIDIA marketing team and the NVIDIA Merlin team discussed how to use personalization to improve GTC attendees’ experience. We had multiple brainstorming meetings to define the scope of the project. We evaluated potential ideas by collecting information about:

  • What would be the value of the project? E.g. how many people are impacted by it?
  • Would personalization improve the user experience?
  • Do we have relevant historical data to train on? This can be a quick estimate. If we do not have the data for an idea, even though it is a good one, we will not be able to develop a system without collecting data in the first place.

We decided to develop multiple E-Mail campaigns for registered users of GTC Fall 2022 to recommend talks at the conference. Over 75,000 people attended (online) GTC Spring 2022, which hosted around 1,000 talks [1] [2]. First, the project targets a large audience, all GTC registered users. GTC contains a large number of diverse talks, such as graphics, data science, machine learning, many subareas of deep learning, super computing, virtual reality/augmented reality, etc. (I sometimes forget that GPUs are used for many domains, nowadays #advertisement :)). As there are so many different cross domain talks to choose from, email recommendations can help attendees consider relevant talks that they may miss otherwise. Another consideration was: E-Mail campaigns are offline batch-processing, which does not require the same level of reliance as an online service.

Although that might sound really easy, it required some time to collect and evaluate all ideas, as well as align on the goal with every department that was involved.

Translate the Scope into a technical Architecture

After we decided on the scope, we visualized the architecture and dataflow (a simplified version is above) to understand all the dependencies and who needs to be involved. We decided for this scope because of its simplicity. It is a great start to develop a recommender system from scratch.

Great — now, we have a scope and an architecture, the execution will be easy, won’t it? We wish, let’s look at some challenges ahead of us.

The Extreme Cold-Start Problem

The use-case of recommending (in advance) conference talks to attendees has a unique characteristic: Every talk (item in recommender system language) is new and we have no user behavior information for it.

Quick side topic: In traditional recommender systems, we have users and items and collect interactions of users and items (e.g. watched). Everyday, new items will be added to the catalog. In the beginning, we do not have user behavior for them. But, we can collect data (exploration) by recommending new items to a few users. After a few days, once we collect enough information, our recommender system can learn behavior for new items.

In our use-case, we do not have any behavior data for the GTC Fall 2022 talks. The conference runs over 3 days. We collect data for past talks, but we cannot recommend past talks as the talks are in the past. In addition, we have a significant amount of new attendees every GTC. That is a real cold-start problem.

Image Adapted from: Off-policy Learning in Two-stage Recommender Systems

We will talk on how to model this use-case in future blog posts. But a small spoiler: We developed a two-tower architecture, where we feed the attendees (user) features in one MLP tower and the talk features (item) to another MLP tower. The dot-product determines the final score (attendee probability). The trick is to select attendees and talk features, which are constant over all GTCs. In our next blog post, we will discuss the challenges with the data.

Key Learnings and Summary

In the first blog post, we explained our consideration in selecting a recommender system project. We provided our decision on how we selected our project to personalize E-Mail campaigns for GTC and a preview of the challenges.

Next week, we will share the data cleaning and preprocessing steps. If you are interested, subscribe to our blog to get notified when we publish our next post.

In the meantime, if you want to learn more about two-tower architecture, we can recommend our blog post “Scale faster with less code using Two Tower with Merlin” or our examples on GitHub. Feel free to check out more of our work.

Team

Thanks to the great team developing the in-house use-case: Angel Martinez, Pavel Klemenkov, Benedikt Schifferer

[1] https://en.wikipedia.org/wiki/Nvidia_GTC
[2] https://www.nvidia.com/en-us/on-demand/

--

--

Benedikt Schifferer
NVIDIA Merlin

Benedikt Schifferer is a Deep Learning Engineer at NVIDIA working on recommender systems. Prior, he graduated as MSc. Data Science from Columbia University