How we balance relevance and surprise in AI-generated content recommendations

Reika Fujimura
CBC Digital Labs
Published in
5 min readFeb 28, 2023
Photo by Brooke Lark on Unsplash

Imagine opening your laptop and wandering around the internet for a while. You will probably find a lot of recommendations coming up in front of you suggesting “these are the best items for you to go next!”. Reflecting upon these moments, how many times did you find them irrelevant, too biased or repetitive? You might recall the time when you found so many similar air fryers filling up the page after you bought a new one.

It’s often difficult to balance relevance and surprise in real-world applications — we want to get just the right extent of relevance, while also ensuring recommendations are not too filtered so that you won’t miss the chance to find new content or products.

Indeed, there is plenty of research on how to introduce surprise within recommendations. However, these approaches tend to be highly technical and heuristic, meaning they are heavily dependent on each business problem. It seems there is still no simple, established and generalizable method to solve this problem, and that is why so many recommendations fail to be “natural” to us.

Here, the Customization and Machine Learning (CaML) team at CBC tackled this problem with an original method, which turns out to be not only robust but also generalizable to different machine learning models. In this blog post, we will share our method which borrows the notion of “entropy” from statistical physics to balance “relevance” and “surprise” in our recommendations and how to understand it within this new context.

Idea of Entropy

In the context of data science, entropy is frequently used as a measurement of impurity, disorder or randomness of the information processed. For example, the decision tree algorithm learns to minimize the impurity of each leaf node by maximizing the homogeneity of information at every node, using entropy as the loss function.

More generally, entropy can be considered as the expected “surprise”. For example, let’s think about drawing a ball from a bag. In bag A, all the balls are orange. In bag B, exactly half of them are orange and the others are blue. In the case of drawing a ball from bag A, the “expected surprise” of getting a ball from bag A is zero since we already know that we will always get an orange ball, whereas, in the case of B, it’s maximized because we have no idea what we will get.

“Expected surprise” of drawing an orange or blue ball from a bag. When all the balls in the bag are orange, the “expected surprise” r is zero, while when half are orange and other half are blue, the “expected surprise” is maximized.

As suggested by this example, the entropy or “expected surprise” becomes smaller when items are more similar to each other. Conversely, it gets higher when there is more diversity in the items.

In our method, which we named “entropy sampling”, we utilize this measurement and adjust the “expected surprise” in our recommendation to balance relevance and diversity.

Balancing the relevance and diversity in the recommendation for a person who liked a coral pink cat. With too low “expected surprise”, the recommendation looks boring, whereas with too high “expected surprise” it looks irrelevant.

If you’d like to deep dive into the details, here is the full-version of this article with technical and implementation details!

How does it work?

So, how does it work? From here we will show how entropy sampling impacts the recommendation result by comparing it with the ones before it is applied.

In the case of an item-based collaborative filtering model, our recommender works like this: user A listened to show X, and show X tends to be liked by the same people who like show Y. Therefore, given that user A has not listened to show Y they should be given it as a recommendation.

Let’s see the bare result first and see what this collaborative filtering model recommends.

Example recommendation for user A using the CBC Listen app with Top 5 sampling. A person who liked Metro Morning, which is a news show, gets a recommendation of Here and Now Toronto, Fresh Air, The Current, Ontario Today, and The Sunday Magazine, all of which are news shows.

These shows are reasonable recommendations, but might be too obvious for the user’s tastes. Furthermore, even if they are of interest to a user, if they open the CBC Listen App again the next day and see the same recommendations it would get awfully unhelpful fast!

So, let’s think about adding some randomness here to avoid the too obvious, too fixed recommendations we saw above. The most straightforward solution would be to do random sampling from the similarity score distribution. This means, from the user side, this similarity score distribution can be transformed into a probability distribution or probability mass function of getting each item in recommendations. Now, applying the random sampling technique recommendations for user A look like:

Example recommendation for user A sampled from the similarity score distribution. A person who liked Metro Morning, which is a news show, gets a recommendation of The Loop (Edmonton local news), Evil by Design (society), The Debaters (comedy show), World Report (news show), and Party Lines (politics).

Now another problem arises. This user liked Metro Morning, which is a news show, but got a lot of shows that don’t sound like related ones. User A might be confused and wonder where The Loop or Party Lines came from.

Now, let’s see what the recommendation for this user looks like after entropy sampling is applied.

Example recommendation for user A sampled with entropy sampling. A person who liked Metro Morning, which is a news show, gets a recommendation of Podcast Playlist (Producer’s Pick), Fresh Air (news), Here and Now Toronto (news), Evil by Design (society), and Ontario Today (news).

Now we have much more relevant recommendations like Fresh Air, Here and Now Toronto and Ontario Today. Moreover, there are also some new genres of shows. This recommendation is not as boring as the top 5 recommendation, and is much more relevant than the random sampling recommendation.

One side benefit of entropy sampling is that the effect is robust among all users, since each user’s recommendation can be optimized separately. (For technical and implementation details, please refer to the full post!)

Although there is plenty of research about how to introduce diversity in recommendation, we took a relatively simple approach by introducing the concept of entropy.

With this approach, we see an improvement in our recommendations, with more fine-tuned relevance and surprise in our delivery endpoint.

Not only that, we find our approach to be robust because it is a post-processing method and can be applied to other recommendation models as well. It’s also a simple approach compared to other techniques used for similar purposes.

If you’re interested in the details, here is the full version of this article!

--

--