A practical guide to content-based recommender systems

Guillaume Galante
4 min readMay 12, 2022

--

Recommendations are all around us, from Youtube recommending what we should watch to Amazon proposing what book we should read. We’re constantly being suggested things, but have you ever thought how these recommendations work behind the curtains?

Before we start, if recommender systems don’t mean anything to you, I highly recommend to have a quick read through my previous post which is an intro to recommender systems.

What is content-based filtering?

Content based filtering is a recommender system that uses item features to recommend similar items a user might be interested in. For example, if a user likes a song, you can highlight other songs that share the same features.

While users performs actions within your product, you get to understand their intent (“what are they looking for?”). These actions are the key to increasing the relevancy of your recommendations.

For example, if a user watches multiple movie genre (Drama, Comedy, Action) but you find out that the same actor is part of these movies, you can recommend other movies featuring this actor.

Datasets

Before we start, you’ll need to have a dataset to work with, you can decide to build your own, for example by web scraping websites to collect data, or you can use an existing one on Kaggle (MovieLens is a popular one for movie recommendations). In this article, we’ll be using a Craft Beer dataset I’ve created (referencing 60.000+ active beers) in order to build a recommendation engine that can suggest which beers I should try next.

Applications of content-based recommendations

There are many applications for content-based filtering, with a bit of practice, you’ll find multiples ways to recommend content to your users. In order to build them by yourself, you’ll require some knowledge of Python and be familiar with using Pandas (a data analysis and manipulation tool).

Categories

One easy way to cluster your recommendations is to sort them by categories. From music genres, book authors or for this example, beer styles.

In this example above, I’ve decided to split my dataset by beer styles (IPA, Stout, Sour, Pale Ales, …). I’ve filtered only the beers that have at least 15 user ratings and sorted by highest to lowest rating.

There’s endless ways to categorise your dataset as long as it creates value for users. For example, I could decide to display similar beer style recommendations on certain pages of my product, whereas on other pages I could cluster recommendations by breweries.

Popularity

Another way to highlight certain items is using a popularity score. For example, Netflix could highlight the top trending series this month based on views. It is a common way of highlighting which of your products are popular, however it is good to keep in mind that popularity don’t always mean it is relevant for a specific user.

In this example above, I’ve decided to sort the dataset based on popularity and refine the results only for the month of July.

User preferences

If you’re aiming for more relevant recommendations, you’ll require to get an understanding of your users. As your users interact with your product, you can start understanding their preferences. For example, Netflix can have a look at what you’ve added to your watchlist, or Spotify can learn from the playlists you’ve created, Amazon can deduct what you’re looking for based on what pages you’ve explored.

In this example, I’m using the breweries a user “liked” to recommend the top rated beers from these breweries.

Last but not least, avoid these mistakes

Making recommendations comes with a great sense of responsibility. If badly executed, it can deteriorate your brand or even reduce trust in your product. That’s why performing regular research studies (user interviews/testing to understand user needs) in addition to experiments within the product (such as A/B test) are highly recommended. Last but not least, below are the common mistakes teams make when developing recommendations for the first time:

  • The Filter Bubble

Recommendations can narrow down users into a single thread of ideas or products. Be careful to regularly highlight different opinions or ideas to your user. If you want to learn more, I recommend this great talk from Eli Pariser.

  • Internal Biases

Biases are everywhere. Make sure to constantly remind your team of them. Diversity makes us stronger, so make sure to get people with different opinions and cultural background in your team. It’s worth sharing an early beta around (company, friends, family) before making bigger mistakes.

  • Be Explicit

Avoid vagueness or ambiguity of your recommendations. Be clear on why these are being recommended making sure users understand why these might be good recommendations for them.

  • Give Controls

Allow user to give feedback on your recommendations, from thumbs up/down vote for any recommendation to simply being able to remove a suggestion, this will bring huge insights to your team in understanding what can be improved and where.

Thanks for reading! :)

--

--

Guillaume Galante

Currently working at Omio, powering journeys for millions of travelers. I’m passionate about personalisation, data science and system-thinking.