Hands-on Content Based Recommender System using Python

Here’s how to create a content based recommender system in a few steps with Python

Piero Paialunga

Published in

Towards Data Science

6 min readJan 16, 2022

One of the most surprising and fascinating applications of Artificial Intelligence is for sure recommender systems.

In a nutshell, a recommender system is a tool that suggests you the next content given what you have already seen and liked. Companies like Spotify, Netflix or Youtube use recommender systems to suggest you the next video or song to watch given what you have already seen or listened to.

The idea of build recommender system has surely not been developed yesterday. In 2006 Netflix announced a 1 million dollar reward to the research team able to build the best recommender system possible given some test data. This was called “Netflix Prize”, by the way.

As this idea is relatively old, there are some crazy good recommender systems out there and they considered a lot of variables and specific information about the user. Of course the recommender system of Netflix is not really open source and even if we have some ideas about how they might work we don’t have their pre-trained ready to use magic models.

Anyway, there is a way to keep the recommender system pretty simple, easy to run and actually surprisingly good working!

In this notebook I’ll show you how to build a content based recommender system using few lines of code and some domain knowledge about machine learning and algebra.

Let’s dive in :)

0. The Libraries

This is what you’ll need to make it work:

P.S. I Installed KMeans to do a Non Supervised Learning approach, but I didn’t use it in this notebook, so you don’t need to install it if you don’t have it: we are not going to use it.

1. The Dataset

The first thing we need is of course the dataset. I found the dataset here and it is basically a collection extracted from IMDb (yes, the famous website). In this collection we have a list of movies with their correspondent features. The features we are using are:

An overview of the movie (i.e. a brief description)
Its title
Its genre (actually, each movie has multiple genres)

Let’s give a look then:

1.1 Importing it:

1.2 Viewing it:

2. The Approach

When working with textual data the first thing to do is to convert this text into numbers. More precisely we are converting a string of text into a vector.

But how are we doing it? How we should convert the string in a smart way such that the vectors represent the text meaning?
Well, this is a really good question and of course it is not really easy to find an optimal answer. Nonetheless, there are really good models (or more precisely transformers), like BERT, which are able to convert wisely sentences into texts. If you are interested in the details, the paper is actually a masterpiece and so clear.

Idea of BERT Pre-Training and Fine-Tuning. Image took from BERT’s paper (here)

But why do we want to convert texts into vectors in the first place?

Well, because now that we have vectors we are able to do all the operations that you usually do on vectors :)

Imagine you are doing a classification task and you want to use Support Vector Machines. Well, of course, you will need to have vectors to do that. So if you have to classify a text you will first convert the text into a vector and then apply the SVM algorithm.

In our case we will use the vectorize test to find the similarity between two vectors. Our recommendation will be the (5) most similar vectors with the one we are considering.

Here is an example below:

Let’s say x and y are two components (we’ll have way more than two). If a movie talks about science, space and rockets and another one talks more or less about the same stuff, we expect the two vectors to be close, like the blue ones.
On the other hand, if the other movie is about a love story we will expect this vector to be far away, like the red and the blue ones.

So here is what we’ll do:

Use BERT to convert our text into a vector
Get the cosine similarity (the cosine of the angle between the two vectors) of a fixed movie (vector) and all the other ones
Pick the movies (vectors) with the largest cosine similarity. We are going to pick 5 of them.

And that’s it :)

I hope it sounds simple already but it will get more clear while I’m showing you how to do it. So let’s do this!

3. The Method

3.1 From text to vector:

Here’s how you convert a text to a vector:

3.2 PCA (optional)

I’m now using a very known method to reduce the dimensionality of our dataset (Principal Components Analysis) . It is just to plot it and give you an idea of the result of the text-vector conversion so you can safely skip it.

3.3 Cosine Similarity and Recommendation Function

Here is how to compute the cosine similarity (one line of code) and the function that we’ll use to get our recommendations:

4. The Results

Let’s plot some recommendations:

We can appreciate something that is really cool. The algorithm recognize that The Godfather, the Godfather II and the Godfather III are similar!

Let’s explore it deeper.

So if you have just watched The Dark Knight it gives you Joker, Dirty Harry, Batman Begins, Guardians of the Galaxy and Death Note! I find these recommendation pretty accurate, especially if we are considering that the algorithm is pretty simple!

We can appreciate how accurate these recommendations are by looking at the plots:

So , the movie that you have watched has the following plot:

“ When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.”

The first recommended movie has this plot:

“In Gotham City, mentally troubled comedian Arthur Fleck is disregarded and mistreated by society. He then embarks on a downward spiral of revolution and bloody crime. This path brings him face-to-face with his alter-ego: the Joker.”

So they are actually pretty similar!

We can even appreciate that the recommended movies have the same genre as well:

So the movie that you watched is a Drama and Family movie, while the recommended movies are:

Drama
Drama
Drama
Drama, History, War
Adventure, History, Family

Again, it is an index of quality of a very simple algorithm!

Finally, this is the way you store your recommendation into a dataframe:

5. Conclusions

I know that there are many ways to get a recommender system. Our assumption here is that we are recommending something that is similar to what you have already watched, but maybe you want to surprise a user so it shouldn’t be that similar as we are suggesting.

Plus we are only considering one movie and we are not really considering a user taste, it is just about the movie itself.

In other words, it is just one of many many many approaches out there, and it is actually really simple. Nonetheless, I find the recommendations pretty accurate and interesting, so it is a method that is good working and ,computationally speaking, very cheap!

I had a lot of fun writing this article. I hope you had a lot of fun reading it. If you have any questions or comment I’ll be glad to hear them here: piero.paialunga@hotmail.com.

Ciao :)