Two minutes NLP — Tips for Recommender Systems with NLP

Content-based and User-based Filtering, Collaborative Filtering, and Hybrid Approaches

Fabio Chiusano
NLPlanet
3 min readDec 12, 2021

--

How suggestions work in different types of recommender systems. Image by the author.

There are several types of recommender systems, but not all of them are suitable to be implemented with NLP techniques. Suppose we are building a recommender system for Medium, where our goal is to suggest articles to the users.

Content-based filtering

How suggestions work in content-based filtering recommender systems. Image by the author.

Content-based filtering methods are based on descriptions of the items to be recommended. They are best suited to situations where there is known data on the items (like name, description, etc.), but not on the users (like his/her previously read articles). These algorithms try to recommend items similar to those that a user liked in the past or is examining in the present.

A key advantage of content-based filtering is that it doesn’t need to know a list of items the user has interacted with in the past, which is usually collected after the user has interacted with the service for some time. As a consequence, content-based filtering works well from day one of a new user.

If we have text data that describe the items, we can leverage NLP techniques to compute items' similarity with document embeddings, like the ones obtained with Doc2vec.

Collaborative filtering

How suggestions work in collaborative filtering recommender systems. Image by the author.

Collaborative filtering is based on the assumption that people who agreed in the past will agree in the future, and that they will like similar kinds of items as they liked in the past.

A key advantage of the collaborative filtering approach is that it does not rely on items’ descriptions and therefore it is capable of accurately recommending complex items such as movies without requiring an “understanding” of the item itself. However, this approach needs to know a list of items the user has interacted with in the past, thus suffering from the cold start problem.

As collaborative filtering does not rely on descriptive features of items and users, it’s not possible to leverage NLP techniques.

I personally found the collaborative filtering approach to outperform the content-based approach when enough data is available.

User-based filtering

How suggestions work in user-based filtering recommender systems. Image by the author.

It is possible to create recommender systems that are based on similarities between users as well, though they commonly perform worse than content-based and collaborative filtering.

Similar to content-based filtering, if we have text data that describe the users, we can use it to compute users similarity leveraging embeddings.

Hybrid filtering

How suggestions work in hybrid filtering recommender systems. Image by the author.

Given the pros and cons of the different types of recommender systems, it is very common to use a hybrid approach.

--

--

Fabio Chiusano
NLPlanet

Freelance data scientist — Top Medium writer in Artificial Intelligence