An overview of Recommendation system and how to implement one from scratch

This blog includes overview of recommendation systems including history, present, future, different categories of it and finally code to process a joke recommendation.

prabhudayal acharya
Analytics Vidhya
6 min readJan 14, 2020

--

A very famous comedian is standing.
Photo by Edo Nugroho on Unsplash

Past present and future:

If we look at history of recommendation system the idea was ignited in between 1992–1996. Before recommendation systems was imagined even before people talked about collaborative filtering it was known as personalization. But it was all manual. For example: With the A travel agent who knows that you’re interested in a safari will keep his or her eyes out for the type of trip you would like, not just something that anybody would want do do. As a second example: Personal shoppers had some simple databases where they could run people’s profiles through when new products came in and getting an idea of who they might want to consider that would be a good candidate for a new product.
In the year 2007 Netflix announced a competition of 1 Million dollars and it changed the course of recommendation system. It attracted people from many backgrounds to participate in the competition. New algorithms emerged along with new mathematical formulas. By the way the surprise library I will be using to make a joke recommendation system is completely developed keeping an eye on the research paper published by the winner team of the Netflix Prize.
As we look forward there are still many things we don’t know. The problem of temporal recommendation. How do I key my recommendations not only to things like season which people have worked on for a while but to sequence, what did you consume next given that you’ve consumed this already. Recommendation for education which is one of the areas were temporal matters a lot.

Broad division of recommendation system:

There are mainly two categories of recommendation system.
1. Collaborative filtering
2. Content-based filtering

Collaborative Filtering: The key to collaborative filtering is based on the assumption that people who agreed in the past will agree in the future, and that they will like similar kinds of items as they liked in the past.
The three main categories of collaborative filtering is:
1. User-User similarity
2. Item Item similarity
3. Model Based

This image shows user-use similarity recommendation and item item similarity recommendation.
User-user and item-item similarity

Content Based filtering: Content-based filtering methods are based on a description of the item and a profile of the user’s preferences. These methods are best suited to situations where there is known data on an item (name, location, description, etc.), but not on the user.

This image shows difference between Collaborative recommendation and content based recommendation.
Collaborative vs content based recommendation

Prerequisite

1.Basic Python
2.Basic pandas
3.Eagerness to explore surprise library
4. Keras(optional)

If you want to jump to codes directly, please go to this github link and find the jupyter notebook.

I will explain each major steps i have followed while solving the problem, but i strongly believe if you are interested in full explanation of problem and interested to know about use of surprise library then you must take a look at the git repo after going through the blog.

Lets Begin.
Some basic info about the data we are going to use.

Data description

Now I will make a plan to approach the problem at hand and slowly move towards solution. We will go through each step with code snippets.
1. Collecting Data
2. Train test split
3. Simple statistics
4. Structure data to compatible format of surprise library
5. Defining error metric
6. Using baseline model
7. Try different models
8. Result

Collecting Data

  1. There are 3 excel sheets provided in data. We will merge them together and form a combined python pandas dataFrame. We have total 73421 users.
Merge all data

2. As described in the dataset information the ratings 99 means the user has not rated that joke. We will remove those records and prepare data in [‘user_id’, ‘joke_id’, ‘rating’] format.

Data preparation

Train test split

We will use scikit-learn train_test_split and split the data as 70–30. 70% data will be available for train and 30% for test

Basic statistics

1. Average rating per user and per joke

distribution of rating of all users
distribution of rating of all jokes

Structure data to compatible format of surprise library

We will structure the data as per the surprise library. It accepts the data in a format as [‘user’, ‘joke’, ‘rating’]. If we had a problem for movie recommendation we would have structured the data as [‘user’, ‘movie’, ‘rating’].

prepare data in surprise library style

Defining error metric

we will use Normalized Mean Absolute Error as error metric.

Normalized Mean Absolute Error(NAME) formula
code to compute NMAE

Using baseline model

We will create a base line model using Baseline model given by surprise library. Baseline model gives 0.2033 NMAE, We will try different models of surprise and combine all results to get better results.

Try different models

  1. KNN Baseline Model:
    It uses similarity based technique to predict rating of users for new Jokes. In our case the NMAE error is 0.196

2. XGBoost on userAverageRating, jokeAverageRating, output of Baseline and output of KNN Baseline:
We have combine all output of previous surprise model and ran a XGB regression model on the data after hyper parameter tuning. Here we get a slightly better result of 0.1928 NMAE

3. SVD Model:
SVD model uses matrix factorization techniques to solve matrix completion problem and predicts the rating.

4. XGBoost on userAverageRating, jokeAverageRating, output of Baseline, output of KNN Baseline and output of SVD model. This model gives 0.18 NMAE and is the best till now.

5. Model with feature engineering:
I have derives two easy features to check their effect on model. One feature is user_average+joke_avg-global_avg. This model gives as usual NMAE as 20.2. I had tried other feature engineering techniques too. They did not work well too.

6. Depp Learning models using keras:
As feature engineering did not worked well I am planning to try some simple neural network models using keras. I tried 3 different models. One with all basic features like user average, joke average — second and third one with all features but with different architectures. One model had 14.9% NMAE.

Results

As we can see the Second_NN model Works best as it has the lowest test error.

Final Thoughts

As we all know no model is perfect. There is and always will be scope for improvement in this model. May be trying different feature engineering technique, some domain expert advice and different neural network architectures can lead to better models.

As a second note i can say that recommendation systems are growing now a days. Surprise library makes developers life much easier by giving all famous model implementation. For creating basic recommendation system it is really useful. I think i have fulfilled my purpose of writing this blog and made developers know about surprise library with a small coding intro. The whole code is available here in git.

--

--