Creating your first ever recommendation system like Netflix

Tech Talk With Alex
4 min readNov 22, 2022

--

Follow this step by step guide to create your own recommendation engine. Because it´s much simpler than what you think!

Netflix is of course way more sophisticated, however you can also build a tool to recommend movies, items of a shop, songs and whatever comes to your mind! I recommend coding the below lines in Google Colaboratory, you just need a Google Account to access it.

Importing Libraries & Data

First of all, let´s import our libraries and your datasheet containing customer information, in our case their purchasing history.

Retrieving Information

Let´s preview the data with df.head() and get some information, like the data type, the different columns, amount of entries etc. with df.info().

Python Code — Retrieving Information

Managing Missing Data

Let´s identify the amount of missing, and we can use three different codes, to see the missing data per column, in total, or in percentage %.

Python Code — Missing Data

We can now proceed with dropping all the rows that contain missing data. In total we have 531266 rows before dropping, and 397914 after. Always double check again whether you still have null values, with df.isnull().sum().

Python Code — Missing Data

Great! Now let´s continue with the actual recommendation engine.

Developing the Customer-Item matrix

We are creating a pivot table that lists all of our shoppers by the Customer ID in the vertical column and the items by the Stock Code on the top row.
Let´s see what we coded in the different code lines.
[15] If the user has purchased the item we will see a 1.0 (or numbr bigger than 1 based on the quantity purchased), otherwise a NaN.
[16] To remove the NaNs and replace the non-purchases with 0s, we use the lambda function.

Python Code — Customer-Item Matrix

Identifying Similarity among Users

We now can calculate how similar users are to each other.
In example if Bob purchased trees and garden decoration, and John did so too, you can later on recommend John other purchases Bob did, since they are very similar to each other → In example both interested in Gardening.

To do so we use Cosine Similarity — You can learn more about the math behind it in my other article.
The closer the number is to 1, the more similar the purchasing behaviour of the two users is.

Python Code — Cosine Similarity Matrix

We can now assign the values 0,1,2,3 etc. the correct Customer ID present in the customer_item_matrix.

Python Code — Cosine Similarity Matrix

Let´s find one similar user

We now locate and list similar users to the User 14606, by the cosine similarity value.
As mentioned above, the closer the number is to 1, the more similar the purchasing behaviour of the two users is. User 12748 is the most similar with a cosine similarity of 0,44.
Next step is, we find the items that User 14606 purchased as well as those that User 12748 bought.

Let´s find the items to recommend

We are then going to substract Items purchased by User12748 from those purchased by User 14606.

You don´t want to manually check the stock code to find out the item name, so you are going to do it with a simple line of code ;) Et voilà!

CONGRATULATIONS! YOU HAVE MADE IT!

Disclaimer: The author of this article assumes no responsibility or liability for any errors or omissions in the content of this site. The information contained in this site is with no guarantees of completeness, accuracy, usefulness or timeliness.

--

--

Tech Talk With Alex

At the end of the day, nothing is too hard to understand when it is explained simply. Making complex concepts easy for everyone to get smart!