Python Implementation of Baseline Item-Based Collaborative Filtering

Bin Wang
7 min readApr 19, 2018

Related article: Comparison of User-Based and Item-Based Collaborative Filtering

Introduction

In the past decade, the websites on the internet have been growing explosively, and the trend of the growth is likely to keep for a long time. As a result, it brings extremely rich and various content to people; however, it also produces a tremendous amount of options for users which makes it acutely difficult for users to make a decision. Therefore, Recommender Systems (RS) — a personalised information filtering technology have been introduced to reduce the number of options to a handful that interest the specific user.

Generally speaking, there are three approaches of recommender systems — Content-Based filtering (CBF), Collaborative filtering (CF) and Hybrid which combines the first two approaches. There are three must-known baseline recommender systems which are one baseline CBF method — Content-Based Recommender System (CBRS), and two baseline CF methods — User-Based Collaborative Filtering (UBCF) and Item-Based Collaborative Filtering (IBCF).

This tutorial focuses on Python implementation of IBCF on the MovieLens Small (MS) dataset. The MS dataset contains 100,000 ratings and 1,300 tag applications applied to 9,000 movies by 700 users. Based on the chosen dataset, the objective of the RS is to select the top-10 rated movies from 9,000 movies of a specific user which will be suggested to the user; The…

--

--

Bin Wang

Years’ experience in AI/Machine Learning research, and leading engineering team in various areas — software development, DevOps, data science and MLOps.