Analytics Vidhya
Published in

Analytics Vidhya

What is Item-Based Filtering? An Applied Example In Python

Creating movie recommender

Hi. In this story, we will try to understand what item-based filtering is and we will see an applied example in Python.

You can access the Kaggle notebook that I created for this story from here.

Photo by Markus Winkler on Unsplash

Item-Based Filtering

Actually, item-based filtering is a type of collaborative filtering technique. Sometimes we can see this technique as “memory-based”. Recommending the items which have a similar “liked” structure with X item is in the basics of the mentality.

We will create a matrix like below.

In this matrix, rows represent users, columns represent items and the intersection cells represent is the liked counts. For example, User1 didn’t like Item4 and Item1 is liked 10 times. But in this technique, we will focus on items more than users.

Applied Example

We will use this dataset. I’ve already imported it into my workspace.

I’m going to import pandas and read the data. After that, I’ll merge the 2 datasets. As aforementioned, we need to use items (movie) names and items liked counts. Because this filtering technique uses these 2 features and creates recommendations by using the items’ liked structure. Therefore I merged these datasets.

import pandas as pdmovie = pd.read_csv('movie_lens_dataset/movie.csv')rating = pd.read_csv('movie_lens_dataset/rating.csv')df = movie.merge(rating, how="left", on="movieId")

Creating User-Item Matrix

Actually, the main point is here in item-based filtering. We need to create the matrix that the filtering technique will use.

Firstly, I’m going to choose movies that have more than 1000 comments counts.

comment_counts = pd.DataFrame(df["title"].value_counts())rare_movies = comment_counts[comment_counts["title"] <= 1000].indexcommon_movies = df[~df["title"].isin(rare_movies)]

I don’t want the filtering technique get affected from the movies that have lower comments.

Now I can create the user-item matrix. Actually, it’s just a pivot table. This table holds user ids in rows; the items titles in columns and the rating counts (liked counts) in intersection cells.

user_movie_df = common_movies.pivot_table(index=["userId"], columns=["title"], values="rating")user_movie_df.shape>>> (138493, 3159)user_movie_df.head(10)

Actually, we completed the biggest challenge point. Other things are just a few line codes.

Recommending Movies

I’m going to choose a movie and select its rating values from our pivot table.

movie_name = "Matrix, The (1999)"# getting the ratings of the choosed movie
movie_name = user_movie_df[movie_name]

What we said above, this filtering technique calculates the selected item’s (movie’s) liked (rating) structure with other items (movies). Therefore, I need to calculate moive_namecorrelations with others. movie_name variable is holding the selected movie’s liked structure. So I can calculate the correlations by using this variable.

movie_name = "Matrix, The (1999)"movie_name = user_movie_df[movie_name]user_movie_df.corrwith(movie_name).sort_values(ascending=False).head(10)
The recommended movies

Yes, we completed item-based filtering. We see the recommended movies for “Matrix, The (1999)” above.

If you go to the Kaggle notebook, you can see some helpful functions for recommending movies.

Hopefully, you enjoyed this. Also for reading about the other recommending techniques, you can visit my profile.

Kind regards.




Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Recommended from Medium

Data Visualisation With Python

Decision Tree — Height of Child

ELO Merchant recommendation Case Study

Ever Wonder How Your Data is Handled?

The best Mario Kart character according to data science

Step up your graph game with Matplotlib and Seaborn

Predict Customer Churn for Sparkify

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Lifelong learner & Freelancer. I use technology that helps me. I’m currently working as a Business Intelligence Developer.

More from Medium

Recommendation Systems: Association Rule Learning

Creating Scalable Machine Learning Systems for Analyzing Real-time data in Python — Part 1

What Predict Work-life Balance? Uncovering Insights from 15,000+ Happiness Survey Data

How to Classify Different Dialects of English