What Should I Read Next? Books Recommendation

Implementing Books Recommender Using Weighted Average Technique

Hargurjeet
Nerd For Tech
4 min readJul 2, 2021

--

A recommender system, or a recommendation system (sometimes replacing ‘system’ with a synonym such as platform or engine), is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item

Recommender systems are used in a variety of areas, with commonly recognized examples taking the form of playlist generators for video and music services, product recommenders for online stores, or content recommenders for social media platforms and open web content recommenders. These systems can operate using a single input, like music, or multiple inputs within and across platforms like news, books, and search queries.

There are many approaches to build a recommendation system. I have build the recommender system by implementing the technique of weighted averages.

Weighted Average

The simplest way to explain it is that although the dataset contains all votes received by users, not all votes have the same impact (or ‘weight’) on the final rating. Hence we calculate the weighted rating of each book using this mathematical formula.

Table Of Contents

  1. About the Dataset
  2. Explanatory Data Analysis
  3. Implementing the Weighted Averages
  4. Recommendation based on Weighted Averages
  5. Recommendation based on Weighted Averages and Text Reviews
  6. Summary
  7. Future Work
  8. References

№1: About the Dataset

This file contains the detailed information about the books, primarily. Detailed description for each column can be found alongside.

  • bookID: A unique Identification number for each book.
  • title: The name under which the book was published.
  • authors: Names of the authors of the book. Multiple authors are delimited with
  • average_rating: The average rating of the book received in total.
  • isbn: Another unique number to identify the book, the International Standard Book Number.
  • isbn13: A 13-digit ISBN to identify the book, instead of the standard 11-digit ISBN.
  • language_code: Helps understand what is the primary language of the book. For instance, eng is standard for English
  • num_pages: Number of pages the book contains.
  • reatings_count: Total number of ratings the book received.
  • text_reviews_count: Total number of written text reviews the book received.

№2: Explanatory Data Analysis

Now we do the exploratory data analysis to get insights

The dataset seems to have no null values. This is good !!!

Let us try understand the rating trend

It is observed, the rating spread is from 3 to 5 and very few books are rated below 3.

Extracting relevant features from the original DataFrame

№3: Implementing the Weighted Averages

Using Weighted average for each Book’s Average Rating
W = Rv + Cm/(v + m)

where
W= Weighted Rating
R = Average of the Books rating 0 to 5
v = No of votes for the books
m = minimum no of votes to be listed
C = the mean rating across the whole report

№4: Recommendation based on Weighted Averages

This is the list of most favored books based on the weighted scores. The book The Complete Cavin and Hobbes seems to have top this chart.

№5: Recommendation based on Weighted Averages and Text Reviews

Recommendation based on scaled weighted average and text reviews (Priority is given 50% to both)

Data normalization is performed as scale down the values of both the fields

Below is the list of books based on weighted average and text reviews equally weighted

Below I have formatted the columns so that complete name of the books are displayed 😃

№6: Summary

  • We downloaded the Books dataset from Kaggle.
  • We ran EDA and analyzed the input features.
  • Calculated the Weighted average as per the formula.
  • We recommended books based on weighted score.
  • We recommended books base on weighted score and review count.

№7: Future Work

  • Try evaluating the results by implemented collaborative filtering.
  • Pearson correlation can also be implemented and results can be analyzed.
  • Take a different dataset (movies IMDB) and implement the weighted average technique.

№8: References

I really hope you guys learned something from this post. Feel free to 👏 if you like what you learnt. Let me know if there is anything you need my help with.

--

--

Hargurjeet
Nerd For Tech

Data Science Practitioner | Machine Learning | Neural Networks | PyTorch | TensorFlow