What To Read?

Using data visualization to personalize book recommendations.

Published in

VisUMD

5 min readDec 14, 2022

A person holding up a stack of books in their hand. — Photo by Thought Catalog on Unsplash.

As someone who likes to read, I often look to the internet for recommendations to pick the book I should read next. I have used social media platforms, editorial reviews, and other websites dedicated to books in the past. These recommendations are often presented in the form of suggestions from a critique or from users who have read books similar to you. No matter which of these methods you try, you don’t get a lot of control over the books being recommended to you, and you don’t really know why some books showed up on your list.

Many research studies conducted in the areas of music, movies, and book recommendations have found that people find recommended content more trustworthy if they can understand the reasoning behind it. Some studies have used various data visualization techniques to disclose the data behind recommendations and found them to be effective.

The purpose of this project is to demonstrate how the book recommendation process can be made more transparent while providing more control to the readers using data visualization.

A screen with computer code written on it. — Photo by Florian Olivo on Unsplash

The technology of recommendations

To get started, I did a bit of digging into the kind of data that is available on the internet which can be used to recommend books. Fortunately, this data is available in many varieties, and I picked one which contained information like book titles, authors, year of publication, and user ratings among other things.

The next order of business was to understand how recommendation algorithms work. After looking into this for a while, I chose to create my recommendation using a collaborative filtering machine learning model. This algorithm essentially recommends items that similar users liked. I already had ratings for books from other users so this model was a good fit.

Then I had to figure out how to best create visualizations based on this data and the recommendation system that I had created. Python has many amazing libraries available to create complex designs like recommendation systems, and also for creating visualizations in different ways. This seemed like a good match for all the tasks that I had in mind, so all the programming and visualization for this project have been implemented in Python using the pandas, networkx, and jaal libraries.

Visualization book recommendations

Since a recommendation system tells us how closely related the different items under consideration are, I am using a network visualization to show this relationship between books.

Step 1: The default view

By default, you see a network of all books showing how they are connected. Each circle, or node, represents one book, and the lines, or edges, represent the connection between books.

A network showing books and the connections between them

While the above visualization fulfills the purpose of explaining how closely all of these books are related, and we can tell which book might be a good recommendation based on another selected book, it is a little overwhelming to look at.

Step 2: Selecting a book

It would be better if we could just book recommendations based on one book instead of just seeing all the ways in which all of the books are connected to each other. If I use the example of a book titled “Exclusive”, this is what it would look like:

A network showing all the recommendations based on one book titled “Exclusive”

This narrows down the options that you are looking at, but can still be a lot for someone looking for one book to read. Since one of the main ideas behind this project is to provide readers with more control, I have provided a way to narrow down the recommendations by applying more filters.

Step 3: Filters

The first preliminary filter that can be applied to these recommendations is to have the ability to only look at the top few options. This is another input that a reader can provide. As an example, if I ask for 4 recommendations, this network will look something like this:

Network showing the top 4 recommendations for the book “Exclusive”

Going further, more filters can be applied to these recommendations. The two that I have tried are based on the author’s name and the year of publication of the book. Continuing the example of the book Exclusive written by Sandra Brown, let’s say we only wanted to see more books like it by the same author, this is what the above network would change to:

Network showing the top 2 recommendations for the book “Exclusive” written by the author Sandra Brown.

Step 4: Make it more interactive

Since it looked like the recommendation system was coming along pretty well so far, my next step was to make it interactive so that these filters and inputs don’t need to be manipulated in the code file, but can be applied in real-time by a reader looking for their next book.

As seen in the video, all the filters we looked at above can be applied right there on the dashboard. This dashboard has the additional capability to color the book nodes based on whether they were published before or after a year supplied by the reader. Moreover, if you size edges by the weight property as shown, you can see how strongly recommended a book is based on the one it is connected to. The wider the edges, the more it implies that readers who liked one of those books also liked the other one.

Possible enhancements

This project is my attempt at making book recommendations more transparent and providing readers with more control while browsing for their next book.

Future work on this project can include incorporating more information on the book, like the genre. Some additional functionalities that can enhance the dashboard can be the ability to hover on a book node or on an edge to get more information about the book or the connection between two books.