Build a React + Flask App that Suggests Novel Novels with a Python Graph
Project Summary:
- Build a Graph database of Users and the Books they read
- Develop a Flask App that serves up rare, interesting Books to Users based on their submitted favorites
- Implement a React App that integrates with Flask + our Graph to showcase to users their next favorite book
There are a lot of great resources for learning Data Science techniques out there: MOOCS, blogs, tutorials, and bootcamps are all avenues for learning. Personally, I learn best by working on projects that I find interesting and engaging. Nothing motivates me more to push my level of understanding to new heights than by working on an intriguing problem. How do I identify new avenues that I can explore? Generally by reading other people’s work!
The idea for this project was first realized by reading Vicki Boykis’ great blog post about the changing nature of Netflix’s recommendation engine. In short (sorry for the hack job, Vicki), gone are the days of minimizing the RMSE of a user’s rating preferences with Matrix Factorization or Deep Learning. Recommendations are still partly an art form, with context, nuance, and design all playing a large part in providing a great experience to the user. In addition, the business value of a recommendation is not simply to provide the best content to the user. In the case of Netflix, it may be to provide the best Netflix content to the user.
I had the great pleasure of attending RecSys 2019 in Copenhagen this year. The winner for Best Paper was one that was not a State of the Art neural network approach. Instead, it was a careful examination of previous, open source recommendation systems, public datasets, and their actual performance in a head-to-head matchup. The results, you may ask? Simple models (Graph based, User/Item Nearest Neighbors, Top Popular items) nearly always outperform the best deep neural networks.
It is safe to say I have been rethinking Recommender Systems in light of these revelations. What do I, as a user, want from YouTube’s recommendation engine? What about a Search Engine like Ecosia/DuckDuckGo? In the former example, I might want to see videos that entertain/inform me in that moment. In the latter example, I want to find the most relevant information, quickly, without worrying about conflicts of interest. In either case, the returned items may not be the most popular, highly rated, or controversial. Advertisers sometimes find virality to be profitable, but I may not be a fan.
So what do I truly want? How can I build a system that optimizes for this elusive, unprofitable goal? There was a landmark moment of clarity in my mind as I was walked to the library the other day: I absolutely love when I find a new author or book that I had not considered or heard of. The sheer discovery, novelty of that moment can inspire weeks of happiness as I flip through the pages.
Time spent reading a book is an investment, and so it can be a difficult proposition to try out anything new (explore vs. exploit). A system that just recommends random items may provide unexpected results, but they may not be good ones. I want to have my cake and eat it, too. I want to find great books that I would never have considered before.
This idea is simple:
- People who read and like the same books as I may have roughly similar tastes to me
- These same neighbors probably have read multiple books, some of which I have not read
- Some of these unread books may be so obscure/unpopular, that I would be unlikely to discover them on my own
None of the above statements should be that surprising to you, particularly the first two points. The last point, of course, is the most fundamental to this work. What if I built a Graph database of books, authors, and readers, and then walked along the nodes of the graph to find these hidden gems of books?
The site GoodReads has published a dataset of 10,000 books, 50,000+ users, and nearly 6 million ratings (1–5 Stars) of books. Surely I could find something interesting in this treasure trove! I have never built a real graph before in my life, so I thought this would be a great trial by fire. Nodes and Edges, how hard could it be? The Nodes of this Graph are the Users
, Books
, and Authors:
The Edges are the ratings associated with each (User, Book, Author, rating)
tuple:
I first created 4 Classes to account for my Graph components. Each Node will also have a List associated with it: Users
will have a shelf of Books
, Authors
will have their bibliography, and Books
will have their audience. The Graph
will connect all of these objects together into a cohesive unit that can be traversed.
The above block of code is pretty chunky, and there are a number of methods inside that are not going to be explored here. We will focus on one that can be directly used with an easy, consumer facing React App API: _book2book
. So let’s say that we can now walk around our Graph, and everything compiles fine. How might we find meaningful Books
to present to a User
? In our mind, let us take a hypothetical walk:
- You, as a
User
, love aBook
by anAuthor
(e.g., a five star rating) - This
Author
has many fans, and she has written similarBooks
you may also love. These are not novel Novels, as you may be aware of them already. - This
Author
has many fans, some of whom may be similar to you - One of these
Users
, selected randomly, will have a shelf of favoriteBooks
- Select the least popular, five star rated
Book
from this shelf - Either present this to the
User
, or head back to Step 1 and keep walking through the Graph
Now we just have to return this Book
to our User
as a possible great discovery! We are missing the key ingredient to this project, though. We need the App to deliver these interesting Book
s. Let’s start with the part that is still Python based, for familiarity: we need to build the back end of our App that handles API requests, and so, of course, we are going to use Flask. Flask allows us to make a small web server in Python, and it is pretty easy to use. I found two tutorials to be extremely useful, and they were nicely linked to one another: a Flask tutorial, and a React App that calls the aforementioned Flask app.
So we first make a new directory that we will call api/
. Inside this folder, we will have two python files. __init__.py
will build up the basics of our Flask app, import the necessary variables, etc.:
and then an app.py
(or whatever you want to call it):
Inside the app.py
, we are doing a number of important things. First, we call our python code that builds our graph database for finding new books. It also sets up the Blueprint
of our backend, so when we make API calls to this service, we are pointing in the correct direction. Let’s unpack these things at a very basic level:
- Our app name is initialized to be
@main
, but we could have other stuff here as well - We are hosting two
.routes
in our@main
app:/input_book
and/novel_novel
- You can think of these as separate pages on our Web app, and they both handle different behaviors
- The HTTP methods of
GET
andPOST
, which get something from our backend or post something to it (good terminology!)
For the /input_book
call, what we are doing is receiving a raw text input from the User
on the webpage, packaging it as a JSON that looks like {"book": "book_title"}
, and then updating the back end server. In this case, we are using this initial book to find a similar, rare book the User
may enjoy. We call out Graph with this input, and set a global
python variable to the image URL of the output book.
Now you can imagine that the WebApp would make a 2nd API call to the Flask App, saying “What is the output of that input?”. This is our GET
call in /novel_novel
, which returns a JSON as well. This probably sounds pretty reasonable to most of you right now, and it did to me as well. You may have even used Flask before. But have you ever built a React
App before?
React is a Facebook sponsored open source JavaScript library for making UIs easier to code up. It still took me a bit of time to learn some JavaScript basics, understand the syntax for React, and try to get everything to look decent. The two important resources that I used are this previously mentioned YouTube and this TowardsDataScience article. The video guide utilizes a second React library, semantic-ui-react
, that makes App building even easier, and so I mostly followed their advice. And honestly, you should too. The basics of installing the necessary tools, building the boilerplate App.js
file, and explanation of writing functions are better left to the JavaScript professionals.
I will go into a little detail about the actual React components I hacked together to integrate with our Graph, though. The main function here is the App.js
file, so we will unpack that first. We import our necessary libraries, initiate App()
, and then return some HTML/JavaScript-like entities (Divisions, Containers, Images, etc.):
You will notice inside the Container
we have two components: <BookEntry/>
and <GrabBook/>
. These correspond to our two API calls in our Flask app which POST
a title that our Graph understands, and then does a GET
call to grab our interesting new novel for the user. Let us look at the more complicated component first, the BookEntry
:
We import a few extra packages from semantic-ui-react
in this component: { Form, Input, Button }
. We want the user to type text input into a Form
corresponding to the title of the book they love, and then hit a submit Button
. First, we initialize a few variables (const [title, setTitle]
) and set the default value = useState('');
, an empty string.
Next, we build the <Form.Field>
, and we add in a placeholder
value that provides instructions to the user on what to do. Once the user has entered some text, we set the useState
of title
to that of the entered text. Easy!
Next, we add a new <Form.Field>
in the shape of a Button
that the user can click on right underneath the text input. On the click of the button (<Button onClick= ...
), we make our first call to our API via the await fetch("/input_book"
call. Recall that in our Flask app, our Blueprint has @main.route('/input_book'
, so this call hits our backend at this contact point. The returned value of this POST
call is just a response
that says everything went OK if and only if we find the title in our Graph. Otherwise we return a 400
and ask the user to submit a new title. It is possible that our database does not have the title, or maybe it is spelled differently, etc., so we do not want our App to break in these cases.
Prior to moving on to the next component, I would encourage anyone trying to build one of these web apps to use console.log()
calls all over the place in your code. Not only do they serve a double purpose as a comment, they can be so helpful in debugging your code. Recall that you just right click on any webpage in Chrome, Inspect
, and then open up the console
tab and you can see everything that is going on.
So, let’s assume that everything went perfectly and we have found the relevant title. Now we actually have to return a new book for the user in our GrabBook
component:
Here we make another API call to the POST
endpoint via a fetch("/novel_novel")
. Remember that we are returning a JSON object that looks like {"image_url": "http://image_of_book.png"}
. We then pass this into an <Image>
block that renders the image and returns it to our App. Cool!
To be fair, the page does not look great. If you are a React expert, please let me know how I can make the layout look better:
- After submitting a request, it requires a refresh of the page to display the correct image.
- I would also prefer if the background image covered the entire page, and the form fields were in the foreground of it.
This work is not intended to be a commercialized, production ready system. It is meant to be a learning adventure that is fun to use and thought provoking in its scope. Here is a link to the repo if you are interested in checking it out.
Thank you for stopping by! Feel free to comment below and ask for books. I personally have checked out 10+ recommendations from the Graph, and I was nearly always surprised by the results. I was glad I investigated the suggestions, and although some were not my cup of tea, others are wonderful discoveries.