My Experience with Flask and Bokeh plus a small tutorial (kinda)
When I set out on my first python project journey, I had a clear goal in mind. I wanted to be able to scrape and schedule the data download to happen automatically without any user input, which I detailed in my first post. Next I needed to store the data I just scraped and describe putting it into a MongoDB, as well as using Beautiful Soup to scrape more data and create a data visualization with Altair. All of which can be read in my second post.
For my third installment, I wanted to be able to display the data visualizations and any analysis on a website. To accomplish this task I chose to use Flask and Bokeh. I chose Flask over Django because I read that Flask was a lightweight framework whereas Django was a full stack framework and I really just wanted something that I could use to quickly develop a base concept. Hence, based on that little piece of information I went with Flask. My decision to go with Bokeh really was based on reading a few blog posts and knowing that I would probably be better off just making a choice and moving on. So Bokeh it was!
To start, I did the most basic task in flask…creating the “Hello World” app on the Flask homepage. Quickly moving on, I found the Creating Interactive Bokeh Applications with Flask tutorial, where I would be using the Iris data set to create interactive histograms. After a short read through the tutorial, I picked up some of what the author was saying. For example, I understood that the name parameter in the URL was then being passed to Flask but that did not directly apply to my project, at least not at this stage. Eventually, I got to the part where the author was using a Jinja2 template to create a drop-down menu which then updated the Bokeh plot…yea that went over my head. Instead I tried to R&D, retrieve and duplicate, the author’s work by copying and pasting the author’s code into Sublime and altering it to work for my application.
I started by creating a function to create a plot and passing that into the def index(): statement. But many bugs were encountered but alas Terminex was closed.
The first bug I found was looking at the HTML template, I saw that the code was using Bokeh version 0.12.5, but when I looked on my machine I saw I was using version 1.0.2. I promptly made the necessary adjustments to the HTML template and like magic my plot appeared.
Next, to add interactivity I added the select widget which would create a drop down menu and then I could select one value to change my graph, or so I thought. My code ran successfully and the widget appeared but the graph did not change upon selecting a new value.
Since I did not have a sound foundation in Flask nor Bokeh I went to the Bokeh documentation and copied a simple example to see if I could get that to work. Choosing the Sliders example, I copied the code into a Jupyter Notebook with the added line of output_notebook() which made the plot appear. But to my dismay I was greeted with an error.
Seeing that only JavaScript callback may be used with standalone output, I started to look for how to implement the CustomJS callback instead of using the event handler .on_change(). Following the first example from the JavaScript Callbacks page in the Bokeh documentation I was able to implement a working interactive plot!
Even though I found success there was a major flaw…I’m a statistician who doesn’t know JavaScript!!! What a cruel world. I looked into the Bokeh documentation more, and saw that it was possible to embed a Bokeh server in a flask server, interesting. Looking back at the documentation I saw in the Adding Widgets section that I use Bokeh serve to start the Bokeh server and set up event handlers with .on_change (or for some widgets, .on_click). Immediately after reading this line it clicked and I found the flask_embed example.
Following the example I was able to generate my first interactive plot with Bokeh and Flask!
Now to change the example to fit my own needs, which resulted in the figure below.
So a lot is going on in this example. First I import all required libraries and then define a function, called modify_doc(), that will create my plots. In the function, I need to create a connection to my MongoDB, query said database, and then perform a little cleaning because I didn’t clean my data before hitting the database, rookie error. Next, I create my plots by calling figure(). From there I start the bokeh server, and then the flask server. And with a little bit of magic I have my first web app!
However, these graphs are merely displaying points and I would like to perform some analysis. That’s where a general additive model comes into play. Using the general additive model, I am able to model the probability that pitch will be called a strike depending on where it crosses home plate. As you can see in the graph below the strike zone is not a square but more of an ellipse!
Now, my next steps will include performing more analysis like that of the general additive model and putting it into my Flask app in order to tell a story with data.