Bokeh: Interactive Visualizations (With a Side Of Wine)
Grab your glass — we’re getting clicky with it. This tutorial serves as a “getting started” guide for how to create interactive visualizations with Bokeh.
Data is one of the most important assets a business or organization can have, but it will serve virtually no purpose if the data insights are not clear and understandable to its users. Data visualization solves this problem: Successful data visualization translates your data into clear results that are easy for anyone to absorb.
A personal example: A few weeks back, my Flatiron School teammates and I presented a practice presentation for our second Data Science cohort project, which consisted of creating a linear regression model to predict housing prices. We were proud to have built, tested, and improved a multiple linear regression model from the group up, and were eager to share the results! Fortunately, before the real presentation, we received this feedback:
“Too technical. Way. Too. Technical.”
I was shocked. After diving head-first into enough Statistics and Calculus to want to bang my head against the wall, yet excitedly feeling like I’m beginning to comprehend the “science” behind Data Science, receiving feedback like this momentarily made my head spin.
But, it was valid, and important, feedback. I was naive to assume that a stakeholder in a business setting would care about which log transformations I performed in the modeling process! (And in a real-world setting, they’d probably want to log me out of the conference room). This is why clear data visualizations are key to making your data findings useful to others. Today, I will show you how the Bokeh Python library make the data visualization process easy, fun, and interactive.
What is Bokeh?
Bokeh is a multifaceted Python library that enables users to easily create interactive data visualizations through its open source code functionalities. The options for data customization are impressively abundant and easy to implement.
With just a couple of quick lines of code for installation and set up, you can build powerful custom interactive applications illustrating your data in minutes that can be easily exported as an HTML webpage, or incorporated into your Jupyter notebook. It is also important to note that this library allows you to create interactive visualizations for modern web browsers utilizing Python code only — no Javascript needed!
In this article, I will walk you through the easy installation steps to get started with our super cool Bokeh-generated data dashboards.
How to Install Bokeh
Bokeh is installable with one line of code via conda or pip.
To install with conda, use the following:
conda install bokeh
Or, To install with pip, use the below:
pip install bokeh
It should take less than a minute to complete.
Getting Started
Before diving into advanced customizations, it is important to understand the fundamentals of how Bokeh visualizations are coded. Below, I will demonstrate a very simple and straightforward generation of a line chart.
First, import the Bokeh plotting module (and, I will be using Pandas to handle my data):
from bokeh.plotting import figure, show
import pandas as pd
Create two simple lists for a sample x and y plane:
x = [2, 2, 3, 4, 5]
y = [1, 7, 5, 1, 9]
Great, we have data! Now, like other plotting libraries, we first create a figure and assign names to our x and y axis's containing our data, along with the title of the plot.
p = figure(title="Our Simple Line Example", x_axis_label='X values', y_axis_label='Y values')
Now we apply the line() function to our plot, which contains the necessary information about our line. Through this line renderer, we pass our lists containing our data, our legend label, and a custom width of our line (which is optional).
p.line(x, y, legend_label="Our simple line connecting X and Y", line_width=3)
Hooray! We are ready to create our first Bokeh plot. All we need to do is call the show() function, and pass in the name of our plot. Note: When you pass show(), the default setting will open your plot in an output html file, usually in a new window. I want to keep my data in my Jupyter notebook, so I will apply the Bokeh output_notebook() function before generating my plot.
# import Jupyter notebook tool
from bokeh.io import show, output_notebook
output_notebook()# show us our first Bokeh plot
show(p)
To the right of your new Bokeh plot, you will find a built-in navigation panel containing several ways to explore your data. Even with the simplest data set containing two lists with 5 data points each, you can still perform exploratory functions like panning around the chart or zooming in and out. You can also save the image with a click of a button while zoomed in, and reset the view afterwards.
Let’s Visualize Some Data!
Congratulations, you’ve created your first interactive Bokeh plot! Now let’s demonstrate how to use some of their unique functionality with real data, and explore how Bokeh allows you to create a sophisticated bridge connecting data analytics and user interactivity.
I am writing this article on a weekend, so the first common data set that comes to mind is the wine data set from the UCI Machine Learning Repository. Let’s import this and get started! (Optional: grab a glass of your favorite pinot.)
wine = pd.read_csv('wine.csv')
# peek what data we are working with
wine.head()
This Italian Wine data set is the result of a chemical analysis of 178 wines grown in the same region in Italy, but derived from three different cultivars. We can see that the three types of wine are classified in the “Wine” column under the labels “1”, “2”, and “3”.
Per the data set documentation, these labels represent grapes from the Barbera, Barolo, and Grignolino regions. I’m interested in visualizing how these regions differ in the characteristics of the wine, so I will first rename these integers to their corresponding strings:
wine_names = {1: "Barbera", 2: "Barolo", 3: "Grignolino"}
wine["Wine"].replace(wine_names, inplace=True)
Let’s make sure our data frame successfully changed our wine column to their corresponding names:
wine.sample(5)
Looking good! We are ready to plot our data.
Glyphs
Before we get started, it’s important to understand that Bokeh customizes its figures with “glyphs” such as line, scatter, circle, quadratic, vbar (vertical bars), etc. Click here to see the glyph documentation. Glyphs can be layered on top of each other in a figure to create a graph with various types of data.
Now, instead of a line figure, let’s generate another type, a scatter plot.
Create A Scatter Plot Figure
# create figure
graph = figure(title = "Bokeh Scatter Graph", x_axis_label='Hue', y_axis_label='Proline')# use scatter glyph to plot our data
graph.scatter(source=wine, x='Hue', y='Proline')# show our figure
show(graph)
Here, we are exploring how our wine samples vary by their Hue vs their Proline. Now that we have all of our samples plotted, we can begin customizing and gaining meaningful insights.
Add Color Categorization
I want to know which samples belong to each of our three types of grape, so I will specify our legend_group as the “Wine” column and assign colors with index_cmap.
# assign colors red, blue, and green to each of our three categories
index_cmap = factor_cmap('Wine', palette=['red', 'blue', 'green'],
factors=sorted((wine.Wine.unique())))# create a new, colorful scatter plot
graph2 = figure(title = "Bokeh Scatter Graph", x_axis_label='Hue', y_axis_label='Proline')
graph2.scatter(source=wine, x='Hue', y='Proline',fill_alpha=0.6, fill_color=index_cmap,size=10,legend_group='Wine')show(graph2)
Here we can see how the three regions’ different types of grapes change the characteristics of our wine. Now, let’s make it interactive!
Add Hover Feature
By adding tools=”hover” to our figure, we can specify custom data affiliated with each data point when the user hovers over each point. In this example, since we are comparing Hue and Proline, I added those two to our hover feature via “tooltips”.
graph3 = figure(title = "Bokeh Scatter Graph - Interactive", tools="hover", tooltips="@Wine: (@Hue,@Proline)") graph3.scatter(source=wine, x='Hue', y='Proline',fill_alpha=0.6, fill_color=index_cmap,size=10,legend_group='Wine')
Now we can hover over our data points and view the data affiliated with each point! This is completely customizable and will only display what you tell it to.
Customize Legend and Hide Data
We can customize our data further by creating an interactive legend that allows you to display or hide each category.
As I mentioned before, glyphs can be layered in Bokeh. To create an interactive legend, we will split our data into three dataframes to capture our three categories: Barbera, Barolo, and Grignolino.
barbera = wine[wine["Wine"] == "Barbera"]
barolo = wine[wine["Wine"] == "Barolo"]
grignolino = wine[wine["Wine"] == "Grignolino"]
Now, instead of plotting all the data in one scatter glyph, we will layer the three scatter glyphs on top of our figure, and assign the “click_policy” to “hide” which allows us to hide the category upon clicking.
# Creating a new graph containing all our previous work for demonstration purposesgraph4 = figure(title = "Bokeh Scatter Graph - Interactive", tools="hover", tooltips="@Wine: (@Hue,@Proline)")# Here, we create a custom legend, layering our three different scatter glyphs# Barbera data
graph4.scatter(source=barbera, x='Hue', y='Proline', fill_alpha=0.6, fill_color=index_cmap,size=10,legend_label='Barbera')# Barolo data
graph4.scatter(source=barolo, x='Hue', y='Proline',fill_alpha=0.6, fill_color=index_cmap,size=10,legend_label='Barolo')# Grignolino data
graph4.scatter(source=grignolino, x='Hue', y='Proline', fill_alpha=0.6,fill_color=index_cmap,size=10,legend_label='Grignolino')# Hide one of the legend categories when clicked on in our legend
graph4.legend.click_policy="hide"
Let’s see it work all together!
Now, the user can split out specific categories they’d like to see, and hover over each point to view detailed information about that point (as opposed to guesstimating using the x and y axis legend).
The possibilities with Bokeh are endless. In fact, we have barely touched the surface in this tutorial! I invite you to explore the Bokeh documentation to view all the other features you can customize your data with. I hope you had fun learning Bokeh with me!