Visualizing Gapminder Data Using Bokeh

Creating Interactive Data Visualization in Python

Abraham Setiawan
CodeX
6 min readFeb 25, 2022

--

Photo by Hikmet on Unsplash

As someone who works with data, chances are you’re familiar with the late Hans Rosling and his life’s work at Gapminder. His presentations were fascinating, using visualization such as scatterplots to tell a story.

If you work with data on Python, you might have used matplotlib or seaborn for data visualization. It does the job most of the time, but sometimes you want to have an interactive plot instead of a static one. This is where Bokeh comes to play.

From Bokeh documentation, it boasts itself to be flexible, interactive, shareable, productive, powerful, and open-source. You could do a simple visualization or a more complex one with it, and by taking advantage of Bokeh server, you’ll be able to make interactive visualization, closing in to Hans Rosling’s presentation (okay, this writing won’t give you the exact same masterpiece, but the concept is there).

Our goal is to make an interactive visualization where you can see how different countries progress over time with a play button and slider, as well as modifiable x and y axes.

End goal (Image by author)

Creating the base plot

Before we start, you can download the dataset here. Also, make sure you have Bokeh already installed in your computer. If not, you can install it using pip:

Now that we have Bokeh installed, we’re going to start with a basic, static plot. First we import the packages that we need from Bokeh and pandas, then we import the data.

Afterwards, we define the plot axes and axis labels. I use a dictionary for the axis labels so that it will provide clear labels when we update the axes. We use ColumnDataSource to provide the data source for the Bokeh plots and CategoricalColorMapper to color code different geographical regions using Spectral6 palette.

Then we make the figure and make scatter plots using plot.circle(). We use fertility as x axis, life expectancy as y axis, and population as the dot size (I use pop_size so that the dots fit in the plot). We then use output_file() to save the plot as a .html file and show(plot) to show the plot.

Base plot
Base plot (Image by author)

Adding hover tooltip

Bokeh allows you to add a hover tooltip, providing extra information when you hover over a data point. To add a hover tooltip, import HoverTool and initialize it with the information you want to have in it. The parameter tooltips accepts a list of tuples as the input, with the label on the left and the data on the right. As with figure, the data source is defined on ColumnDataSource. Finally, use plot.add_tools() to add the hover tooltip you just made to the figure. Make sure to add this code before output_file() and show(plot).

HoverTool in action
HoverTool in action (Image by author)

Adding a slider and dropdown menus

To make the plot interactive, Bokeh allows you to add slider and dropdown menus as well. To do this, we import more libraries from Bokeh. To add a slider, we use Slider and specify the parameters for it. To add a dropdown menu, we use Select and input a list of possible dropdown options, give default value, and the title. To make the plot updated when we move the slider or use the dropdown menu, we use .on_change() method. This is a callback method to update the plot that will take in function update_plot() as a parameter. More on this later.

We also need to create a layout for our plot. Using column and row, we design the layout of our plot. Finally, we show the layout (not the plot) using show(layout), which replaces show(plot).

Before we can proceed further, we have to define the function update_plot() for our callback method. What this function does is to fetch the new value from the slider and dropdown menus and set it as the data source for the plot. We also update the hover tool by removing and adding a newly created hover tool. It’s important to define the hover tool as global for the function to work, since the original hover tool was defined outside of this function. Remember to put this function before the previous code.

After implementing this code, we will have this nice layout with the slider and dropdown menus on the left side of the plot.

The plot with slider and dropdown menus
The plot with slider and dropdown menus (Image by author)

Running on Bokeh server

You might notice that the plot doesn’t update when you use the slider or dropdown menus. This is because show(layout) will only show a static plot. We need to run the code on Bokeh server to make it interactive. To do so, we first need to use curdoc instead of show(layout) with the code below.

Then we run this bash code on Terminal (add ! in the beginning if you’re running the code on Python notebook). Make sure you have saved your file, your Terminal is in the correct directory, and that your file name is correct. For me, the file name is gapminder_bokehserver.ipynb.

Slider and dropdown menus in action
Slider and dropdown menus in action (Image by author)

Adding a play button

While the previous step is already good enough, I decided to go the extra mile and add a play button that will interact with the slider so that the data flows automatically throughout the years. Once played, the play button will turn into a pause button to stop the flow. To do this, we use Button and set the label and width.

Same thing with the slider and dropdown menus, we need to add a callback method for the button with button.on_click(animate). This gets a bit tricky, so bear with me.

Right before the animate() function, we define an empty callback_animate variable. We use this newly defined variable inside the animate() function by calling global callback_animate. We use a conditional statement to switch between the play and pause button. On the if statement, we define callback_animate as another callback method to trigger animate_update() that will update the slider, which in turn will update the plot. On the else statement, we remove the callback so that the slider stops updating. I don’t have a good answer on why we have to define an empty variable and then call it globally inside a function, but this is the only way that works for me.

Then we update the layout to include the newly created button.

Finally, we run this code on Bokeh server again.

Final result
Final result (Image by author)

I hope you found this useful. For me, using Bokeh is more challenging than matplotlib or seaborn. But the interactivity that Bokeh offers really makes the plot stands out compared to static plots. You can find the full code on my GitHub repository.

Cheers!

--

--

Abraham Setiawan
CodeX

Data Analyst student at Hyper Island with experience in product and innovation. I write about my journey in the data world. Website: abrahamsetiawan.com