Photo by Ronit Shaked on Unsplash

Python and Bokeh: Part II

Gleb Ivashkevich
Yandex school of Data Science
12 min readJul 18, 2019

--

The beginner’s guide to creating interactive dashboards: Bokeh server and applications.

This is the second part of our tutorial series on Bokeh visualization library. In this part of the series, we will explore Bokeh application and how to serve them using Bokeh server.

Please, refer to the first part of these series to cover the basics of building scatter, line, and bar plots, and learn how to apply styling and embed plots in web pages.

All the examples in Part I of the series displayed simple embedded visualizations with only the basic interactivity. However, we often need to create full-fledged applications to properly visualize the data, as real-world visualizations usually require not only interactions with a user via buttons, dropdown menus, and other elements, but also dynamic data updates.

Bokeh allows this, as plots can be not only embedded, but also combined into large and elaborate web applications, with the built-in Bokeh server. It handles data and plot updates between backend server in Python and client-side Bokeh JavaScript library, which is responsible for actual drawing in a browser.

Anatomy of a Bokeh application

Bokeh defines a set of abstractions for storing and transporting objects between backend and frontend. The main and most important one is the document. A Bokeh document is a container, which incorporates all the elements, including plots, widgets and interactions.

Each document contains one or more models. In the context of Bokeh, models are plots, axes, tools or any other visual or non-visual element, which is drawn or used to draw something else on the user screen. Bokeh backend serializes documents to JSON which are sent and displayed by the BokehJS client.

In turn, Bokeh application is an entity, which creates and populates documents with updates on the backend with all the necessary configuration to properly transfer data between the server and client. Although Bokeh applications are written in Python and handled by the server, it is also possible to add custom JavaScript functionality on a client side, with JavaScript callback or otherwise. This is a very powerful tool, but we will use it only occasionally, and mostly for styling. Bokeh JS client can also handle user actions.

Without further ado let’s proceed to actual coding. We will start by exploring the main building blocks of Bokeh apps and leverage that knowledge to create a truly interactive dashboard for real-world data visualizations with all the interactivity and data facilities we may need.

Running Bokeh server

Let’s create a simple application, which plots some random data. The most basic way to do so is to use a single Python script, which takes an empty document and fills it with models:

Bokeh exposes curdoc() function, which provides a handle for the current default document. Although you can create and populate documents manually, with all the flexibility possible, curdoc() is the most straightforward way to get a document and start working with it, while not compromising on flexibility too much.

After that, we can use figure() to create a model, which is a Figure instance in this case:

Although we already specified figure size, this model is still empty and is not attached to the document. Let’s plot some random values on a newly created figure and attach it to the document we have:

Each document has one or more root elements. Root elements are the direct children of a document and, as we will see later, they can be directly referenced elsewhere.

As simple as it looks, we have our first Bokeh application. Just a final touch: we will define a page title to be displayed in the browser tab with bokeh_doc.title = "Sample Bokeh App" and we're ready to pass it to Bokeh server:

As you might have already figured out this command bokeh will run a server and manage backend-related tasks, like downloading data sets. run subcommand launches a server, while --show option indicates, which app should be open in a browser window.

The complete code for this application:

Basic Bokeh application

Under the hood, Bokeh server performed a lot of tasks: it added a figure model to the document, serialized the model and sent it to the client browser session. All of that happened without our intervention at all.

Widgets and callbacks

Now we have a working app which displays a static scatter plot. Apparently, we have covered static plots in Part I, so what’s the difference? It’s simple: Bokeh server provides a rich functionality to make plots and other document elements actually dynamic and interactive. This is achieved by a system of callbacks and attribute updates, which are handled by Bokeh server.

For example, to update some plot with new data, we only need to add that to a corresponding data source. No need to send updates to a client session, Bokeh with perform this for us.

Callbacks and various data source update mechanisms are the main building blocks of Bokeh interactivity, so let’s explore them.

Periodic callbacks

The main and, probably, the most common type of callback used in dynamic dashboards, is the periodic callback. After being registered to a document, it will be fired by Bokeh server on specified time intervals.

Periodic callbacks are most commonly used to fetch new data and update plots. Although it’s better to perform long-running I/O operations outside main thread, we will not bother about it in this tutorial, as this is a very general Python topic, not specific to Bokeh.

To illustrate, how periodic callbacks are used, let’s create a simple application with an empty figure in it:

To add data to the figure, we use a periodic callback, which draws random numbers at each run:

The method bokeh_doc.add_periodic_callback notifies Bokeh server, that add_circles function must be fired every second, or once per 1000 milliseconds. Note, that each callback will add new renderer to the plot, so after a while, we will have a bunch of independent glyphs in our app. While it is fine for the sake of this presentation, more efficient scenarios will require ColumnDataSource functionality (to be introduced later in this tutorial).

As previously, we launch the app:

If everything works as expected, the app will add a circle to the plot every second. The complete code for this app is as follows:

Application with a periodic callback

Widgets and attributes callbacks

Periodic callbacks help to make Bokeh applications dynamic, but we also want to make our plots responsive to user interactions, like clicks and selections. The typical mechanism of interaction is when the user engages with some element or visual component, like a Button, Dropdown menu, Slider, etc., to change some aspect of plotting.

For example, a user may select a category from a dropdown menu to filter the data for display, based on the selected category. Or, we may want to turn on or off periodic callback on a button click (yes, these callbacks may be added or removed dynamically). Both types of interactions basically work the same.

User interaction typically causes some changes in a model or models attributes. For example, slider value is bound to value attribute, and as a user interacts with a slider, the attribute changes correspondingly.

Bokeh allows to bind a callback to these attribute changes and the way to register such callback is through on_change method, which is exposed by various models. The callback function must have a specific signature, as you will see in a moment. This mechanism is extremely powerful and enables custom handling of virtually any changes in a Bokeh application.

We will extend our app from previous section:

Bokeh provides predefined widgets, like the Button, which can be used immediately in an application:

We use a predefined style called success for a green color button (very similar to the Bootstrap CSS framework) with Generate label on it.

So far the app does nothing: no data is plotted to the sample_plot, the button is not responsive to a click, and it is not even added to the document.

Buttons in Bokeh expose a simpler callback mechanism to handle clicks: callback function should have no arguments and is registered with on_click method. You certainly can use any Python functional tool to create such a callback function from a generic function with an arbitrary signature.

Now we need to add a plot and a button to the document. We will use a basic column layout for this, with only a minor change: we will wrap the button in an element called widgetbox, which is responsible for the proper placement and padding of the widgets (check how it will look without it):

So far we have not created any attribute callbacks, but we will shortly.

When any of the high-level plotting methods like circle, vbar are applied on a Figure instance, those methods add an additional renderer to that figure (look into the renderers attribute of a Figure instance). As you already probably figured out, we can attach a callback to any such change. To illustrate this, let's create a very basic callback with the correct signature:

Note the signature: it’s generic, so that the same callback function may be bound to different attributes: the attribute name itself is the input parameter to the function.

Now, on each button click, we will see "attribute 'renderers' changed" in the terminal. As simple as that. Each time we click the button, we add a new renderer to the plot and Bokeh fires renderer_added callback. The power of this mechanism is that we do not connect this renderer_added callback to the button at all. We just properly handle the sequence of events, launched by the button click. This allows to wire callbacks in Bokeh application in a complex and flexible way.

The complete code for this app is as follows:

Providing data

In all the examples above, we plotted data in a straightforward way: by providing numpy arrays to a corresponding glyph method.

In larger applications, this simple implementation has a couple of important drawbacks. First, each call to a circle or any other glyph function adds a new renderer to the plot which makes them hard to track. Second, this approach couples the data layer to the view layer, which makes the code entangled and harder to maintain as the app grows larger.

The right way to handle data in Bokeh applications is via ColumnDataSource and CDSView. ColumnDataSource is a data container, which was introduced in Part I of this tutorial. CDSView is a filtering mechanism for column data sources, which allows us to visualize only certain data elements with a single data source under the hood. We will use CDSView later in our examples and in our dashboard application.

Streaming

Let’s rewrite our button application in a more efficient way. First, we create a data source and use it for plotting:

For now, sample_plot depends on data_scr as the source for its data to be presented as circles. But this data source is still empty. How should we populate it with data?

Instead of plotting data directly on a button click, we will now stream data into the data source:

Let’s breakdown this code: the data_scr.stream method of ColumnDataSource appends new data to the data source. Remember, Bokeh keeps track of the two versions of our data: one on the server side and the other on the client side.

Using data_scr.stream ensures that only diff changes will be sent to the client, while the data source itself will not be recreated or resend from scratch. This is important, as sometimes you may need to stream large amounts of data, and creating new data sources on each update is costly both in terms of resources and performance.

Note also the rollover argument: it caps the maximum number of most recent data points that the client will keep. Again, this is more suitable for applications, which update frequently and with large amounts of data.

The complete code for this app is as follows:

Patching

While streaming is used to provide new data, sometimes you need to change the data, which is already in the data source. For this use case ColumnDataSource exposes a patch method. Again, only diff updates will be sent to the client with a minimal network overhead.

Let’s rewrite our callback function — we will randomly select 3 circles and change their x field:

The patch method requires a dictionary, with keys being the fields of data source to be changed. You do not need to change all the fields at once. Values are sequences of pairs, where the 0-th element is the index to patch at, and the 1-st element is the new value to patch with.

Patching in Bokeh has two drawbacks to be mentioned, though. One is the very strict type checking in data_scr.patch: int will pass, while np.int64 or np.int32 won't. That's why we need to do patch_idx = [int(ix) for ix in patch_idx].

Also, patching won’t handle generators, so zip won't work without transforming it to the actual data sequence.

The complete code for this app is as follows:

Streaming and patching methods enable us to decouple the data layer from the view layer. Now our data updates are efficiently handled by Bokeh, so we do not have to bother with how data changes will find their way to the client. We can now design Bokeh applications around data, and instruct plots to use specific data sources for drawing.

Using tables

One important widget to explore before creating a real dashboard is a table. A common use case is to display actual data values for inspection.

Let’s create a simple application with tables. Along the way, we will also explore filtering with CDSView.

First, we need to add some imports:

In Bokeh, a table is a collection of columns, linked to some data source (what a surprise!). To create a table, we need to construct table columns first:

Note, that we provide not only field names, which correspond to columns in a data source, but also titles, which will be displayed in the header row. We need also the data source and then we’re ready to create the table itself:

Now, to illustrate how Bokeh handles filtering and selections, we will add the second table, with the same columns, but with CDSView:

This code looks foreign, so let’s break it down a bit. In this table, we want to display only filtered rows, and we start by creating a mask. This mask filters nothing as for now, but we only need a placeholder to create a CDSView. We will update it later.

To create a boolean mask in CDSView, we use a BooleanFilter. Another option may be an IndexFilter, which allows selecting which indices should be displayed in a view. Finally, we create the table itself and instruct it to show data from the data_scr according to columns and subject to any filtering, as defined in the data_view.

Now, how do we change, which rows are displayed in our filtered table? Remember, Bokeh tracks all the changes in the document and its children. What we need to do is just to change the view, and Bokeh will transfer the changes to the client.

Let’s first arrange our application together:

If you launch it now, nothing interesting will happen: tables will be absolutely identical. To make them look different, let’s add a periodic callback:

In this callback, we randomly select a subset of rows to recreate the view. No additional actions are needed: Bokeh will notify every model, which needs to re-render on this change (in this case, only filtered_table).

Note several useful things about tables and data in Bokeh in general:

  • you can sort table rows by some column value by clicking at the column header,
  • to get to original order, you need to Ctrl+click on the column header.

Moreover, if you select one or more rows in one table, you will see, that the same rows are selected in another table (except those, which are filtered at the moment). This is an important observation: in reality, you selected not rows in the table, but rows in the underlying ColumnDataSource.

Bokeh notices the client change and notifies every model (tables, plots, and others), which uses the same data source, causing all of them to respond to the change. Actually, you can even attach a callback to the selection and handle even more elaborate behavior.

The complete code for this app is as follows:

Next steps

Now that we know how to create glyphs, provide them with data (filtered or now), and create dynamic and interactive Bokeh applications, we can proceed to the final part of the series: dynamic dashboards. In the next part, we will create a real interactive dashboard, using the real data, coming from an external data source. We will handle all the aspects of the dynamic dashboard: data management, plotting, interactivity, and styling. Stay tuned!

--

--