Introducing Dash HoloViews

Jon Mease
Plotly
Published in
5 min readDec 3, 2020

We’re happy to announce the release of Dash HoloViews. This collaboration between the HoloViews and Dash projects makes it possible to build certain classes of interactive Dash applications without the need to manually define any callbacks.

Two particularly powerful use cases are the ability to automatically link selections across multiple plots (also known as crossfiltering) and to display large datasets using Datashader. Both of these use cases can be implemented on top of pandas, Dask, or GPU accelerated cuDF DataFrames.

For a demo and Q&A session for Dash HoloViews with Plotly’s Chief Scientist, watch our recorded webinar!

Datashader and Linked Selections with Dash HoloViews

Dash HoloViews was just released as part of HoloViews version 1.14. You can install it today with pip:

$ pip install holoviews==1.14

HoloViews Overview

HoloViews is an ambitious project that aims to provide a flexible grammar of visualization types and plot interactions. HoloViews specifications can be displayed using a variety of technologies, including Plotly’s Graphing Library and Dash.

While HoloViews can be used to create a large variety of visualizations, for Dash users it is particularly helpful for two use cases: Automatically linking selections across multiple plots and displaying large data sets using Datashader.

HoloViews also provides a uniform interface to a variety of data structures, making it easy to start out by visualizing small pandas DataFrames and then scale up to GPU accelerated RAPIDS cudf DataFrames, or larger than memory Dask DataFrames.

For more background information, see the main HoloViews documentation at https://holoviews.org/

Dash Overview

Dash is the easiest way to build highly-scalable analytic web applications using pure Python, R, or Julia. It is downloaded over 350,000 times per month and is relied upon by hobbyists, researchers, and businesses to share data analysis and visualization results, and to operationalize machine learning and data science models.

For HoloViews users, Dash provides a new deployment technology with excellent scalability characteristics. In contrast to other technologies that can be used to deploy HoloViews dashboards (Bokeh, Panel, and Voila), Dash is unique in that it stores all user-level session data exclusively in the user’s web browser. This means that server memory requirements depend only on the app itself, and they do not increase linearly with the number of simultaneous users.

Not only does Dash’s architecture make it possible for a single server to support many simultaneous users, it also makes it easy to horizontally scale an application across multiple servers using a load balancer like NGINX or Dash Enterprise Kubernetes.

Linking Selections

HoloViews provides a really powerful link_selections transformation that automates the process of setting up linked selections across plots. This is a fairly sophisticated function that understands the semantics of how each plot type was generated from a source data set, and how selections in the plot’s local coordinates represent symbolic expressions that can be used to filter the original dataset. For more information, see the Linked Brushing section in the HoloViews documentation.

Thanks to Dash HoloViews, the full power of the link_selections transformation can be used in Dash. Here is an example of linking a scatter plot to a histogram to visualize the classic iris data set.

Automatic Linked Selections

See the Dash HoloViews documentation for the full source code of this example.

Visualizing Large Data sets with Datashader

Another HoloViews feature that is particularly convenient for Dash users is the integration with Datashader.

Datashader is a Python library for quickly creating a variety of principled visualizations of large datasets.

While the Plotly.js WebGL accelerated scatter trace can handle hundreds of thousands of points, Datashader can handle tens to hundreds of millions. The difference is that rather than passing the entire data set from the Python server to the browser for rendering, Datashader rasterizes the data set to a heatmap or image, and only transfers this heatmap or image to the browser for rendering.

To effectively use Datashader in an interactive context, it’s necessary to re-render the data set each time the figure viewport changes. This can already be accomplished in Dash by installing a callback function that listens for changes to the relayoutData prop, but coordinating the full update process is not straightforward.

With HoloViews, the datashade transformation can simply be applied to a scatter element without the need to write any callbacks defining how to re-evaluate the datashading operation in response to viewport changes.

This example loads the same iris data set, but then duplicates it many times with added noise to generate a DataFrame with 1.5 million rows.

See the Dash HoloViews documentation for the full source code of this example.

Combining Linked Selections and Datashader

It’s even possible to combine the link_selections and datashade transformations to create linked visualizations across large datasets. This example shows how the two previous examples can be combined to support linking selections across a histogram and a datashaded scatter plot of 1.5 million points.

Notice how the box selection tool can be used to select regions of space in each figure and how the selection of the corresponding points is displayed in both figures. Also, notice how the datashaded scatter plot dynamically resamples after zoom and pan operations.

See the Dash HoloViews documentation for the full source code of this example.

Mapping support

HoloViews allows most 2-dimensional visualization types to be overlaid on top of a map. Here is a more refined example of displaying the 10 million row NYC Taxi data set that is provided by the PyViz project (https://examples.pyviz.org/nyc_taxi/nyc_taxi.html).

The drop-off locations are rendered using Datashader on top of a Mapbox map, and is accompanied by a histogram of fare amounts. The datashaded scatter plot automatically resamples in response to zoom and pan events, and selections on both plots are linked.

This full app is under 100 lines of code with zero callbacks. For the full source code, see the GitHub repository at https://github.com/plotly/dash-holoviews-taxi.

GPU Acceleration with RAPIDS

Many HoloViews operations, including datashade and link_selections, can be accelerated on modern NVIDIA GPUs using technologies from the RAPIDS ecosystem. All of the previous examples can be GPU-accelerated simply by replacing the pandas DataFrame with a cuDF DataFrame.

The ability to develop exploratory data analysis workflows using pandas, and then later GPU accelerate them with almost no code changes is one of the most exciting recent developments in the PyData ecosystem.

Acknowledgments

The Dash HoloViews integration was funded by the NVIDIA RAPIDS project as part of the ongoing partnership between Plotly and NVIDIA.

--

--

Jon Mease
Plotly
Writer for

Creator of VegaFusion, Vega-Altair maintainer, visualization at Hex, former Chief Scientist at Plotly