TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Multivolume rendering in Jupyter with ipyvolume: cross-language 3d visualization

--

The Jupyter notebook is becoming the standard environment for data science in many fields, with it comes the requirements of visualization. Two-dimensional visualization is arguably the most important, and there is a rich set of libraries to choose from. Many of these build on top of the workhorse of visualization in the Python world: Matplotlib. Some other, such as plotly, bokeh and bqplot take better advantage of the browser by providing fully interactive plots, being able to zoom, pan and make selections with high framerates and smooth transitions.

Although 3D visualization is used less often, sometimes it is essential to understand intrinsic 3D datasets (e.g., a brain scan), or complex structures that are difficult to comprehend with 2D projections.

Transparency in 2d, not a problem. Here showing a million points in bqplot (an open PR).

However, 3D visualization is hard. While transparency in 2D is trivial; it requires just blending operations, is nearly impossible to do correctly in 3D at acceptable framerates. In fact, it is an area of active research.

For large datasets in 2D, it does not make sense to plot each individual point. Instead, it makes more sense to work with statistics of the data. An approach that datashader and vaex take is to reduce the data to a 2d histogram (or any other statistic) and visualize this as a heat map.

Showing 150 million taxi pickup locations in NYC by showing a heat map instead of individual points, using vaex with bqplot

Volume rendering

The same technique can be used in 3D, except the rendering technique of 3D cubes is more difficult: it uses volume rendering, more specifically volume ray casting. With this technique, a ray is cast through the scene for each pixel on the scene and accumulates RGB and alpha values for each pixel in the 3d volumetric cube, and blends them together.

(left) Big data: exploring 100+ million rows by volume rendering a 3d histogram. (middle) Astronomical data cube: Radio observations. (right): Medical data cube: Scan of a male head.

Some datasets are intrinsically volumetric cubes. In radio astronomy two of the axes are sky coordinates, and a third axis is the frequency axis, forming a 3d intensity cube of (usually) emission. In medical imaging, the three axes are most usually just the spatial dimensions.

Multivolume rendering

Sometimes, multiple large datasets or multiple data cubes need to be rendered in the same scene. They could be overlapping, fully, partially, or not at all. To make things even more complex, a different volume rendering technique called maximum intensity projection works in a different way that makes it different to combine them in the same scene. In fact, I would argue it makes multivolume rendering at least as difficult as transparency techniques in 3D.

Glue-Jupyter uses ipyvolume for 3D rendering and also provides a richer UI and higher level data bookkeeping.

Glue to Glue-Jupyter

Glue is a desktop application for multi-dimensional linked-data exploration and uses multivolume rendering in its 3D visualizations, for instance, to visualize 3D selections (as demonstrated in the above screencast). Glue-jupyter (a project I am working on) aims to bring glue to the Jupyter notebook. It, therefore, requires a solution for multivolume rendering in the Jupyter notebook. Glue-jupyter is still in early development, but already provides an easy modern way to explore (multiple) datasets interactively, while at the same time giving programmatic access from the notebook.

Ipyvolume

Ipyvolume is the best 3D visualization package for the Jupyter notebook (disclaimer: I am the main author ;). With ipyvolume, you can do scatter plots, quiver plots, and multivolume rendering in one scene, with just a few lines of code, it does not get any easier!

Simple example showing volshow, scatter and quiver.

Moreover, ipyvolume is built on top of ipywidgets (Jupyter widgets for the frontend part), giving you all the features that it provides out of the box. Examples render live in the documentation, plots are easy to embed on a static HTML file, the ability to easily link properties together on the frontend (browser) or kernel side (Python process), and listen to any change of a property (a byproduct of the bidirectional communication). Also, ipyvolume reuses pythreejs, exposing a large part of the threejs API for free! And using ipywebrtc, we can record movies, stream them to a remote computer or take snapshots.

Multivolume rendering: male head scan and dark matter particle simulation in one scene.

As of version 0.5, ipyvolume supports multi-volume rendering, allowing you to visualize two datasets on top of each other. In the example on the left, we show a scan of a human head and at the same time a 3D histogram of the particles in a dark matter simulation. Any number of data cubes can be put in the same scene, a big feat for visualization in the Jupyter ecosystem! As far as I know, it is the first package to do the multivolume rendering for Jupyter.

What about large data cubes?

In case you are using a remote Jupyter notebook, possible using Jupyter Hub, you will have the best of two worlds: your code is close to the data and the visualization is local on your laptop/desktop. However, that means that the data needs to be transferred to the client/browser, which for large data cubes can be considerable (yeah, anything to the power 3 is usually large). To solve this, ipyvolume will send a low resolution data cube by default to the browser. If you zoom in, a change in the coordinates of the bounding box is detected (yay ipywidgets!), and a zoomed in higher resolution cutout version will be sent to the browser.

After zooming in, a new high-resolution cutout will be sent to the browser.

Cross-Language

The joy does not stop here, since most of the code of ipyvolume is frontend code (JavaScript) that runs in the browser, and there is not much preventing us from using it for another language. The BeakerX team at Two Sigma have already shown it can be used from all the JVM languages (Java, Clojure, Groovy, Scala, …).

Now together with the QuantStack team, we are building xvolume, a C++ binding to the frontend code of ipyvolume (yeah, I should rename that to jupyter-volume right?).

This means that with a single code base (plus some per-language glue) we can have a serious 3D visualization package (ipyvolume) for many languages in the Jupyter notebook.

TL;DR version / Conclusion

As of ipyvolume 0.5, we have a serious 3D visualization package for the Jupyter ecosystem that can even do multivolume rendering with mixed rendering techniques, combined with regular meshes, scatter and quiver plots. Reusing the front-end code, and being a Jupyter widget, we can reuse the library in all the JVM languages and C++.

Special thanks to Casper van Leeuwen and all other contributors for this 0.5 release!

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Maarten Breddels
Maarten Breddels

Written by Maarten Breddels

Data Science tooling for Python, creator of Reacton, Solara, Vaex, ipyvolume

Responses (3)