Introducing Kaleido ✨

Static image export for web-based visualization libraries with zero dependencies

Jon Mease
Jon Mease
Jul 16, 2020 · 6 min read

Background

As simple as it sounds on the surface, programmatically generating static images (e.g. raster images like PNGs or vector images like SVGs) from web-based visualization libraries (e.g. Plotly.js) is a complex problem. It’s a problem that library developers have struggled with for years, and it has delayed the adoption of these libraries among scientific communities that rely on print-based publications for sharing their research. Today we introduce Kaleido: an easy to install Chromium-based library for static image export for web-based visualization libraries.

The core difficulty is that web-based visualization libraries don’t actually render plots on their own. Instead, they delegate this work to web technologies like SVG, Canvas, WebGL, etc. Similar to how matplotlib relies on various backends to display figures, web-based visualization libraries rely on a web browser rendering engine to display figures.

When a figure is displayed in a browser window, it’s relatively straight-forward for a visualization library to provide an export-image button because it has full access to the browser for rendering. The difficulty arises when trying to export an image programmatically (e.g. from Python) without displaying it in a browser and without user interaction. To accomplish this, the Python portion of the visualization library needs programmatic access to a web browser’s rendering engine.

There are three main approaches that are currently in use among Python web-based visualization libraries (e.g. Plotly, Bokeh, Altair, ipyvolume, etc.):

  1. The Selenium or pyppeteer Python libraries can be used to control a full system web browser such as Firefox, Chrome, or Chromium to perform image rendering.
  2. A custom headless Electron application can be used to perform image rendering using the Chromium browser engine built in to Electron. This is the approach taken by Plotly’s current Orca image export library.
  3. When operating in the Jupyter notebook or JupyterLab, a Python library can use the Jupyter Comms protocol to communicate with a custom Jupyter extension running in the browser. This extension can perform the image export and then communicate the results back to the Python process using the Comms protocol.

While approaches 1 and 2 can both be installed using conda, they still rely on all of the system dependencies of a complete web browser, even the parts that aren’t actually necessary for rendering a visualization. For example, on Linux both require the installation of system libraries related to audio (libasound.so), video (libffmpeg.so), GUI toolkit (libgtk-3.so), screensaver (libXss.so), and X11 (libX11-xcb.so) support. Many of these are not typically included in headless Linux installations like you find in JupyterHub, Binder, Colab, Azure Notebooks, SageMaker, etc. Also, conda is still not as universally available as the pip package manager and neither approach is installable using pip packages.

Additionally, both 1 and 2 communicate between the Python process and the web browser process over a local network port. While not typically a problem, certain firewall and container configurations can interfere with this local network connection.

The advantage of options 3 is that it introduces no additional system dependencies. The disadvantage is that it relies on running within a Jupyter notebook, so it can’t be used in standalone Python scripts.

The end result is that all of these visualization libraries have in-depth documentation pages on how to get image export working, and how to troubleshoot the inevitable failures and edge cases that people run into. While this is a great improvement over the state of affairs just a couple of years ago, and a lot of excellent work has gone into making these approaches work as seamlessly as possible, the fundamental limitations detailed above still result in sub-optimal user experiences. This is especially true when comparing web-based plotting libraries to traditional plotting libraries like matplotlib and ggplot2 where there’s never a question of whether image export will work in a particular context.

The goal of the Kaleido project is to make static image export of web-based visualization libraries as universally available and reliable as it is in matplotlib and ggplot2.

The Kaleido Approach

To accomplish this goal, Kaleido introduces a new approach. The core of Kaleido is a standalone C++ application that embeds the open-source Chromium browser as a library. This architecture allows Kaleido to communicate with the Chromium browser engine using the C++ API rather than requiring a local network connection. A thin Python wrapper runs the Kaleido C++ application as a subprocess and communicates with it by writing image export requests to standard-in and retrieving results by reading from standard-out.

By compiling Chromium as a library, we have a degree of control over what is included in the Chromium build. In particular, on Linux we can build Chromium in headless mode which eliminates a large number of runtime dependencies, including the audio, video, GUI toolkit, screensaver, and X11 dependencies mentioned above. The remaining dependencies can then be bundled with the library, making it possible to run Kaleido in minimal Linux environments with no additional dependencies required. In this way, Kaleido can be distributed as a self-contained library that plays a similar role to a matplotlib backend.

Improvements

Compared to Orca, Kaleido brings a wide range of improvements to plotly.py users.

pip installation support

Pre-compiled wheels for 64-bit Linux, MacOS, and Windows are available on PyPI and can be installed using pip. As with Orca, Kaleido can also be installed using conda.

Improved startup time and resource usage

Kaleido starts up about twice as fast as Orca, and uses about half as much system memory.

Docker compatibility

Kaleido can operate inside docker containers based on Ubuntu 14.04+ or Centos 7+ (or most any other Linux distro released after ~2014) without the need to install additional dependencies using apt or yum, and without relying on Xvfb as a headless X11 Server.

Hosted notebook service compatibility

Kaleido can be used in just about any online notebook service that permits the use of pip to install the kaleido package. These include Colab, Sagemaker, Azure Notebooks, Databricks, Kaggle, etc. In addition, Kaleido is compatible with the default Docker image used by Binder.

Security policy / Firewall compatibility

There were occasionally situations where strict security policies and/or firewall services would block Orca’s ability to bind to a local port. Kaleido does not have this limitation since it does not use ports for communication.

Try it out

Kaleido can be installed using pip…

or conda.

Out of the box, Kaleido supports converting Plotly figures to PNG, JPG, WebP, SVG, and PDF output formats. Support for the EPS format is available when the poppler library is installed. This can be done either using conda, or a system package manager.

When Kaleido is installed, plotly.py 4.9.0+ will automatically use it for image export operations, falling back to Orca if Kaleido is not available. For example…

This will produce a file named fig.png in the current working directory containing this image

Beyond plotly.py: Kaleido Scopes

While the development of Kaleido has been motivated by the needs of the plotly.py community, we know we’re not unique in facing these challenges. We’ve designed the C++ and Python portions of Kaleido using a basic plugin architecture (Kaleido plugins are called Scopes) with the goal of making it possible to support image export for other web-based visualization libraries. If you’re interested in adding support to Kaleido for another web-based visualization library, check out the Scope (Plugin) Architecture wiki page and let us know how we can help!

Additionally, since the core of Kaleido is a standalone C++ application that receives export requests on standard-in and writes responses to standard-out, it is relatively straightforward to build wrappers for other languages besides Python. In fact, Kaleido support will be coming soon to the Plotly for Rust library. If you’re interested in writing a Kaleido wrapper for another language, check out the Language Wrapper Architecture wiki page and, again, let us know how we can help.

Learn more

Plotly

Plotly’s Dash puts AI & ML in the hands of business users

Plotly

Plotly is a data visualization company that makes it easy to build, test, and deploy beautiful interactive web apps, charts and graphs—in any programming language.

Jon Mease

Written by

Jon Mease

Chief Scientist at Plotly

Plotly

Plotly is a data visualization company that makes it easy to build, test, and deploy beautiful interactive web apps, charts and graphs—in any programming language.