Data to .dashboard()

Allan Enemark
Jan 16 · 3 min read

The RAPIDS viz team has wrapped up a massive refactor since our last post, and we are happy to announce a new, easier-to-use pythonic notebook interface. With just a few lines, you will be able to start a GPU cross-filtered, browser-based dashboard directly from your notebooks. Basically, we take the headache out of interconnecting multiple charts to a cuDF backend, so you can get to visually exploring data faster, straight from where you are already working.

By leveraging Jupyter notebooks, Bokeh server, and Panel to greatly reduce complexity, we negate the need for any additional setup to a RAPIDS installation. So starting from the 0.12 release, cuxfilter will be part of the RAPIDS conda and docker installations. For up to date details, check out the cuxfilter installation instructions.

Major features of the updated cuxfilter

One more thing. We’ve been working closely with the fantastic Holoviz Datashader team to develop a GPU-accelerated version of their library. After prototyping with our own cuDatashader project, Holoviz developed a native cuDF and GPU-accelerated version of Datashader that has just been released. They’ve done such a good job, we’ve now integrated Datashader with cuxfilter.

Get Started

The fastest way to get started is to check out ‘10 minutes to cuxfilter.’ We have notebook examples in our docs page, as well as information on available chart types, dashboard layouts, and color themes.

By the way, cuxfilter is best used to interact with large (1 million+) tabular datasets. GPU’s are fast, but accessing that speedup requires some overhead that isn’t worthwhile for smaller datasets.

Finding that usability sweet spot

This big refactor is a reflection of honing our goals since the initial PoC release. The notebook interface is aimed specifically at supporting python-focused data scientists and analysts in their workflows. While there are many fantastically capable visualization libraries available, often the goal of simply “seeing my data” is stymied by the mental and technical overhead of learning how to use them — especially for large datasets with multiple interactive charts.

To alleviate this, our general principle is to use existing viz libraries, enhance their capability with GPU acceleration, and simplify the deployment of cross-filter focused dashboards with opinionated templates. Essentially, we’re trying to find a usability sweet spot between Bokeh and Tableau.

What’s a cuDataTile?

Values for the charts are precomputed to allow for very fast slider scrubbing when cross-filtering, without any pause for recalculation. This is enabled through cuDataTiles, a GPU-accelerated version of data tiles inspired by the Falcon project.

Hey, where’s the original cuxfilter and mortgage viz demo?

The original version of cuxfilter, most known for powering the mortgage viz demo, has been moved into a separate branch on Github. Since it has a much more complicated backend and a hardcoded javascript frontend, we’ve decided to focus on the streamlined notebook version in the master branch. More on javascript later.

What’s next?

Much more is to come. With this refactor as a foundation, we are planning to extend cuxfilter with Pydeck charts, large scale network graphs, and better looking dashboard templates. We are also working on creating several notebook demos that we can use for an easy to browse example gallery.

For those who would like to create bespoke javascript visualization applications, we are planning the development of a proper javascript API based in Node.js for the near future (so keep your eyes peeled).

If you have any questions about building your own GPU-backed viz app or have a feature request, reach out! Ping us on Slack or raise an issue on our Github. While you are there, why not help us improve cuxfilter.

Thank you to these open source projects

Cuxfilter in its current form wouldn’t be possible without these great open source projects on which we rely: Bokeh, Datashader, Panel, Falcon, Jupyter. So, thanks all for being part of the open source community.

RAPIDS AI

RAPIDS is a suite of software libraries for executing end-to-end data science & analytics pipelines entirely on GPUs.

Allan Enemark

Written by

Curiosity powered designer & sustainability advocate.

RAPIDS AI

RAPIDS AI

RAPIDS is a suite of software libraries for executing end-to-end data science & analytics pipelines entirely on GPUs.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade