The Road to 1.0 — Building for the Long Haul

RAPIDS was everywhere at NVIDIA’s GTC 2019 in Silicon Valley. It was great to see so many vendors, partners, and customers showcasing RAPIDS in their booths and presentations. Even the RAPIDS press coverage blew me away. Not that we need more motivation, but the adoption of RAPIDS justifies the team’s long nights and tireless work. We won’t let you all down and will continue to build a cutting-edge GPU-accelerated data science ecosystem. You can help by continuing to submit issues, feature requests, and pull requests.

RAPIDS 0.5 is dead, long live 0.6 (for the next 6 weeks)

On that note, I’m also excited to announce RAPIDS 0.6! If you were pleased with version 0.5, you’ll love 0.6. RAPIDS 0.6 is more robust and includes some awesome new features. First and foremost though, we have new docs! You asked for it, and we’re happy to provide a much cleaner doc experience with many details regarding how the project is run.

In cuDF 0.6, we added:

  • initial support for strings in cuDF DataFrames,
  • join & groupby string columns,
  • typecasting string columns to numeric columns,
  • elementwise functions such as length, concatenation, regex search/replace, and split,
  • Parquet, ORC, JSON, HDF5, & Feather file format support via PyArrow/Pandas,
  • PyTorch, MxNet, Chainer/CuPy compatibility via DLPack & __cuda_array_interface__,
  • scaleout support to multi-GPU & multi-node with Dask-CUDA and Dask-cuDF,
  • HDFS & Cloud Object Store CSV readers, and
  • distributed joins, groupbys and basic aggregations on groups.

libcudf, the C++ and CUDA backend for cuDF, includes a number of new features and improvements. RAPIDS 0.6 continues an ongoing theme of more robust, modular and consistent C++ design throughout libcudf, including its testing infrastructure. This is important to improve stability and maintainability as we accelerate more Python functionality using CUDA C++. These improvements also aim to make libcudf more usable from other C++ applications and from languages other than Python.

libcudf 0.6 also introduces:

  • GPU-accelerated scatter, gather and transpose,
  • infrastructure to support a complete set of GPU binary operations on all data types via Just-in-time (JIT) compilation,
  • DLPack support, and
  • Doxygen-generated API documentation

This release also introduces cuGraph, an initial collection of graph features. RAPIDS is committed to graph analytics, as highlighted in our graph blog.

Highlights of cuGraph in 0.6 include:

  • Louvain clustering,
  • Jaccard Similarity,
  • PageRank, plus many more.

Finally, we made huge improvements in cuML, primarily in three key areas: new single-GPU and multi-GPU algorithms, a new package to ease building the multi-GPU models with Dask, and significant enhancements to our C++ and Python APIs.

Notably, we added to cuML:

  • UMAP (more below),
  • mini-batch Stochastic Gradient Descent (SGD) Regression, including options for ordinary least squares (OLS), Logistic, and Hinge objective functions, as well as Lasso and Elastic Net regularization,
  • single-node multi-GPU k-Nearest Neighbors (k-NN) and OLS, and
  • the initial release of Dask-cuML, a new package to connect multi-GPU cuML algorithms to Dask-cuDF. Initial algorithms are k-NN and Linear Regression.

Accelerating Innovation: UMAP

I want to thank Leland McInnes for all his support and work on UMAP. UMAP is a new and fast dimensionality reduction algorithm, similar to t-SNE, that also works for general non-linear dimensionality reduction. Over the last year, UMAP use has been growing in research papers, especially in genomics, and most recently Google and Open AI used UMAP to make neural net activation spaces more interpretable.

As we build out cuML, we want to balance cutting-edge techniques like UMAP with methods that you learn at the start of your career as a data scientist, but keep going back to (such as regularized regression and PCA). For both cutting-edge and fundamental algorithms, our goal is the same: to make them very fast, reliable, and easy to use. By the way, t-SNE fans don’t fret, we’re working on this algorithm as well.

More Features; Better Support; Stronger Ecosystem

As you can see from the chart, our issues are growing. This is great! We love the feature requests and bugs; the only way to make RAPIDS better is to know what works, what doesn’t, and what’s needed. To accelerate building a feature-complete and stable code base, we’re going to follow the CUDA and PyData ecosystems that RAPIDS is a part of.

Image on Wookieepedia by John Jude Palencar

For CUDA, we’re going to move to the “Rule of Two” (no, not that Rule of Two): we will only officially support two major CUDA versions in our support matrix. This means that when we add CUDA 10.1 support we will drop official CUDA 9.2 support.

For the PyData and Python ecosystem, we will follow the conda ecosystem for versions we support. This currently means we will continue to support Python 3.6 and 3.7, and we will add 3.8 support once the conda ecosystem does so. Likewise, support for 3.6 will be dropped once support is dropped by the conda ecosystem. This allows us to reliably test, package, and distribute packages that work well with the rest of the ecosystem.

After talking with many early adopters, startups, and partners, we feel getting more features out and more tightly scoping the support matrix is the right thing to do. Once we get to 1.0, we will expand our officially supported matrix of versions. If this reduced support is going to impact your usage of RAPIDS, please let us know via a GitHub issue so we can discuss!

What’s Coming Next

Finally, a quick preview of our next release. In 0.7 we plan to:

  • deprecate the current implementation of K-Means (but don’t worry! We will have a new and improved version to replace it),
  • change the way that cuDF talks to the underlying C++ to allow the user to get readable errors for improved troubleshooting,
  • add GPU-accelerated Parquet reader support. Yes, decompression and RLE on the GPU,
  • add rolling windows to cuDF and Dask-cuDF,
  • add Open-UCX support to Dask-cuDF (UCX-py) for faster P2P communication, GPU RDMA, and to leverage NVLink/NVSwitch better, and much more.

As always, we invite you to engage with us on GitHub. If there’s a feature, algorithms, or function you’d like to see, please let us know by opening a GitHub issue. Thank you again for using RAPIDS, and I look forward to all the feedback on 0.6. Onward to 1.0!