RAPIDS Release 0.18

Tons of New Features and Updates. Let’s Dig in.

Alex Ravenel
RAPIDS AI
4 min readMar 2, 2021

--

In the latest 0.18 release of RAPIDS, the team continues with the mission of delivering better, faster, more refined tools. Here’s what we shipped in the latest release.

RAPIDS Core Libraries Updates

RAPIDS Data Frames: cuDF

cuDF has added broader support for fixed-point decimal types in C++ with initial support in Python. Nested types support continues to improve and additional groupby and rolling window aggregations have been added. cuDF has also added various documentation improvements surrounding groupby, missing data, and I/O.

RAPIDS Machine Learning: cuML and XGBoost

cuML continues to emphasize three key themes: model explainability, performance, and support for flexible data types.

https://en.wikipedia.org/wiki/Shapley_value

The experimental implementation of the SHAP explainability algorithm has improved performance and robustness — stay tuned for a deep dive blog into this topic. DBSCAN can now take advantage of multiple GPUs or nodes, allowing improved performance for this compute-intensive algorithm. Conversion of Random Forest models to the high-performance inference format of FIL is now much faster. tSNE can now take advantage of sparse input data, and Incremental PCA has graduated from “experimental” to production-grade, allowing users to perform dimensionality reduction on very large, batched datasets. Finally, the nearest neighbors algorithm has been extended to support approximate nearest neighbors methods for queries of very large datasets.

XGBoost has been updated to version 1.3.3, bringing a number of bug fixes in addition to key features from the 1.3.x series, such as updates to the Dask API and GPU-accelerated model explainability via GPUTreeSHAP (see the full blog here).

RAPIDS Graph Analytics: cuGRAPH

cuGraph has added two new algorithms: Traveling Salesman (TSP) and Egonet extraction. The team has also rolled out a new version of Multi-GPU PageRank based on cuGraph primitives using 2D data partitioning. This new version scales to extremely large graphs. BFS has been enhanced to have a depth limit. The latest release also improves graph primitives and continues to just make cuGraph better.

https://en.wikipedia.org/wiki/Travelling_salesman_problem

RAPIDS Signal Processing: cuSignal

cuSignal is entering the realm of phased array systems applicable to developers using multiple sensors for their signal processing workloads. Support was added for both pulse compression and pulse doppler processing with plans to continue growing our support in this space.

RAPIDS cuXfilter and Visualization

cuXfilter has new fully customizable responsive layouts, new themes, and new panel widgets. Find usage details in our API docs, and keep an eye out for our upcoming end-to-end tutorial video.

New RAPIDS cuXfilter responsive layouts

RAPIDS Memory Manager (RMM)

RMM has added a new C++ cuda_async_memory_resource built on cudaMallocAsync, a new C++ stream pool, a Python stream wrapper class, and Thrust interop improvements.

Cyber Log Accelerators (CLX)

For this release, CLX focused on adding out-of-the-box capabilities to the repository that helps cybersecurity and information security professionals tackle new use cases. Now included is anomaly detection using Lightweight Online Detector of Anomalies (LODA), including both a notebook example and the underlying CLX module necessary to build an end-to-end pipeline. cyBERT is now extended to include predictive maintenance tasks, which also utilizes the new sequence classifier.

If you’re interested in Exploratory Data Analysis (EDA) for security-focused datasets, the EDA module can help you perform that task with less manual setup time. If you’re a SlashNext user, you can wrap your API calls with CLX to keep your entire enrichment pipeline on the GPU. Many CLX use cases have complex dependencies. While building a Docker container yourself was always an option, the CLX Docker image is now available on Docker Hub (stable and nightly) and NGC.

CLX contributors are excited about the future of cybersecurity and how the security infrastructure of the future will be transformed by the tight integration of sensors, telemetry, transport, and raw AI compute capabilities. We are also pleased to announce that CLX has additional NLP-based workflows planned for the near future.

BlazingSQL

BlazingSQL has added new string functions (REPLACE, TRIM, UPPER, and others). Additionally, there is now a new communication layer that improves distributed performance and supports UCX.

RAPIDS Community

The RAPIDSFire podcast is now available weekly. Since the last release we’ve added episodes that cover:

We have an exciting series of guests lined up for the coming weeks, so remember to subscribe wherever you get your podcasts.

If there’s a topic you’d like to hear or if you’d like to be a guest, reach out to host Paul Mahler on Twitter.

Conclusion

RAPIDS has continued to evolve from our first release, and we have an exciting year ahead.

Before we close out, we’d like to acknowledge Josh Patterson who has moved on to explore new challenges outside of NVIDIA. We want to thank Josh for his hard work and can’t wait to see what the future holds for him.

--

--