RAPIDS 24.04 Release

Nick Becker
RAPIDS AI
Published in
6 min readApr 24, 2024

The RAPIDS 24.04 release is now available, and it includes a new accelerated vector search library, expanded zero-code change experiences for pandas and NetworkX workflows, optional query optimization for Dask workflows, support for Python 3.11 and pandas 2, and more.

Table of Contents

Bringing accelerated vector search to everyone with cuVS

To better serve the needs of the vector search and database community, we’ve released cuVS, a new library dedicated to vector search.

cuVS contains state-of-the-art implementations of several algorithms for running approximate nearest neighbors and clustering on the GPU. It’s derived from the core algorithms in RAPIDS RAFT and purpose built to be easily embeddable in the world’s vector databases or used as a standalone tool.

As part of our work to simplify tapping into accelerated vector search, cuVS now includes both C and Rust APIs in addition to the existing C++ and Python APIs.

The example below shows how to build a vector search index with cuVS’ new C API and DLPack:

#include <cuvs/neighbors/cagra.h>

cuvsResources_t res;
cuvsCagraIndexParams_t index_params;
cuvsCagraIndex_t index;

DLManagedTensor *dataset;
load_dataset(dataset);

cuvsResourcesCreate(&res);
cuvsCagraIndexParamsCreate(&index_params);
cuvsCagraIndexCreate(&index);

cuvsCagraBuild(res, index_params, dataset, index);

cuvsCagraIndexDestroy(index);
cuvsCagraIndexParamsDestroy(index_params);
cuvsResourcesDestroy(res);

For those who prefer Rust, use the following example of building a vector search index with cuVS’ new Rust API:

use cuvs::cagra::{Index, IndexParams, SearchParams};
use cuvs::{ManagedTensor, Resources, Result};

use ndarray::s;
use ndarray_rand::rand_distr::Uniform;
use ndarray_rand::RandomExt;

/// Example showing how to index and search data with CAGRA
fn cagra_example() -> Result<()> {
let res = Resources::new()?;

// Create a new random dataset to index
let n_datapoints = 65536;
let n_features = 512;
let dataset =
ndarray::Array::<f32, _>::random((n_datapoints, n_features), Uniform::new(0., 1.0));

// build the CAGRA index
let build_params = IndexParams::new()?;
let index = Index::build(&res, &build_params, &dataset)?;
println!(
"Indexed {}x{} datapoints into cagra index",
n_datapoints, n_features
);

Ok(())
}

With these new C and Rust bindings, you can now integrate CAGRA and all of its goodness into nearly any codebase.

For example, the team at SearchScale have been working on bringing cuVS into Apache Lucene via the new C API to help power an ecosystem built on Java. We’re excited to see this work progress and are excited to help bring accelerated vector search to every platform.

Stay tuned for more information about cuVS in an upcoming blog.

Expanding Zero Code Change Acceleration for pandas and NetworkX code

We’re continuing to invest in making accelerated computing more accessible to data scientists and engineers who may be working with their favorite CPU-based tools and frameworks day-to-day.

cudf.pandas is now GA
cuDF’s pandas accelerator mode, the zero-code change accelerator for pandas workflows, has graduated from Beta and is now Generally Available (GA). Over the past six months, we’ve fixed dozens of bugs, improved the profiler, and expanded our comprehensive unit and integration tests suite.

Now that cudf.pandas is GA it’s fully supported by NVIDIA AI Enterprise — enabling every organization to go to production confidently. You can learn more in the cudf.pandas section of the cuDF documentation.

cuGraph Backend for NetworkX
With the 24.04 release, the cuGraph backend for NetworkX (nx-cugraph) now supports more than 60 algorithms. We continue to collaborate with the NetworkX community to both solidify the backend dispatching infrastructure for all backends and ensure that the user experience for GPU-accelerated NetworkX is smooth.

In line with that goal, NetworkX 3.3 includes support for graph caching to backends. In prior versions, each time NetworkX dispatched to cuGraph (or any other backend) the underlying graph data structure needed to be converted into something the backend could operate on (e.g., a GPU object).

Now, this graph conversion only occurs once — bringing significant performance gains to workflows that involve running multiple graph algorithms on the same underlying data.

If you’re curious about whether your favorite algorithm is supported by the cuGraph backend, the latest NetworkX documentation indicates whether an algorithm is supported by alternative backends.

To learn more about NetworkX backend dispatching in general, visit the NetworkX Backends documentation.

Dask-cuDF Now Supports Query Planning [Experimental]

We’ve been working with the Dask community to bring automatic query planning and optimization to Dask DataFrames through the dask-expr project. Beginning with the 24.04 release, Dask cuDF now supports automatic query optimization as an experimental, optional configuration.

Query optimization can improve performance and reduce peak memory requirements for a given workflow by avoiding unnecessary computation and choosing the most efficient execution path.

With query planning enabled in the example query below, both the equality filter and column selection will be pushed down into the read_parquet call, avoiding significant unnecessary IO and computation. Without query planning, you’d need to manually optimize your code to achieve the same behavior (which can get quite complex in larger workflows).

We see this play out in practice when running various analytics workflows on a 100GB dataset. Query optimization provides benefits for both single and multi-GPU workflows, resulting in both faster performance and the ability to complete larger workflows on a single-GPU system without running out of memory.

Various analytics workflows on a 100GB dataset
Various analytics workflows on a 100GB dataset

In the RAPIDS v24.06 release, we anticipate Dask cuDF will support query optimization by default rather than as an experimental, opt-in configuration.

RAPIDS-wide support for Python 3.11 and pandas 2

Python 3.11
The 24.04 release brings Python 3.11 support to RAPIDS with packages and containers now available. The RAPIDS Quick Start guide now uses Python 3.11 by default, but you can always use the Release Selector to choose your specific combination of platform dependencies.

Pandas 2
Version 2 was a major milestone for pandas, bringing significant updates like expanded PyArrow support, copy-on-write enhancements, nullable data types, and more.

Beginning with the 24.04 release, cuDF and all RAPIDS libraries now support (and require) pandas 2. As part of this upgrade, cuDF now contains numerous feature improvements, API optimizations and long-standing deprecation enforcements that simplify data analysis workflows. In addition to providing a better experience for cuDF and cudf.pandas users, these enhancements propagate to downstream libraries like cuML and cuGraph that are designed to accept both cuDF and pandas inputs.

For those interested in a full list of breaking changes compared to prior versions of cuDF tied to pandas 1.x, you can learn more in the breaking changes section of the cuDF user guide.

Free on demand video deep dives into Data Science from GTC

NVIDIA GTC took place last month and highlighted some of the incredible work going on in the world of accelerated data science and data processing. In case you missed it, you can still register for free and watch sessions about:

You can search all of the data science and data processing sessions here.

Conclusion

The RAPIDS 24.04 release takes another step forward in our mission to make accelerated computing more accessible to data scientists and engineers. We can’t wait to see what people do with these new capabilities.

To get started using RAPIDS, visit the Quick Start Guide.

--

--