RAPIDS 23.06 Release — Accelerating Data Science

Alex Ravenel
RAPIDS AI
Published in
7 min readJun 21, 2023

--

RAPIDS’ 23.06 release is now available! This release adds a number of features and improvements we are excited about, including:

  • Major improvements to cuSpatial, including DE-9IM support, pip packages, performance improvements, and improved example notebooks
  • RAFT integrations with Implicit (collaborative filtering) and Meta’s FAISS (similarity search), as well as improvements to pylibraft
  • Performance and memory usage improvements in cuGraph
  • Timezone support and numerous performance increases in cuDF
  • CUDA 12 support in pip wheels

Get started with the quick start installation instructions for conda, pip, or docker or read on to see more details on this months’ upgrades!

Table of Contents

Expanded spatial relationship and distance queries in cuSpatial

23.06 is one of our biggest releases yet with the focus being on increasing our reach, making it easier than ever before to use cuSpatial and to keep the data and compute on the GPU so further processing and analysis is faster than ever. Major changes include:

  • cuSpatial can now be installed with pip
  • Cartesian distance between GeoSeries of any geometry type.
  • Spatial Predicates for any two non-multigeometry columns
  • Notebook Improvements including using cuML with cuSpatial and a cuDF accelerated WKT parser

Starting with 23.06, you can now install cuSpatial using pip alongside the other main RAPIDS libraries, hosted on the NVIDIA index. We support both CUDA 11.2–11.8 and 12.0 with pip.

# If using driver 525+, with support for CUDA Toolkit 12.0+

pip install --extra-index-url=https://pypi.nvidia.com cuspatial-cu12

# If using driver 450.80+, with support for CUDA Toolkit 11.2+

pip install --extra-index-url=https://pypi.nvidia.com cuspatial-cu11

This release also includes fully parallelized Cartesian distance between any two pairs of geometry. It’s easier and faster than ever to find the distance between objects in large datasets of (multi)points, (multi)polygons, and (multi)linestrings.

23.06 now supports the complete set of spatial binary predicates, also known as DE-9IM. Nine predicates are supported in this release: contains, geom_equals, intersects, covers, crosses, disjoint, overlaps, touches, and within. These predicates give you the ability to leverage GPU parallelization to determine spatial relationships between two GeoSeries.

LineString intersects LineString
import cuspatial
import geopandas
import numpy as np
from shapely import LineString, GeometryCollection
left = geopandas.GeoSeries([
LineString([(-4, 1), (-2, 1), (0, 1), (2, 1), (4, 1)]),
LineString([(-4, 2), (-1, 2), (-1, -2), (1, -2), (1, 2), (4, 2)]),
LineString([(0, 4), (0, -4)])
])
right = geopandas.GeoSeries([
LineString([(-4, 0), (-2, 0), (0, 0), (2, 0), (4, 0)]),
LineString([(-4, 0), (-2, 0), (0, 0), (2, 0), (4, 0)]),
LineString([(-4, 0), (-2, 0), (0, 0), (2, 0), (4, 0)]),
])

sample1 = np.random.randint(0, 3, 10_000_000)
sample2 = np.random.randint(0, 3, 10_000_000)
gpd_left_lines = left[sample1].reset_index(drop=True)
gpd_right_lines = right[sample2].reset_index(drop=True)
left_lines = cuspatial.from_geopandas(left)[sample1].reset_index(drop=True)
right_lines = cuspatial.from_geopandas(
right
)[sample2].reset_index(drop=True)

This example demonstrates a 96x speedup using cuSpatial for LineString-LineString intersection tasks on an AMD EPYC 7642 (48 core) CPU and an NVIDIA A100 GPU.

In [2]: %time gpu_result = left_lines.intersects(right_lines)
...:
CPU times: user 422 ms, sys: 59.4 ms, total: 482 ms
Wall time: 481 ms

In [3]: %time host_result = gpd_left_lines.intersects(gpd_rig
...: ht_lines)
CPU times: user 46.4 s, sys: 90.2 ms, total: 46.5 s
Wall time: 46.5 s

To get your wheels turning on ways to use cuSpatial and leverage the power of RAPIDS alongside it, we’ve updated the Hausdorff Clustering Notebook to use cuML, and the zip code stop sign comparison notebook now uses a cuDF-accelerated WKT parser that improves IO performance by up to 40x.

Just like with the 23.04 cuSpatial release, there’s too much to cover in this article, so we are planning another standalone cuSpatial 23.06 blog — stay tuned for more!

Nearest neighbors and more in RAFT

RAFT continues to provide important building blocks for building machine learning and data analytics applications and release 23.06 is no exception! pylibraft continues to expand, with 23.06 bringing RAFT’s powerful k-selection APIs to Python libraries.

Implicit, the popular library for collaborative filtering in Python, is officially using RAFT’s C++ k-selection primitives and found an instant 25% speedup over its prior implementation. Stay tuned while Implicit continues to integrate RAFT primitives, such as brute-force nearest neighbors, in coming releases. Meta’s FAISS library, which has long been known as the standard for approximate nearest neighbors (ANN) algorithms on the GPU, has also been integrating RAFT’s C++ brute-force nearest neighbors (aka Flat index) and will soon be integrating RAFT’s C++ IVF-Flat, IVF-PQ, and CAGRA implementations.

Nearest neighbors aside, all of RAFT’s public APIs now accept the new raft::resources object, which provides API-level environment and dependency agnosticism. As an example, CUDA math libraries are no longer required dependencies for users who aren’t invoking APIs that need them.

Better memory utilization and performance in cuGraph

Release 23.06 saw cuGraph addressing performance and expanding the list of algorithms that scale to Multi-Node Multi-GPU (MNMG). In 23.06 the team focused on reducing the memory overhead in creating a graph in MNMG environments. The memory overhead (what cuGraph uses versus the size of the input data) was reduced from:

  • Directed Graph: overhead dropped from 5.9x to 3.07x
  • Undirected Graph: overhead dropped from 12x to 6.14x for undirected graph

An Undirected graph requires that the data be symmetrized hence memory is 2x of directed.

By reducing unneeded data copies, the performance of graph creation improved between 1.7x and 4.2x depending on the size of the graph.

For release 23.06, cuGraph is pleased to announce the release of MNMG Leiden for community detection / clustering on large graphs. We also enhanced Node2Vec to support MNMG for GNN sampling on large graphs. Speaking of GNNs, we spent a lot of time optimizing and improving our sampling algorithms. The cugraph-pyg code was refactored so that the node loader now matches upstream PyG. Additionally, we released a blog discussing cugraph-pyg. And lastly, the RMAT generator was updated to support the creation of Bipartite graphs.

Timezone support and performance boosts in cuDF

cuDF 23.06 introduces support for timezone-aware datetime types. You can use the tz_localize method to create timezone-aware data:

>>> dti = cudf.date_range("2001-01-01", freq="3h", periods=10)
>>> dti
DatetimeIndex(['2001-01-01 00:00:00', '2001-01-01 03:00:00',
'2001-01-01 06:00:00', '2001-01-01 09:00:00',
'2001-01-01 12:00:00', '2001-01-01 15:00:00',
'2001-01-01 18:00:00', '2001-01-01 21:00:00',
'2001-01-02 00:00:00', '2001-01-02 03:00:00'],
dtype='datetime64[ns]')

>>> dti_tz = dti.tz_localize("America/New_York")
>>> dti_tz
DatetimeIndex(['2001-01-01 00:00:00-05:00', '2001-01-01 03:00:00-05:00',
'2001-01-01 06:00:00-05:00', '2001-01-01 09:00:00-05:00',
'2001-01-01 12:00:00-05:00', '2001-01-01 15:00:00-05:00',
'2001-01-01 18:00:00-05:00', '2001-01-01 21:00:00-05:00',
'2001-01-02 00:00:00-05:00', '2001-01-02 03:00:00-05:00'],
dtype='datetime64[ns, America/New_York]')

To convert between timezones, use tz_convert:

>>> dti_tz.tz_convert("America/Chicago")
DatetimeIndex(['2000-12-31 23:00:00-06:00', '2001-01-01 02:00:00-06:00',
'2001-01-01 05:00:00-06:00', '2001-01-01 08:00:00-06:00',
'2001-01-01 11:00:00-06:00', '2001-01-01 14:00:00-06:00',
'2001-01-01 17:00:00-06:00', '2001-01-01 20:00:00-06:00',
'2001-01-01 23:00:00-06:00', '2001-01-02 02:00:00-06:00'],
dtype='datetime64[ns, America/Chicago]')

These functions are similar to their Pandas counterparts, but are much faster:

psr = pd.Series(pd.date_range("1970-01-01", "1980-01-01", freq="1T"))
sr = cudf.from_pandas(psr)

In [6]: %timeit psr.dt.tz_localize("America/New_York", ambiguous="NaT", nonexistent="NaT")
1.55 s ± 9.91 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit sr.dt.tz_localize("America/New_York", ambiguous="NaT", nonexistent="NaT")
9.98 ms ± 246 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

libcudf — the C++ library that powers cuDF — implemented several important performance improvements and new features in 23.06:

CUDA 12 pip packages now available!

Last but not least, we are proud to announce that RAPIDS pip packages supporting CUDA 12 are now available via the NVIDIA pip package index. Give them a try via the install picker or try the instructions below:

# Will install cudf, cuml, cugraph, cuspatial–adjust packages as needed
pip install --extra-index-url=https://pypi.nvidia.com cudf-cu12 cuml-cu12 cugraph-cu12 cuspatial-cu12

Conclusion

We are proud of the updates that 23.06 brings, and hope that you will give it a try! For more information, you can check out:

--

--