10 Minutes to cuDF and CuPy

Nick Becker
RAPIDS AI
Published in
2 min readAug 1, 2019

Echoing the theme of our RAPIDS 0.8 release post, we want to encourage interoperability and enable an ecosystem. RAPIDS libraries and the GPU PyData ecosystem are developing quickly. Yet, sometimes a library may not have the particular functionality you need. This is where it’s important to lean on the bridges between GPU libraries.

Clean handoffs between GPU PyData libraries are forming. For example, if you’re working with cuDF but need a more linear-algebra oriented function that exists in CuPy, you can leverage the interoperability of the GPU PyData ecosystem to use that function. Just like you can do with NumPy and pandas, you can weave cuDF and CuPy together in the same workflow while keeping the data entirely on the GPU.

To help encourage this interoperability, we’ve released a new addition to our 10 minute notebook series called “10 Minutes to cuDF and CuPy”. This is an introductory notebook that explains how easy it is to transition between the two libraries if your workflow can benefit from it. In this tutorial, we show how the CUDA Array and DLPack interfaces allow us to share our data between cuDF and CuPy in microseconds. This gives us near instant access to the best of both libraries.

How impactful is GPU-accelerated array processing? Turns out, for many operations on large arrays you can get more than a 100x speedup using CuPy on the GPU compared to NumPy on the CPU. With that much horsepower at your fingertips for both dataframe and array based workflows, cuDF and CuPy can fundamentally change the way data science is done and how you work.

Want to get started with RAPIDS and CuPy? Check out cuDF and CuPy on Github and let us know what you think! You can download pre-built Docker containers for our 0.8 release from NGC or Dockerhub to get started or install it yourself via Conda. Need something even easier? You can quickly get started with RAPIDS in Google Colab and try out all the new things we’ve added with just a single push of a button.

Don’t want to wait for the next release to use upcoming features? You can download our nightly containers from Dockerhub or install via Conda to stay at the tip of our development branch.

--

--