NVIDIA RAPIDS Accelerate Kubeflow Pipeline with GPUs on Kubernetes

Vartika Singh (NVIDIA), Jeffrey Tseng (PM for RAPIDS, NVIDIA), Pete MacKinnon (Red Hat), Abhishek Gupta (Google)

Data science workflows are complex, non-trivial to manage, and compute intensive. NVIDIA and the Kubeflow team are trying to simplify the process and, at the same time, speed them up.

Today, we’re announcing the availability of the NVIDIA RAPIDS GPU-accelerated data science libraries as an image on the Kubeflow Pipelines.

Inherently complicated, data science pipelines span the iterative phases of ingestion, validation, training, deployment, and more. They scale across clusters of servers running software from different parts of the workflow. And they are often compute and IO intensive. All this results in slow machine learning model development and deployment cycles.

The integration of RAPIDS with Kubeflow Pipelines streamlines the model development workflow and drastically decreases end-to-end model iterations times by automating the deployment of open, GPU-accelerated data science tools. Combining the simple orchestration of machine learning pipelines with RAPIDS, a collection of CUDA-accelerated libraries, data scientists can train and deploy machine learning pipelines significantly faster to solve business problems.

RAPIDS is an open-source data analytics and machine learning acceleration platform for executing end-to-end data science training pipelines completely in GPUs. The RAPIDS framework allows you to leverage the parallelism of GPUs to run the entire end-to-end data science workflow at high speeds.

RAPIDS uses GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces and focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes.

Rapids on Kubeflow

Kubeflow allows users to spawn Jupyter Notebooks using pre-built or custom Jupyter runtime environments and deploying them to production as seamlessly as possible. A RAPIDS image using NVIDIA GPUs and RAPIDS libraries, on Kubeflow pipelines, shortens the time to deployment from ingestion.

The newer Jupyter spawner UI for Kubeflow. The new version of the UI will be available in 0.4.

Users simply need to go to the Kubeflow JupyterHub Spawner interface and select the appropriate container image, and specify the resource requirements on the container including the requirement of NVIDIA GPU eg. {“nvidia.com/gpu”: 2 }. The spawner interface lists a few images by default in a dropdown list, but also allows the user to type in a path to an image. For the RAPIDS image, you can paste in gcr.io/kubeflow-dev/kubeflow-rapidsai-notebook:latest.

Once the notebook is ready, users can easily experiment and develop the data transformation and training with RAPIDS.

With the availability of RAPIDS-based Jupyter images, end users can build and execute an accelerated, end-to-end data analytics and machine learning pipeline on Kubeflow and NVIDIA GPUs.

Learn more about Kubeflow and RAPIDS.