Using RAPIDS with Singularity

Mike Beaumont
RAPIDS AI
Published in
2 min readFeb 26, 2019

By: Scott McMillan

RAPIDS is accelerating High Performance Computing (HPC). Containers make it easy to use RAPIDS for your HPC workload, however, it’s only available in NGC and Docker… until today. Since Docker is not typically available to users of shared HPC systems, many HPC systems rely on the Singularity container runtime. Singularity was developed to better satisfy the requirements of HPC users and system administrators, including the ability to run containers without superuser privileges.

Since Singularity was expressly designed for HPC use cases, it’s default behaviors differ in some respects from other container runtimes such as Docker. The RAPIDS containers were designed for Docker, so a few minor workarounds are currently necessary to use them with Singularity. Essentially, the workarounds are needed because the Singularity container image is read-only and the process namespace is not isolated from the host.

Several variations of the RAPIDS container are available to download; please choose the variant most appropriate for your needs. The example below uses the Ubuntu 16.04 and CUDA 9.2 runtime container image, but the workflow is the same regardless of the selected image. Typically, Singularity will have already been installed by the system administrator, so it will already be present on many HPC systems. Singularity documentation can be found here.

First, please review the container host prerequisites below:

  • NVIDIA Pascal GPU architecture or better
  • CUDA 9.2 or 10.0 compatible NVIDIA driver
  • Singularity version 3.0 or later.

The included Jupyter notebooks are a great way to explore RAPIDS using Singularity.

  • /rapids/notebooks/cuml — cuML demo notebooks. These notebooks have data pre-loaded in the container image.
  • /rapids/notebooks/mortgage — cuDF, Dask, XGBoost demo notebook. This notebook requires download of Mortgage Data, see notebook E2E.ipynb for more details. In cell 3, modify the following line of code to add the “local_dir” parameter.
cluster = LocalCUDACluster(ip=IPADDR, local_dir=’/tmp/dask_cuda’)

Get started with RAPIDS here: https://rapids.ai/start.html.

RAPIDS Github repository can be found at https://github.com/rapidsai.

Read more about Singularity here: https://www.sylabs.io/singularity/.

I hope that by using RAPIDS with Singularity, you can easily pull the RAPIDS container to accelerate your HPC workloads.

--

--