CuPy v9 is here.

Published in

CuPy

3 min readApr 22, 2021

We are excited to announce the availability of CuPy v9.0.0. This release contains the effort of development in the past 7 months, including CUDA JIT to transpile Python code to CUDA, support for NVIDIA cuSPARSELt, AMD ROCm support through binary packages, and so on.

See here for the full release note.

CuPy v9 Highlights

JIT API

CuPy v9 introduced the JIT API that allows you to define CUDA kernels in Python code. Here is how the code would look like:

This is an equivalent of a reduction code shown in NVIDIA’s slide deck (page 7). You can then execute the kernel like this:

The JIT API parses the Python code using AST and transpiles it to CUDA code on the fly depending on the dtype of input arrays. So kernels compiled through this API can achieve the same performance level as CUDA kernels directly written in C/C++. You can find the full Python code and transpiled CUDA code here.

Refer to the User Guide for more information about the JIT API.

NVIDIA cuSPARSELt Support

CuPy now integrates the Python binding for the cuSPARSELt library that accelerates sparse matrix multiplications on NVIDIA Ampere GPUs. We are planning to start using it in CuPy sparse APIs to transparently improve performance.

AMD ROCm Binary Packages

Support for AMD ROCm Platform has been significantly improved in CuPy v9. You can now install CuPy for ROCm 4.0 via binary package (`pip install cupy-rocm-4–0`). Docker images are also available. See Using CuPy on AMD GPU for the detailed instructions.

New Random Generator API

CuPy v9 implements the new Random Generator API introduced in NumPy v1.17.

Since our implementation is based on cuRAND, we currently support the following BitGenerator objects: XORWOW (default), MRG32k3a, and Philox4x3210. Notice that they are different from NumPy ones.

The new random module is currently in development and more distributions will be added in later releases.

Refined Documentations

Documentation has also been improved in this release.

The documentation website is now powered by a cleaner and widely adopted by the community pydata-sphinx-theme.
The “Tutorial” chapter has been changed to the User Guide so that we can cover more general topics than tutorials.
The API Reference has been reorganized to follow the structure of the latest NumPy/SciPy API references to improve the browsing experience.

Dropping Support for Python 3.5 and CUDA 9.0

Support for Python 3.5 and CUDA 9.0 has been removed in CuPy v9. Use Python 3.6+ and CUDA 9.2+. Please refer to the Upgrade Guide for more information about the dependency changes.

Other Enhancements

Besides the major points presented above, there are several minor features in this release that we would like to briefly highlight.

CuPy is starting to support Generalized Universal Functions on its codebase. This allows `cupy.matmul` to be invoked via NumPy ufunc dispatcher.
Several new functions added to `cupyx.scipy.ndimage` that became the base for NVIDIA’s newly released cuCIM library.
Performance improvements by extending the use of cuTENSOR to routines such as prod/min/max/mean.
`cupyx.lapack` now as a public interface to cuBLAS.
`cupyx.scipy.sparse` coverage greatly improved.
Smaller binary packages by omitting cuDNN and NCCL libraries from the distribution. You can easily install these libraries via CuPy’s built-in tool after installing CuPy (`python -m cupyx.tools.install_library`).

Acknowledgments

CuPy v9 features numerous contributions from the community. We would like to thank everyone involved in CuPy development, including NVIDIA Corporation and the U.S. Department of Energy Brookhaven National Laboratory’s CSI-HPC group.

We are also happy to welcome developers from AMD to the community. We believe this will boost the development of ROCm support in CuPy.

Join the CuPy community!

Our community is open to everyone interested in CuPy. Anyone is welcome to join, ask for help, and contribute! Join Gitter to talk with developers and other users, open an issue for feature requests or feedback, and submit a pull-request to contribute to the code and documentation.

Finally, don’t forget to watch and star the repository on GitHub and follow us on Twitter and Medium to stay updated on the latest news!