Vladimir Iglovikov
3 min readMar 26, 2024
Relative performance of Image Augmentation libraries

In this plot, you observe the relative performance of popular image augmentation libraries based on the following conditions:

  1. CPU Utilization: The benchmark utilized a single CPU core from an AMD Ryzen Threadripper 3970X.
  2. Image Format: RGB images were stored as three-channel units in uint8 format, with the first 2000 images from the ImageNet set being used. The benchmark was conducted five times, each time shuffling the order in which the libraries were applied.
  3. Processing: Each image was processed individually.
  4. Transforms: A limited set of 20 transformations was employed.

This setup aims to replicate the most common scenario in computer vision model training, where each CPU core processes a separate image in parallel.

It’s important to note that altering any of these conditions — such as utilizing multiple CPU cores per image, switching to GPU processing, or transitioning from uint8 images to float32 images—could potentially change the relative standings. Future iterations of this benchmark will expand to include float32 images and add evaluations for bounding boxes, key points, and segmentation masks.

The latest version of this benchmark can always be found at LINK.

For those interested in batch processing of images on GPU or in seeking differentiable augmentations, Kornia is recommended as it specializes in these areas.

The primary purpose of this benchmark is not to highlight the speed of Albumentations but rather to identify opportunities for optimization. The table below underscores that transformations such as Rotate, ShiftRGB, JpegCompression, and GaussianNoise could—and therefore should—be optimized.

We hope the developers of other libraries will also use this benchmark as a profiling tool, leading to an increase in version updates across all libraries.

If the benchmark setup contains any code errors or inaccuracies, please report them via an issue at LINK.

The code for the benchmark is available at LINK.

Results

Libraries

  • Albumentations 1.4.1
  • ImgAug: 0.4.0
  • torchvision: 0.17.1+rocm5.7
  • pillow: 0.7.2
  • Augly 1.00
Images per second. Higher is better.

To create a plot in the header, we divided each row by maximum value and averaged columns.

Conclusions

If you preprocess images on the CPU, notice low GPU utilization (which you can monitor using nvtop) and high CPU utilization (observable through htop), transitioning to Albumentations could significantly impact performance.

Across the board, all libraries, including Albumentations, have room for improvement regarding performance.

Request

  1. If you have expertise with any of these libraries, we encourage you to review the benchmark code for potential biases or “unfair” practices.
  2. Additionally, if you are aware of transforms that are shared between at least two of the libraries but have yet to be included in this benchmark, we strongly encourage you to contribute by creating a Pull Request to incorporate them into the benchmark. Alternatively, you can make a difference by simply leaving a comment or directly messaging Vladimir Iglovikov.