Albumentations Outpaces torchvision, Keras, and imgaug: A Drastic Leap in Image Augmentation Speed

Vladimir Iglovikov
4 min readFeb 28, 2024

--

Photo by Sabri Tuzcu on Unsplash

In machine learning, specifically for computer vision tasks, the speed at which images are processed isn’t just a tech issue — it’s also a cost issue. If your image loading is sluggish, it can slow down batch preparation to the point where your GPU is just waiting around. Here’s how that hits your wallet:

  1. Longer training times: In the cloud, you pay for GPU usage by the hour. If your batches are slow to prepare, you’re burning through cash without actually training the model.
  2. Slower iterations: More time per iteration means slower progress. It’s not just about patience; it’s about efficiency and cost.
  3. Underutilized talent: While your GPUs idle, so do your machine learning engineers and researchers. Their time is expensive; if they’re waiting on slow model training, that’s money down the drain.

So, speedy image processing isn’t a luxury — it’s essential to keeping costs down and productivity up.

This text delves into the performance comparison of the top image augmentation libraries to guide you in optimizing your machine learning pipelines.

For those interested in conducting their analysis, here’s the link to the GitHub repository with all the necessary benchmarking code.

Contents:

  1. Introduction
  2. Benchmark Setup
  3. Results
  4. Key Takeaways
  5. Conclusion and Next Steps
  6. How to Run the Benchmark Yourself

Introduction

Low GPU utilization is a tricky beast in the world of machine learning and can stem from a myriad of issues, each with its own solutions. In the previous text, we compared different JPEG decoding libraries.

But today, our focus shifts to a particularly impactful scenario: image augmentation.

Why focus on this? Unlike hardware upgrades, which can be costly and time-consuming, switching the image reading library is a quick fix — a few lines of code.

The real question is, with all libraries available, how do they stack up against each other in terms of performance?

Benchmark setup

Libraries and versions

For our evaluation, we selected the following image augmentation libraries:

Hardware Setup

Tests were performed on an AMD Ryzen Threadripper 3970X 32-core Processor.

Benchmark Methodology

We utilized the first 2000 images from the ImageNet validation dataset for this evaluation. All outputs were standardized to contiguous NumPy arrays with the “np.uint8” data type.

Each library was assessed on a single core to gauge its standalone performance, eliminating any advantages from multi-threading or GPU support.

Our evaluation followed these key steps:

  1. Pre-loading Images: We loaded all the images straight into memory before testing. This step removes any differences that might arise from disk read speeds. For libraries like torchvision and Augmentor, we used PIL format images. For all the other libraries, we went with numpy arrays.
  2. Warm-up Phase: We started with a warm-up run for each library. This helps get everything running smoothly by finishing up any initial setup or caching that needs to happen first.
  3. Measurement Runs: We ran each library five times to ensure our results were solid and smooth out any random ups and downs. Also, we mixed up the order in which we tested the libraries each time to keep things fair and avoid any order-based bias.

Results

Источник: https://github.com/albumentations-team/albumentations

The numbers in the table show how many images each library can handle in a second — the higher, the better.

Conclusions and Recommendations

Evaluating System Utilization

If low GPU utilization during training (which can be monitored using tools such as nvidia-smi or ntop) is accompanied by high, typically 100%, CPU usage (observable through tools like htop), it suggests that there are bottlenecks in image augmentation processes.

Optimizing with Albumentations

Our benchmarks indicate that Albumentations outperforms other libraries in CPU-based image processing speeds. If you’re facing bottlenecks in your augmentation pipeline, switching to Albumentations could significantly improve performance.

Even if your current setup presents a manageable bottleneck, adopting Albumentations might be beneficial due to its operational efficiency and straightforward integration process. For detailed guidance, refer to transition examples from torchvision and Keras to Albumentations.

Exploring GPU-based Augmentations with Kornia

While Albumentations sets the pace for CPU-based image augmentation, Kornia shines in environments leveraging GPU acceleration. For projects that implement uniform transformations across large batches or specifically require GPU-enhanced processing, Kornia could provide the necessary efficiency boost. Though this GPU-focused approach may not be universally applicable, it proves highly effective in certain scenarios. Consider integrating Kornia for GPU-driven augmentation tasks here.

Source: https://kornia.readthedocs.io/en/latest/augmentation.html#benchmark

Running the Benchmark Yourself

For a direct experience and to test different image sets or hardware configurations, access the benchmark code here. This can provide insights tailored to your specific setup.

Engagement

If you found this analysis helpful, consider supporting it with claps on Medium, which allows up to 50 per article.

Additionally, if the benchmark code aids your project, a “star” on the repository would be greatly appreciated.

I’m open to connecting on various platforms:

Your feedback and connections are highly valued.

--

--