Boosting image processing performance, from ImageMagick to Libvips

Context

Dimitri Bouron
Sep 30, 2019 · 5 min read

Criteo displays billions of personalized ads a per day. Thanks to our dynamic creative generation pipeline, Criteo creates optimized banners with a set of recommended products the user is interested in.

Each day billions of product images are displayed across the Web. To handle this traffic , we developed our own Content Delivery Network. This application stores, serves and processes our customers product images.

This service handles up to 170k queries per second worldwide, at peak time. We process more than 18k images per second, applying some basic operations like resizing, cropping or padding. These operations have a certain CPU cost and a significant impact on request latency.

Motivations

So far, our application was using a Java wrapper of ImageMagick for image related computations. We observed a lack of concurrency and processing operations on PNGs were a bottleneck with ImageMagick. Indeed, the refresh of a customer catalog with many PNGs triggered significant CPU utilization peak. We were very frustrated by the performance of this library according to the necessity to process images as fast as possible in our context.

Libvips is a processing image library, written in 1990, oriented on concurrency and low memory needs. We found some interesting benchmarks which demonstrate a good scalability, an impressive speed up and memory usage compared to other libraries, expecting up to 5.5 times faster than ImageMagick. Another advantage to Libvips is its pro-active community. The maintainers are continuously experimenting new implementations to bring more performance (e.g.: libspng). Thus, we decided to test this library in our application.

Proof of concept

At the beginning, we started to implement a tiny Java wrapper using Java Native Interface to call some Libvips functions chosen according to our use case. We tested our wrapper on some product images with a real scenario for measuring its performance compared to ImageMagick and the JNI overhead cost.

First, we ran a simple benchmark which read an input, resized it, wrote back into a memory buffer. The results were promising:

  • Up to x4 on JPEGs with 1 thread, slight improvement on 4k PNGs
  • Up to x8 on JPEGs and x2.3 on PNGs with 8 threads
Performance comparison between Libvips and ImageMagick

Testing in production

Following these promising results, we decided to test Libvips with real traffic. Thus, we deployed two applications in production during three months: a reference with ImageMagick and a test app with Libvips.

Thanks to the high throughput of processed images, we discovered and contributed to fix two issues in Libvips: a segmentation fault when saving a PNG image, and a memory leak. Also, CMYK to sRGB color space conversion was a missing feature. We reported the issue and we implemented a straightforward conversion algorithm in Libvips. The maintainer merged a complementary patch with a different approach: falling back to a default ICC profile.

We strongly improved the image processing time:

  • up to 2.3x faster (mean)
  • up to 1.33x faster (50th)
  • up to 2x faster (99th)

In the meantime, one of our automated tests reported that 8% of our banners were too heavy, where a failure is defined as: the banner weight exceeds 800KB for width < 970 pixels; exceeds 1200KB for width >= 970 pixels. The top 10 heaviest banners were caused by oversized PNGs. In order to solve this problem, we enabled color quantization in Libvips. Color quantization is a process which reduces color space into a 256-colors bucket. It drastically reduces PNG weight (and thus the size of our banners) without altering the image quality.

Original image
256-colors image

Result summary

Firstly, our application is 230% faster in average for processing images. The operations on PNG image are definitely faster than ImageMagick did, it allows us to enable color quantization, which consumes lots of CPU resource.

Libvips rollout in production
Scaling Libvips thread pool size from one to two threads

When we reduced our banner weight with color quantization, it resulted in the clicked banners count uplift (+0.5% click through rate) for clients with heavy images. This improvement divides a customer image weight by eighteen in average. This is three times better than our previous implementation.

Image color quantization impact on byte failure test

The biggest impact for end users is the banner display time. It makes our banners more reactive to user events. Furthermore, it saves a considerable amount of consumed data, especially when a web page contains many banners with PNG images. This is really appreciable on mobile device context, making the web navigation faster and more fluent.

To conclude, Libvips is made for those who have a pipeline of several complex operations and/or deal with large images. Criteo not only use Libvips for its CDN, we also use it when reprocessing dataset of customer images to feed our deep learning models.

Finally, we open sourced our Java wrapper, named JVips, on GitHub: https://github.com/criteo/JVips. Many features are not implemented there, so feel free to contribute!

Criteo R&D Blog

Tech stories from the R&D team

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store