Tinkering with Imaginary for performance and scale

Published in

engineering-udaan

6 min readJul 4, 2022

By— Vaibhav Sharma, Daleef Rahman, Soumyadeep Mukherjee

Image transformations for user have been a common use-case where apps and websites often need images to be shown in different size/shape/color-scheme/opacity etc. depending on what is needed in the design.

In pre-cool-image-processing world, this would be solved by storing n versions of images with all the changes in the storage layer and then serving them separately depending on what is needed on the frontend. As frontends evolved and designs started getting complex for better UX, this meant maintaining upto 20 versions of same image adding to a lot of storage cost as well as maintenance overhead for UI engineers.

As compute became cheaper and image processing became mainstream, a lot of folks started moving this from storage to compute where images would be transformed on-the-fly depending on URL parameters. This made design and UI way leaner as well as reduced the storage needs too.

At udaan, we used one such third-party tool to do the image transformation for long enough when we realised, why not just try to move it in-house. Of course it benefits cost but it also allows us to expand on use-cases without depending on third party. Hence, started our journey to have our own HTTP based Image Processing Layer.

On evaluating, many solutions were available for only image processing like Sharp, Pillow, Libvips , not many with HTTP request capability. However, we stumbled on Imaginary . Except few extreme functionalities provided by our 3rd party provider, Imaginary provided almost every basic image processing feature that we need at udaan and it was Open Source as we desired.

Why Imaginary?

Imaginary is an HTTP-based service that uses the libvips capabilities. libvips(https://www.libvips.org/API/current/)- as they call themselves is A fast image processing library with low memory needs. It is a widely used image processing library and its binding is available in few different languages, for eg - Javascript(Sharp), go(Govips). libvips is already stable and adding new features in Imaginary is very much possible in the future. Also libvips Open source community is quite active with all the issues and doubts and the solutions to a few specific problems are not very hard to find on the internet.
Imaginary provides immense flexibility in transforming the images through the endpoint and GET params.
Micro-service based architecture

POC

The idea was to use imaginary APIs and try if we can use the same URL format endpoints which is already in use throughout the udaan application.

We used reverse Nginx proxy service in front of our imaginary deployments. After mapping the already existing url transformation endpoints to the respective Imaginary endpoints through Nginx config, we were able to start load test and figure out few bottlenecks that we faced with imaginary. Also we fixed few issues that we faced during this whole journey and merged those changes to upstream which we will talk about soon.

Assuming this url https://xyz.udaan.com/f_png,w_200,q_auto:good/u/merchandising/70ydwrsehevfn3xs7qo.jpg

we need format as png, width as 200 and quality as good (which is string mapped value for 80%)

So our Nginx config looks something like

This configuration allowed us to just redirect all the request from /v2 to imaginary with the same url that we were using across udaan.

Issues

While trying Imaginary for our load, we observed the following issues:

OOMKilled error — We started facing Imaginary pod restarts within few minutes of load testing. We later figured out that the Imaginary was not able to free the memory of output image buffer produced by the libvips.
PNG file size — The file size of png images was upto 2x compared to our 3rd party provider even after resizing with reduced quality.
Two containers in a single pod — We thought of using a single pod for Nginx as well as for Imaginary. After pod crash of Imaginary due to OOMKilled we started getting lots of 5xx.
High CPU usage — Image processing is always a highly intensive memory and CPU task. libvips provides effort parameter to tune this param but it’s not exposed from the imaginary GET APIs.

Fixes

Memory leak — Imaginary uses a go-c(https://pkg.go.dev/cmd/cgo) binding to use libvips library for image processing. For each image allocated memory the application was unable to free the memory quickly. Jemalloc fixed this issue. PR - https://github.com/h2non/imaginary/pull/381
External Pallete Quantisation flag support — Pallete quantisation is a 8-bit color quantisation in libvips. In short it reduces the distinct colors in the image with minimal degradation.Libvips supports pallete quantisation for png images, it reduces the size of the output image buffer with 8-bit quantisation when provided pallete param as true. This solved our PNG size issue and the images output sizes was very similar to our 3rd party provider results. PR - https://github.com/h2non/imaginary/pull/380
Enabled quality param for PNG images in vips - We also passed the quality param to libvips for png images. If Imagequant is available(which it is in imaginary) then with pallete Quantisation, you can tune quality param for the PNG images. PR - https://github.com/h2non/bimg/pull/398/files
External Speed support — Libvips supports CPU effort parameter while processing PNG images. We wanted to leverage that capability, thus we thought of exposing that param as well in Imaginary. PR - https://github.com/h2non/imaginary/pull/383/files
Separate out deployments — Isolated deployments for Nginx and Imaginary. Removed the single PoF.

There are few other requirements for which we still need to be dependent upon our 3rd party, for those cases we fallback to the respective provider. For us Nginx as reverse proxy has been really useful because it allowed us to not make any significant changes in the previous urls throughout our client applications at udaan. We got it all working just by adding a /v2 in the previous domain 🙂

HLD

Imaginary image is already available on Dockerhub so it didn’t take much time for us to start with some experiments on the architecture https://hub.docker.com/r/h2non/imaginary/ (The latest image is not updated, we have created our own from master after the latest contributions). The components that we are using for our service are given below:

CDN Ingress
NGINX deployment which has two containers -
a. NGINX
b. Prometheus log exporter from Nginx logs
Imaginary Deployment.
NGINX service pointing to Nginx deployments.
Imaginary service pointing to Imaginary deployment.
HPA for Nginx deployment.
HPA for Imaginary deployment.

The whole stack is built upon Kubernetes.

Speed vs Size

Initially, for the jpeg image, we were getting great results but not so good for png formatted images. It was almost thrice the size of our 3rd party provider images. We dived more into libvips, and we found out about palette param for png images and tweaking it gave similar results.

💡 Palette enables 8-bit quantisation of PNG images

While the results we got for size were really good but latency was still an issue that. We needed to get the latency to ~500ms. We figured out another param in libvips which was effort. effort tunes the level of CPU effort to reduce file size, so in case the CPU becomes bottleneck, latency would take a huge hit as image transformation is a CPU intensive job. So more the value of effortparam, more would be the CPU usage and thus high latency. We started tuning this param. Here are the following results: