Tinkering with Imaginary for performance and scale
By— Vaibhav Sharma, Daleef Rahman, Soumyadeep Mukherjee
Image transformations for user have been a common use-case where apps and websites often need images to be shown in different size/shape/color-scheme/opacity etc. depending on what is needed in the design.
In pre-cool-image-processing world, this would be solved by storing n versions of images with all the changes in the storage layer and then serving them separately depending on what is needed on the frontend. As frontends evolved and designs started getting complex for better UX, this meant maintaining upto 20 versions of same image adding to a lot of storage cost as well as maintenance overhead for UI engineers.
As compute became cheaper and image processing became mainstream, a lot of folks started moving this from storage to compute where images would be transformed on-the-fly depending on URL parameters. This made design and UI way leaner as well as reduced the storage needs too.
At udaan, we used one such third-party tool to do the image transformation for long enough when we realised, why not just try to move it in-house. Of course it benefits cost but it also allows us to expand on use-cases without depending on third party. Hence, started our journey to have our own HTTP based Image Processing Layer.
On evaluating, many solutions were available for only image processing like Sharp, Pillow, Libvips , not many with HTTP request capability. However, we stumbled on Imaginary . Except few extreme functionalities provided by our 3rd party provider, Imaginary provided almost every basic image processing feature that we need at udaan and it was Open Source as we desired.
Why Imaginary?
- Imaginary is an HTTP-based service that uses the
libvips
capabilities.libvips
(https://www.libvips.org/API/current/)- as they call themselves is A fast image processing library with low memory needs. It is a widely used image processing library and its binding is available in few different languages, for eg - Javascript(Sharp), go(Govips).libvips
is already stable and adding new features in Imaginary is very much possible in the future. Alsolibvips
Open source community is quite active with all the issues and doubts and the solutions to a few specific problems are not very hard to find on the internet. Imaginary
provides immense flexibility in transforming the images through the endpoint and GET params.- Micro-service based architecture
POC
The idea was to use imaginary APIs and try if we can use the same URL format endpoints which is already in use throughout the udaan application.
We used reverse Nginx
proxy service in front of our imaginary deployments. After mapping the already existing url transformation endpoints to the respective Imaginary
endpoints through Nginx
config, we were able to start load test and figure out few bottlenecks that we faced with imaginary. Also we fixed few issues that we faced during this whole journey and merged those changes to upstream which we will talk about soon.
Assuming this url https://xyz.udaan.com/f_png,w_200,q_auto:good/u/merchandising/70ydwrsehevfn3xs7qo.jpg
we need format as png
, width as 200
and quality as good
(which is string mapped value for 80%
)
So our Nginx
config looks something like
This configuration allowed us to just redirect all the request from /v2
to imaginary with the same url that we were using across udaan.
Issues
While trying Imaginary
for our load, we observed the following issues:
- OOMKilled error — We started facing
Imaginary
pod restarts within few minutes of load testing. We later figured out that theImaginary
was not able to free the memory of output image buffer produced by thelibvips
. - PNG file size — The file size of png images was upto 2x compared to our 3rd party provider even after resizing with reduced quality.
- Two containers in a single pod — We thought of using a single pod for
Nginx
as well as forImaginary
. After pod crash ofImaginary
due toOOMKilled
we started getting lots of 5xx. - High CPU usage — Image processing is always a highly intensive memory and CPU task.
libvips
provideseffort
parameter to tune this param but it’s not exposed from the imaginary GET APIs.
Fixes
- Memory leak —
Imaginary
uses a go-c(https://pkg.go.dev/cmd/cgo) binding to uselibvips
library for image processing. For each image allocated memory the application was unable to free the memory quickly.Jemalloc
fixed this issue. PR - https://github.com/h2non/imaginary/pull/381 - External Pallete Quantisation flag support — Pallete quantisation is a 8-bit color quantisation in
libvips
. In short it reduces the distinct colors in the image with minimal degradation.Libvips
supports pallete quantisation for png images, it reduces the size of the output image buffer with 8-bit quantisation when providedpallete
param as true. This solved our PNG size issue and the images output sizes was very similar to our 3rd party provider results. PR - https://github.com/h2non/imaginary/pull/380 - Enabled quality param for PNG images in
vips
- We also passed the quality param tolibvips
for png images. IfImagequant
is available(which it is in imaginary) then with pallete Quantisation, you can tune quality param for the PNG images. PR - https://github.com/h2non/bimg/pull/398/files - External Speed support —
Libvips
supports CPU effort parameter while processingPNG
images. We wanted to leverage that capability, thus we thought of exposing that param as well inImaginary
. PR - https://github.com/h2non/imaginary/pull/383/files - Separate out deployments — Isolated deployments for
Nginx
andImaginary
. Removed the single PoF.
There are few other requirements for which we still need to be dependent upon our 3rd party, for those cases we fallback to the respective provider. For us Nginx
as reverse proxy has been really useful because it allowed us to not make any significant changes in the previous urls throughout our client applications at udaan. We got it all working just by adding a /v2
in the previous domain 🙂
HLD
Imaginary image is already available on Dockerhub
so it didn’t take much time for us to start with some experiments on the architecture https://hub.docker.com/r/h2non/imaginary/ (The latest image is not updated, we have created our own from master after the latest contributions). The components that we are using for our service are given below:
- CDN Ingress
- NGINX deployment which has two containers -
a. NGINX
b. Prometheus log exporter from Nginx logs - Imaginary Deployment.
- NGINX service pointing to Nginx deployments.
- Imaginary service pointing to Imaginary deployment.
- HPA for Nginx deployment.
- HPA for Imaginary deployment.
The whole stack is built upon Kubernetes.
Speed vs Size
Initially, for the jpeg
image, we were getting great results but not so good for png
formatted images. It was almost thrice the size of our 3rd party provider images. We dived more into libvips
, and we found out about palette
param for png
images and tweaking it gave similar results.
💡 Palette enables 8-bit quantisation of PNG images
While the results we got for size were really good but latency was still an issue that. We needed to get the latency to ~500ms. We figured out another param in libvips
which was effort
. effort
tunes the level of CPU effort to reduce file size, so in case the CPU becomes bottleneck, latency would take a huge hit as image transformation is a CPU intensive job. So more the value of effort
param, more would be the CPU usage and thus high latency. We started tuning this param. Here are the following results:
We moved further with 5 speed
for our production service, along with palette
param when formatted with png.
Scalability
Right now we have enabled Imaginary to 100% in production ;)
Kindly checkout our engineering blogs for more such content. Thanks.