Image Resizing Architecture at housing.com

The Problem

Images are the core to almost every business on the internet right now be it Hotels, Real Estate, E-commerce etc. With every great image comes not so great image-versions, Search Page has thumb images, Product Page has Medium or Large Images etc. Now consider a scenario you have 100TB of images with a fixed number of image versions and your business demands complete design overhaul of your website. Consider the time and resources you will have to spend to create new versions of images (Approx 6 Cr images). We faced a similar problem a few months back.

Plus Storage cost of 100TB data in S3 (More than the salary of a good software developer in India :))

Lazy Load

What if I tell you that you can generate any version of an image at runtime? The solution that you are thinking is not that complicated!!

Ever Heard of Aws Lambda?. I feel Aws Lambda is the uber of computation industry. No maintenance of servers, no memory leak issues. It just feels like magic. Amazon took it a step further, they introduced LambdaEdge in November last year. Aws LambdaEdge is basically Aws lambda replicated in every edge location (which can be triggered on cloud front request).

Lambda@Edge

The Solution

There are multiple lambdas involved in the solution we built each with a specific function:

Webp checker — This lambda is responsible for determining whether the browser (Only google chrome accepts webp images) is accepting WEBP images from request headers
2. One-time Image Resizer — When an image is uploaded to s3 we convert this image to a standard 1500*1500 image and move the originally uploaded image to amazon glacier
3. Run-Time Image Resizer — When the request misses CloudFront cache, this lambda converts the standard image present on s3 to requested image and stores it in the CloudFront cache.

The Glacier (this one is not melting)

Glacier is almost 5 times cheaper than S3 but it takes hours to fetch data from a glacier. We send the original high-resolution image to Glacier and keep a copy of a standard 1500*1500 image converted by One-time Image Resizer in s3. This image is perfect for our runtime image version creation.

Request flow
We store all the configurations(height, width, watermark, crop, white space shrinking etc.) of an image in the database and keep one copy of it in s3 document, after every creation or updating of configuration we update the s3 file.
The request first comes to webp checker it appends .webp extension on the basis of request headers. If CloudFront has an image with this URL it returns the image otherwise request goes to Run-Time Image Resizer it fetches the standard image from s3 and configuration from s3 document, by looking at the extension of URL (webp images will have URL like .jpg.webp) it converts the image and stores it in the CloudFront cache

Useful links:

https://aws.amazon.com/blogs/networking-and-content-delivery/resizing-images-with-amazon-cloudfront-lambdaedge-aws-cdn-blog/