Converting images to WebP from CDN

Wisani Shilumani
Nona Digital

--

The rise of WebP: A new image format for the web

The WebP format has become increasingly popular since Google introduced it in 2010. Its biggest selling point lies in its ability to produce much smaller file sizes while maintaining similar image quality. Faster load times = higher conversion rates.

WebP is a modern image format that provides superior lossless and lossy compression for images on the web. WebP lossless images are 26% smaller in size compared to PNGs. WebP lossy images are 25–34% smaller than comparable JPEG images at equivalent SSIM quality index. — Google

source: https://bitsofco.de/why-and-how-to-use-webp-images-today/

Tooling and considerations

Tooling: AWS (S3, CDN, Lambda@Edge), Sharp, User Agent

There are a few considerations we have to make before getting to the code:

  1. Firstly, not all browsers support WebP. Currently, WebP is natively supported in later versions of Google Chrome, Firefox, Edge, the Opera browser, Android Browser and Samsung internet.
  2. We may have a store of hundreds or thousands of pictures that we want to convert from supporting browser requests.
  3. We have to change what the HTTP request and response objects look like.
WebP support table: https://caniuse.com/#feat=webp

The plan: On-the-fly conversion

We’re going to listen for requests to CDN, and return a WebP image for all supporting browsers, granted that a WebP image exists. Otherwise, we’re going to fetch the image in its original format and convert it to WebP and return the newly converted WebP image.

It’s going to be…

LEGENDARY!

CDN requests and responses

On top of the considerations, we have to understand what the CDN request and response objects look like:

CDN events (that can be used to trigger Lambda functions)

We’ll be triggering our lambdas with the viewer request and origin response objects. The reason for using an origin response is that we want to leverage CDN caching for responses where the image conversion has already happened. However, for requests, since we modify the request uri , we change the cache key, and therefore need to do this on every viewer request.

The CDN Request Object

Don’t get intimidated, the important thing is that this object has the requestheaders object and the requesturi string.

The CDN Response Object

Again, fear not — what’s really important here is that we have access to the headers and the request uri string.

Let’s get coding!

Summary

Here are the steps we’re going to take:

  1. Listen for requests to CDN, and trigger a Lambda function that hijacks any viewer request.
  2. Determine if the request event is for an image and if the browser requesting the resource supports WebP based on the user-agent we receive from the request.
  3. If we determine that the request is for an image and that the browser supports WebP, we replace it with therequest uri image extension with .webp and add the original extension into the request header
  4. Next, we trigger a separate Lambda that hijacks any CDN origin response.
  5. If the request uri on the response event has a .webp extension, and the response status is a 404, we check our S3 bucket for the same image, but with the original extension, we placed into our request header in step 3.
  6. If we find an image with the original extension in S3, we run a WebP conversion using Sharp and place it in the origin response, otherwise, we leave the 404 response unaltered.

The code: Viewer Request

The code for the Viewer Request Lambda is straight forward. It compares the browser and browser version from the request to a predefined list of supported browsers to determine WebP support and rewrites png, jpg and jpeg extensions to webp; and leaves all the heavy lifting to the Origin Response Lambda.

This leaves our function pretty lightweight, which is pretty ideal since Viewer Request and Response Lambda’s can’t be more than 1MB in size.

The code: Origin Response

The origin response function does all the heavy lifting. If the response status is a 404, it fetches request headers to determine the original file extension. It then replaces the webp extension in the request uri with the original file extension and queries S3 with the new uri (s3Key).

If it finds the file in S3, it then converts the image to WebP using Sharp, puts it in the S3 bucket, and places it in the response body as a base64 image. It finally sets the Content-Type header to image/webp. If it fails to find the image in the S3 bucket, it sets the Content-Type header to image/webp and leaves the response as a 404.

That’s it!

Gotchas

  1. If you’re deploying using the Serverless Application Model (I’ve attached a conjoined template in the appendix), make sure you use 2 separate projects for your viewer request and origin response functions — AWS won’t let you deploy viewer requests more than 1MB (Installing Sharp will make your zip exceed this).
  2. You need to give your functions the edgelambda.amazonaws.comexecution role.
  3. Cloudfront Triggers for Lambda@Edge are only available in us-east-1. Make sure your Lambda’s are deployed in that specific region.
  4. Cloudwatch logs for your Lambda’s won’t necessarily be in the us-east-1 region, instead, they’ll be in the region closest to where you’re making that response from (It’s CDN after all)
  5. If you’re on Mac OS, Sharp might not run if you install it locally and deploy it to AWS — it needs to be specifically installed for Linux. There are multiple ways to do this. Sharp recommends using t2.micro instance and ssh’ing into it; I find this unnecessarily complex and difficult to maintain across teams — I use a Docker container running Linux to install all my npm packages and create a zip that I push using aws sam. I’ve attached it in the appendix.

Appendix

Conjoined SAM Template

Creating Functions ZIP from Docker Container

With the Makefile and Dockerfile in your root, run make all

--

--

Wisani Shilumani
Nona Digital

Hi! I’m Wisani, a software developer at Allan Gray at the V&A Waterfront. I love building tech that inspires.