Converting images to WebP from CDN
The rise of WebP: A new image format for the web
The WebP format has become increasingly popular since Google introduced it in 2010. Its biggest selling point lies in its ability to produce much smaller file sizes while maintaining similar image quality. Faster load times = higher conversion rates.
WebP is a modern image format that provides superior lossless and lossy compression for images on the web. WebP lossless images are 26% smaller in size compared to PNGs. WebP lossy images are 25–34% smaller than comparable JPEG images at equivalent SSIM quality index. — Google
Tooling and considerations
Tooling: AWS (S3, CDN, Lambda@Edge), Sharp, User Agent
There are a few considerations we have to make before getting to the code:
- Firstly, not all browsers support WebP. Currently, WebP is natively supported in later versions of Google Chrome, Firefox, Edge, the Opera browser, Android Browser and Samsung internet.
- We may have a store of hundreds or thousands of pictures that we want to convert from supporting browser requests.
- We have to change what the HTTP request and response objects look like.
The plan: On-the-fly conversion
We’re going to listen for requests to CDN, and return a WebP image for all supporting browsers, granted that a WebP image exists. Otherwise, we’re going to fetch the image in its original format and convert it to WebP and return the newly converted WebP image.
It’s going to be…
CDN requests and responses
On top of the considerations, we have to understand what the CDN request and response objects look like:
We’ll be triggering our lambdas with the viewer request and origin response objects. The reason for using an origin response is that we want to leverage CDN caching for responses where the image conversion has already happened. However, for requests, since we modify the
request uri , we change the cache key, and therefore need to do this on every viewer request.
The CDN Request Object
Don’t get intimidated, the important thing is that this object has the request
headers object and the request
The CDN Response Object
Again, fear not — what’s really important here is that we have access to the
headers and the request
Let’s get coding!
Here are the steps we’re going to take:
- Listen for requests to CDN, and trigger a Lambda function that hijacks any viewer request.
- Determine if the request event is for an image and if the browser requesting the resource supports WebP based on the
user-agentwe receive from the request.
- If we determine that the request is for an image and that the browser supports WebP, we replace it with the
request uriimage extension with
.webpand add the original extension into the
- Next, we trigger a separate Lambda that hijacks any CDN origin response.
- If the
request urion the response event has a
.webpextension, and the
response statusis a 404, we check our S3 bucket for the same image, but with the original extension, we placed into our request header in step 3.
- If we find an image with the original extension in S3, we run a WebP conversion using Sharp and place it in the origin response, otherwise, we leave the 404 response unaltered.
The code: Viewer Request
The code for the Viewer Request Lambda is straight forward. It compares the browser and browser version from the request to a predefined list of supported browsers to determine WebP support and rewrites
png, jpg and jpeg extensions to
webp; and leaves all the heavy lifting to the Origin Response Lambda.
This leaves our function pretty lightweight, which is pretty ideal since Viewer Request and Response Lambda’s can’t be more than 1MB in size.
The code: Origin Response
The origin response function does all the heavy lifting. If the response status is a 404, it fetches request headers to determine the original file extension. It then replaces the
webp extension in the request
uri with the original file extension and queries S3 with the new
If it finds the file in S3, it then converts the image to WebP using Sharp, puts it in the S3 bucket, and places it in the response body as a
base64 image. It finally sets the
Content-Type header to
image/webp. If it fails to find the image in the S3 bucket, it sets the
Content-Type header to
image/webp and leaves the response as a 404.
- If you’re deploying using the Serverless Application Model (I’ve attached a conjoined template in the appendix), make sure you use 2 separate projects for your viewer request and origin response functions — AWS won’t let you deploy viewer requests more than 1MB (Installing Sharp will make your zip exceed this).
- You need to give your functions the
- Cloudfront Triggers for Lambda@Edge are only available in
us-east-1. Make sure your Lambda’s are deployed in that specific region.
- Cloudwatch logs for your Lambda’s won’t necessarily be in the
us-east-1region, instead, they’ll be in the region closest to where you’re making that response from (It’s CDN after all)
- If you’re on Mac OS, Sharp might not run if you install it locally and deploy it to AWS — it needs to be specifically installed for Linux. There are multiple ways to do this. Sharp recommends using t2.micro instance and ssh’ing into it; I find this unnecessarily complex and difficult to maintain across teams — I use a Docker container running Linux to install all my npm packages and create a zip that I push using aws sam. I’ve attached it in the appendix.
Conjoined SAM Template
Creating Functions ZIP from Docker Container
Dockerfile in your root, run