AWS CloudFront Compression using Lambda@Edge - where is brotli?!
IMO it gives you a really good example of how to serve compressed files
without using the auto compression feature in CF, and how to bypass the fact that AWS removes entries from the Accept-Encoding header.
In this article, I’ll share:
- Why I think you should do that on your own with Lambda@Edge and not using CF’s gzip compression feature.
- W̶h̶y̶ ̶I̶ ̶b̶e̶l̶i̶e̶v̶e̶ ̶y̶o̶u̶ ̶s̶h̶o̶u̶l̶d̶ ̶j̶u̶s̶t̶ ̶u̶s̶e̶ ̶g̶z̶i̶p̶ ̶f̶o̶r̶ ̶n̶o̶w̶ ̶a̶n̶d̶ ̶n̶o̶t̶ ̶a̶n̶y̶ ̶o̶t̶h̶e̶r̶ ̶a̶l̶g̶o̶r̶i̶t̶h̶m̶ ̶(̶l̶i̶k̶e̶ ̶b̶r̶o̶t̶l̶i̶)̶ ̶e̶v̶e̶n̶ ̶t̶h̶a̶t̶ ̶y̶o̶u̶ ̶c̶a̶n̶ ̶d̶o̶ ̶i̶t̶ ̶w̶i̶t̶h̶ ̶L̶a̶m̶b̶d̶a̶@̶E̶d̶g̶e̶.̶
Edit: You should use brotli now since, on July 11th, 2018 Amazon added support for whitelisting the Accept-Encoding HTTP header 🎉 For details at the bottom.
- There is also a nice performance trick at the end 😄
5 minutes work that can save ~300ms on your app loading time.
Why not using CF compression feature is a good idea?
First of all you can do it easily on your own using:
1. Upload the files (js, css etc) as is to the origin + the compressed files in any algorithm you want (gzip as I stated)
2. Then, using origin request Lambda@Edge event you can check the Accept-Encoding header and point to the right file (e.g myfile.js.gzip / myfile.js.br) in your origin.
3. In the origin response Lambda@Edge event, just add the right Content-Encoding according to the file’s format.
But why would you want to do it just for gzip support if CF already has a nice radio button that does it for you?
Few reasons (which might seems small to some of you):
1. When the origin returns the files to CF it does that without any compression, meaning that the entire file size is being transferred as stated in the docs:
The origin server returns an uncompressed version of the requested file to CloudFront.
It’s so important to you that the browser will get compressed files that you’re spending the time reading about it here, but you wouldn’t do the same for the entire pipeline that serves the files..? Think about it 😃
2. CF compression adds more time until the file will get to your users, why not do that in advance and upload the files already compressed..?
Now the reason I wrote that some won’t agree is that after the compression CF caches the files and the above won’t happen. But it’s still important IMO because, after every deployment you have (which in the agile world happens even few times a day), the first user asking for the files in each edge location will be slower than the others, this helps him/her as well 😃
3. You’re preparing yourself for the day Amazon will stop removing entries from the Accept-Encoding header and you’ll be able to serve any compression algorithm you would like 😎
OK got it! Now tell me why not brotli using Lambda@Edge and CF
In this article a use of Lambda@Edge viewer request event was suggested in order to trick AWS and have the Accept-Encoding values being stored in a new made up header called X-COMPRESSION.
That was done in order to later (if you’ve done the suggested above changes) check that header instead of the Accept-Encoding header and serve any algorithm you want.
The article also had a disclaimer on that approach:
Some further investigation would be needed to estimate the cost of this process, especially since these functions will be triggered for every user request.
Why is that bad IMO? 😱
1. You just made your website slower
Viewer request event is happening on every request to CF.
Unlike origin request/origin response which gets called only if the file is NOT in the CF cache. Only after a deployment, you have for 1 customer per CF edge.. that’s rare!
So does use of br over gzip really saves you that much time compared to adding a Lambda function that will run on every request to load your website? Even the fastest function needs to spin up first (because that’s what Lambda functions do..).
The answer is probably no and you just made your website load slower 🤦
🎉 Performance trick alert 🎉
Some of you might say:
But what about your index.html file? It cannot be cached on the client so it will also go to the origin every time…
That’s wrong! It’s not cached on the client true, but it can be cached on the CF side and after a deployment, we just invalidate the CF cache... BOOM just saved you around 300ms on your website loading!
All you need to do is modify your default cache behavior:
Instead of “Use Origin Cache Headers”, just set them yourself and upon a new deployment just invalidate your cache. That’s it! Now your index.html file will call CF but CF won’t call the origin since it has it in its cache (don’t forget to invalidate CF after a deployment..)
2. Congratulation! You’re now paying more money to Amazon 🎉
Maybe the true reason why AWS strip away the Accept-Encoding header, thinking: “hmmm if someone really really wants to use a different algorithm, they should have Lambda@Edge viewer request event trigger every time they load their website and then pay us a LOT since a LOT of events will be triggered.. mohahahaha 😈”
Don’t pay extra for that. It‘s not worth it.
I’m sold, but I still really want to use brotli eventually 😢
H̶e̶l̶p̶ ̶u̶s̶ ̶c̶o̶n̶v̶i̶n̶c̶e̶ ̶A̶W̶S̶ ̶t̶o̶ ̶s̶t̶o̶p̶ ̶r̶e̶m̶o̶v̶i̶n̶g̶ ̶i̶n̶f̶o̶r̶m̶a̶t̶i̶o̶n̶ ̶f̶r̶o̶m̶ ̶t̶h̶e̶ ̶A̶c̶c̶e̶p̶t̶-̶E̶n̶c̶o̶d̶i̶n̶g̶ ̶h̶e̶a̶d̶e̶r̶ ̶b̶y̶ ̶a̶d̶d̶i̶n̶g̶ ̶👍̶ ̶o̶n̶ ̶t̶h̶e̶ ̶i̶s̶s̶u̶e̶ ̶i̶n̶ ̶t̶h̶e̶ ̶A̶W̶S̶ ̶f̶o̶r̶u̶m̶.̶
Edit: On July 11th, 2018 Amazon added support for whitelisting the Accept-Encoding HTTP header 🎉 We did it! You can now use brotli without using the viewer request.
✅ As I stated above, I recommend working with Lambda@Edge origin request/response events to serve pre-compressed files instead of CF compression feature.
✅ Don’t use viewer request even to support different algorithms other than gzip. It just doesn’t worth the speed and costs..
✅ Cache also your index.html file in CF to speed things up with almost zero effort.