Running Selenium and Headless Chrome on AWS Lambda
There seems to be a small community of developers building interesting browser things on AWS Lambda with Headless Chrome.
Around a month ago, I was also prototyping a small AWS Lambda project that enabled running Selenium Webdriver with Headless Chrome inside a Lambda function called lambdium (disclaimer: no relation to laudanum).
I wanted to share the general architecture of the app and some things I learned while building it to help any other developers wanting to give serverless and Headless Chrome a try.
There’s also great post by Ken Soh for an overview of other projects experimenting with Headless Chrome and AWS Lambda. The comments in a recent Hacker News post about the Chromeless project are worth a read, too.
Binaries not included
In brief, AWS Lambda is just an event-driven Linux container on EC2 with some special cgroup sauce — see my Gluecon talk for more on that. Because of that, it’s perfectly acceptable to run almost any non-privileged process you want inside Lambda as you would in a container. You just invoke a 64-bit Linux binary that doesn’t require special privileges using something like Node.js’s child_process module or Python’s subprocess library in code in the main function handler.
Complex Linux binaries like chromium, though, need lots of shared libraries and assume a couple things about the Linux environment that don’t play perfectly with Lambda. Marco Lüthy came to the rescue in early March and figured out how exactly to get the chromium binary working. His serverless-chrome project is now the reference implementation for many projects, including chromeless and lambdium.
With the ability to run Headless Chrome in AWS Lambda, I was curious if I could get Selenium tests to run using ChromeDriver. Then, using one of the many Selenium client libraries, I could run pre-existing tests in different languages. After many iterations of copying shared libraries around — I detailed the general recipe for getting untested binaries working on Github — ChromeDriver was able to run and interact with Headless Chrome inside a Lambda function. With ChromeDriver running, I just had to connect a Selenium client library to it.
Finally: running Selenium tests
By going through the process above, I was able to get ChromeDriver and Headless Chrome running inside an AWS Lambda function — barely making it under the compressed 50mb size limit for functions (the Chrome binary is around 80% of this).
Surprisingly, running the Selenium tests using the Node.js selenium-webdriver module ended up being possible with just a few changes to the default options:
- There are a bunch of Headless Chrome-specific command line options you need to specify.
- You have to tell Selenium where to find the Headless Chrome binary (because it’s in AWS Lambda, it’s not in the
PATH
).
With that done, there was a mostly working Selenium Webdriver session with Headless Chrome running in AWS Lambda ready to go. The output of a script that visits Google.com and checks the title looks like this in AWS Cloudwatch(again, not much to see because it’s headless after all):
According to the AWS Lambda calculator, if I ran that same function 1,000 more times, my monthly bill (not including the free tier of the service) would come to $0.04. I increased the function memory to 1152mb—running Chrome takes a lot more than the default memory size in AWS Lambda.
Project setup and known issues
The prototype project, which initially tried to design as “framework-less” and as boing as possible (it’s just regular Node.js, a Makefile
, and some shell-scripts), is up on Github. It shouldn’t be too hard to port to, say, Python, C# or Java (all AWS Lambda-supported languages). As of early January 2018, I migrated the project to use AWS SAM Local and in March 2018 I ported it the AWS Serverless Application Repository (one click install!).
I’m still working on a new branch that should improve startup times, but that’s a subject for another post.
Will Headless Chrome shake things up in the browser QA/test world?
The duo of Headless Chrome and AWS Lambda seems like it has a real possibility to shake up automated browser testing — due in no small part to how cost effective serverless is for low-volume traffic like occasional browser testing in QA. With the generous Lambda free tier, you could probably run low-volume browser tests on AWS infrastructure for free.
Also, there’s no reason why Headless Chrome couldn’t also work on other serverless compute platforms, and there seems to be plenty of interest running it in Docker as well. I give it 6 months, max, before someone writes some sort of Kubernetes cluster that runs Headless Chrome at scale.
Think it’s a pretty fun time to be hacking on serverless compute platforms or even containers in general — especially with the recent release of Headless Chrome and the cool things that are going to be built with it.
Pull requests and questions welcome. Also highly recommend you check out the other projects in Ken Soh’s post.