Implementing Selenoid with Browser-Mob Proxy and NGINX server
Two years ago I began a new job at AppsFlyer as a Quality Automation Developer.
I started in a small group focusing on a small feature with plenty of time.
Since we are a SAAS company, which provides an advanced and complex dashboard that presents a massive amount of data, most of our product over dashboards is based on UI testing.
As the QA group was pretty small back then, the company used 3rd party services, such as BrowserStack and Ghost for Selenium testing.
But the amount of tests kept increasing, and we needed a new solution to control the large scale of tests across the QA teams, now standing around 4.5K sessions per day. Our new solution needed to take control of coding, debugging and maintaining the browser versions and to provide value and impact for our quality efforts..
The traditional Selenium Grid Solution — (spoiler alert) it was catastrophic
The direction was pretty clear at the start, we needed a solution based on Selenium Grid for running multiple tests across different browsers, operating systems and machines in parallel. I’d remembered from past experience that the traditional Selenium Grid solution was catastrophic — it wasn’t stable and had plenty of issues in the VM environment, maintaining the environment was very difficult, each machine represented a different browser type or version and it used too many resources.
Our challenges were to find a stable, secure and reliable solution for our infrastructure in order to run as many browser sessions in parallel as possible.
Since AppsFlyer works with AWS, I had the relevant resources for my POC.
So I allocated 3 Amazon EC2 Spot instances , one of the servers performed as a Hub, the other two performed as Nodes (as mentioned here Selenium Grid).
First I wrote a very simple Selenium code with Python, this code is simply to run the test on a Remote Chrome webdriver, the remote address is the Hub address.
At the beginning of the POC, I tried to implement the solution of the traditional Selenium Grid — as expected, it soon turned into a mess.
As long as I ran a single test, it worked pretty well, but when we started upscaling the instability appeared very quickly. It was 2 long weeks of struggling to achieve stability for concurrent sessions.
There must be a better solution to run Selenium tests in 2019, and since I had the time, I expanded my research across the network for the perfect solution.
The Selenoid Solution
During my research, I came across an interesting solution called Selenoid, a powerful Golang implementation of the original Selenium hub code, which uses Docker to launch browsers. You will get a new fresh dockered environment for each browser’s session that will be terminated at the end of the test — after the driver close function is running in your code.
This docker solution caught my eye, and I settled down to implement the solution as quickly as possible. The Selenoid project documentation is very clear and detailed, you can configure your environment with just one script command:
$ ./cm selenoid start — vnc
Our needs were different though, and we needed our own configuration for setting the Selenoid environment. We also used the GGR solution, Go Grid Router, which manages all sessions across the Selenoid machines. I decided to wrap the whole configuration in a docker-compose.yml file, and added it to AWS chef recipe, so each time a machine initializes it contains the full configuration.
Here you can see the configuration of the GGR, the GGR-UI that gathers all Selenoid UI machines, and the browser mob proxy that we’ll talk about later. You can also see the docker image files in our local repository, they can be downloaded from Aerokube page in docker hub repository.
The GGR machine is now our Remote Webdriver address and all tests run against its hostname.
The GGR also contains xml files that gather all Selenoid machines. We set the number of sessions for each browser version on each machine, as you can see here:
On the remote Selenoid machines, we had a separate YML file for the Selenoid hub and for the Selenoid-UI docker containers. In the “command” section you can see Selenoid CLI Flags for custom configuration.
So, how does it work now?
The test runs from a local computer or Jenkins Job, in the desired capabilities we choose the browser version and the stage of the run (remotely). Once the test is running, the session will be created and the GGR xml file will allocate it by its browser version and the available Selenoid machine that’s ready to get it.
On the Seleoid machine, we have a browsers.json configuration file that points to the requested docker image. For example if we are running our test on Chrome version 80, the GGR will allocate an available Selenoid machine from its test.xml file. On the Selenoid machine, the browsers.json will initiate the Chrome 80 docker image in a Chrome 80 container, there the test will run with full VNC access.
Selenoid and Browser mob Proxy
Another challenge that we needed to deal with, was to trap the XHR files from the browser’s developer tools (will be supported with Selenium 4). We wanted to catch all rest API requests that were being sent after a page load. Some short research revealed the Browsermob-proxy project, BrowserMob Proxy allows you to manipulate HTTP requests and responses, capture HTTP content and export performance data as an HAR file.
We decided to install it on the GGR machine, since all sessions run through it. So we added its configuration to the docker-compose.yml file (1.1).
To initiate the proxy we send stage:proxy in our test’s desired capabilities, this command initiates the Proxy Server, and we can use it for the rest of our test.
The Proxy server listens on port 9090, and the client scope is 9092–99. The following command will return all XHR files, after loading a specific page in your Selenium test:
Selenoid and NGINX
Our last challenge came from our PBA team. Our PBA customers implement AppsFlyer Web SDK in their website code, once the page is loaded it pulls our latest SDK version for web attribution. One of our developers’ challenges was to test the SDK version in a staging environment; they didn’t know how the new version would affect the customer’s website.
We wanted to redirect the first load of the SDK from our production path to our staging path, in order to test the Beta SDK on a real customer’s site without damaging anything.
What was the final solution?
We run our tests with a side version of Chrome that we created, wrapped the Chrome version with our browser certificates and created our own version of Selenoid browser docker image.
Once the test runs with this specific version, the REST calls for pulling the SDK will be redirected through the NGINX server, and our staging SDK version will be pulled from our CDN. Now the developer can test their SDK version before pulling it to production. We can also control other REST calls with the NGINX server by redirecting the traffic to our namespace (staging environment).
The NGINX default-ssl configuration default-ssl.conf:
All in all this was an interesting and complex multi-faceted project, but well worth the effort to attain a great working solution, and we gained additional experience and knowledge along the way!