Sersan: Auto-Scalable Test Automation in Kubernetes

Published in

Sale Stock Engineering

6 min readNov 28, 2019

Testing has been a major part of our software development cycle. We’ve adopted manual and automated test since the early days of Sorabel (was Sale Stock). At Sorabel, we do release several times in a day. It is mandatory to keep our software quality high while shipping our changes as fast and frequent as possible. We invest in automated test to maintain our sanity in ensuring all software functionality works properly given our limited capacity in QA Analyst and Engineers. We use automated test heavily to perform regression test, smoke test, and core test (mandatory test suites as release gatekeeper).

Auto-Scalable Test Automation in Kubernetes at Sorabel by Sale Stock

We choose Selenium as the foundation of our automated test due to its maturity and feature richness. It’s compatible with Appium so we can extend our framework to run against our native app. Software engineers at Sorabel write functional test on top of our in-house automated test framework. As our system grows in features and capability, the number of test suites to be executed also increase tremendously and happens in several stages. Running test suites that took hours no longer make sense for us, hence we embrace parallelisation in running test suites in various deployment stages.

It may look overwhelming but running automated tests promptly is crucial for us. We want to have better safety and faster iteration in development since automated test help us in early bug detection and faster issue mitigation in various stages. Having automated tests run against various version and environment means less effort for manual testing to check regression issue, therefore we can focus on exciting initiatives or have our weekend undisturbed with critical bugs.

At Sorabel, we strive to do the best and push to break the limitation. After month research, our QA Engineers come up with the idea to build something like Selenium Grid with auto-provisioning mechanism powered by Kubernetes API.

Our main goal is to reduce maintenance effort with reasonable cost. Moreover, we would like to have a more resilient ecosystem so our engineers could do more productive things rather than fixing issues in the automation test infrastructure. Here are the features that we would like to achieve:

Compatible with WebDriver protocol. We do not need to make any changes in our test framework.
Auto-scalable. It could automatically scale up or scale down the browsers according to the received number of tests.
Resilient. Any failures will not cause any disruption to the running tests.
Reasonable cost. It can run in pre-emptibles nodes.
Easy to set up and less maintenance by single binary deployment to handle all stuff and use any existing browser images without any modification efforts.
Able to handle high concurrent tests seamlessly. We would like to have a single service to perform hundreds of tests from all projects with various browser specifications.
Able to see what is going on in the browser during test execution for debugging process.

Hello Sersan!

Execution time matters for us. Running tests in parallel are mandatory. We developed an auto-scalable browser automation testing running in Kubernetes called Sersan leveraging Selenium’s feature. Sersan leverage the capabilities of Selenium Grid and add some features to optimise its scalability and reliability for running in Kubernetes cluster. In Selenium Grid, we can easily spawn node for test and wait for it to be registered to Selenium hub before we can perform our test. Although it is easy to scale up nodes in Kubernetes, some efforts are still required when running tests with higher diversity such as various browser compositions.

We may found those case in a software project where we need to run some test suites in Chrome while other test suites need to run testing in Firefox or various combination of browsers and versions. The possible solutions are either we could spin some node running Chrome and some running Firefox, or we run the Firefox test after the Chrome testing finished. It is worth noting that it will be difficult to scale down the node when some tests still running although some browser nodes are idle; as a startup, we need to use our limited resource efficiently.

Sersan is built to automate that effort and simplify the setup. It will manage the node lifecycle — it will ensure the node is always in a healthy and clean state. We don’t need to worry again about browsers composition and scaling because Sersan will take care of all of that stuff. Whenever Sersan receives a new session request, it will spin up a browser node based on the capabilities and delete it after the test is finished. Sersan is stateless which makes it possible to easily run it on pre-emptibles (to save even more cost) or doing maintenance stuff like scale up or deploy a new version of Sersan when there is some test still running. Sersan helps to simplify complex browser compositions as configuring available browser version as easy as we edit a YAML file.

How It Works

Each time Sersan receive a new session request, it will try to find the available browser image based on the specified browser and version in capabilities. If it is found, Sersan will request pod creation to Kubernetes API. Sersan doesn’t contain any queue mechanism because it is already handled by Kubernetes. After browser pod running, Sersan will forward the new session request to that pod immediately without waiting for other sessions. Information about original session ID, pod IP and port, and VNC port (if available) will be packed into JSON Web Token (JWT). This token will be used to determine where the test is being run.

Requests other than new session will be directly forwarded to the browser pod based on the information in the decoded JWT token of client’s session ID.

Last, if Sersan receives delete session request, It will simply forward that request to the right browser pod then request pod deletion to Kubernetes API.

Now, what will happen if the client didn’t request deletion? The browser pod will destroy itself after a specified timeout. When Sersan requests a pod creation to Kubernetes API, it will inject a timeout script to the pod manifest. If the timeout is reached, the browser pod will execute exit 0. That pod will be marked as completed and all reserved resources will be released.

Cost

Memory and CPU usage depend on the many factors e.g. concurrencies, browser type, test duration, etc. We can use observability tools like Grafana or honeycomb.io to get memory and CPU utilisation profile. For example in our case. For browser pod, we specify the CPU request to 400m, CPU limit to 500m, memory request to 500Mi, and memory limit to 1000Mi based on the average resource utilisation.

From this profile, we could determine which machine type that suite our need. We decided to use custom machine type with 4 vCPU and 8 GB memory which is can accommodate 7 concurrent tests at the same time. It only cost about $0.03 per hour (persistence disk is not included). If we want to higher concurrent, we can enable cluster autoscaler feature on the node pool. It will automatically scale up or scale down the node according to the number of received tests. Browser pod only live for a short period, so we could use pre-emptibles to get lower cost.

Sersan is already used in production at Sorabel to perform hundreds of concurrent tests and requires lots of work ahead to make it better. Sersan enables us to run test automation at scale while being prudent with our infrastructure budget. As this article is written, we are experimenting with having Sersan orchestrating automated test for our Android app.

Sersan is open source and it will available soon in Github, feel free to contribute. We’re keen to have discussions to improve Sersan. We’re also looking software engineers passionate about testing and developing testing platform at Sorabel, check out our career site for more information.

Edited by Benedictus Yoga

Sersan: Auto-Scalable Test Automation in Kubernetes

Hello Sersan!

How It Works

Cost

Written by Dimas Aryo Prakoso