Selenium: Reflections about Cluster Economics
Welcome back to my series of articles about wonderful Selenium testing world! Previously I was mainly speaking about technical issues you may encounter while trying to create your own Selenium cluster. Today we are going to touch Selenium cluster economics.
Why do economics matter?
In IT community we all like to hold conversations about new technologies (tools, frameworks, hardware) and methodologies (agile, extreme programming and so on). Both help to achieve the same goal: deliver a product corresponding to customer requirements with reasonable quality in reasonable time. Considering only requirements and time schedule a lot of software engineers never calculate how much their work cost to the company.
In real world however this cost very often determines whether company will exist or not. From the business point of view every company has two cumulative parameters: revenue and expenses. Revenue is usually determined by total amount of sold products or services. Increasing the overall revenue is a job for sales and marketing teams — for a software engineer it is difficult to influence the revenue. Expenses contrarily are in direct proportion to the effort spent delivering the product. That is to say more time your team spends working on the same product — more expenses you have. Having constant revenue more expenses means less efficient business — simple as that.
While a lot of decisions about reasonable level of expenses are made by company top-managers, there is at least one domain where almost any regular software engineer can influence the situation — means of production, i.e. tools and frameworks. Usually managers delegate choosing technical stuff like preferred programming language, IDE, framework and accompanying tools to development team: it simply has more knowledge about it. The main reason why we use various tools is that they usually decrease total effort to deliver the same piece of software product. Tools only make sense if the benefit from decreasing the effort if greater than total expenses to buy and adopt these tools.
So by choosing correct tools a software developer can decrease company expenses and thus directly influence business efficiency. Being part of development cycle — software testing domain follows the same economic rules and so does Selenium automated testing. Returning to Selenium clusters — our experience shows that a lot of development teams decide to create and maintain Selenium clusters themselves. Very often they underestimate an overall Selenium cluster cost and thus decrease business efficiency. In the next sections I would like to calculate how much does it cost to have own Selenium cluster in the cloud and provide a criterion when it makes sense.
Selenium Cluster: How much does it cost?
To understand how much does it cost to have your own Selenium cluster — let’s first of all enumerate its integral parts. A typical Selenium cluster consists of:
- A set of machines with running browsers. One desktop browser running in Linux Docker container approximately requires 1 CPU core and 1 Gb RAM (this can very depending on tested application). More complicated platforms such as Android and Windows require two times more CPU and memory. In fault-tolerant cluster these machines should be distributed across two or more datacenters. We recommend to use Selenoid as efficient browser sessions launcher.
- A Selenium load-balancer. This is needed to distribute the load across browser machines. We recommend to use an open-source Ggr load-balancer for that purpose. One instance of Ggr consumes 1 CPU and 1 GB RAM maximum. In fault-tolerant cluster it is sufficient to have two Ggr instances in two datacenters.
- A network load-balancer. It provides a single entry-point (IP-address) to the cluster.
- Reliable storage. Used to store Selenium sessions logs and recorded videos.
Certainly your company can have its own datacenters and network engineers but the most common case is to use some cloud platform: Amazon Web Services, Google Cloud, Microsoft Azure and so on. In this article I will use Amazon Web Services to calculate the price because this cloud seems to be one of the most popular enterprise cloud platforms and in our support channel we often receive questions about it. For example, let’s calculate total cluster cost for 10 parallel sessions.
For 10 parallel Selenium sessions we need to start 4 on-demand virtual machines:
- 2 x 1 CPU, 1 Gb RAM for Ggr load balancer instances. For example, an AWS t2.small is suitable. Currently it costs $0.023-$0.027 per hour depending on region, that is to say up to $20 per month each.
- 2 x 4 CPU, 16 Gb RAM for instances with browsers. Every machine allows to run 5 browsers in parallel. A suitable AWS instance type would be m5.xlarge, eating $0.19 — $0.23 per hour depending on region, i.e. up to $171 monthly.
If we sum up virtual machine prices and divide by 10 parallel sessions we get:
(2 * $20 + 2 * $171) / 10 = ~ $38 per parallel session per month
Now let’s add a load-balancer price. AWS provides 3 types of load balancers: application load balancer, network load balancer and classic load balancer. Any of them is suitable for Selenium, so let’s take the cheapest one. The load-balancer itself costs starting from $0.022 per hour + payment for the traffic and network connections. So minimum price would be
$ 0.022 * 24 * 31 = ~ $16 per month or
~ $1.6 per session per month.
To calculate S3 storage maximum price we assume that every Selenium session is using video recording and needs its log to be stored too. By default Selenoid is using h264 codec and a short 10 second video takes approximately 150 kilobytes of disk space. So per minute we get 150 Kb * 6 = 900 Kb and adding a log file we can assume that every Selenium session eats 1 Mb of disk space every minute or
60 * 60 * 24 = 43200 Mb ~ 43 Gb per month. Usually we don't need 1 month old videos, 2 weeks are sufficient, so the last number can be divided by 2, i.e. every Selenium session can consume up to 20 Gb of S3 disk space monthly. A standard S3 storage costs $0.02 - $0.03 per Gb per month depending on region. Thus S3 storage costs approximately
20 Gb * $0.03 = $0.6 per month for each session.
To get an approximate one Selenium session cost per month we need to sum up all previously calculated values:
$38 + $1.6 + $0.6 = ~ $40 per session per month
An approximate price above was calculated for mainstream browsers like Firefox, Chrome or Opera. These browsers could be easily packed to Docker images and run on standard Linux virtual machines without special requirements. If you need platforms like Android or Windows then things become more complicated. To efficiently run Selenium tests on these platforms you need two times more computing resources: 2 CPUs and starting from 2 Gb RAM. We have already seen that virtual machines price constitute the majority of the overall Selenium cluster cost. But in case of Android and Windows — not every standard virtual machine is suitable. In order to work fast these machines have to support nested virtualization! Although supported in Google Cloud and Microsoft Azure, in AWS there are no instances with nested virtualization enabled. So the only solution here is to launch bare-metal instances. The cheapest bare-metal instance in AWS (i3.metal) costs approximately $5.5 per hour (i.e. ~ $4000 per month) and knowing that this instance can run up to 36 Android \ Windows sessions in parallel you get ~ $125 of per session monthly expenses. So trying to run Android and Windows Selenium tests in the cloud can cost you a fortune!
All my previous calculations included only the cost of computing resources needed to launch the cluster. But even a truly reliable cluster still requires one or more engineers that will maintain this cluster: update browser versions, help users to resolve issues in tests and so on. Typical software engineer salaries differ from country to country, so I don’t provide exact numbers here. Anyway depending on total number of computing resources an engineer salary can be an important part of Selenium cluster cost and it certainly should be considered.
Having all these numbers let me show you one more thing… It’s our new Selenium cloud platform called Browsers providing you a ready-to-use Selenium cluster available in minutes.
Browsers has only unlimited billing plans and provides you a fault-tolerant Selenium cluster with the majority of popular browsers and platforms supported. A standard set of Firefox, Chrome and Opera browsers will cost $40 per parallel session and adding Android emulators plus Windows-only browsers will cost $80 per parallel session because these platforms require two times more hardware. Logs and videos are saved automatically for every session when you request it and are available via direct URLs so you could attach them to your test execution report.
So having your own cluster in AWS is even more expensive than using this cloud solution. One parallel browser session in AWS has the same cost and you will additionally have to pay to at least one engineer that will look after the cluster and do periodic browser updates.
Just email us to firstname.lastname@example.org and we will be happy to deliver an account with any desired number of parallel sessions.
What if I need an in-house Selenium cluster?
It only makes sense to deploy your own cluster if you have your own hardware or if your company security policy does not allow using an external service. In that case you can dramatically decrease Selenium cluster cost by using auto-scaling solutions such as Kubernetes. All mainstream cloud platforms including AWS provide Kubernetes out of the box. You can easily configure your Kubernetes cluster to remain small at night and during the week-ends and automatically grow when you really need a lot of browsers for testing.
Let’s calculate how much money you could economize by using Kubernetes auto-scaling for your Selenium cluster. AWS support for Kubernetes is called EKS (Elastic Kubernetes Service). Every such Kubernetes cluster consists of a master node delivering Kubernetes API and a set of worker nodes — running on virtual machines of any desired flavor. You pay $0.2 per hour ($144 per month) for the master node (compare to Google Cloud where it is free) and the usual price for selected virtual machines. Let’s initially consider we need the same virtual machines as before, that is to say 10 parallel sessions costing ~ $38 per month each, i.e. ~ $380 per month for 10 parallel sessions. For example you cluster load is 100% at working hours (8 hours per day) and 30% during the rest of the time. By using Kubernetes auto-scaling you now spend:
$380 * 1/3 + $380 * 0.3 * 2/3 = ~ $200 per month = ~ $20 per parallel session per month
With auto-scaling you can spend times less money for computing resources so this is highly recommended.
If you need an in-house Selenium cluster running in Kubernetes or Openshift — take a look at Moon.
Any More Articles?
Definitely. Consider these ones:
- Selenium on Windows: Docker Revolution
- Selenium: Exploring the Moon
- Selenium: back to the Moon
- Selenoid: storing data efficiently
- Selenoid: more Android sweets
- Selenium: a new hope (part I)
- Selenium: a new hope (part II)
- Selenium: done in 60 seconds
- Selenium on Windows: revisited
- Selenium: easy as a pie
- Selenium: an Apple story
- Selenium: Growing Muscles
- Selenium: Clear as a Bell