Playwright with dynamic parallel steps and HTML Publisher plugin

Playwright on Jenkins at scale with HTML reports

A comprehensive guide to running Playwright tests in Parallel on Jenkins

Bruno Mańczak
Fandom Engineering

--

If you’re anything like me, sometimes you need to integrate old battle-tested solutions with new and shiny technologies. In today’s episode, my perennial butler called Jenkins was asked to run end-to-end tests using Playwright, the new hot and steamy test automation framework. Let the tutorial begin!

Part 1: The setup

To be able to run any tests on Jenkins, we need tests. If you have your own repo, skip this part.

For those of you who want to try stuff from scratch, run npm init playwright demo in your terminal. Project creator will ask you a few questions. I chose my project to be TypeScript and disabled GitHub Actions. This resulted in creation of 1 test tests/example.spec.ts`:

import { test, expect } from '@playwright/test';test('basic test', async ({ page }) => {
await page.goto('https://playwright.dev/');
await page.locator('text=Get started').click();
await expect(page).toHaveTitle(/Getting started/);
});

Also, a file called playwright.config.ts was created, where global configuration for our job is stored. By default, tests are configured there to run on 3 browsers: desktop Chrome, desktop Firefox, desktop Safari (and mobile versions added too, but commented).

Tests can now be run using npx playwright test

Example test being run in console

This is a good start, but we want much more tests to be able to showcase running them at scale. You could copy and paste the same test multiple times or generate them dynamically as many times as possible, e.g. using the classic for() loop.

import { test, expect } from '@playwright/test';for (let i = 0; i < 150; i++) {
test(`basic test numer ${i}`, async ({ page }) => {
await page.waitForTimeout(10 * 1000);
expect(true).toBe(true);
});
};

Notice that I also updated the test name to include dynamic number, as Playwright does not allow you to have 2 tests with the same name. I also needed to change the apostrophe (') to a backtick (`) to be able to use template literals.

I also changed the body of the test to only have a simple check: is true still true? A truly philosophical assertion to search for glitches in the matrix. I did so, as I want to be on good terms with the Playwright team and I don’t want to be responsible for a DoS attack on their site. I also added a wait of 10s (10 * 1000ms) to make tests a little slower.

This now results in 150 tests that are run in 3 browsers, which results in 450 tests, 10s each, which would take 1 hour and 15 minutes to execute, if run without any parallelization. Now we’re talking! We’ll make it faster on Jenkins.

Part 2: The Jenkins

Now we want to run those tests on Jenkins. We want to create a pipeline as a script available in a git repository. In other words we want to create a Jenkinsfile that will tell Jenkins what to execute, and how. We have 2 syntax conventions to choose from: Declarative versus Scripted Pipeline syntax. In this example, we will create a Declarative Pipeline. The basic syntax looks as follows:

pipeline {
agent any
stages {
stage('Stage 1') {
steps {
echo 'Hello world!'
}
}
}
}

In our case, we want to replace echo with running tests with npx playwright test, and we want to run tests in a docker image that has all the required libraries and browsers already installed; this is why I added the agent{} in line 5. Why not in line 2? Line 2 defines the agent that orchestrates the whole pipeline with multiple stages and steps. We do not need our docker image for this task, we need it only when running tests.

Why am I not using mcr.microsoft.com/playwright:focal image but one with a specific version, v1.17.1 like in the example on the Playwright.dev site? If Playwright would release a new version, the focal would point to the newest version and my build would start failing. To make sure your builds do not fail due to some external dependencies change, it’s recommended to point to a specific version.

pipeline {
agent any
stages {
stage('Many tests') {
agent {
docker {
image 'mcr.microsoft.com/playwright:v1.17.1'
}
}
steps {
sh 'npx playwright test'
}
}
}
}

Part 3: The speed-up

Playwright offers us two ways of scaling tests: horizontally (more computers) and vertically (more tests on 1 computer). Vertical scaling is enabled by default, by running tests in parallel on one machine using separate workers. By default, half of the number of CPU cores is used. You can change that in your playwright.config.ts file, or using command line with parameter --workers 5. In my example, default settings resulted in 2 workers on Jenkins.

To scale up horizontally, we can leverage the built-in Playwright feature called sharding. It can be enabled from the command line by passing --shard=1/3 parameter, where 1 would be the index of the machine that is running tests, and 3 would be the total count of machines that you are using. To leverage that feature, we could update our Jenkinsfile e.g. to use 2 sharding with 2 machines running stages in parallel (see parallel { in line 5)

pipeline {
agent any
stages {
stage('Many tests') {
parallel {
stage('Shard #1') {
agent {
docker {
image 'mcr.microsoft.com/playwright:v1.17.1'
}
}
steps {
sh 'npx playwright test --shard=1/2'
}
}
stage('Shard #2') {
agent {
docker {
image 'mcr.microsoft.com/playwright:v1.17.1'
}
}
steps {
sh 'npx playwright test --shard=2/2'
}
}
}
}
}
}

Nice step forward which results in a following pipeline getting built on Jenkins (screenshot from Blue Ocean Jenkins UI):

Our first Parallel pipeline with 2 shards

Ok, so now let’s try to have 5 parallel steps, not 2. This is the moment you realize that the current setup is not perfect, as nobody wants to copy & paste 10 lines of code for every new machine one wants to use. Let’s try to generate new parallel stages in a dynamic way. You could achieve that with the following Groovy script

def doDynamicParallelTestSteps() {
tests = [:]
int totalShards = 5
for (i = 0; i < totalShards; i++) {
def shardNum = "${i+1}"
tests["${shardNum}"] = {
node() {
stage("Shard #${shardNum}") {
docker.image('mcr.microsoft.com/playwright:v1.17.1').inside {
git branch: 'main',
credentialsId: '<credentialsID>',
url: '<repoURL>'
sh "npx playwright test --shard=${shardNum}/${totalShards}"
}
}
}
}
}
parallel tests
}

What is happening here? We started mixing Declarative and Scripted Pipeline syntax. In the above example, we define a function that creates separate pipeline stages, for the provided number of shards (totalShards = 5 in line 3). Unfortunately, we also need to clone the git repository once again in each node. This wasn’t required in a purely declarative pipeline as Jenkins handles this context automatically. Once we start doing manual magic, the automatic magic disappears. We clone the repository using git step, (branch called main, if the repository is private you also need to provide the credentialsID as stored in your Jenkins instance and of course link to your repository.

In line 13 we run tests for a given shard, in line 19 we define our newly created stages to run in parallel. Now we need to merge it with our pipeline:

pipeline {
agent any
stages {
stage('Many tests') {
steps {
script {
doDynamicParallelTestSteps()
}
}
}
}
}
def doDynamicParallelTestSteps() {
...

We needed to add the function to the file and trigger it in script{} block. Now all is good and ready to go… or is it? What if we want even more shards? What if we want to change the number of shards dynamically too? Now we hard-coded 5 shards, let’s get rid of it.

pipeline {
agent any
parameters {
string(name: 'SHARDS', defaultValue: '2', description: 'How many shards should we use?'
}
stages {
stage('Many tests') {
steps {
script {
doDynamicParallelTestSteps()
}
}
}
}
}

Here we added a parameter to our job (lines 3–4) which allows us to choose how many shards we want to use. You also need to update the doDynamicParallelTestSteps() function to use the value. Simply change line 3 to:

int totalShards = Integer.parseInt(params.SHARDS)

Now after clicking the “Run” button you will be prompted to provide the number of shards to use.

Triggering a parametrized build in Jenkins Blue Ocean

Well done, we’re almost done.

Part 4: The report

I know you might be a console geek, who hates to touch any GUI but not everyone is. Sometimes people prefer to review nice HTML reports instead of long console logs. Playwright has you covered with built-in HTML reporter. You just need to make it usable on Jenkins.

Currently, it’s impossible to have 1 HTML aggregated report when using sharding (unless you would like to use some 3rd party reporting tool like allure) but it’s planned for Playwright v1.19. In the meantime, we will create a separate report file for each shard, but we will try to make it as convenient as possible.

Using the HTML publisher plugin, you can add the following code to your doDynamicParallelTestSteps() function, which would create a separate HTML report for each shard

publishHTML([
allowMissing: false,
alwaysLinkToLastBuild: true,
keepAll: true,
reportDir: 'playwright-report',
reportFiles: "index${shardNum}.html",
reportName: "shard ${shardNum}",
reportTitles: "title ${shardNum}"
])

Reports are now visible in the classic Jenkins UI on the left hand side

Where to find newly published HTML reports

As you might have noticed, once you want to access the report from the next shard, you need to go back to Jenkins and click on the link to the next shard. We could make a slightly more usable report, that has all shards in separate tabs at the top of the report if we leveraged stash/unstash together with a new step that aggregates the reports together. The end result will look like this:

pipeline {
agent any
parameters {
string(name: 'SHARDS', defaultValue: '7', description: 'How many shards should we use? (enter number, job will fail with string')
}
environment {
REPORT_FILES = "index1.html"
REPORT_TITLES = "Shard 1"
}
stages {
stage('Many tests') {
steps {
script {
generateReportFiles()
generateReportTitles()
doDynamicParallelTestSteps()
}
}
}
stage('Make report') {
steps {
script {
doUnstashShards()
}
publishHTML([
allowMissing: false,
alwaysLinkToLastBuild: true,
keepAll: true,
reportDir: 'playwright-report',
reportFiles: REPORT_FILES,
reportName: "aggregated",
reportTitles: REPORT_TITLES
])
}
}
}
}
def generateReportFiles() {
int totalShards = Integer.parseInt(params.SHARDS)
for (i = 1; i < totalShards; i++) {
int shardNum = i + 1
REPORT_FILES = REPORT_FILES + ', index' + shardNum + '.html'
}
}
def generateReportTitles() {
int totalShards = Integer.parseInt(params.SHARDS)
for (i = 1; i < totalShards; i++) {
int shardNum = i + 1
REPORT_TITLES = REPORT_TITLES + ', Shard ' + shardNum
}
}
def doDynamicParallelTestSteps() {
tests = [:]
int totalShards = Integer.parseInt(params.SHARDS)
for (i = 0; i < totalShards; i++) {
def shardNum = "${i+1}"
tests["${shardNum}"] = {
node('qa-executors') {
stage("Shard #${shardNum}") {
docker.image('mcr.microsoft.com/playwright:v1.17.1').inside {
git branch: 'main',
credentialsId: '<credentialsID>',
url: '<repoURL>'
catchError() {
sh "npx playwright test --shard=${shardNum}/${totalShards}"
}
sh "mv playwright-report/index.html playwright-report/index${shardNum}.html"
stash includes: "playwright-report/index${shardNum}.html", name: "shard${shardNum}"
}
}
}
}
}
parallel tests
}
def doUnstashShards() {
int totalShards = Integer.parseInt(params.SHARDS)
for (i = 0; i < totalShards; i++) {
unstash "shard${i+1}"
}
}

Here we did the following:

  1. Once Playwright finishes tests, we rename the report to index${shardNum}.html, e.g. index1.html
  2. We stash (i.e. save for later) our HTML reports
  3. We unstash the files in a new stage called “Make report
  4. In the “Make report” stage we publish the HTML report with 2 variables passed there: REPORT_FILES and REPORT_TITLES.
  5. The REPORT_FILES is responsible for defining which files will be used in the aggregated reports (file names are generated in generateReportFiles() function, where we generate a string with all files names separated with commas, e.g. "index1.html, index2.html, index3.html" in case of 3 shards)
  6. The REPORT_TITLES defines what titles will be used for the generated aggregated report. We define the titles in generateReportTitles() functions, where we also create a string with all titles separated with commas, e.g. "Shard 1, Shard2, Shard3" in case of 3 shards.
  7. Lastly, we wrapped the test execution in catchError() {..}. This is done to allow for report generation also in case of tests failure. Without this line Jenkins will stop executing further steps and stages (in our case: creation of HTML reports) once tests fail. With catchError in place, our job will still be failed/red but also a report will be available.

The end result looks like this:

Switching between HTML report tabs in aggregated report

As you can see it’s now easier to switch between multiple tabs.

Epilogue

Setting dynamic sharding and HTML reports for Playwright on Jenkins is not so hard. Once it is in place, we can fully leverage as many Executors as possible to make tests as fast as they can be. With enough resources you could also discover server-side performance issues on your test application server :)

Do you think it can be done in a better way? Leave me a note in the comments below!

--

--

Bruno Mańczak
Fandom Engineering

Psychologist and cognitive scientist by education, tester by occupation, dad and husband by love.