How Foleon runs automated screenshot comparison tests using Puppeteer, Headless Chrome and Pixelmatch

Published in

levi niners articles

13 min readJul 8, 2021

Two cents about Foleon, a super-powerful content creation platform, and how Levi9 contributed to it

Foleon’s leading content creation platform empowers business teams to create an engaging and intelligent content experience at scale. Organizations use Foleon to produce personalized content for every stage of the customer journey and give buyers the flexibility to self-educate and consume content at their own pace. It features an intuitive drag & drop editor and powerful template engine, providing sophisticated design capabilities while dramatically reducing complexity, enabling increased content velocity across the entire organization. Foleon content provides the audience insights needed to turn customer journeys into seamless experiences.

Collaboration between Foleon and Levi9 began in May 2019, with the onboarding of the first Levi9 Development Team. Over the two years, we have grown our partnership, currently having three development teams, one IT Ops team, and a mixed team working on the assignment, alongside Foleon Development teams. As we in Levi9 love to underline — nobody makes impact alone.

Levi9 is a nearshore technology service provider with around 1000 employees and 50+ customers. Like the comet Levy9 we make an impact that shapes the future. We impact businesses with technology services.

Reasons for running regression tests on Foleon content using browser automation

To provide some long-desired functionalities of the Foleon editor, part of the core applications in charge of the content needed to be rewritten. With that came a need for regression testing of the way all the existing Foleon documents will be presented using the new setup; obviously, there should be no visual changes when opened in the browser.

Since we are talking about the view of the HTML — not the structure of the DOM, we needed a way to create a screenshot of a Foleon content rendered using the applications currently running the Production site and a screenshot of Foleon content rendered using our new, shiny, modern stack. If the difference in pixels between those two images is zero, it means we have not introduced any bugs with our new code, disregarding if there were changes to the HTML DOM.

What is browser automation?

Browser automation is not a new invention. Selenium was released in 2004, and for a long time, it has been the go-to tool for this category. It is primarily used to implement test suites for websites, but it can also be used to take screenshots or automate tasks in scenarios when an API is not provided.

What is a “headless” browser?

The landscape changed significantly in 2017 when Google announced headless Chrome with Firefox following later that year. “Headless” in this case means that there’s no browser UI, no GUI of any sort. This comes super useful when running tests, especially in a CI environment. Since there is nobody to look at the visuals, obviously an extra overhead of the browser GUI is useless.

This feature made it possible to write efficient tests and scripts against the very browsers used by the users.

What is Puppeteer?

Puppeteer is a Node library developed by Google, more precisely the Chrome DevTools team. That gives it a major advantage over similar projects because it is backed by the same company that makes the most widely used browser in the world.

It provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default but can be configured to run the browser in a full (non-headless) mode.

Puppeteer for Firefox is a work in progress — as of puppeteer v2.1.0 you can interact with Firefox Nightly.

Most things that you can do manually in the browser can be done using Puppeteer. It can be used to:

· Generate screenshots and PDFs of pages
· Crawl web pages
· Automate form submission, UI testing, keyboard input, etc.
· Create an up-to-date, automated testing environment. Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features
· Capture a timeline trace of your site to help diagnose performance issues
· Test Chrome Extensions

You can try it out at: https://try-puppeteer.appspot.com/

By default, installing Puppeteer also downloads a compatible version of Chromium. This means that the browser version you use will be compatible with Puppeteer.

Why we chose Puppeteer over Selenium?

Selenium automates browsers. Primarily, it is used for automating web applications for testing purposes but is certainly not limited to just that. It supports numerous features that are helpful for automation testing.

It offers cross-browser support and several other features, but it is fairly resource-heavy and has a steep learning curve compared to puppeteer. The fact that Selenium can be used to write tests in different languages did not matter to us at this point since we decided to build our testing framework exclusively in Node.js. Since all future projects planned to come after this one were supposed to be written in Node, it was a natural choice to practice the language new to us. Especially having in mind that NPM offers loads of libraries and tools waiting to be used in your project.

Selenium uses the WebDriver protocol, which has a prerequisite on a running server acting as an intermediary between Selenium and the browser resulting in some added complexity.

On the contrary, Puppeteer controls Chrome using the internal non-standard DevTools protocol. This means it communicates with the browser on a direct line making it more user-friendly and performant.

Why we chose Puppeteer over Playwright?

When the first public version of Playwright was released in January 2020, we got another option for web automation to use. Playwright is a Node.js library developed by Microsoft.

The top two contributors of Puppeteer now work on Playwright — the Puppeteer team moved from Google to Microsoft and is now the Playwright team. This resulted in the two packages being very similar in many aspects (the API is quite similar, and both are by default bundle compatible browsers).

Playwright’s biggest advantage is cross-browser support. It can run Chromium, Firefox, and WebKit (Safari browser engine). However, to accomplish this, Playwright is shipped with patched versions of Firefox and WebKit. The maintainability of those changes is questionable in the long run since they were not developed together with the main branch of those projects and, since cross-browser support was not one of our main requirements, we decided to go with puppeteer.

Puppeteer-cluster and Pixelmatch

Puppeteer-cluster enables us to run a cluster of puppeteer workers. This library spawns a pool of Chromium instances via Puppeteer and helps to keep track of jobs and errors. Since our tool can be assigned to create screenshots of many pages, we wanted to run that process in parallel. Puppeteer Cluster takes care of reusing Chromium and restarting the browser in case of errors.

This library gives you the following possibilities:

· Handling of crawling errors
· Auto restarting of the browser in case of a crash
· Automatic retry if a job fails
· Different concurrency models to choose from (pages, contexts, browsers)
· Simple to use, small boilerplate
· Progress view and monitoring statistics (see below)

Pixelmatch is the smallest, simplest, and fastest JavaScript pixel-level image comparison library, created to compare screenshots in tests. It features accurate anti-aliased pixel detection and perceptual color difference metrics. Pixelmatch is around 150 lines of code, has no dependencies, and works on raw typed arrays of image data, so it is blazing fast and can be used in any environment (Node or browsers).

The tool — Testing engine

Our tool was built with two use cases in mind: to be run as a CLI locally and to be run as part of the CI flow. We decided to name it: Testing engine.

Using the Testing engine as a CLI application

When running the application in this mode, the user will be provided with guided steps and asked for various inputs to provide.

To build the CLI, we used Inquirer — a collection of common interactive command-line user interfaces.

Imagine a real-world scenario where a front-end developer would run a local instance of the Foleon application with some code changes. That developer will most likely want to check how the modifications to the code affect the result rendered in the browser, when compared to the stable application deployed, for example on the Test environment.

The developer would run the Testing engine, enter the URL of the expected application (on Test environment) and actual application (i.e. localhost:3000).

Also, using the CLI parameters a developer would specify the data source of both applications to be the same (Test environment), only the front-end applications rendering the HTML would be different.

If we task the tool to run a comparison for one Foleon document containing 10 pages, that would mean it needs to create 10 pairs, each having 2 screenshots which would be compared. This process would be done in parallel to improve performance using puppeteer-cluster.

This is our puppeteer-cluster configuration:

const puppeteerCore = require('puppeteer-core');const cluster: Cluster<ScreenshotRequest> = await Cluster.launch({
  concurrency: config.get('puppeteer-cluster.concurrency-type'),
  maxConcurrency: calculateConcurrency(),
  timeout: config.get('puppeteer-cluster.timeout-ms'),
  monitor: config.get('puppeteer-cluster.monitor'),
  puppeteerOptions: {
   headless: config.has('puppeteer-cluster.headless') ? config.get('puppeteer-cluster.headless') : true,
   executablePath: config.get('docker.enabled') ? config.get('puppeteer-cluster.executable-path') : undefined,
  },
  puppeteer: config.get('docker.enabled') ? puppeteerCore : undefined,
 });
 
 cluster.on('taskerror', async (err, data) => {
  logger.error(`Error browsing %o: ${err.message}`, data.url);
  await teardown();
 });

There are multiple concurrency implementations available in puppeteer-cluster. We decided to use CONCURENCY_PAGE — the one that shares everything between jobs (cookies, local storage, etc.) since it fits our use case the most; common data sharing is a benefit for us and could only speed up the process.

We made the headless switch configurable to be able to have an option of running the tool with the browser showing to the user, for troubleshooting purposes.

Properties executablePath and puppeteer are overridden in case we run the tool through Docker; read more about this in the “Using the tool as part of the CI pipeline” section.

Property executablePath points to the external installation of the browser.

Property puppeteer is set to puppeteer-core to avoid the download of Chromium. More details on this can be found in puppeteer documentation.

The number of workers per CPU is calculated based on the number of CPU cores available in the system running the tool or it can be configured to a given max value, depending on the configuration.

function calculateConcurrency(): number {
 if (config.has(‘puppeteer-cluster.workers-per-cpu’)) {
 const workersPerCpu: number = config.get(‘puppeteer-cluster.workers-per-cpu’);
 const totalWorkersCalculated = workersPerCpu * os.cpus().length;
 logger.info(`Total workers to be used = ${totalWorkersCalculated}`);
 return totalWorkersCalculated;
 }
 return config.get(‘puppeteer-cluster.max-concurrency’);
 }

Each task queued in the cluster is assigned a callback function which:

1. navigates to the actual URL
2. takes a screenshot
3. stores it in a buffer.

The process is repeated for the expected URL, after which the tab is closed. Image buffers of both screenshots are sent for comparison.

await cluster.task(async ({ page, data }) => {
  page.setDefaultNavigationTimeout(config.get('puppeteer-cluster.page-navigation-timeout-ms'));
 
  async function navigateAndScreenshot(url: string) {
   await page.goto(url, { waitUntil: 'networkidle0' });
   return (await page.screenshot({ fullPage: true })) as Buffer;
  }
 
  const actualImageBuffer = await navigateAndScreenshot(data.imageComparatorRequest.previewerActualUrl);
  const expectedImageBuffer = await navigateAndScreenshot(data.imageComparatorRequest.previewerExpectedUrl);
 
  page.close();
 
  data.imageComparatorRequest.actualImage = actualImageBuffer;
  data.imageComparatorRequest.expectedImage = expectedImageBuffer;
 
  compareImages(data.imageComparatorRequest);
 });

Each pair of URLs for comparison is queued to the cluster:

imageComparatorRequests.forEach((imageComparatorRequest, pairIndex) => {
  cluster.queue({
   url: `${imageComparatorRequest.previewerExpectedUrl}`,
   imageComparatorRequest,
   index: pairIndex,
  });
  logger.info(`[${pairIndex}] pair queued.`);
 });

The actual comparison process, the compareImages() function using pixeldiff is as follows:

export const compareImages = (contentComparisonRequest: ContentComparisonRequest): void => {
  try {
   const minPixelDiff: number = config.get('image-comparator.min-pixel-diff');
   const threshold: number = config.get('image-comparator.threshold');
 
   const actualImagePNG = PNG.sync.read(contentComparisonRequest.actualImage);
   const expectedImagePNG = PNG.sync.read(contentComparisonRequest.expectedImage);
 
   const { width, height } = actualImagePNG;
   const imageDiff = new PNG({ width, height });
 
   const numberOfDifferentPixels: number = pixelmatch(
    actualImagePNG.data,
    expectedImagePNG.data,
    imageDiff.data,
    width,
    height,
    {
     threshold: threshold,
    }
   );
 
   logger.info(
    `Difference in pixels is = ${numberOfDifferentPixels});
 
   if (numberOfDifferentPixels > minPixelDiff) {
    const currentTimestamp = new Date().getTime().toString();
 
    storeImage(contentComparisonRequest, '_actual', currentTimestamp, actualImagePNG);
    storeImage(contentComparisonRequest, '_expected', currentTimestamp, expectedImagePNG);
    storeImage(contentComparisonRequest, '_diff', currentTimestamp, imageDiff);
   }
  } catch (error) {
   logger.error(
    `Error occurred during image comparison for publicationId=${contentComparisonRequest.publicationId} and pageId=${contentComparisonRequest.pageId} ${error}` 
   );
  }
 };

pixelmatch allows you to override the matching threshold (ranges from 0 to 1) to adjust the comparison sensitivity and we have introduced a configurable variable minPixelDiff to be able to control the threshold in pixels, which can come in useful in certain scenarios while developing new features.

We used pngjs to create PNG images which will be stored in the filesystem. Function storeImage() just stores the png files to the filesystem using fs.

In the end, we open the directory containing the results using the open library, making them available for the user to inspect.

…async function printOutResultsFolderNameAndOpenIt(initialFolderPathForResults: string) {
  if (fs.existsSync(initialFolderPathForResults)) {
   logger.info('Comparison results are stored in: ' + path.resolve(initialFolderPathForResults));
    await open(initialFolderPathForResults);
  } else {
   logger.info('No difference in content found, no results folder created.');
  }
 }…

And, in the end, the teardown() function shuts the whole cluster down:

async function teardown() {
  await cluster.idle();
  await cluster.close();
 }

Using the tool as part of the CI pipeline

We plan to have the Testing engine integrated into our PR flow on Github and Jenkins. This is still a work in progress.

We plan to create a predefined set of Foleon documents containing pages featuring all sorts of different content that can be added to Foleon pages.

Our PR testing pipeline would initiate a specific set of actions upon a PR creation:

1. Create a new environment dedicated to that PR containing a set of FE applications able to render the content
2. Spin up the Testing engine in a Docker container in Jenkins, using an external installation of Chrome
3. Run a set of screenshot comparison tests for mentioned preset of Foleon documents
4. Give feedback on any possible difference between screenshots on PR to the creator and reviewers

We expect this automated flow will give developers more reliability in the end product they build while also making them more efficient.

Our goal is to run the Testing engine in a Docker container to maximize flexibility, making use of the environment variables to queue content to be tested.

We have opted for Alpine Linux as our base container since it has a minimal footprint on a Docker image making it as light as possible.

In the builder phase, we just build the application, while in the second phase we oversee setting up of Chrome and copying of the packages from the builder phase.

As mentioned before, Puppeteer by default downloads the latest version of the browser, but when running in the Docker container we want to disable that behavior. The reason for this is that bundled Chromium that Puppeteer installs is missing the necessarily shared library dependencies.

To remedy this, the “install chromium” section uses the edge repository for getting Chromium for Linux together with the required libraries to run Chrome for Alpine.

Therefore, we set the PUPPETEER_SKIP_CHROMIUM_DOWNLOAD environment variable to true.

After running the Docker build, we get our Chromium executable: /usr/bin/chromium-browser. This is used as our Puppeteer Chrome executable path.

The complete Dockerfile can be found below:

FROM node:12-alpine AS builderWORKDIR /usr/src/app# add packagesRUN apk --no-cache add curl bash rsync# install pnpmRUN curl -sL https://unpkg.com/@pnpm/self-installer | nodeENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD="true"COPY ./package.json /usr/src/app/package.jsonCOPY ./pnpm-lock.yaml /usr/src/app/pnpm-lock.yamlCOPY ./pnpm-workspace.yaml /usr/src/app/pnpm-workspace.yamlCOPY ./tsconfig.json /usr/src/app/tsconfig.jsonCOPY ./packages /usr/src/app/packagesRUN pnpm install --frozen-lockfileRUN pnpm buildRUN rsync -a --exclude="src" --exclude="coverage" --exclude="node_modules" --exclude="tsconfig.json" --exclude="*.md" /usr/src/app/packages/ /usr/src/app/packages-be-copied/FROM node:12-alpineWORKDIR /app# add curl and bash packagesRUN apk --no-cache add curl bash# install pnpmRUN curl -sL https://unpkg.com/@pnpm/self-installer | node# install chromiumRUN apk update && apk add --no-cache nmap && \    echo @edge http://nl.alpinelinux.org/alpine/edge/community >> /etc/apk/repositories && \    echo @edge http://nl.alpinelinux.org/alpine/edge/main >> /etc/apk/repositories && \    apk update && \    apk add --no-cache \      chromium \      harfbuzz \      "freetype>2.8" \      ttf-freefont \      nssCOPY ./chromium.conf /etc/chromium/chromium.confWORKDIR /usr/src/appCOPY ./package.json /usr/src/app/package.jsonCOPY ./pnpm-lock.yaml /usr/src/app/pnpm-lock.yamlCOPY ./pnpm-workspace.yaml /usr/src/app/pnpm-workspace.yamlCOPY --from=builder /usr/src/app/packages-be-copied/ /usr/src/app/packages/ENV NODE_ENV=docker \    NODE_CONFIG_DIR='/usr/src/app/packages/loader-cli/config/' \    PUPPETEER_SKIP_CHROMIUM_DOWNLOAD="true"RUN pnpm install --prod --frozen-lockfile --no-optionalCMD ["node", "packages/loader-cli/dist/cli/cli.service.js"]

This is the content of our chromium.conf file:

CHROMIUM_FLAGS="--no-sandbox --headless"

The explanation for the -–no-sandbox can be found in Chrome documentation.

Testing engine running in practice:

The gif shown below presents the tool in runtime. For demo purposes, we compared one Foleon page with its duplicate that was edited — to be able to spot the differences in the result.

At the end of the process, the tool prints out the location of the results folder, and opens it with the default application assigned for that action by the OS.

Images of the Actual, Expected and Comparison result of the demo look like this:

Conclusion

Having in mind that the whole team working on this project had never written Node.js before and that all of us come from the world of Java, it was a very fun experience for us, and we have learned a lot. The most interesting part was optimizing the performance of the screenshot creation flow when the tool is assigned to compare much more than just a few pages.

To speed it up, we were trying to parallelize this process on our own, but then we discovered puppeteer-cluster which made things easier for us. We identified that as a major benefit in practice when compared to Java — the NPM repository contains a huge list of ready-to-use libraries, available with a simple npm install command.

Ultimately it was a win-win situation: our client — Foleon was happy with what the Testing engine delivered and we had a blast while building it! It was great teambuilding as well, as Levi Niners gathered in this team (called Plum, how funny is that) for the first time and also worked with Node.js for the first time.

Filip Stanišić
Tech Lead @ Levi9Serbia