Mastering Puppeteer

Crunch Tech
10 min readNov 23, 2020


At Crunch, we like getting features into our users’ hands as soon as they’re ready. The purpose of testing is to let us know if we’re about to break something while doing this!

Puppeteer is part of our E2E testing process.

What is E2E testing? 🤔

End To End (E2E) testing means firing-up a “copy” of a system and running automated actions on it to ensure that everything works together. These actions simulate what a user might do with the software, and are normally broken down into journeys:

  1. Can the user log in?
  2. Can they navigate to the invoices page?
  3. Can they access the Help Centre articles?

Along with unit and integration tests, E2E testing is the final “troll under the bridge” that our code needs to give the password to in order to go in front of our users.

What about Puppeteer? 🖐 🧦

Many E2E testing systems exist for browser-based software like ours, and Puppeteer is what we use at Crunch.

Puppeteer helps us compose E2E journeys — the buttons to tap and the fields to type into in order to do something that a user might.

So we’re not testing security headers here, but point-and-click and hunt-and-peck. We’re simulating a real user, or at least, trying to!

Simulating a user 👩‍💼

Running a Puppeteer test is a satisfying thing to do. Once a journey has been written and the “run debugger” button is hit, the ghost of a fevered speed-runner possesses your machine for a few short seconds while a journey is clicked, swiped, and tapped to completion.

But hang on, this doesn’t really sound like a real user, does it? Real users aren’t all speed-runners, tapping away tens of times per second. Real users tap, wait for something to happen, and then take action on what their screen shows them, or screen reader tells them.

Mash-o-tron 4000 🤖

If we really tested our software at the “speed-of-user” much of the benefit of automation goes out of the window. We want to deploy new features quickly, and we also want to make sure that fixes to any newly discovered bugs go out quickly, too.

So there’s a balance to be struck between running tests as fast as we can, and doing so in a way that’s as close to the behaviour of a “real user” as possible.

Additionally, there are timing issues to consider:

Since a script can run through a series of actions faster than a user can, the browser may not have finished processing a step before more come along, sending instructions the page isn’t ready for.

The above diagram illustrates this. It’s a design for a “button masher” (which may actually be a useful thing to simulate, but we’ll creep around the edges of that for now!)

Slowly slowly… 🐵

A first stab at solving this might be to add a fixed delay between instructions, simulating a user patiently waiting for something to happen:

Using Puppeteer’s API, this can look a bit like this:

Using timeouts in this way is far from ideal: different machines accomplish tasks at different speeds, while a timeout is a fixed period of “real” time:

Whether your development machines are faster than your CI/CD pipelines or the other way around, it’s clear that this method isn’t easy to maintain.

Development machines can compete for resources in an ad hoc way. CI/CD nodes with slightly different performance metrics can be chosen between runs of the testing pipeline. All of this affects the time needed for a task to complete, so such tests should be as “atemporal” as possible.

When performing an action like a click, we need to wait for its expected side-effects before continuing, hopefully in the form of an event that can be listened for.

Elementary, my dear🕵️‍♂️

What works better is to wait for an element to be in the document before we do something with it.

Our first example can be changed to:

Since we know which button needs to be clicked next, we can just wait for its associated CSS selector to appear in the document.

This means we can arrange our tests in a more user-like, or behavioural way: see button, click button. Just like a user:

Behaviour vs. Implementation 🤹‍♀️

If we could draw a line from code to user, the closer we get to the user the more behavioural our tests ought to be. At the same time, it would be great if our tests failed in ways that leave “breadcrumbs” for the developers who eventually need to revisit them.

If page.waitForSelector() fails, why? Was there a full page navigation in between? A network request? An in-place re-render? Maybe we ought to test these mechanisms?

There’s a trap here. These are implementation details, and the more we write tests to check that they work, the harder we make it to maintain the tests while adding new features or refactoring the code. We should try to test against behaviour as much as possible.

In an app with a lot of moving parts this ideal can be difficult to achieve. Third-party and legacy code can be opinionated differently to our current best practices. It’s not uncommon to have click-handlers inserted after an element is added to the page with lazy-loading techniques for example, or just by using libraries that do things another way.

Sometimes we have to do a bit more than waiting for text to appear.

Generalising implementation 🔨

We developed some utility functions that offer a good balance between the considerations we’ve outlined:

  1. Speed of blasting through a test
  2. Trying to be as behavioural as possible
  3. Testing implementation in a broad, but forensically useful way

These methods are:


  • Puppeteer already offers functionality very similar to this in the form of an option to page.waitForNavigation() and page.goto(). This is a re-implementation to watch for the number of active XHR requests without also needing to wait for a full page change, a common use-case for Single Page Apps.


  • Complex pages can take a few runs of the event loop to finish rendering, and this utility helps us determine when things have “gone quiet” without waiting for a specific element to show up. We use MutationObserver for this, and step through its implementation below.


  • We’ve been tripped-up on occasion by third-party components messing about with focus/blur (or “cursor stealing” as we call it) after updating an input element, which gives the impression that page.type() is misbehaving. document.activeElement keeps track of the current focus, and the above function lets us know when it has stopped being updated for the specified time interval.

Some code! 👩‍💻

Let’s dissect one of these helpers…

Here’s waitForDom.toIdle():

Essentially, we’re racing resolve/reject in the returned Promise:

After five seconds, reject will fire. Any DOM mutation will delay the call to resolve by a half-second:

We have a few helpers in the code we posted above, with one for the debouncing functionality that “pushes back” the call to resolve if needed:

  1. seconds()
  2. makePromise()
  3. makeDebouncer()
  4. runAfterMutation()
  5. promise.finally(…)

Let’s add these back in as we rewrite waitForDomToIdle():

1: seconds()

Starting small, instead of individual function arguments we’re using the props/destructuring pattern to make the call more readable, for example:

  • waitForDom.toIdle({ withinMs: 100 })

We also have a function that makes “5000” more obviously read as “5 seconds”. It’s not much, but it makes our lives easier while debugging. :)

2: makePromise()

To help guard against excessive “nesting” in our Promise constructors, the above helper returns an unsettled promise along with its resolve/reject methods.

3: makeDebouncer()

This implements the debouncing we spoke about previously. It lets us postpone the execution of a function for a period of time, with each call to the debouncer starting the delay over until the method is finally run.

We use another to handle the “bailout” condition:

The purpose of reject is to set an upper-limit on how long we’re willing to wait for the DOM to be busy. Five seconds is a long time to watch a browser rendering, so this could be much shorter. Long durations like this usually imply the need to “waitFor…” something else first, like a network round-trip.

We’re not doing any actual debouncing for rejectrejectLater() will be called only once, unlike resolveSoon(). We just use makeDebouncer here for convenience, since it rather usefully returns a cleanup-handler that we’ll run later on.

Of course, rejecting with an error-reason would be best-practice, but we left this out previously for the sake of brevity. Your linter may complain without it. :)

4: runAfterMutation()

Here, we’ve wrapped MutationObserver in a slightly more convenient API. We can pass a callback to runAfterMutation() that gets run every time there is any kind of DOM activity, and we receive a function that removes the mutation observer when we’re done with it.

5. promise.finally(…)

Just before returning with the promise we built, we start both the debouncers for resolve/reject:

Lastly, we put all the clean-up methods in a .finally()for the promise we’re returning:

This isn’t strictly necessary for the promise to work properly, especially in browser-side code in a short-lived page. But unsettled promises and uncleared setTimeout()’s prevent Node from halting, which can be annoying if you’re doing the same kind of thing in server-side code.

It’s also just good practice to clean up after yourself. 🧹

Adding it to a test 🖋

Going back to where we left our last example of a Puppeteer API call, we would like to update it to read like this:

Puppeteer itself doesn’t provide an API for detecting DOM mutation, hence the need to write our own browser-side function above.

But to use it we need to inject the code we wrote into the page…

Really adding it to a test 😬

Puppeteer provides a way to do this using page.evaluate():

Once injected, we can call this function later using page.evaluate() again:

Since we have a few more of these “waitFor…” helpers, and we’d like a way to attach them all at once, our production code looks more like this:

Back in the test we can insert the helper like so:

Wrapping up

To summarise, we broke-down a helper that answers the question of “Can we tell if the browser is busy rendering something?” without being too specific about how it does so.

We also set a reasonable timeout for how long a rendering-operation should take. Notice that such a “timeout” is distinct from what we were talking about previously with page.waitForTimeout(1000):

  1. Waiting an arbitrary length of time
  2. Waiting for an event, and bailing-out if it doesn’t happen within a reasonable timeframe

We hope it’s clear that the timeout in waitForDom.toIdle()satisfies the second point more than the first, and this is the ideal we try to achieve with other “eventful” helper methods in our codebase.

Having such methods means we can specify “failing timeouts” that are intuitive: A render operation really shouldn’t take more than a few hundreds of milliseconds. A network operation really shouldn’t take more than a few seconds.

Obviously we’re putting in a lot of effort to run our tests quickly, but if they fail, we’d like them to fail quickly too!

One more thing…

For those still reading, there are a couple more technical things to say about Puppeteer’s interaction between Node and the Browser.

For example, if'.crunchyButton') — a Puppeteer command running in Node — triggers an action in the Browser that renders the next button within a single run of its Event Loop, we can reduce our test to:

No need to page.waitForSelector() or waitForDom.toIdle() at all!

Again, this would only work if the click-handler for .crunchyButton rendered the .chewyButton element within the same event-loop:

Our test would fail if the target page were updated like so:

That setTimeout() on line 9 will put the call to renderChewyButton() after'.chewyButton') is called from Node in our test. 😱

The reason for this is a little complicated, but it does give a hint as to why Puppeteer leans so heavily on Promises for its API. We’re waiting for side-effects to happen, and they may not necessarily be done immediately after the action that causes them.

Imagine that Node runs'.crunchyButton') and pauses for a moment for the Browser to return, or “unblock”.

The setTimeout() in the handleClick() function will cause the Browser to yield-back to Node immediately, and before renderChewyButton() runs. So there’s no guarantee that the .chewyButton element will be rendered in time for'.chewyButton') to see it!

Obviously we’re all using UI frameworks in our day-to-day work rather than Vanilla JavaScript, and things can get a little more sophisticated than setTimeout() in the world of micro/macro event-loop tasks. Still, the principle applies also if your UI code is at the mercy of React’s scheduler.

The intention of waitForDom.toIdle() and the other helpers is to give just the right distance from such intricacies without ignoring them wholesale.

Thanks for reading, and good luck with your test writing!

Written by Conan Theobald. Conan is a JavaScript Developer at Crunch, and has been making websites since the Great Browser War. He also studies and teaches Aikido, and is very much looking forward to practicing it again without all the social distancing!

Special thanks goes to Conor O’Driscoll for producing the graphics that help bring the post to life.

Please take a look at our past posts. If you’re a Developer looking for a new role in the Brighton area or require some great value Accountancy services — check out our website.