Screenshot testing with React and Storybook

Visual regression testing for fun, profit, and peace of mind

Illustration by Marta Pucci

A friend of mine recently related a scary story about the lack of automated visual regression testing where he works—a huge international tech firm that you’ve heard of and probably use. He’d added a CSS class to an element to hide it from the user. Unbeknownst to him, and despite using BEM-style naming for CSS classes, this class name was already being used for an important checkbox in the user settings screen of his company’s web app. The result? Users could no longer change the setting represented by the checkbox, because it was invisible!

I asked him why that hadn’t been caught by automated tests. He explained that, although they did have end-to-end tests in place that went through the UI and tested its functionality, the tests didn’t catch the bug. Selenium was still able to check the original checkbox via its selector, so its visibility had no effect on the outcome of the test. The only way this could have been caught is via visual regression testing — or, as it is sometimes called, screenshot testing.

💅 Switching to styled-components

I recalled this story recently while working on a sizable project at Clue: converting all of our “native” CSS to use styled-components. The new helloclue.com website consists of dozens of components. An article page, for example, contains at least ten different React components, all of which have accompanying CSS files. Converting all of this CSS to styled-components means there’s an enormous risk of bugs exactly like the one my friend experienced. So before the project began, I investigated setting up screenshot testing for the site.

📕 Enter Storybook

(Note: if you’re already familiar with Storybook and @storybook/react, feel free to skip to the next section.)

We use Storybook when developing simple presentational components. This way, we can test them in every possible state, without having to reproduce all the logic and so forth required to get them to that state.

For example, on our author pages (the AuthorPage component), there are four possible states:

  • Both the author and the author’s articles are loading.
  • The author is loaded, but the articles are still loading.
  • The articles are loaded, but the author is still loading.
  • Both the author and the articles are loaded.
Loading-state placeholder elements on the author page in Storybook

The loading states are reflected in the UI via placeholder elements. The problem is, how do I make sure these elements look right? Using our office’s speedy internet connection, I only get a split second to check the loading state of the author page before the placeholders are replaced with real content.

That’s where Storybook comes in. With Storybook, you can render a component in the specific state you want to test it in, and keep it in that state for as long as you need¹:

// stories/AuthorPage.tsx
// Note that the code samples in this article are written in TypeScript.
import { storiesOf } from "@storybook/react"
import AuthorPage from "../components/AuthorPage/AuthorPage"
storiesOf("AuthorPage", module)
.add("loading", () => (
<AuthorPage
isLoadingArticles={true}
isLoadingAuthor={true}
/>
))

In the example above, I’ve created a “story” for the author page in which both the articles and the authors are still loading. I can then run yarn run storybook (or npm run storybook), and it starts a Storybook server where I can view the AuthorPage component in its loading state. (Of course, I created additional stories for each of the other states, as well.)

While viewing the component, I can use Chrome Dev Tools to inspect the elements on the page, debug UI issues, and generally have a stable environment in which to develop the loading state of the component.

📸 But what about screenshots?

Since each component can be isolated and locked into a specific state in Storybook, and I wanted to screenshot every possible state of many of the site’s components, our Storybook setup seemed like a great place to do screenshot testing.

A bit of searching turned up an excellent library called storybook-chrome-screenshot, a Storybook addon. Using Storybook Chrome Screenshot, I can add a decorator to my Storybook stories:

// stories/AuthorPage.tsx
import { initScreenshot, withScreenshot } from "storybook-chrome-screenshot/lib"
import { addDecorator, storiesOf } from "@storybook/react"
import AuthorPage from "../components/AuthorPage/AuthorPage"
addDecorator(initScreenshot())
storiesOf("AuthorPage", module)
.add("loading", withScreenshot()(() => (
<AuthorPage
isLoadingArticles={true}
isLoadingAuthor={true}
/>
)))

When I run storybook-chrome-screenshot -p 9001 -c .storybook, a Storybook server is once again spun up. But this time, the Storybook Chrome Screenshot addon goes through each story that I’ve added the withScreenshot decorator to and takes a screenshot of it. (The decorator can also be added to the entire Storybook setup to automatically screenshot every story.²)

I can then add the storybook-chrome-screenshot command to the scripts key in my package.json, and automatically run it as a part of my CI process.

🔬 Comparing screenshots: before and after

OK, so I have screenshots of every component in the site in every possible state. But the whole point of taking screenshots is to compare them against master, to make sure nothing unintentionally changed.

That’s where reg-suit comes in. It’s an NPM package that does the actual work of pixel-by-pixel visual regression testing, and then generates an HTML report of the results.

First, I created regconfig.json in our app’s root directory (with a few non-JSON compliant comments added for your benefit):

{
"core": {
// The directory where storybook-chrome-screenshot dumps its screenshots, which reg-suit will use to compare against screenshots from master.
"actualDir": "__screenshots__",
// The directory where reg-suit will dump its HTML report, as well as images showing the visual differences if there are any.
"workingDir": "__screenshots_tmp__",
// This determines how forgiving reg-suit should be of differences between screenshots.
"threshold": 0,
"addIgnore": true,
"thresholdRate": 0,
"ximgdiff": {
"invocationType": "client"
}
},
"plugins": {
// Use CI environment variables to determine which commits to compare screenshots from (in this case, master vs. HEAD).
"reg-simple-keygen-plugin": {
"expectedKey": "${TARGET_GIT_COMMIT}",
"actualKey": "${GIT_COMMIT}"
},
// Notify GitHub of the results of the visual regression test.
"reg-notify-github-plugin": true,
// Publish screenshots and test results to S3. Note that your CI will need to have AWS credentials configured for this to work.
"reg-publish-s3-plugin": {
"bucketName": "<BUCKET_NAME>"
}
}
}

Given the above config, reg-suit stores the screenshots for each commit in an S3 bucket, under a directory named for the Git SHA of that commit. It also stores master screenshots in the same bucket under its own Git SHA directory. It then uses the reg-simple-keygen-plugin to identify those directories and run comparisons between the screenshots in each.

🍱 Putting it all together

I mentioned earlier that the impetus for screenshot testing of helloclue.com was the conversion of CSS files to styled-components. Once we got the setup described above working on helloclue.com, I started the conversion process.

What was so cool about having automated screenshot tests during this process was that I could make tons of changes to the styling of components without worrying too much about making mistakes, since I knew they’d be caught.

At right, how the loading author page should look on mobile. In the middle, how it actually looked in my PR. On the left, red highlights where changes were detected.

And of course, I did make mistakes! While converting the author page to styled-components, for example, I mistakenly left out the styling that made the placeholder for the author’s title appear as a gray bar. Since our visual regression tests are integrated with GitHub, reg-suit commented on my PR to inform me that some visual comparisons had failed³: they didn’t match what was on master. My visual regression testing was actually working!

🎁 Fin

There are two things that I hope are clear from this article. First, visual regression testing is important! It can help you catch major UI bugs that your end-to-end tests missed. And second, there are tools available to make this easy in React. It’s just a matter of putting them together to make them work for you!

I’d love to hear from you in the comments:

  1. Was there anything that could be made clearer about the setup steps in this article? I’m happy to explain in the comments, or even to edit this article to make it easier to understand.
  2. What other tools, if any, are you using right now to do visual regression testing—particularly for React?
  3. Any other thoughts or feedback?

¹ Note that this works best with presentational components. Our presentational components simply take properties (such as isLoading) and output DOM. API calls, timeouts, etc. are all handled in container components. If you’d like to read more about separating business logic from presentation logic in React, I highly recommend Dan Abramov’s excellent article on the topic.

² Our actual screenshot configuration (with comments added for your benefit) looks like this:

// stories/index.tsx
import { addDecorator } from "@storybook/react"
import { initScreenshot, withScreenshot } from "storybook-chrome-screenshot/lib"
addDecorator(initScreenshot())
// Rather than wrapping an individual story in `withScreenshot()(...)`, we'll add a decorator to the entire Storybook instance. This way, it'll take screenshots of every single story.
addDecorator(withScreenshot({
// A one-second delay ensures that fonts load before screenshots are taken.
delay: 1000,
  // We take screenshots at multiple viewport sizes, to ensure that various media queries are covered.
viewport: [
{
width: 320,
height: 568,
isMobile: true,
hasTouch: true,
},
{
width: 768,
height: 1024,
isMobile: true,
hasTouch: true,
},
{
width: 1024,
height: 768,
isMobile: true,
hasTouch: true,
},
{
width: 1280,
height: 800,
},
{
width: 1440,
height: 900,
},
],
}))
import "./AuthorPage.tsx"
// [import all other stories...]

³ It’s worth noting that we use a custom GitHub integration which doesn’t actually fail the build when screenshot comparisons “fail.” This is because changes to the UI are often intentional. Instead of failing the build, our GitHub integration simply comments on the PR with a count of how many screenshots changed from master to the PR. If that count is greater than 0, I can manually review the visual regression report and determine whether or not all the changes in the PR were intentional.