How we do visual regression testing

There are well-established tools for automating the testing of how software behaves, but very few that automate the testing of how software looks. This is how Friday does visual regression testing.

We first started investigating visual regression testing in 2012, starting with Wraith and subsequently forking it to use Selenium webdriver rather than PhantomJS. We’ve also tried Backstop and PhantomCSS, but neither stuck in our workflow. A major stumbling block was deciding when a frontend engineer should run the tests. Ad-hoc during feature development? Before submitting a pull request for a feature? On the develop branch once a feature is merged? Nightly? Maybe a QA automation engineer should be running these instead, as part of a test suite?

After a quick brainstorm with our engineering team, we decided that we needed the following basics:

  • Separate tools for taking screenshots and reporting differences. There are so many ways to define and set up a test (load a single URL; crawl a whole site; target a specific DOM element; change DOM or session state first etc.).
  • Use our preferred tool, Selenium, so that we can run tests across mutliple browsers and platforms.
  • Negligible setup for each new project.
  • Simple integration with continuous integration tools (we use Jenkins) so that developers do not need to run tests manually.
  • Integration with existing test automation tools to reduce the overhead of maintaining yet another test suite.

Introducing Spectre

Spectre is a Ruby on Rails application that manages your visual regression test suites. It provides an API for runner scripts to submit screenshots and receive a pass or fail in real time, and a simple UI for browsing and inspecting diffs.

Here’s how we use it:

Each of our projects has a test runner (usually a Rake task), triggered nightly (1), that contains a list of URLs or is pointed to a styleguide to crawl. The script uses our Selenium grid (2) to load the URLs and snap a full-height screenshot at multiple viewport widths (3), and post the result to Spectre along with other metadata such as test name, viewport width and source URL (4).

Spectre ingests the screenshot and compares it against a previous test of the same name. If the images are sufficiently different, the test fails (5) and the Rake task will report a Jenkins build failure. Jenkins notifies the team via Hipchat (6).

Static and dynamic content

But while testing against a frontend styleguide or set of HTML templates is all fine and dandy, and will catch the most obvious regressions, it doesn’t guarantee complete coverage. The main reason we developed Spectre in the way we did was to completely integrate visual regression testing into our existing testing toolsets of choice: RSpec and Cucumber.

Frontend and QA, sitting in a tree, visually r-e-g-r-e-s-s-i-n-g

Landed on a product page? Take a screenshot. Opened an accordion? Take a screenshot. Added four products to your basket? Take a screenshot. Submitted a lead form to Salesforce? Take a screenshot. The result is *full* visual regression coverage, from static frontend styleguides through to end-to-end integrations with content management systems, CRMs and payment gateways. No stone left unturned.

So many uses

  • Day to day maintenance of large-scale component libraries with many concurrent frontend and QA engineers.
  • Screenshotting a local site after making large scale typography changes for client review.
  • Performing a full regression of 120 components and 40 HTML templates after swapping ruby-sass for libsass, to ensure that differences in float rounding in the CSS output would not impact layout.
  • Performing a full site regression after refactoring a Handlebars data binding implementation.

Give Spectre a try and let us know how you get on!

Are you an engineer looking for a new home? Want to see what else we can do? Why not drop us a line at

Autodidact but not autistic.

Autodidact but not autistic.