Test Pyramid: the key to good automated test strategy

The problem

Companies are investing a lot of money into automated testing, but they are

  • not seeing improved quality
  • not seeing improved productivity
  • not seeing improved user sentiment

One of the main reasons for this is the over reliance on browser based end to end tests.

If a tech team has a visible build radiator than it might look like this:

I often see

  • Long running tests suites; 30 mins-4 hours, sometimes days
  • Flakey tests fails 1 in 10 or even 1 in 3 runs
  • Too much time by QA and devs creating and maintaining regression tests; rather than finding and fixing problems
  • Environments broken or not reflecting production; therefore negating the usefulness of the regression tests

Introducing the Test Pyramid

In 2012 Martin Fowler wrote an article about the test pyramid, this was a concept developed by Mike Coen. Within my company (ThoughtWorks) it is well known and used day to day to inform how we design our automated testing and continuous integration practices, but it is surprising how little is known in the software industry.

The test pyramid is a tool to fix the problem of over-reliance on long-running UI tests.

Testing pyramid

The pyramid says that tests on the lower levels are cheaper to write and maintain, and quicker to run. Tests on the upper levels are more expensive to write and maintain, and slower to run. Therefore you should have lots of unit tests, some service tests, and very few UI tests.

I often see companies test suite looking like an ice cream cone.

Testing ice cream cone

Here they have very few unit tests, some UI test, lots of QA tests and lots of manual tests.

In this situation the QA department has created an automated test suite, but the development team has not. It will be very long running and flakey because the development team has not helped build the suite or architect the application in a way that makes it easy to test. It is broken by the devs very regularly and they are relying on the QA department to fix it.

There is probably both a manual test team and an automated test team. There is likely not enough trust in the test suite so regression is being performed twice, once manually and once automatically, because of this double work and because of the extra hand-offs between the testing teams this approach has made the company go slower.

Testing hourglass

In this situation there is a lot of Unit and UI tests but little service based tests.

This is better, there is no manual testing, the QA department has created a test suite that the company is relying on. The developers are writing a lot of unit tests, perhaps they are practicing TDD. However a lot of the logic that is being tested in the unit tests is also being tested in the user interface tests as well. Again causing double work. The QA team is almost solely relying on browser based tests for regression, instead of cheaper service based tests.

Both the hourglass and the ice cream cone is an indicator that there is a lack of collaboration and communication between the QA and Development departments, they are probably separated organizationally and even by location.

Why is the hourglass and the ice cream cone a problem

With a lot of UI tests, the regression suite will be slow. This affects the productivity of the whole team, it takes a long time for developers to get feedback on their changes. The continuous part of Continuous Integration is slowed down. It adds a delay before the developer can find out if there changes work, whether it can integrate with their dependencies and to get feedback from business owners or users.

Ideally, you would want a test suite that runs in 10–15 mins to not adversely affect productivity.

The slippery slope

One solution teams take when the test suite grows large and slow is to change the workflow of the team. This is a slippery slope, perhaps the team would no longer expect developers to run the full suite of tests before they check in to source control, the test suite runs on the CI server after the check-in.

If the team is still complaining about the time or stability they may choose to move the test suite to an overnight daily build.

Worse still the company may lose all confidence in the test suite and decide to release software to QA envs or even production without the test suite passing.

This starts to change the behavior of the team, slowly the team relies less on the automated test suite, it becomes out of sight and out of mind for the developers. The test suite to a developer is something that another team is running and worrying about somewhere. It is not maintained well, it becomes slower and more flaky, thus loosing it’s effectiveness.

There is a point when someone will ask why are we spending so much money to build a test suite but not seeing any value. We have not improved quality, productivity or user sentiment.

Why are UI tests so expensive to create and maintain

User Interface (UI) tests, usually means browser-based tests written with a tool such as selenium. These tools go through multiple layers (network, browser, database) to get to the code they are testing. This adds a lot of latency and slowness to every operation. Of course, this is by design, they are intended to be as close to the user’s real experience as possible, but they can never be as fast as an in process unit test or a API service test.

It is not uncommon to see a test for every acceptance criteria or for every story, this will very quickly balloon the number of tests you are writing. Each test could be 10 seconds or even up to 30 seconds. Each test represents more time you are adding before getting feedback on the change that has been made.

UI tests are very hard to write well. Browsers are asynchronous in their nature, different parts of the web page will load at different times. The use of AJAX is now commonplace, it takes quite a bit of effort to write a good deterministic test.

The feedback cycle for a UI test is slow, when you write a unit test if you have made a mistake you will find out in under a second, with a UI test, it could be 10–30 seconds, long enough to get distracted, look at your phone or check your email.

With an extensive test suite, it is vital to add structure, abstractions, and organization. UI tests typically are seen as scripts, they are not treated as production code, and without refactoring to clean code, the test suite will be difficult to maintain.

The solution

By aligning your development and automated testing goals, and organizing your tests suite into a pyramid you can avoid these problems, I will explain more in these articles:

Pushing down the test pyramid; using the test pyramid to optimize automated test usefulness

Quality is everyone’s responsibility; Avoiding disconnected Development and QA departments