Stepping back from end-to-end tests: taming external services.

Published in

Koble

6 min readDec 2, 2020

This is the first of two articles in which we will go over the techniques and approaches employed at Koble to reduce our reliance on end-to-end tests.

Koble is an API-first company. Our main product is an API for managing insurance products. Koble was built as a platform for insurers to connect to, a service that their own applications could rely on instead of re-implementing core insurance processes.

The Koble API server isn’t a standalone service: it communicates with other systems to fetch, manage, and process the information it requires. The world of software is very much interconnected. As more engineering teams push to become cloud-native, the general trend is for applications to be less monolithic and more nebular — an orchestration of different services that communicate between programmatic interfaces (e.g., HTTP interfaces and pub/sub messaging systems).

Running end-to-end (e2e) tests is often seen as the most foolproof way to test software: this way, you’re using the system the same way as a user would, and you’re testing all of the connections between the different components. However, e2e testing tends to become more difficult as system complexity increases. And there comes the point when it’s worth taking a step back and questioning whether you want to continue investing in e2e tests or whether you can find a more effective way to test your system.

In this article, we’re going to explore the question of how to replace e2e tests with smaller, more focused tests. In particular, we’re going to cover how to control external services when testing. We’ll explain a technique we use a lot when testing our API's business logic: the use of test doubles and, more specifically, stubs to replace 3rd party services. We will introduce the why and the how of this technique and explain the different tests necessary to implement stubs well.

Costs and benefits of e2e tests

We’re not questioning the value of a reliable e2e test. We are, however, questioning the cost of relying on such tests as the primary focus. To enable the e2e testing of complex systems, expensive staging environments are generally created to mimic production as much as possible. A dedicated QA team is often tasked with interacting with the system through a user interface, exercising as much functionality as possible.

This isn’t a terrible strategy: it typically yields useful results, especially in manual testing. However, the nature of systems often makes e2e testing difficult to do fast and reliably. A manual tester can check the system reliably, but they’re obviously constrained by the limits of what a human can do (in terms of working hours, for example). An automated test can run more quickly, at any time of the day, but reliability becomes the issue. It’s not uncommon for e2e tests to fail because “something went wrong.” Investigating this “something” takes time, and updating the test code to defend against this possibility can lead to more complicated test code.

Narrow your focus, increase your control.

Testing our API in a faster and more cost-effective manner involves reducing complexity so that we can run automated tests reliably, obviating the need for humans and long testing cycles. Effective automated testing requires having a great deal of control of the system under test. Although our system is interconnected, we want to test things as if they do exist by themselves: isolating our code from the outside world is essential. We want to take a step back from e2e tests and focus our testing on specific parts of the system instead. If we test the parts in isolation and then test the connections between the parts, then we’re essentially testing the system in its entirety, but in a way that’s easier to control.

*Figure 1: Understanding the logical separation of our system helps focus testing on one part at a time.*

If our system connects to 3rd party APIs, then we find ourselves communicating with somebody else’s software, hosted somewhere else. We connect over a network that may or may not work as fast as we want it to. There may be restrictions in place, such as to request throttling, where the number of requests we can make in a given time frame is limited. Equally, we might well be building against a service that’s under development, that’s prone to unexpected bugs and outages that you don’t want to impact you when you’re trying to develop your own software. These challenges can make automated e2e testing frustrating and painful. We want to eliminate our dependency on 3rd party APIs when we’re running automated tests for the business logic. The common way to do this is to implement test doubles instead of real services.

Unit test + integration test = e2e test ?

The idea of using test doubles — fakes, stubs, and mock objects — is by now quite well understood in the world of software development. These test doubles take the place of real services, creating an alternate (yet realistic) controlled environment. Every major programming language has libraries we can use to implement test doubles.

There is an obvious risk involved when testing your application against fake services: what if your stubs don’t reflect reality? You’ll have a suite of passing tests, but when it comes to plugging into the real services, you might be met with a few surprises. Therefore, it’s necessary to test these stubs. We need to test that this world of doubles remains a valid representation of the real world.

Testing the stubs shouldn’t pose too many difficulties. For every stub, we know what request we send the adapter. We need a simple test that sends the same request to a real adapter, then checks that the fake response we’ve been using is still equivalent to the real response.

*Figure 2: Testing a stub is as simple as checking that our fake responses are equivalent to the real responses*

If we test our doubles well, then we’ve organized things so that we’re performing an e2e test in a segmented fashion. We test the business logic in isolation (unit tests), then we test the assumptions we used to control things in those tests (integration tests). With the parts tested in this way, we should have confidence that an e2e test would work. Essentially, an e2e test would just be making sure that our tests were correct.

Benefits of this approach

Designing our code with stubs brings a few key benefits to the development process:

The tests run faster. Using in-memory stubs is far faster than making network requests. This is a huge bonus for people who follow the red-green-refactor development methodology of running tests after each small code change. We don’t need to validate the stubs every time we run the functional tests: we validate them periodically, at key moments (in Koble, for example, we test the stubs before approving merges).
The tests are more reliable. We’re no longer prey to the vagaries of external services. In our controlled world, they always behave as expected.
The point of failure is clearer. We’ve limited the possibilities of what can go wrong.
If breaking changes to an API we communicate with are announced in advance, we can update our stubs to reflect the changes and see how our system is affected, allowing us to plan and make changes to our application before the external service is changed.

Closing thoughts

E2E tests may be the gold standard, but e2e testing can become an expensive exercise. Breaking your dependence on this type of testing can make testing quicker, easier, and more reliable. Replacing e2e tests with unit tests that focus on the business logic and integration tests that focus on joining all the parts will give you the same coverage but in a more controlled manner. At Koble, the original driver for substituting real service calls with stubs was to improve test reliability. We have benefited from the approach in other ways too. It’s a technique worth exploring if you don’t do it already.

Stepping back from end-to-end tests: taming external services.

Written by Peter Loggie