Integration testing our services for fun and profit

Gerad Suyderhoud
Gladly Engineering
Published in
5 min readMay 5, 2015

At Gladly, unlike most companies, we had the luxury of building our product with a multi-service pubsub-backed architecture from the ground up. While this has been mostly great, we’ve occasionally found documentation on best practices and conventions a bit sparse.

One area where we initially had trouble finding good information was on how to best do integration testing in such an environment. Here’s a brief summary of our research and results.

A brief aside: if you’re also interested in researching and solving problems like this in your day-to-day job, we’re hiring.

tl;dr

We opted to do in-process, contract-based, integration testing by stubbing out connection libraries in each of our services’ existing test suites.

Our challenges

  1. We have a polyglot environment. Our environment consists of Go, Node and Python-backed services. So our solution works with multiple frameworks and languages.
  2. Our services communicate via pubsub. Since our product has many real-time push-based requirements, we use pubsub for inter-service communication. Our solution needed to support that technology, so we couldn’t use any of the great REST-based solutions like pact and pacto.
  3. We wanted our tests (even our integration tests) to be fast, reliable, and easy to maintain. Who doesn’t?

Our solution

We have a github repo that defines the contracts for our services, along with the source code for stub communication libraries for each platform that validate and respond to requests according to the contract.

These stub libraries are exported via package management (e.g. npm, go get) to each of the services, which then are responsible for exercising all of their relevant contracts by using the stub library within their existing test suite.

How this addresses our challenges

  1. We support multiple languages by exposing a library shim for each language in our environment.
  2. Similarly, we support pubsub by mocking out pubsub functionality in each library shim.
  3. Finally, because our tests are run in-process, without accessing the network, they are as fast and reliable as possible.

The code

Say you have a Go echo service that subscribes to the /echo topic and publishes an echo back to the topic whenever it receives a message.

And you have a javascript echo library that publishes a message to the /echo topic and calls a callback when it hears anything back.

We define a contract that describes the relationship between to the services with the following json:

To test the Go side of the service, we test that it fulfills the contract as a provider (that is, it receives a message from the consumer, and responds with the expected message from the provider).

To test the javascript side of the service, we test that it fulfills the contract as a consumer (that is, it sends the message from the consumer, and responds to the expected message from the provider.

Why we chose this approach

In process

We chose in-process testing because it’s fast and reliable. Since everything runs in-process, there’s no network overhead nor instability. As Dirk, one of our founders, put it “you don’t want your tests to break due to code that is not intentionally being tested.”

In-process testing necessitates stubbing at some level, and for us it made the most sense to stub at the communication library level. This provides two benefits:

  1. It forces us to encapsulate our communication libraries with a consistent API. This will make it easier to switch underlying communication protocols down the road, should the need arise.
  2. We only test code that we have written, we don’t double test code provided by upstream libraries.

Contract-based

We chose contract-based tests because they allow us to define tests that run against both upstream and downstream services independently (so teams can develop and test their code independently, but be confident that dependent systems will continue to work).

Contracts are limited to messages that are exchanged between services. It’s up to each service integration test to setup its state so response messages match the contract (that it’s easy to do this is another benefit of in-process testing).

Challenges with this approach

The biggest challenge with this approach is that we need to maintain a stub communication library for every platform we support. This adds a bit of extra cost and complexity every time we add a new language to our supported architecture.

Other challenges to this approach include:

  • Service authors are responsible for ensuring that they test all their contracts.
  • Authors are also responsible for manually testing upstream/downstream services when they change a contract.
  • Since state is not defined within the contract, authors are responsible for setting up initial state within their tests so that their input and output can match the contract.

Other solutions

We also considered and rejected the following alternative approaches for testing our services:

End to end testing

When it was early in the product’s lifecycle, we could probably have gotten away with just running our application against the full stack and verifying the results in the UI. It was tempting, since it didn’t require any additional development work! (We already had a front-end test driver).

But we collectively had had enough bad experiences with the performance and reliability challenges of end-to-end testing that we decided to make the upfront investment in a slightly more sophisticated solution.

We still do run end-to-end testing in all our clusters with synthetic transactions. Since we are stubbing at the library level, this is our best guarantee of correct full system behavior.

Wireline playback tests

Another approach we considered was running a proxy at the wire level to record, verify, and playback behavior.

Similar to end-to-end testing, we’ve had bad experiences with this approach in the past. It breaks the test-first approach, since in order to capture wireline traces for future replay there needs to be an existing implementation. Furthermore, it can be hard to debug and maintain tests that are written like this.

Stub server

The most compelling alternative proposal we explored was to create a stub pub/sub server that each service could connect to for the purpose of running tests.

There are a number of benefits to this approach:

  • it would insulate us from having to stub out communication libraries for each of our platforms
  • the server can more easily verify that all the contracts have been tested (right now service authors must ensure this in their test suites)
  • the server can more easily ensure that both ends of the contract are tested when the contract changes (again, this is a manual step with our current solution)
  • the server is able to provide a uniform graphical UI for showing progress, and displaying what how a service diverged from the contract in the case of failures.

Ultimately, we shied away from this solution due to the complexity and perceived costs of implementing it, though mountebank looks promising in the long-term.

Conclusion

We’ve been using this approach for our pubsub infrastructure for almost 3 years and it has served us well. And we’ve continued to invest in it even though we have recently introduced pact for our REST endpoints.

If you know of a better solution, we’d love to hear it in the comments. And if you’re interested in working on problems like this, we’re hiring.

References

We stood on the shoulders of giants when researching this. In particular:

--

--