How to test your backend in a microservice architecture

I hardly need to tout the benefits of a microservices architechture these days, I feel the entire internet has done that already. But how do you go about testing all these services?

Firstly, we must consider: why do we have tests? I’ve actually heard several answers to this question, and depending on the answer a person gives, you can almost guess at what approach to testing that person will advocate. In my view, the primary reason we have tests is to make sure we don’t deploy a broken component to production, protecting our customers from unwanted bugs and regressions.

From this purpose, we can easily define what our tests must do, their task: Ensure that any API-call an enduser might make has the expected result. No more, no less.

Therefore, for the purpose of testing, we can assume that your app looks like this:

the “app cloud” here represents all microservices that make up our app

The clients speaks to the external api, which in turn speaks to the microservices that make up your app. Note that the client doesn’t know about what services hide behind the external api, and nor should it — that’s an implementation detail, and thus subject to change!

Integration tests speaks to the app via the external api, just like a client would

I’m going to argue that your tests should mimic the clients behavior. After all, if they don’t, how will you know with certainty your customers experience won’t be negatively impacted by your next deploy?

While we’re talking about trusting your tests, there is one ground rule to testing that I cannot stress enough:

Never ever use mocks! Ever.

If you mock, you can no longer trust your tests. Unit-tests are similarly worthless for our defined purpose, since our customers will hardly use an isolated component — they use the system as a whole.

This is not to say that you shouldn’t build isolated components! You most definitely should make sure your components are as isolated and loosely coupled as possible! But when we test, we test the system.

Test what you sell.

Moving on. Often, our features deal with data of some kind. Almost always, that data is persisted somehow within our application. This means we likely need some test data!

Remember, the point of testing is discovering bugs before they affect customers, so we probably shouldn’t use production data to test — what if we have a bug that destroys or corrupts the data? Obviously we cant risk that, so what do we do?

This is where thing’s get a little less obvious. A very common pattern I’ve seen in regards to testdata for integration tests is to load some database dump that contains a known state.

The app loads a dump with a known state

This might make intuitive sense, but don’t do it!

Inserting data by importing a dump is basically amounts to mocking the parts of your code that would normally create that data. Surely that’s not how data is created in production? Most likely, you have a set of API endpoints, or maybe even an entire separate API, that handles data creation:

On my current project, there’s a separate client for creating data. It’s often called the CMS, but here I’ll refer to it as the “backoffice API”

Again, our tests should mimic the client, so test data should be created dynamically, via an API. That API should be the same API that is used in production to create data. Whether that API consists of different endpoints in the same API from which clients retrieve their data, or if it’s a separate API altogether (as in the example above) depends on the application. For clarity, I’ll be assuming that it’s a separate API, and I will be referring to it as the backoffice API.

This is what our test environment might look like, with the test-suite consuming both the external and backoffice API’s.

Test-suite using both backoffice api, and external api, in order to test the whole system.

In this setup, each test should create it’s own test data in a suitable manner, run the test, and remove the test data. The values of the test data should, whenever possible, be randomized, such that the test never runs twice with identical data.

It’s important to eliminate assumptions from our test-code. Therefore, we should aim to have values randomized at runtime whenever possible. This ensures our code can handle a wider range of possible inputs, widening the range of bugs caught in our tests!

Ignoring HTTP requests for a while, one could even picture this as a circuit, where data flows from the setup phase of the test, via the backoffice API, through the application, out from the external API, and back into the assertion phase of the test, where it’s verified to have gone through the expected transformations (if any):

One important thing to note here is the amount of assumptions that our test-suite makes. In a perfect world, you have zero assumptions in your test suite. In reality though, it’s often not possible to reach that goal. Things like configuration parameters for example are often hard or impossible to change dynamically or even read from the test suite, and so any feature that depends on some config variable might include an assumption on what that value is set to.

Second law of testing: Make as few assumptions about state as possible.

The second law of testing is a lot harder to adhere to than the first. Allow me to dive into an example:

Say we’re testing a search-feature. First we create 2 new entries, say:

{"id":1, "title":"Hoola Bandoola Band Live in concert", ...},
{"id":2, "title":"Bananas are tasty and rich in fiber", ...}

When we search for “hoola bandoola” we should expect the result to only contain the first entry, right?

Wrong! before you continue, can you guess why?

Answer: The database we’re querying might contain other entries about “Hoola Bandoola Band” than the one we created as part of this test! After all, who wouldn’t write more than once about these guys:

Still, we want to make sure we don’t get any irrelevant results from our search, right?

The proper assertion to make in this case is to check if our search result contain the entry where

id=1

but not the entry where

id=2

It may or may not include an entry where

id=3

So we must be careful not to assume that the entry we’re looking for is the only, or even the first result of our query.

A simple example

Following this approach in your testing, you’ll soon realize that practically all of your tests utilize the backoffice API, and writing test code to call these endpoints quickly gets tiresome.

It’s therefore useful to set up some utility code that helps build all these http-requests, and a nice abstraction is to have a function (or method depending on your language)per data type, that with a single call creates a database entry of that specific type. wow, that was a mouthful! allow me to clarify:

Say we are building tests for a blog. A blog will likely contain blog-posts, categories that the posts belong to and comments, there will also be users of different types (authors and readers).

A simple test verifying if we can list the posts of a specific author now needs to:

  1. Create a user of the author type
  2. Login as that user
  3. Create a title and corpus
  4. Publish the post
  5. Repeat steps 2&3 a couple of times so that we actually have a list of posts
  6. Log out
  7. Log in as visitor
  8. Request all the posts of the author we created in step one
  9. Verify the result

All of the above steps except the last means sending at least one http-request each. That’s gonna be a lot of code! And if you want a test for adding comments, that test will have to do it all over again!

Nobody wants to write hundreds of lines if they don’t have to. The good news is, we don’t!

By writing utilitycode, we can have steps 1&2 reduce down to

loginAsNewAuthor();

steps 3–5 wont be more than

publishPosts(3);

and so on, you get the point.

A bonus of doing this is that your backoffice API will be the most well-tested code in your project, without you even thinking about it! you’re welcome.