The Magic of Consumer Driven Contract Tests

Published in

Hootsuite Engineering

10 min readAug 27, 2020

Preface

Why hello there! My goal in this article is to share what Consumer-Driven Contract Tests are, how to design them, how we use them at Hootsuite, and talk about some of the technical challenges that come with these tests. For simplicity, I’ll refer to Consumer-Driven Contract Tests as contract tests. So let’s get started!

What are Contract Tests?

Now before you think you are signing your life away, contract tests are quite an integral part of creating a reliable service-oriented architecture.

As an application evolves, code changes often result in side effects. These could be schema change of core application models, change in values in each respective fields, or changes that resulting in a different side effect for your application. In a monolithic application, good unit tests and integration tests would catch breaking modifications. However, unit tests and integration tests simply aren’t enough for microservices. Often, services are maintained by different teams. The more teams there are the more changes that can occur on any data schema i.e. events sent between services, and the higher the chance that breaking changes can happen. This is where contract tests shine.

Contract testing first and foremost is a design pattern for testing that does NOT require frameworks to create! A lot of the time developers would look for frameworks or tools to create tests, but contract tests are not restricted to a particular language and do not require additional tools or frameworks.

Contract tests can be thought of as integration tests between services that depend on each other. These tests ensure the link between a service providing some functionality (producer) and the service (consumer) which depends on these functions is not broken. This can be an HTTP endpoint for which consumers call on or producers produce an event that the consumer consumes. In this article, we are focusing on contract testing for “asynchronous” flows (producers produce an event that the consumers later consume). By no means are contract tests end-to-end testing of your entire application. Instead, they will test and guarantee parts of your application are working correctly by probing parts of your application, and they are meant to be faster than end-to-end testing!

Here in this article, we focus on consumer-driven contract tests. What does this mean? In each test case, consumers are essentially telling the producers “hey, you are required to behave exactly in this manner, and don’t you dare change that!”. This way tests are focused on ensuring the consumer’s demands are met.

Contract Test Case Recipe

So, what makes up an individual contract test case? Well, let’s take a look at the following diagram…

(Helpful definitions)

Events — Data resulting from an action or activity that is generated by one service and meant to be sent/consumed by other services
Producers — Services Emitting events
Broker — Intermediate service that holds emitted events and provides them to consumers
Consumers — Services Consuming events

At Hootsuite, contract tests run against our live staging infrastructure, using the services and Kafka brokers deployed on staging.

At a high level, the end-to-end flow of a contract consists of 3 phases:

Phase 1 — Emit Test Data

This normally involves defining and generating/mocking the necessary data by using the producer service. This can be done either through SDKs or directly interacting with a service’s endpoints to create the data. When that is not possible, mocking the data within the test case is an option, however, it’s better to test with live data. The goal here is to then emit the data to the brokers.

Phase 2 — Broker Receives Data

Great, when data is created and emitted, a broker acts as the intermediary in microservice architectures. At Hootsuite, we use Kafka as the broker between the producers and consumers. A producer service places the data or “event” into Kafka, and services listening on to Kafka will grab events that they want and when they want it. Here in phase 2, we want to be able to check whether the event has been emitted and stored in Kafka. Though this is optional as a similar check can be performed in phase 3.

Phase 3 — Consumers Receives Data

Consumers listen to Kafka and identify events it wants and will consume the event into its service. Once the data is consumed, the consumer will perform actions on the data to yield either an expected data or behavior for your application. It is this data or behavior that you want to assert on!

Currently, most contract tests are embedded as part of both the producer and consumer’s build pipelines at Hootsuite. When a build on either the producer or consumer service runs, the contract tests are cloned from the repo, built, and executed. This way, when there is any breaking change in either the producer or consumer service, it is caught before any code gets deployed. To a certain degree, these also serve as documentation to developers modifying any service codebase. Though any tests added to any codebase should technically also serve as good documentation …. However, that’s not the only place you can place contract tests! Some additional ideas:

Jenkins Scheduled Build Jobs — Running on intervals
Create a contract testing service that executes them at scheduled intervals — proactive and can serve as health status for your services
Embed the tests inside a docker container, this way the time it takes to perform setup for contract test execution can be reduced.

Hootsuite Contract Tests with a Twist

At Hootsuite, Contract Tests are done with a little bit more complexity. To achieve more purposeful testing, phase one in the previous section has a bit of a twist. Instead of mocking the data, we can generate the necessary data for the producers by hitting the APIs of major social networks — Facebook, Twitter, Instagram, LinkedIn …etc. Once a Tweet or Post is created, if the social network accounts are linked to Hootsuite, our services will receive the creation event information and use that as the data for the contract tests! LIVE DATA!

(Pretty cool isn’t it? 😉)

Though this is cool, this adds 3 additional challenges to writing reliable contract tests:

Data Schema Changes

It’s no surprise API’s get deprecated or changed all the time, this means contract tests, depending on how the API’s data is formatted and which fields are used would often require changes

Data correctness

People make mistakes, API’s underlying changes can sometimes go sideways, you can always end up receiving faulty data. So sometimes your tests aren’t failing because you’ve made a mistake but the social network you depend on has a bug!

Timing of Data delivery

Probably more “on-time” than your average package from Canada Post, there can always be a delay in when you receive the data generated by external APIs …. especially when traffic is worse than the morning rush

Here’s the key point

Hootsuite’s contract tests rely on live data provided by Social Network APIs

This opens the door to a lot of cool testing strategies that can be done. However, at the same time, introduces additional points of failure for test cases. Yet, the pros outweigh the cons as the changes from upstream third party API can also be captured as part of these tests, implicitly contributing to the application’s test coverage and reliability.

Twitter Contract Tests

Recently, at Hootsuite, we did work to migrate our Twitter services from polling to Webhook, referred to as Twitter Migration. So why was this important for building out Contract Tests?

Well, to begin with, Webhooks provide a more reliable and instantaneous response to any of Twitter-related activities! Thus, we would be able to create live data directly from the Twitter APIs to test. This made the tests more meaningful, robust, and reliable with data coming faster than polling reducing the possibility that the tests may fail due to time outs.

Here’s how we do it…. with a diagram 😎

Within the contract test suite (the same test suite should also be executed in producer service).

Generate Live Data (Producer)

At the beginning of the test case, we make calls to generate Twitter Mention or DMs through a service that specifically handles interactions with the Twitter API. This will allow us to supply dynamic content to be tested. The message content should be dynamic and changes on each request if you are generating live data.

2. Check if Data is Present in Kafka (Broker)

Once the service responsible for emitting twitter related events (Producer Service) receives the event via Webhook, the data is packaged into an event object and placed into the Kafka brokers. Here, we can assert whether the event was generated and placed into Kafka.

3. Check Expected Behavior (Consumer)

Once we are guaranteed an event is in Kafka, we can test whether the event generates an expected behavior in the consumer services. Note here, that despite the service for determining message assignment and tagging matching the definition of consumer service, it is easier to test the behavior of its consumer services (services that actually performed the assigning and tagging of messages). Utilizing the SDK from the downstream services, we can check whether the correct assignment or tagging occurred.

Alright, this can’t be a technical article without some code …. so……. let’s take a look at an example! Here’s something we’re using for Twitter contract tests!

/*
    Scenario: A registered user direct mentions a registered user
    Expected Result: The receiving registered user should be assigned.
*/
"Creating a Twitter Mention" should "generate an assignment" in new TestCase {
// Preparing the request with the unique text we want to tweet
    val request = generateCreatePostRequest(
      messageBody = s"@$receiverName $randomDayQuote",
      socialProfileId = senderSocialProfileId
    )// PHASE 1
// Create the tweet 
    whenReady(scumApi.createPosts(List(request))) {
// Wait for the resulting confirmation that a tweet has been made
// and save the response fields so we can use it to validate within
// our own system.
      result: List[CreatePostResponse] =>
        val response = result.headOption.get
        response.postId shouldBe defined// PHASE 2
// Checking broker
        eventually {
// check if the upstream producer service, whether it successfully 
// obtained a response and provided the correct event inside kafka
// by validating if the postId we got from tweet creation response
// matches the event's postId.
          validateEventBus(response.postId.get)
        }// PHASE 3
// Checking if downstream services performed expected action and 
// assert on them 
        eventually {
// check if message assigning service correctly assigned a
// message/post to a user
         whenReady(isAssigned(response.postId.get, receiverOrgId)){
            isAssigned =>
              isAssigned shouldBe true
          }
        }        eventually {
// check if message has been tagged correctly and the correct number // of tags have been applied. 
          whenReady(getTags(response.postId.get)) { response =>
            response.tags.length shouldBe 1
            response.tags.head.id shouldBe tagId
          }
        }
    }
  }

Couple things to note:

Eventually blocks

These are critical to performing these tests in Scala. We can retry policies to handle scenarios like a race condition, where the actual result hasn’t been generated yet.

Imagine having two endpoints for

Creating a message
Retrieving a message

If I requested to Create a message and immediately make a request to retrieve the message, if the message hasn’t been created we’ve got an empty response. However, does this mean the message can’t be created? Nope! It may just not have been created by the time your request for retrieving the message was made. Thus, using an eventually block with a retry policy specifying the interval of retry and the max number of retries will allow you to check the retrieve message endpoint at a later point to see if the retrieve message endpoint can return a non-empty response!

Define your policy carefully

It may be alright for calls to internal Hootsuite services to have more frequent retry policies, but when it comes to third-party API’s? Have longer intervals between retries and fewer retries!

WhenReady blocks

This can be thought of like awaits from JavaScript, just handling async responses! That’s it!

Technical Challenges

Now, a couple of things I’ve encountered while creating these contract tests.

Testing Flakiness — Many moving parts within in contract tests can result in test flakiness

Do your technical due diligence, wait for a reasonable time for network calls to return, and set a good retry policy that maximizes the success rate.
Accept the fact some tests may fail due to flakiness, so long as when evaluating the value of the test case, this test still highlights and covers an important part of your code, this may be an acceptable flakiness

Identifying points of failure

Relating to the point above. Sure, contract tests can alert developers that something is wrong. However, it often becomes difficult to pinpoint where exactly is the problem. This still requires manual debugging, tracing an event’s path to identify what could have gone wrong. For example, sometimes the failure is not related to your tests or your services, but the upstream APIs have failed. It may take quite a bit of time until you identify that. To combat that, really an SOP (Standard Operation Procedures) can be created to let developers know where to look first. This way, it’ll be easier to identify the points of failure!

Don’t Spam! You aren’t a malicious bot!

If you are depending on external APIs or services, you may encounter rate-limiting or spam detection for the data you are emitting. Be courteous and mindful of how frequently you run the tests and randomize your data with some meaningful text and change it up in each call! I’m sure there’s a list of jokes somewhere on the internet you could use to randomize what you post 😂

Why is it Important?

Additional Functions of Contract Tests

Serve as documentation for use case scenarios for critical components of your application
Provide confidence in future changes (avoiding breaking changes) of these critical components
Help drive Test-Driven Development (TDD) — Once one test exists, it’ll open the door to more tests to write!

Thinking back to my personal experience with the Twitter Migration work. It was certainly bumpy, as for a simple Tweet, one must also account for scenarios such as retweets, retweet with mentions, replies, nested replies, single mention, multi-mentions, and the list goes on. For a new developer who’s never touched Twitter itself, this can get truly confusing, and something would be missed.

Concluding Remarks

Sure, contract tests may consume a good chunk of a developer’s time to implement and may be expensive to maintain. However, the value it can provide when applied to critical components of your application can be extremely high. Having effective and tests allow you to scale your product, teams, and company efficiently with reduced application outages! Your SLO’s will thank you later 😂