Contract Testing Serverless and Asynchronous Applications — Part 2

“windmill pointing east” by Jordan Ladikos on Unsplash

By Ron Holshausen

We have now entered the era of the serverless function and we no longer have to worry about where or how our code runs. Someone else will do the worrying for us (for a nominal price) and we only have to concern ourselves with getting our functions to fulfill their destinies and become all they can be. Which raises some questions relating to our testing practices. Well, it raises other questions too, but for the purpose of this blog post that is the important one.

Technically there are still pesky servers. We just have no visibility in which data centre our functions are running, or on what computer, or in fact, if they are running on a computer or someone else’s fridge. But our functions are interacting with each other in an asynchronous manner, consuming events as input and generating output which in turn results in more events to be consumed. All the thinking we have done for testing contracts with the consumers and providers over message queues can be applied equally well here.

Contract Testing in a Serverless World

For this post we will be using AWS Lambda functions as our examples. I’m assuming everything will hold true for Google Cloud Functions and Microsoft Azure functions as well, but I have never used those.

The very first Lambda function I used helped us break out functionality from a legacy monolithic system. We needed to add a feature to email a PDF to customers of our client, and we did not want to have to go through the release process of the large application. The solution to this was to put the smallest code we could in the legacy system, and once the data was out we could use all the AWS services and enable a continuous delivery pipeline.

In this case we chose to update the legacy system to write a JSON file with all the necessary data required to an S3 bucket. It was a small change which gave us a lot of flexibility. From there (once that change was deployed), we could wire up a function to respond to the bucket events, render a PDF back to the bucket and then have another function respond to those events to email it out. It was wondrous.

Until the day someone didn’t get their PDF in an email.

Turns out a small change in a model class in the legacy system resulted in the JSON format changing, resulting in the lambda function failing. No working lambda function meant no PDF, which in turn resulted in no lambda function invocation and no email. And no alerts meant nobody saw the error logs in Cloud Watch Logs. It also turns out that delivery managers get really upset from a lack of delivery. Who knew?

Looks like a contract test is in order. We just need to work out where the contracts are.

In this case we can use the same Message Pact solution as described in the previous post to test this contract. The S3 bucket and S3 events are just the transport mechanism. The actual contract is between the bit of code we added to the legacy system and our lambda function. Had we being paying attention, we should have noticed that writing the JSON file to the S3 bucket was crossing a context boundary, and we should have put a contract test in place. We should also have had alerting on our logs. And drunk less, exercised more and eaten healthier. But it was the days of the wild serverless west. Which, of course, is actually no excuse.

Here is a diagram of the flow of an asynchronous message-based contract test using Pact:

The Lambda function is the consumer of the JSON message, and the legacy application is the provider. Well, it is a little more complex than that. The message is not the actual JSON data, but an S3 event which contains a reference to the JSON file stored on an S3 bucket. There is also an HTTP request under the covers between the AWS S3 service and the Lambda execution service, but we can just focus at the higher level and assume we get an asynchronous S3 event passed to our function via the S3 bucket.

The Lambda Function Consumer

Most Lambda functions would probably be written in Node.js, but in this case we were well versed in the Java PDF generation libraries so we chose to write it as a Groovy JVM function. The overheads of using a JVM-based Lambda function was ok, as it didn’t really matter how long it took the function to respond to the event as long as the PDF was emailed for the start of the following business day. We could also then use Spock to write the test and Pact-JVM to test it.

The consumer Pact test defined the message interaction based on our JSON data payload, and wrapped it in a mock of the S3 event. It looked something like this:

class PoToPdfHandlerPactSpec extends Specification {
    // our Lambda function handler
private PoToPdfHandler handler
// mock of the service which will generate the PDF file
private PDFGenerator pdfGenerator
// mock of the service that will fetch the JSON from the S3 bucket
private PurchaseOrderService orderService
    def setup() {
pdfGenerator = Mock()
orderService = Mock()
handler = new PoToPdfHandler(pdfGenerator, orderService)
}
    def 'has a contract with the Big Bad Legacy App with regards to POs'() {
given:
def poStream = new PactMessageBuilder().call {
serviceConsumer 'PoToPdfHandlerLambdaFunction'
hasPactWith 'Big Bad Legacy App'
            given('there is a valid address and email')
expectsToReceive 'a purchase order in json format'
withContent(contentType: 'application/json') {
supplierName string('Test Supplier')
supplierOrderId identifier()
processingCentreId identifier()
orderDate timestamp('yyyy-MM-dd\'T\'HH:mm:ss')
lineItems minLike(1) {
productCode regexp(~/\d+/, '000011')
productDescription string('JIM BEAM WHITE LABEL COLA') // oh, yeah
quantityOrdered integer(20)
}
summary {
orderTotalExTax decimal(2000.0)
orderTotalIncTax decimal(2200.0)
}
supplierEmail string('TestSupplier@wild-serverless-west.com')
senderEmail string('buyers@wild-serverless-west.com')
}
}
        def bucket = 'testbucket'
def inputKey = 'po.json'
def outputKey = 'po.pdf'
        // We need to mock out the AWS objects for this test,
// as the handler will use the AWS SDK to fetch the
// actual message from the S3 bucket using the information
// from the event we receive
Context context = [:] as Context
def poBytes
def mockS3Object = Mock(S3Object) {
getObjectContent() >> { new S3ObjectInputStream(
new ByteArrayInputStream(poBytes), null)
}
}
        // The event JSON we will receive
def eventJson = [
records: [
[s3: [bucket: [name: bucket], object: [key: inputKey]]]
]
]
def event = Gson.newInstance().fromJson(JsonOutput.toJson(eventJson), S3Event)
        when:
poStream.run { Message message ->
// The actual JSON from the message will be wrapped in an input stream
// which will be read by our handler via the mocked AWS SDK call
poBytes = message.contentsAsBytes()
            // An we now invoke the handler
handler.handleRequest(event, context)
}
        then:
// We expect the PDF generator to be called. It means we were
// able to correctly process the JSON from the downstream system
1 * pdfGenerator.generate(_) >> new byte[1]
1 * orderService.fetch(bucket, inputKey) >> mockS3Object
1 * orderService.save(bucket, outputKey, _)
}
}

The test sets up the expected message as a Message Pact. It then mocks out an S3 event to return the message payload of the expected message via the AWS SDK calls. Thirdly, it invokes the lambda function with the mocked event and then verifies that the PDF and persistence services where called.

Running this test results in a Message Pact file being generated with our expected JSON format. This is nearly identical (apart from mocking the AWS stuff) to the consumer test in the previous post. But instead of testing consuming a message off a message queue, we are pretending to respond to an AWS event.

The Legacy Application Provider

Now for the important part. Well, it is all important. So, now for the more important part. We want to ensure that the bit of code we added to the legacy application will always generate JSON files that can be processed by our Lambda function. We can do that by using the Message Pact verification that we used for the message provider. We create a test method annotated with the @PactVerifyProvider annotation that matches the description from the pact file from the consumer test. This method must invoke the code that generates the JSON data that normally gets written to the S3 bucket and return it so that it can be verified against what our Lambda function expects.

As with most legacy applications, it was not as easy to write as we had to mock out a lot of collaborators to get it to work, and I won’t bore you with those details. Here is a simplified version of the verification function:

class SubmittableSupplierOrderPact {
  @PactVerifyProvider('a purchase order in json format')
String jsonForPurchaseOrder() {
// This was the cause of our failure. A change in these classes
// caused the generated JSON to change
SupplierOrderView supplierOrder = supplierOrderViewFixture()
    ISupplierOrderItemView item1 = new SupplierOrderItemView()
item1.with {
setProductCode('1234')
setProductDescription('Test Product 1')
setPurchaseOrderQuantity(10)
setListPrice(200)
setTotalPriceExTax(100)
}
ISupplierOrderItemView item2 = new SupplierOrderItemView()
item2.with {
setProductCode('1235')
setProductDescription('Test Product 2')
setPurchaseOrderQuantity(50)
setListPrice(200)
setTotalPriceExTax(900)
}
List orderItems = [ item1, item2 ]
    IOrganisation organisation = [
getEmailAddress: { 'TestSupplier@wild-serverless-west.com' }
] as IOrganisation
    // The model class that gets rendered to JSON
def subject = new SubmittableSupplierOrder(supplierOrder, orderItems, organisation,
// yikes! timezones!
SystemParameterManager.timeZoneId)
// This is the Object mapper used to convert the model classes to JSON. Hopefully nobody
// changes the actual code to use something else. But as it is a legacy application, it
// is unlikely.
def mapper = new JSONMapperModule().provideReportObjectMapper()
// return the JSON representation
mapper.writeValueAsString(subject)
}
}

Running the Pact-JVM verifier will result in the function jsonForPurchaseOrder being called and the result verified against the message payload from the pact file. Now if the JSON generated from the legacy application ever changes in a way that our Lambda function can not process it, we will get a failing build.

For us it resulted in customers always getting their PDFs as promised and a happier Delivery Manager. But my suspicion is that the last bit is just a coincidence.

Look, Ma, I have a Contract Testing Hammer!

Now that we have implemented a contract test over a message queue, and one with a Lambda function invocation, we started to see patterns all over the place where we could use these asynchronous contract tests. We had also learned about the Context Boundary as an important place to have a contract test.

We had a service in one bounded context that was creating an event and needed to pass it over to a service in another bounded context. A pattern you could use was to have an adapter service on the boundary that could accept the request. But in our case we did not have much time (most of the services in the second context did not exists yet), and the ownership of the data was passing with the call (the data was now owned by the new service and only a reference stored to the data in the former context). The data flow was also one way at this stage.

It was felt that with a contract test in place we could just push the JSON document to the document store in the correct format. In essence, we could treat the document store as the transport mechanism in the same way we treated the S3 bucket as one. Service 1A is the provider, the JSON document the message payload and the service on the other side (service 2A) is the consumer.

In Summary

Contract testing serverless functions turned out to be not that different from contract testing we had done before. They have inputs, outputs and stuff in the middle that loves to misbehave the same as regular functions (and children). Any asynchronous invocation (like Lambda functions being invoked in response to events) with data passed around in unstructured formats (like JSON) can also be susceptible to failure if the format of the data changes. And the format of the data will change. In fact, the format of your JSON has probably already changed, you may just have not noticed it yet.

Like with message queues, knowing who is consuming your messages and how they are consuming them can be a blessing when you need to change the message format. Same with serverless functions. Knowing which functions are going to respond to the JSON file you write to that S3 bucket makes it easier to be able change that JSON file in future. And you will have to change it in the future. Or you may have already changed it, you just have not noticed it yet!