Dyno 🦕 : Meet the Mockasaur

11 min readMay 11, 2019

In the last two articles we’ve been building up a library to communicate with Amazon’s DynamoDB database, by using their boto3 library. On the way, we’ve explored how well Python integrates with Swift, and started building a Reactive interface to allow results to be returned to the end user.

We’ve got more to do though:

build a reactive user interface which will automatically display our results
caching to deal with failed or slow connections.

Now, however, we need to pause before we build more functionality, and add some tests so that we can be sure future changes don’t break our library.

As a bonus, we’ll also fix our interface so that we can store an object in the database simply by making it conform to the Codable protocol.

I’m hoping to show some techniques you can use in your own code, for development of asynchronous tests on libraries you don’t own. Like usual, the Dyno source code is available on github — this article uses the testability branch.

Testing, testing everywhere

Testing our library is not going to be easy. For a start:

It relies on a Python interface
That interface is supplied by a third party (Amazon)
We need to connect to an external database to check the results
If we make too many database requests, we’ll be charged real money! 💵

And we of course want to make sure our tests are reliable and repeatable (oh, and free !)

I’m attacking the problem with 2 different testing approaches, both of which are complementary. We will build up a suite of tests using both approaches and hence cover a wide range of scenarios.

Interaction Diagram

Here’s a high-level diagram of how Dyno interacts with boto3 and ultimately with our DynamoDB database on AWS. This will help us understand the various testing approaches.

Approach I : You say Boto, I say Bötö

The first way of making ourselves independent from boto3’s calls to AWS is: not to use boto3 at all ! Instead, we mock out the whole of boto3 and replace it with DynoMockBoto which can replicate the whole round-trip interaction with AWS. We build a very simple mock database (just a Dictionary) which we can use to store and retrieve data.

We can also use this to simulate connection failures or very slow connections.

But before we can use this mocked boto3, we have to refactor our code a little bit to provide a place to put the mock code. So we insert a new protocol, called DynoConnection, which handles the connectivity to DynamoDB.

Our DynoConnection protocol looks like this:

Once we have that, we change our Dyno library interface so it uses DynoConnection. This is an example of dependency injection (D.I.) : rather than hard-coding into Dyno that we are using a particular way to connect to DynamoDB , this is passed in by the caller. D.I. is a good technique for testing, as it allows you to insert test code in place of your actual logic, though it does mean you need to refactor your code to use protocols.

We then move the boto3 connection logic into a new class DynoBoto3.

So that clients of our library can continue to use Dyno without worrying about passing a DynoConnection, we use Swift’s default arguments to use DynoBoto3 if no other connection is passed in. Now, our structure looks like this:

Why is DynoBoto3 a class and not a struct ? This is so that we can subclass it later to provide a different type of mocked-out interface in Approach II.

Writing our Mock

Now, we can actually write our DynoMockBoto class. We build it so we can provide it with a mini database (just a Dictionary) like this:

Note we provide a keyField for each table: this is actually a requirement of DynamoDB. DynoMockBoto uses this to do faster Dictionary lookups of the test data in the table.

We can then write a very simple mock class that implements the DynoConnection protocol. (For mocks, the simpler the better — we don’t want errors in the mock classes to cause confusion in our tests!) For example, for scan we just find the passed-in table in our test data, and return it in the format that Dyno needs to do its processing:

Our structure:

Testing with a Reactive Mock

Finally, we can actually write a test. Way back in the first article, Swift Package Manager created a Tests/DynoTests.swift file for us, and created a Test target to build. So we put our tests in that file, as a XCTestCase subclass whose functions will be our test cases.

So let’s think of a test for scan. How about making sure it returns the dinosaur we provided in our test dataset? But here we run into a challenge – remember that all of our calls are asynchronous and return Observables. Does that mean we need asynchronous tests too? That’s possible, but adds plenty of complexity and boilerplate, and hence our test code becomes hard to maintain (and discourages us from writing tests, which is the last thing we want).

RxBlocking

Fortunately, those nice folks at RxSwift have solved this problem for us. By including the RxBlocking module (which is for test purposes only) and adding the toBlocking() operator into our Observable stream, the observables are collected into a regular array.

In fact, RxBlocking also provides convenience functions like last() to get hold of the last element of the array. So our check for the test data becomes much simpler to write (see github for the full code) – we can check that the last item in the Observable stream was a successful result containing our dinosaur 🦕:

And testing works!

When I first added RxBlocking to the project, I found some of my tests started showing errors. RxBlocking was new to me – perhaps I was using it incorrectly? But after checking carefully, I realized I’d missed out completing the Observable stream in some of the failure cases in the Dyno library – which meant that the toBlocking() call was not finishing (it was timing out).

So this was a genuine bug which I’d uncovered, and quite a complex asynchronous dependency that would have been hard to find any other way. So RxBlocking was certainly worth the (minimal) effort!

Approach II : Who am I talking to ?

The problem with the Mock approach is that it doesn’t really check we are passing the right data to Boto in the first place. With scan for instance, we have quite complicated logic to build a filter (see DynoFilter in github for the details). What if that’s not correct?

What I’d like to be able to do is to check the data being sent from Dyno, via Boto, to AWS — and compare that to what I’d expect, which I’d obtain from running the same command using the boto3 library directly in Python. That assures me that I’ve created the right calls. So, the question becomes: how do we find what Boto3 is sending to AWS?

This would be possible by putting a network analyser onto my Wifi — but of course I want a way to test this repeatedly without actually making the calls. Instead, fortunately, it’s possible to set a couple of flags in boto3 to get hold of the information in a re-usable way:

First, use boto3.set_stream_logger to log what it’s doing to a file, and the Python logging module (which I loaded as PYLOGGING) to use a temporary filename for that test.

Secondly, when we create the connection to boto3, tell it to not use the default AWS URL, but to use a dummy URL which we pass in.

If we do this, boto3 doesn’t actually make the call to AWS, and nicely prints out for us the call it’s going to make, which looks something like this:

Making request for OperationModel(name=Scan) with params: {'body': '{"TableName": "Mockosaurs", "Limit": 100}', 'url': 'https://dynamodb.dummy.com/', 'headers': {'User-Agent': 'Boto3/1.9.94 Python/2.7.10 Darwin/18.5.0 Botocore/1.12.94 Resource', 'Content-Type': u'application/x-amz-json-1.0', 'X-Amz-Target': u'DynamoDB_20120810.Scan'}, 'context': {'auth_type': None, 'client_region': 'us-east-2', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x1073103d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}

boto3 then fails with an error message – which is OK because we just want to test what we are sending to AWS, not receiving.

If we wanted to test both sending and receiving, we would probably end up writing a local web service to pretend to be AWS, and hence we could control the data being sent and received. However, that ends up duplicating the testing of boto3 itself, and that’s overkill: we trust Amazon to have done that! So, we are just testing sending for now.

Parsing the log file

Once we have this log file, we can parse it to find the call being made, turn the text into a JSON object, and compare that to the JSON object we get by making the same call directly from Python.

To make this work, we create another DynoConnection implementation: this time subclassing the “real” DynoBoto3 class. This subclass is called DynoLocalOnlyBoto3 (the localOnly meaning it doesn’t go to AWS, and the Boto3 part because it otherwise does call through to the Python boto3 library). In fact, the only thing we override in the subclass is the flags to set the logger (to a temporary logfile created for the purpose), and setting the dummy URL – the rest of the logic does not change.

DynoLocalOnlyBoto3 does however contain a log parser. The idea is that once the test run is connected, we parse the log and extract the JSON for comparison to the expected JSON (which we’ll hard-code into the test itself). The log parser is this function:

…which returns a parsed dictionary [String:Any] of the data on the OperationModel line in the log file.

The parser is pretty simplistic, but one interesting extension it adds to String is suffix(after:String). This searches for the string after in a bigger string, and if it’s found, the end of the string after that is returned (or empty string, if the parameter is not found).

“abc def-ghi”.suffix(after:”def-”) == “ghi”

My original implementation tried to use indexes into the string to remove different parts, but this is much more straightforward to understand, and in line with Swift’s string processing paradigm.

Once we’ve parsed the file, we use Foundation’s JSONSerialization.jsonObject method which tries to convert data into a JSON structure. To compare the parsed data with the expected result I added a small comparison method XCTAssertEqualDictionaries in DynoTests which compares nested dictionaries, which is fine for simple testing.

Snakes on a Test 🐍🐍

One problem did appear with this testing approach. Running tests one after the other (by clicking the ‘play’ button in Xcode on each test method) worked properly, but trying to run them all at the same time (by clicking ‘play’ for the whole suite) caused seemingly-random failures in the tests using Approach II.

From the random nature of the failures it looked like a race condition, but even adding synchronization to the tests to prevent them running at the same time manifest the same behaviour: a race condition but in single threaded code! What was going on?

After some head-scratching and searching of Stack Overflow, I realized that the problem was with my use of the basicLogging method on the Python logger. This is actually a one-shot setup: the first time the logger is used when your code runs, it sets up the logger: but running it repeatedly doesn’t change the logger path again! So whichever test managed to set the logfile name first would always “win” and succeed, but the other tests would fail because they’d all be trying to parse the same logfile each time.

That was fixed with yet more Python magic (I will spare the details, see DynoLocalOnlyBoto3.init). Again I was pleasantly surprised by how easily Python and Swift integrate.

Testing works, part 2

And again all of this setup effort paid off, as I uncovered another bug, with the Filter setup (I had assumed I could pass all filter values as Strings, but numbers need to be set as numbers).

I would have definitely found that if I’d tried to run against a real database, but the advantage of the unit tests is that I could refactor the Filter code to make it rather more straightforward, and be sure it would work again afterwards… without having to use a real database and potentially pay the read costs!

Coder coda

After all that, I have a suite of unit tests for the Dyno library. This will be very useful for future development. We’ll put this to use next time when we add a user interface mini-library to process those Observables and actually show some data!

One More Thing: Python Objects to Codables

Although Swift can deal with Python objects, it doesn’t actually know much about them: they are just opaque PythonObjects. Wouldn't it be great if we could turn them into regular Swift types?

You may have noted that functions like scan can now take the type of object to create, rather than using a Builder to turn a Python object into a Swift object.

This is done by having the types you want to use with Dyno conform to the Codable protocol. I then use a bridging function convertDecodableToBuilder to convert the passed-in [De]Codable into a Builder :

The final challenge then is to provide a PythonDecoder, which can take a Python object and turn it into the type T which you pass in. This magic is performed by conforming to Swift’s Decoder protocol (used by things like JSONDecoder to similarly convert JSON objects to arbitrary passed-in types)

The problem with conforming to Decoder (and Encoder, which goes the other way) is that it requires a huge amount of boilerplate code: there are very few examples on the Internet of writing your own, and in fact the best examples are from JSONDecoder.swift and Codable.swift in the open-source Swift library.

I actually got a big jump-start using code written by my friend Sam of https://elegantchaos.com who adapted the Apple code to decode to dictionaries, I’ve just adapted that further. Look at the PythonEncoder.swift and PythonDecoder.swift files: they are not pretty but the end result is very nice!

Note, adding these files has therefore changed the distribution licence to Apache License v2.0. If you don’t like this, remove those files and use Builder/Writer directly.