Dyno đŠ : Meet the Mockasaur
In the last two articles weâve been building up a library to communicate with Amazonâs DynamoDB database, by using their boto3
library. On the way, weâve explored how well Python integrates with Swift, and started building a Reactive interface to allow results to be returned to the end user.
Weâve got more to do though:
- build a reactive user interface which will automatically display our results
- caching to deal with failed or slow connections.
Now, however, we need to pause before we build more functionality, and add some tests so that we can be sure future changes donât break our library.
As a bonus, weâll also fix our interface so that we can store an object in the database simply by making it conform to the Codable protocol.
Iâm hoping to show some techniques you can use in your own code, for development of asynchronous tests on libraries you donât own. Like usual, the Dyno source code is available on github â this article uses the testability branch.
Testing, testing everywhere
Testing our library is not going to be easy. For a start:
- It relies on a Python interface
- That interface is supplied by a third party (Amazon)
- We need to connect to an external database to check the results
- If we make too many database requests, weâll be charged real money! đ”
And we of course want to make sure our tests are reliable and repeatable (oh, and free !)
Iâm attacking the problem with 2 different testing approaches, both of which are complementary. We will build up a suite of tests using both approaches and hence cover a wide range of scenarios.
Interaction Diagram
Hereâs a high-level diagram of how Dyno interacts with boto3
and ultimately with our DynamoDB database on AWS. This will help us understand the various testing approaches.
Approach I : You say Boto, I say Bötö
The first way of making ourselves independent from boto3
âs calls to AWS is: not to use boto3
at all ! Instead, we mock out the whole of boto3
and replace it with DynoMockBoto
which can replicate the whole round-trip interaction with AWS. We build a very simple mock database (just a Dictionary) which we can use to store and retrieve data.
We can also use this to simulate connection failures or very slow connections.
But before we can use this mocked boto3
, we have to refactor our code a little bit to provide a place to put the mock code. So we insert a new protocol, called DynoConnection
, which handles the connectivity to DynamoDB.
Our DynoConnection
protocol looks like this:
Once we have that, we change our Dyno
library interface so it uses DynoConnection
. This is an example of dependency injection (D.I.) : rather than hard-coding into Dyno that we are using a particular way to connect to DynamoDB , this is passed in by the caller. D.I. is a good technique for testing, as it allows you to insert test code in place of your actual logic, though it does mean you need to refactor your code to use protocols.
We then move the boto3
connection logic into a new class DynoBoto3
.
So that clients of our library can continue to use Dyno
without worrying about passing a DynoConnection
, we use Swiftâs default arguments to use DynoBoto3
if no other connection is passed in. Now, our structure looks like this:
Why is
DynoBoto3
a class and not a struct ? This is so that we can subclass it later to provide a different type of mocked-out interface in Approach II.
Writing our Mock
Now, we can actually write our DynoMockBoto
class. We build it so we can provide it with a mini database (just a Dictionary) like this:
Note we provide a
keyField
for each table: this is actually a requirement of DynamoDB.DynoMockBoto
uses this to do faster Dictionary lookups of the test data in the table.
We can then write a very simple mock class that implements the DynoConnection
protocol. (For mocks, the simpler the better â we donât want errors in the mock classes to cause confusion in our tests!) For example, for scan
we just find the passed-in table
in our test data, and return it in the format that Dyno
needs to do its processing:
Our structure:
Testing with a Reactive Mock
Finally, we can actually write a test. Way back in the first article, Swift Package Manager created a Tests/DynoTests.swift
file for us, and created a Test target to build. So we put our tests in that file, as a XCTestCase
subclass whose functions will be our test cases.
So letâs think of a test for scan
. How about making sure it returns the dinosaur we provided in our test dataset? But here we run into a challenge â remember that all of our calls are asynchronous and return Observables. Does that mean we need asynchronous tests too? Thatâs possible, but adds plenty of complexity and boilerplate, and hence our test code becomes hard to maintain (and discourages us from writing tests, which is the last thing we want).
RxBlocking
Fortunately, those nice folks at RxSwift have solved this problem for us. By including the RxBlocking
module (which is for test purposes only) and adding the toBlocking()
operator into our Observable stream, the observables are collected into a regular array.
In fact, RxBlocking
also provides convenience functions like last()
to get hold of the last element of the array. So our check for the test data becomes much simpler to write (see github for the full code) â we can check that the last item in the Observable stream was a successful result containing our dinosaur đŠ:
And testing works!
When I first added RxBlocking
to the project, I found some of my tests started showing errors. RxBlocking
was new to me â perhaps I was using it incorrectly? But after checking carefully, I realized Iâd missed out completing the Observable stream in some of the failure cases in the Dyno library â which meant that the toBlocking()
call was not finishing (it was timing out).
So this was a genuine bug which Iâd uncovered, and quite a complex asynchronous dependency that would have been hard to find any other way. So RxBlocking
was certainly worth the (minimal) effort!
Approach II : Who am I talking to ?
The problem with the Mock approach is that it doesnât really check we are passing the right data to Boto in the first place. With scan
for instance, we have quite complicated logic to build a filter (see DynoFilter
in github for the details). What if thatâs not correct?
What Iâd like to be able to do is to check the data being sent from Dyno, via Boto, to AWS â and compare that to what Iâd expect, which Iâd obtain from running the same command using the boto3
library directly in Python. That assures me that Iâve created the right calls. So, the question becomes: how do we find what Boto3 is sending to AWS?
This would be possible by putting a network analyser onto my Wifi â but of course I want a way to test this repeatedly without actually making the calls. Instead, fortunately, itâs possible to set a couple of flags in boto3
to get hold of the information in a re-usable way:
- First, use
boto3.set_stream_logger
to log what itâs doing to a file, and the Pythonlogging
module (which I loaded asPYLOGGING
) to use a temporary filename for that test.
- Secondly, when we create the connection to
boto3
, tell it to not use the default AWS URL, but to use a dummy URL which we pass in.
If we do this, boto3
doesnât actually make the call to AWS, and nicely prints out for us the call itâs going to make, which looks something like this:
Making request for OperationModel(name=Scan) with params: {'body': '{"TableName": "Mockosaurs", "Limit": 100}', 'url': 'https://dynamodb.dummy.com/', 'headers': {'User-Agent': 'Boto3/1.9.94 Python/2.7.10 Darwin/18.5.0 Botocore/1.12.94 Resource', 'Content-Type': u'application/x-amz-json-1.0', 'X-Amz-Target': u'DynamoDB_20120810.Scan'}, 'context': {'auth_type': None, 'client_region': 'us-east-2', 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x1073103d0>}, 'query_string': '', 'url_path': '/', 'method': u'POST'}
boto3
then fails with an error message â which is OK because we just want to test what we are sending to AWS, not receiving.
If we wanted to test both sending and receiving, we would probably end up writing a local web service to pretend to be AWS, and hence we could control the data being sent and received. However, that ends up duplicating the testing of
boto3
itself, and thatâs overkill: we trust Amazon to have done that! So, we are just testing sending for now.
Parsing the log file
Once we have this log file, we can parse it to find the call being made, turn the text into a JSON object, and compare that to the JSON object we get by making the same call directly from Python.
To make this work, we create another DynoConnection
implementation: this time subclassing the ârealâ DynoBoto3
class. This subclass is called DynoLocalOnlyBoto3
(the localOnly
meaning it doesnât go to AWS, and the Boto3
part because it otherwise does call through to the Python boto3
library). In fact, the only thing we override in the subclass is the flags to set the logger (to a temporary logfile created for the purpose), and setting the dummy URL â the rest of the logic does not change.
DynoLocalOnlyBoto3
does however contain a log parser. The idea is that once the test run is connected, we parse the log and extract the JSON for comparison to the expected JSON (which weâll hard-code into the test itself). The log parser is this function:
âŠwhich returns a parsed dictionary [String:Any]
of the data on the OperationModel
line in the log file.
The parser is pretty simplistic, but one interesting extension it adds to
String
issuffix(after:String)
. This searches for the stringafter
in a bigger string, and if itâs found, the end of the string after that is returned (or empty string, if the parameter is not found).
âabc def-ghiâ.suffix(after:âdef-â) == âghiâ
My original implementation tried to use indexes into the string to remove different parts, but this is much more straightforward to understand, and in line with Swiftâs string processing paradigm.
Once weâve parsed the file, we use Foundationâs JSONSerialization.jsonObject
method which tries to convert data into a JSON structure. To compare the parsed data with the expected result I added a small comparison method XCTAssertEqualDictionaries
in DynoTests
which compares nested dictionaries, which is fine for simple testing.
Snakes on a Test đđ
One problem did appear with this testing approach. Running tests one after the other (by clicking the âplayâ button in Xcode on each test method) worked properly, but trying to run them all at the same time (by clicking âplayâ for the whole suite) caused seemingly-random failures in the tests using Approach II.
From the random nature of the failures it looked like a race condition, but even adding synchronization to the tests to prevent them running at the same time manifest the same behaviour: a race condition but in single threaded code! What was going on?
After some head-scratching and searching of Stack Overflow, I realized that the problem was with my use of the basicLogging
method on the Python logger. This is actually a one-shot setup: the first time the logger is used when your code runs, it sets up the logger: but running it repeatedly doesnât change the logger path again! So whichever test managed to set the logfile name first would always âwinâ and succeed, but the other tests would fail because theyâd all be trying to parse the same logfile each time.
That was fixed with yet more Python magic (I will spare the details, see DynoLocalOnlyBoto3.init
). Again I was pleasantly surprised by how easily Python and Swift integrate.
Testing works, part 2
And again all of this setup effort paid off, as I uncovered another bug, with the Filter setup (I had assumed I could pass all filter values as Strings, but numbers need to be set as numbers).
I would have definitely found that if Iâd tried to run against a real database, but the advantage of the unit tests is that I could refactor the Filter code to make it rather more straightforward, and be sure it would work again afterwards⊠without having to use a real database and potentially pay the read costs!
Coder coda
After all that, I have a suite of unit tests for the Dyno library. This will be very useful for future development. Weâll put this to use next time when we add a user interface mini-library to process those Observables and actually show some data!
One More Thing: Python Objects to Codables
Although Swift can deal with Python objects, it doesnât actually know much about them: they are just opaque PythonObject
s. Wouldn't it be great if we could turn them into regular Swift types?
You may have noted that functions like scan
can now take the type of object to create, rather than using a Builder
to turn a Python object into a Swift object.
This is done by having the types you want to use with Dyno
conform to the Codable
protocol. I then use a bridging function convertDecodableToBuilder
to convert the passed-in [De]Codable
into a Builder
:
The final challenge then is to provide a PythonDecoder
, which can take a Python object and turn it into the type T
which you pass in. This magic is performed by conforming to Swiftâs Decoder
protocol (used by things like JSONDecoder
to similarly convert JSON objects to arbitrary passed-in types)
The problem with conforming to Decoder
(and Encoder
, which goes the other way) is that it requires a huge amount of boilerplate code: there are very few examples on the Internet of writing your own, and in fact the best examples are from JSONDecoder.swift
and Codable.swift
in the open-source Swift library.
I actually got a big jump-start using code written by my friend Sam of https://elegantchaos.com who adapted the Apple code to decode to dictionaries, Iâve just adapted that further. Look at the PythonEncoder.swift
and PythonDecoder.swift
files: they are not pretty but the end result is very nice!
Note, adding these files has therefore changed the distribution licence to Apache License v2.0. If you donât like this, remove those files and use Builder/Writer directly.