Self mockery
Here on the Data Team at When I Work we manage a lot of projects: 5 web apps, 7 services, 20 packages, 15 ETLs and several other assorted endeavors. That’s a lot for a team of six people to keep up with. One of the biggest reasons that I believe we succeed in managing all of these projects is our commitment to our practice, to not only focus on the final results we deliver, but also on the path we take to deliver them.
As part of this we commit ourselves to having full test coverage on all of our projects. It doesn’t matter if they are core or ancillary, one-offs or recurring, all of them need tests and all of the tests need to pass before any change is considered complete. Driven by this commitment to testing we’ve experimented with design patterns that enable us to write our tests quickly.
Our most recent success in this area comes from a pattern one might call “self-mocking”, having libraries that know how to mock themselves. Anyone familiar with testing in python has probably used unittest.mock
. It is a package in the Python standard library that allows programmers to easily replace dependencies within their code, allowing them to write tests and check their output against a known result. Self-mocking follows the same basic idea, but shifts responsibility around.
In normal unit testing in Python the package being tested uses unittest.mock
to mock out any dependencies that would cause problems while running tests. These are things such as network or disk operations, or really anything that gets complicated without really providing much testing value for the code at hand. Mocking out these dependencies allows our unit tests to be run quickly and reliably with fairly minimal effort.
When practicing self-mocking we take some of this responsibility and move it out of the code being tested and into the package being mocked. Where normally it would be the responsibility of the calling code to handle mocking its dependencies, we shift that responsibility onto the dependency itself. We do this by adding mocking utilities into dependency’s public interface. The consuming code can then use these utilities provided by the dependency instead of mocking it directly. This pattern lets us take the work that is repeated across many consuming packages and move it into a single location.
To put this in concrete terms, suppose we had a package, let’s call it streamer
, and its job is to handle streaming data records out to an AWS Kinesis stream.
import streamer# put records onto some kinesis stream
streamer.put_records(records)
Now suppose streamer
does more than just put records straight onto the queue. It might also do some validation, confirm that the given records
are formatted correctly and will be acceptable to Kinesis. If we write our tests the normal way we omit all of this checking.
# One way of writing tests without self mocking
@mock.patch('streamer.put_records')
def test_my_code(mock_put_records):
"""Test my code without actually trying to send records."""
Since we mock out all of streamer.put_records
we get rid of the validation along with the network operations. So this test may pass, but we may be misusing the library and there could be a validation error when we run it for real.
The self-mocking way of doing this would be to add a new method to streamer
. Let’s call it streamer.mock_kinesis_stream
.
# In streamer/__init__.py
import functools
from unittest import mockdef mock_kinesis_stream(func):
"""A python decorator that mocks the stuff we need.""" @functools.wraps(func)
def wrapper(*args, **kwargs):
with mock.patch('the_things_we_use_internally'):
func(*args, **kwargs)
return wrapper
Now instead of just mocking right at the interface we can allow streamer
to be intelligent and remove any side effects that we wouldn’t want during testing, but preserve any validation that we would.
import streamer# Writing tests with self mocking
@streamer.mock_kinesis_stream
def test_my_code():
"""Test my code without sending records, but with validation."""
Now I know some of you will be saying, “That’s not unit testing” and the truth is you’re right, that isn’t unit testing under the most strict definition. We are not taking steps to isolate each component from all of the others, in fact we’re actively working to keep them together. However, our experience suggests that this is how we’d prefer it. It gives us advantageous isolation, allowing us to feel confident that our code is going to work across package boundaries and still get that snappy feedback that we’ve grown to expect out of unit tests. In the end, we get higher quality code for less effort, which is a trade I’m always eager to make.