Tips and tricks for unit tests (with Python & Pytest)
In my career as a Python developer I have had the luck of being instructed to test my work thoroughly from the beginning (thanks Rubén!) and on each new job I could, to a certain degree, maintain testing as a primary part of my job, and allowing me time to keep improving my skills.
What I realized along this path is that many people, even seasoned developers, lack the tools or the knowledge to do unit testing effectively. The whole problem is not only that, it is also that many people, from managers to developers, consider testing a waste of time and they cut corners when under pressure- deadlines, you know.
Testing saves you time in the short and the long term. Also, it helps you write better and more reliable code
In my opinion, testing saves you time in the short and the long term. Also testing helps you write better and more reliable code — see section Why Unit Testing.
Over the last year at my current company, Worldsensing, we’ve been increasing our unit testing gameplay, and Pytest has been a great enabler. Also, we’ve been developing an internal guide for anyone to read and contribute to regarding unit testing in Python. A good part of the material for this article comes from that guide.
In this article I would talk mainly about testing with Pytest, but many of the tips are valid for other frameworks and languages, and most of the key features I’ll be talking should be available for any of you.
Code examples are embedded using Github’s Gist. Due to restrictions on the embedded plugin, each example is in a separated Gist, but all examples are available in a single multi-file Gist, so you can download it, play with it and contribute: https://gist.github.com/hectorcanto/40a7ecbd9e02b0550840f850165bc162
Disclaimer: code snippets have been tested only in Python 3.6, other versions might need some adaptation.
Why Unit Testing
Testing serves you in many ways:
- It permanently validates what you’ve done; when you say that something works, tests are the proof.
- It saves you from regression bugs and side effects. If your code is thoroughly tested you can be much more at ease developing.
- Tests enable you to refactor. As a rule, you should not only develop code that works, but also code that is readable and maintainable. For that you need to refactor the code, and, to do so, you must have tests.
- Developing tests changes your mindset. When making tests you change sides and “attack” your code. Then, this can make it easier for you to detect corner-cases and potential errors. Since you are unit testing, you may identify blocks (units) of code that would belong better in a function, so refactor and test again.
- Tests may save you from the “blame” culture. When an error occurs somewhere in the project architecture (app, frontend, API, DB …), if you have tested that particular situation, you are free to go.
- Debugging is much easier. When an unexpected error occurs, if you have a good test suite you’ve already saved a lot of time: you have the basic cases covered, you have probably discarded some possibilities, and you can reproduce the error with a few modifications on a given test. Also, you can patch in the debugger to see what is happening behind the curtains without having to deploy the full monster.
Nevertheless, unit testing will not protect you from everything. You need to consider the full Test Pyramid to balance unit tests with other test types to fully validate your project. But that’s a whole topic for another article.
Why pytest: productivity
In my experience, I find Pytest more appropriate for code reusability than the built-in unittest module; however its learning curve is a bit steeper and some artifacts seem a bit odd at the beginning.
Once you get the hang of the Pytest style you will be faster creating tests. Also, Pytest is capable of running tests from other frameworks, so you can keep the old tests unchanged and start with Pytest right away.
To increase your test development productivity even more I recommend using repository templates to reuse the boilerplate and keep the recurrent code synchronized between projects — You can find out more about repository templates in my other article https://medium.com/worldsensing-techblog/project-templates-and-cookiecutter-6d8f99a06374.
In some cases, [private] libraries are another tool to reuse code. For instance, I have a DB client that initializes DB and drops db and has commands to help you manage it, like clear tables between tests.
Basic concepts: fixtures and mocks
Before talking about testing directly, you must know two concepts that are common to every programming language: fixtures and test doubles (also known as mocks).
Fixtures
Fixtures are the set of artifacts surrounding the test of a certain component (or unit): input data, real elements, fake elements (mocks and others), pre-created files.
In particular, Pytest pays special attention to the loading of fixtures through two ways: the input parameters of the given test, and decorators (especially “parametrize”). If a fixture is mentioned in the test interface it will be automatically loaded and accessible in the test scope, and unloaded at the end of the test if applicable — a fixture can be reused if the given scope is bigger than the usual “function” scope, we will talk about scope later on.
Mocks
Mocks are a particular type of fixtures. Mocks are replacements for “real things” in your program with bare minimum implementation. To be more precise Mock is a type of test “double”; other test doubles are stubs, fakes and dummies; but they are pretty similar and many people use “mock” as an equivalent term for any test double.
Mocks are useful for reducing the load of a test, and removing dependencies between different elements so you can test only the component you want (that’s why we call them unitary tests). If you don’t mock certain elements you may encounter undesired effects like calling an external API or failing to reach an inaccessible dependency.
Fixtures in general, and mocks in particular, will be used in the examples of the next sections.
Know your mocks
Once you understand Mocks you may try to implement them, but usually, if not always, you can use some libraries that already implement them.
Regarding Python, they come within the great built-in mock library, made available to Pytest through the pytest-mocker plugin. Also, the community has created may Pytest plugins
Mocks make our tests lighter than the real implementation, help us remove dependencies and serve as spies and interceptors of our program flow.
By default, mocks intercept messages going through them. In case you need messages to pass through, you can either configure the mock to let things pass or use a spy.
Mocks can replace full modules, a class, a method or a single attribute. The most common way to use mock is to patch a method of a class or a function of a module.
More applied examples of mocks are presented in the next sections.
Monkeypatching
An alternative technique to mocking is monkeypatching.
Monkeypatching is changing some code element at runtime. In Python this is simpler than in other programming languages because of the mutable nature of classes and modules, attributes and functions are easily modifiable at runtime.
Contrary to mocks, monkeypatches do not add any spy functionality, they just replace the patched element.
While this looks dangerous — it is, proceed with caution-, with Pytest monkeypatching is made easier for you: it comes with a built-in plugin that will patch and unpatch the target automatically in the context of a given test. See some simple examples:
Pytest features and tips
On each of the next points I will explain an aspect of unit testing, associated with a Pytest feature.
Parametrize your tests
I’ve seen the same test copy-pasted many times with different inputs. With Pytest you can reduce it to only one test. An example:
Each entry will be run and asserted separately, so if one input causes a failure, you will be notified for that particular error, and the rest of the parametrized entries of the test will be run.
Be unit my friend
Make your unit test as small as possible, in a way that the minimum possible piece of code is involved. This way, when an error appears, you will be able to quickly assess where it originated. On the contrary, if your test is to broad, it will be more difficult to determine the culprit.
A typical anti-pattern is to use an API to insert the fixtures in the DB and then interact with it to test different methods. If the insertion fails, your test also fails, but not because of the targeted API call. Avoid this, if possible, by loading the fixtures off-API, for instance with the ORM — which is usually something you trust and thoroughly tested.
Treat test code as core code
Your tests have to be maintained and they will be read by others. Also, tests are an expression of what you expect of your implementation, so they are a good way to understand what a particular piece of code is doing.
So, make your test code readable, use docstrings and comments and respect style, almost as if it were a part of the functional code base. You and your colleagues will be thankful for it later on.
Make your test suite fast
Test should be fast so you can run them often. Every time you add or change something in your code, launch your tests to see if what´s already developed keeps working.
My rule of thumb is 1 minute tops, but that really depends of what are you doing and if you are connecting to something external like a DB or a cloud service.
If your tests cannot be made really fast, organize them in suites — see the next sections.
Organize your test by concern
Separate your tests by concern in different suites (a fancy name for folders) so you can run them according to what are you testing. This is especially important if you have tests that hits something external.
Pytest can help you in several ways to run each test group separately: folders, markers, names …
I usually split tests in three folders:
- smoke: just to check initialization and configuration. I usually mark them to be run first with pytest-ordering plugin at a module level: pytestmark = pytest.mark.first
If smoke tests are not easily separable, you can use a marker per function. Decorate your basic tests with “@pytest.mark.smoke” and launch them with “pytest -m smoke”. - unitary: the really small ones, with no dependencies. They should run in a few seconds and launched often (more than integration ones anyway).
- integration: the ones that need a DB or that hit something external like a cloud server, or an authorization third party. I usually make subfolders for each dependency.
You can further group your tests, for instance, by dependency integration, by API resource, by affected subdomain…
This folder tree gives an example of how I organize my tests.
── tests
│ │
│ ├── data
│ │ ├── some_input.json (input fixtures from files)
│ │ └── list_of_names.txt
│ │
│ ├── unitary
│ │ ├── __init__.py (not mandatory but sometimes useful)
│ │ ├── conftest.py (particular fixtures of this subfolder)
│ │ ├── test_api_basics.py
│ │ ├── test_resource_user.py
│ │ ├── test_other_resource.py
│ │ ├── test_business_logic.py
│ │ ├── test_validation.py
│ │ ├── test_factories.py
│ │
│ ├── integration
│ │ ├── aws
│ │ │ ├── __init__.py
│ │ │ ├── conftest.py
│ │ │ ├── test_some_service.py
│ │ │ └── test_bucket_upload.py
│ │ │
│ │ └── repositories
│ │ ├── __init__.py
│ │ ├── conftest.py
│ │ ├── test_interface_db.py
│ │ └── test_external2.py
│ │
│ ├── __init__.py
│ ├── common.py (extract common code and utils here f.i.)
│ ├── conftest.py (most fixtures are stored here)
│ ├── test_smoke.py
│ └── test_cli_aux_commands.py
│
├── reports/ (stores test results, test logs, and coverage reports)
├── pytest.ini (configures pytest runner)
├── .coveragerc (configures coverage plugin)
└── README.md (always tell what you do, code only explains behaviour, not intention)
Following the names in the snippet, you can launch them separately with:
- pytest tests/test_smoke.py
- pytest tests/unitary
- pytest tests/integration
- pytest tests/integration/aws
To launch them all just execute pytest.
Test for errors
For some reason, many people only test to see that everything works as expected. But what does your program do when something goes wrong?
My recommendation is to test what your program does when an error occurs: Does it raise the right exception? Does it log what you need?
Pytest helps you do that:
Don´t wait for sleeps and timeouts
It is usual for programs to be idle for a few seconds waiting for a response or trying to reconnect to some service.
To test these mechanisms, you don’t need to wait, monkeypatch them to get results immediately. See three examples:
Also, monkeypatching can be used to generate timestamps or datetimes that emulate a certain condition (see the previous code snippet).
Mock HTTP responses
I’ve seen test suites that implement a mock server that responds to certain requests of your program. That’s usually an overkill. Instead, mock the request method (requests.get for example) and make it return an object with the desired status code and content, you will probably ignore other attributes like headers.
Remember, you are testing the response of your program to certain conditions, keep it simple.
In this example we use spies, but to actually block the call. In that case, you need to use mocker.object.patch instead.
Spy on loops
Loops are especially hard to test, specially when you have to run several iterations until you see the behavior you are looking to test.
Exception as breaking point
Sometimes it seems impossible to break apart a piece of your program into a testable function. In that case, you can use exceptions to your own benefit. Patch some function to raise a custom exception and check the current state of your application.
EXAMPLE BreakingPointException TBD
Make your fixtures programmatically
Fixtures are the elements that create the conditions of a given test case. In a broad definition, fixtures include the inputs you enter you tests; and the elements you set up, like a DB, an application state or a file in the disk.
In a second definition, fixtures refer to the input parameters, if you prefer data fixtures.
An anti-pattern that I´ve seen many times is to have dozens of JSONs or files stored that make up those data fixtures. That’s plainly wrong.
Create functions that generate those files or inputs on demand. This pattern is usually called factory (the factory pattern is a recurrent one in object oriented programming, look it up).
In fact, there is an interesting Python package called Factory-boy that helps you create instances from your DB model, whether it is a Django Model, a Mongo model or an SQLAlchemy model.
In a particular use case, I like to convert the model fixture into dictionaries, to use it to make requests to the Rest API I´m developing, in that case I create a EntityDictFactory like this:
Use a separate DB for testing
While developing, you will probably create some mock data and store it in a DB. The process of creating it is costly and you surely want to keep the database as it is for some time. Then, use a different DB schema for your tests to avoid unwanted deletions and conflicts between different data sets.
I usually have a different configuration for local development and testing. I name the test DB schema as test_$whatever_schema_name, so it is easy to identify.
Regarding the mock data, automate its creation to be safe and save time during the developing phase. Django has mechanisms to load fixtures, Django and Flask have command frameworks to make the database management simpler and there are a few CLI tools that will make your life easier: Click and Fire are my favorite ones.
Recreate your auxiliary database in every launch
While developing, the database model will evolve constantly. If you don’t update your database tables, you will get errors and they won’t be very descriptive. To avoid this, create and drop your schema in every test run (the whole suite), so it will be synced with your code.
Pytest helps you with this, with session scope fixtures that are automatically run
No matter which test filtering you use, this fixture will be run at the beginning and end of the whole test run. If you are not familiar with Pytest, mind that what is after the yield statement will be run at the end, as tear-down.
And clean your tables in every test
While recreating the DB is costly, cleaning the tables is not. Since we are talking about unitary test you should clean up the tables you are using for the next tests, so there are no conflicts. See the previous code snippet for an example on how to clean tables and configure the fixture.
If you really need a lot of fixtures loaded for a series of tests, you may be outside the unitary spectrum. If that’s your case, I recommend that you to keep those test apart (separate folder) and manage the fixtures with a module or class scope instead of a function one.
Wrap up
I hope that this article has served you to understand why testing is important; gave you new arguments to defend its importance; or make your life easier introducing you to testing, and with Pytest in particular.
Before ending, I would like to thank Melanie Ator, Daniel Lázaro and Sergi García for reviewing this article, as well as all my colleagues at Worldsensing that are improving our testing efforts.