Integration testing with databases is one of the most vital, yet commonly overlooked part of any software development process. This is, in part, due to the need to have a reachable database server active around somewhere. In this article, we explore unit and integration testing with databases that can be repeatedly spun up for testing, provides a uniform schema and test data, and can be easily torn down afterwards. All these can be done with the help of some nifty open source Python libraries that are readily available.
A Little Bit on Pytest
pytest is the current de facto testing framework for any Python application, surpassing in popularity to the
unittest module, which comes as a standard package with all modern Python distributions. Vanilla use of
pytest is very straight-forward. All one needs to do is to write a function with
test as a substring of the function name, and run the
pytest command on it.
However, the real power of
pytest comes when one understands all the bells and whistles underneath the hood, in particular fixtures, which can help perform various setting up and tearing down of relevant environments required to run test cases.
There are a lot of very good tutorials on
pytest and thus we will not go into them in detail. This article focuses purely on using fixtures to set up temporary databases to achieve a more realistic and repeatable testing environment.
Fixtures are objects that set up certain conditions that are then used in testing. Think of them as functions that are called before running the actual test functions. In
pytest, this can be done by marking a function with the decorator
@pytest.fixture. Thereafter, you can use the fixture in any test function, simply by adding it as an argument to the function definition. A simple example is shown below.
Of course, this simple description belies their flexibility and extensibility. One can even use a fixture within another fixture to break down the setting up into more manageable pieces, as well as to share certain fixtures.
Later we will see some of these being used in the test codes for database connections.
Testing Database Interactions with Fixtures
We shall go through this tutorial with a simple example of an bank account transaction system (I mentioned in my previous article that I will not use examples that have no semblance to real life applications, but yet at the same time, simple enough to illustrate the concepts). As a proponent of good clean design, I will also recommend using
SQLAlchemy (https://www.sqlalchemy.org) to model the various business objects.
In this system, each customer will be a user, and each user can have one or more accounts. At the same time, an account can belong to a single user, or it can be a joint account belonging to multiple users. Every time the user credits or debits into the account, a transaction will be created. The Entity-Relation diagram reflecting the relationship is shown below.
Now that we have the business objects, their Object Relation Model definitions in SQLAlchemy is then shown in the snippet below.
We next define all the operations that can be applied to the system through another module, i.e.
core.py. For simplicity, we create a module with the following operations.
Although the snippets might seem verbose, we try to instill good programming habits by including typing information and making sure that the code conforms to PEP8 standards.
Now, let’s get on to testing them.
Scenario 1 — Using in-memory SQLite
Whilst most articles on testing lean heavily into using mock objects to mock away interaction with databases, the completeness of the test depends on the correctness of that interaction. We can imagine the case whereby an SQL statement, or a query forming expression having errors that can go undetected. One of the easiest way (and of course the main reason why we use
SQLAlchemy) is to use the fixtures to create an in-memory version of the database and insert test data into it, similar to how we insert test data in the previous examples.
A few things to note here:
- We use one fixture to setup the database, and another to insert the test data.
- Within the core functions, it is a good practice to not assume or set up a database connection. Rather we treat the connection (
sessionin this case) as a reference that can be passed around amongst the business logic functions.
Next we ask ourselves, can we do better? Whilst this approach of testing is very straight forward, and does not require much external dependencies, it does suffer from a few shortfalls, namely:
- If we use database specific features such as
JSONBdata types, some of the operations cannot be tested.
- Forcing our application structure such that it can be tested on SQLite prevents us from using the more advance features that some databases provide.
Scenario 2 — Testing with pytest-postgresql
There is a plethora of database fixture libraries that allow for one to temporarily spin up the database of choice (provided you have the required executable). Whether it is by design or coincidence, most of them share similar interface signatures.
Here we are going to use
pytest-postgresql, which is an pytest plugin, to set up the test database. So let’s get started!
We need to install a working version of
pg_ctlworking. One of the easiest way is to use Anaconda (https://anaconda.org), like so:
conda install -c anaconda postgresql
First we define the scope in which the fixture is going to be valid. By default,
factories.postgresql_proc() will create a session-scoped fixture. We can put it in
conftest.py so that it can be shared across multiple test modules. Thereafter we create a functional scoped fixture using the call to
This plugin was not designed with SQLAlchemy in mind so we have to perform some coding gymnastics. The reason why we’d want to use SQLAlchemy is such that we can use its object models to create the tables. To do so we need to get the cursor object and pass it into the engine. This is made possible by the inner function
dbcreator(), which returns a connection. Thereafter we pass it into
create_engine() to create the engine using which we can automatically generate all the DDL statements to create the tables during testing.
With all the things in place, we then insert in all the required test data in yet another fixture, and run the same set of tests as per the scenario using SQLite. But this time, we can test with a higher level of confidence.
So that’s it! We have gone through briefly on the concepts of
fixtures. We introduced the use of the
pytest-postgresql plugin for more realistic testing of a database driven application. For most other databases or messaging frameworks, there most likely will be an equivalent plugin:
- … And more
Coincidentally, they are all created with similar interfaces, such that the process of setting them up goes through roughly the same processes.
Hopefully this article can give users a clearer understanding of how to better test database-driven applications, and thereby leading to more robust and reliable software.
Given the ubiquity of Python and
pytest, if you find that a plugin for a database you want to test for is missing, you can either write one (or contact me), or appeal to the general Python community, and I’m sure an usable test plugin will be created in no time.