Python & Django
Django toolkit for Python stored procedures in PostgreSQL
PostgreSQL is one of the most advanced good old SQL databases, covering a huge spectrum of tasks and showing outstanding performance. PL/Python plugin allows writing stored procedures in Python.
Django integrates with PostgreSQL over its amazing contrib module and supports many of those modern features, but not PL/Python. My last library, Django-plpy, closes this gap.
About PL/Python and its features
PL/Python allows you to create fully functional stored procedures using Python instead of the PostgreSQL’s native stored procedure language PL/SQL. The Python version used is set in the PostgreSQL build.
Here is a simple example of a PL/Python function:
and a trigger:
Maintaining PL/Python functions with Django in raw SQL can become quite annoying, as although Django ORM integrates deeply with the database, stored procedures and triggers were never really favored by Django. Possible reasoning for it is that database is responsible for data storage, and web application for logic. Django-plpy makes testing of stored procedures without a database possible, where at least cosmetically logic remains in the database. Let’s dive in.
The main workflow for bringing python functions to the database is to decorate them with
@plpython and call manage.py command
syncfunctions to install them. Full annotation is needed for the proper arguments mapping to the corresponding Postgres type function signature. The arguments are then automatically mapped to the proper database types.
manage.py syncfuntion Django-plpy will wrap the function in a stored procedure and install it in the database. The function still remains a python function and can be easily unit tested without involving a database. Here is how the function can be used when installed.
Python functions in SQL queries
Python functions in annotations
Using python functions for custom ORM lookups
Installing of python triggers
Triggers are a very mighty mechanism, django-plpy allows you to easily mark a python function as a trigger, so some logic from your project is directly associated with the data changing events in the database.
Here is an example of a python trigger using the
Using Django models in triggers
The parameters of
@pltrigger decorator declare the trigger parameters like event the trigger will bind to and table name. You can replace
table_name with a model name, the table name will be looked up automatically:
Read more about plpy triggers in the official Postgres documentation: https://www.postgresql.org/docs/13/plpython-database.html.
Using Django ORM in triggers comes at a price, though, so if you want to use them, check out the details of the implementation below.
Bulk operations and triggers, migrations
Python triggers are fully featured Postgres triggers, meaning they will be created for every row, unlike Django signals. So if you define a trigger with
event=”UPDATE” and call a bulk update on a model, the trigger will be called for all affected by the operation:
Unlike the code of Django models or signals, triggers will also be called while migrations.
Turning Django signals to PL/Python triggers
Although Django signals are neither asynchronous nor have any ability to be executed in another thread or process, many developers mistakenly expect them to behave this way. Often it leads to a callback hell and complex execution flow because Django signals implement a dispatcher-receiver pattern and only make an impression of asynchronous execution.
With django-plpy, you can quickly turn your signals into triggers and make them truly asynchronous.
On the way there
How the code is installed
Django-plpy copies the function code, wraps it in a PL/Python stored procedure or trigger, and then installs it with
manage.py syncfunctions. In case you use Django models, the database requires access to your project file and virtualenv (see below), or if you create your own database docker image, it has to be provisioned correspondingly. This scenario seems quite exotic, and associated with certain risks.
Using Django in PL functions and triggers
While installing with
syncfunctions the source code of the function will be copied to a corresponding stored procedure and installed in Postgres. This makes your local context not available to the functions, which means that no models or libraries can be used within the transferred functions.
To solve this problem, you need to set up your python project and environment within a Postgres python interpreter. Django-plpy supports the following two scenarios of how you use your database.
Database setup and configuration
Database and application are on the same host
Rarely used nowadays, but still out there, this scenario is the simplest for the environment sharing. Django-plpy creates stored procedures and transfers the necessary configuration to the database:
- secrets and database access credentials
- path to the python env (defaults to
distutils.sysconfig.get_python_lib(), for more config see below)
- loads Django applications the way manage.py does it
Database is in a separate docker container
A more common production scenario is that the database is on a separate docker container.
The official Postgres image doesn’t support plpython plugin out of the box, so if you want to use plpython as such you would need to create your image or use one I created thorinschiffer/postgres-plpython.
All the images provide python 3.7 because Postgres uses the default python environment from the OS the image is based on and 3.7 is the standard for Debian Buster.
Using django-plpy with dockerized Postgres
To make the code available to the Postgres python interpreter, it has to somehow appear within the docker container. You can either provision the image with it while building if you decided to write your docker image / dockerfile, or you can share the code using volumes.
Once the code and environment exist somewhere within the Docker container, django-plpy can be told to use them: So if your environment lives under
/env (copy site-packages folder to this path) and your app within
/app, add following settings to your
PLPY_ENV_PATHS = ["/env"]
PLPY_PROJECT_PATH = "/app"
AWS RDS and other managed databases
In the times of SaaS, databases are rarely connected in a docker image but much more frequently in a managed database like AWS RDS. In this case, django-plpy can only install simple functions and triggers because there is no access to the database’s file system in such setup.
Besides that, some managed databases won’t give you superuser rights, meaning installing extensions in such a scenario will be troublesome.
Accidental call to
Model.save in a trigger function can create a really nasty situation: a trigger function will connect to the local database over network using psycopg, causing a new transaction to be created, that repeat all the cycle indefinitely many times. This hidden recursion can overwhelm the database and the application server.
The same effect can be triggered by any write operation caused from the trigger. Therefore, you should avoid using any ORM methods in the trigger.
Installation and documentation
Full documentation and installation instructions can be found here. Check it out for your project, and you are welcome to contribute.
If you liked this, check out my recent articles:
🇩🇪 Employment in Germany: the ultimate guide
🕒 Why your software quality degrades with time: short story