Python & Django

Django-plpy

Django toolkit for Python stored procedures in PostgreSQL

Handmade Software
Published in
5 min readJan 28, 2022

--

PostgreSQL is one of the most advanced good old SQL databases, covering a huge spectrum of tasks and showing outstanding performance. PL/Python plugin allows writing stored procedures in Python.

Django integrates with PostgreSQL over its amazing contrib module and supports many of those modern features, but not PL/Python. My last library, Django-plpy, closes this gap.

TL;DR: GITHUB

Looking for some Python or JS freelance gig, rates up to 50 €/hour? Please don’t hesitate to contact me on Twitter or Linkedin with your CV, availability and hourly rate. (CET +/- 3 hours).

About PL/Python and its features

PL/Python allows you to create fully functional stored procedures using Python instead of the PostgreSQL’s native stored procedure language PL/SQL. The Python version used is set in the PostgreSQL build.

Here is a simple example of a PL/Python function:

and a trigger:

Maintaining PL/Python functions with Django in raw SQL can become quite annoying, as although Django ORM integrates deeply with the database, stored procedures and triggers were never really favored by Django. Possible reasoning for it is that database is responsible for data storage, and web application for logic. Django-plpy makes testing of stored procedures without a database possible, where at least cosmetically logic remains in the database. Let’s dive in.

Using django-plpy

The main workflow for bringing python functions to the database is to decorate them with @plpython and call manage.py command syncfunctions to install them. Full annotation is needed for the proper arguments mapping to the corresponding Postgres type function signature. The arguments are then automatically mapped to the proper database types.

After executing manage.py syncfuntion Django-plpy will wrap the function in a stored procedure and install it in the database. The function still remains a python function and can be easily unit tested without involving a database. Here is how the function can be used when installed.

Python functions in SQL queries

Python functions in annotations

Using python functions for custom ORM lookups

Installing of python triggers

Triggers are a very mighty mechanism, django-plpy allows you to easily mark a python function as a trigger, so some logic from your project is directly associated with the data changing events in the database.

Here is an example of a python trigger using the @pltrigger decorator.

Using Django models in triggers

The parameters of @pltrigger decorator declare the trigger parameters like event the trigger will bind to and table name. You can replace table_name with a model name, the table name will be looked up automatically:

Read more about plpy triggers in the official Postgres documentation: https://www.postgresql.org/docs/13/plpython-database.html.

Using Django ORM in triggers comes at a price, though, so if you want to use them, check out the details of the implementation below.

Bulk operations and triggers, migrations

Python triggers are fully featured Postgres triggers, meaning they will be created for every row, unlike Django signals. So if you define a trigger with event=”UPDATE” and call a bulk update on a model, the trigger will be called for all affected by the operation:

Updated calls the trigger for every row

Unlike the code of Django models or signals, triggers will also be called while migrations.

Turning Django signals to PL/Python triggers

Although Django signals are neither asynchronous nor have any ability to be executed in another thread or process, many developers mistakenly expect them to behave this way. Often it leads to a callback hell and complex execution flow because Django signals implement a dispatcher-receiver pattern and only make an impression of asynchronous execution.

With django-plpy, you can quickly turn your signals into triggers and make them truly asynchronous.

On the way there

How the code is installed

Django-plpy copies the function code, wraps it in a PL/Python stored procedure or trigger, and then installs it with manage.py syncfunctions. In case you use Django models, the database requires access to your project file and virtualenv (see below), or if you create your own database docker image, it has to be provisioned correspondingly. This scenario seems quite exotic, and associated with certain risks.

Using Django in PL functions and triggers

While installing with syncfunctions the source code of the function will be copied to a corresponding stored procedure and installed in Postgres. This makes your local context not available to the functions, which means that no models or libraries can be used within the transferred functions.

To solve this problem, you need to set up your python project and environment within a Postgres python interpreter. Django-plpy supports the following two scenarios of how you use your database.

Database setup and configuration

Database and application are on the same host

Rarely used nowadays, but still out there, this scenario is the simplest for the environment sharing. Django-plpy creates stored procedures and transfers the necessary configuration to the database:

  • secrets and database access credentials
  • path to the python env (defaults to distutils.sysconfig.get_python_lib(), for more config see below)
  • loads Django applications the way manage.py does it

Database is in a separate docker container

A more common production scenario is that the database is on a separate docker container.

The official Postgres image doesn’t support plpython plugin out of the box, so if you want to use plpython as such you would need to create your image or use one I created thorinschiffer/postgres-plpython.

All the images provide python 3.7 because Postgres uses the default python environment from the OS the image is based on and 3.7 is the standard for Debian Buster.

Using django-plpy with dockerized Postgres

To make the code available to the Postgres python interpreter, it has to somehow appear within the docker container. You can either provision the image with it while building if you decided to write your docker image / dockerfile, or you can share the code using volumes.

Once the code and environment exist somewhere within the Docker container, django-plpy can be told to use them: So if your environment lives under /env (copy site-packages folder to this path) and your app within /app, add following settings to your settings.py

PLPY_ENV_PATHS = ["/env"]
PLPY_PROJECT_PATH = "/app"

AWS RDS and other managed databases

In the times of SaaS, databases are rarely connected in a docker image but much more frequently in a managed database like AWS RDS. In this case, django-plpy can only install simple functions and triggers because there is no access to the database’s file system in such setup.

Besides that, some managed databases won’t give you superuser rights, meaning installing extensions in such a scenario will be troublesome.

Hidden recursion

Accidental call to Model.save in a trigger function can create a really nasty situation: a trigger function will connect to the local database over network using psycopg, causing a new transaction to be created, that repeat all the cycle indefinitely many times. This hidden recursion can overwhelm the database and the application server.

The same effect can be triggered by any write operation caused from the trigger. Therefore, you should avoid using any ORM methods in the trigger.

Installation and documentation

Full documentation and installation instructions can be found here. Check it out for your project, and you are welcome to contribute.

=====================================

If you liked this, check out my recent articles:

🇩🇪 Employment in Germany: the ultimate guide

🕒 Why your software quality degrades with time: short story

😃 RapidAPI: first steps with Python

--

--