Breaking Free From the ORM: Replacing Connection Handling

Handle performant connections to the database

Omar Rayward
Building the system
3 min readAug 24, 2018

--

TL;DR, Understand the threading model running your program when reusing database connections.

In previous articles, we looked at why object relational mapping is a detrimental construct, and how to start breaking free of the ORM by first decoupling database migrations.

An ORM usually combines three competencies in one. So, when we remove the ORM, we have to replace:

  1. Database migrations
  2. Database connection handling
  3. Database queries

In this article, we’ll explore how to handle connecting to the database. In the next and final article, we’ll look at how to implement database queries without using the ORM.

Handling Database Connection

The scope of this article isn’t to explain how to connect from every potential application to every potential database. Here, we are only exploring how to connect from a Python program to a PostgreSQL database.

No matter how we connect to the database, we always have to take into consideration two things:

1. What performance bottlenecks are there when connecting from our program to the database?

In the case of PostgreSQL, one bottleneck is establishing a connection to the database. Since establishing the connection takes so much time, we want to reuse connections as much as possible. That way, we don’t have to open a new connection with every database query.

2. Does the threading model running our program affect performance when querying the database?

The processes running our programs can be single-threaded or multi-threaded.

Single Threaded

In this case, each process starts a different instance of the application. Each instance of the application opens a database connection, and this will be reused for different requests. There will never be concurrent requests since each process is running on a single thread.

Multiple Threads

In this case, each process has multiple threads, therefore, some code may be shared between threads. If the database connection is shared, then database queries will slow down since every query will go through the same connection.

For this, it’s useful to have a pool of database connections, and just use one of them for each transaction. Once each database transaction is finished, the connection is placed back to the pool. This way, even though the pool is shared, there’s no database connection shared between threads.

Here is some code:

# db_connection.py
from psycopg2.pool import ThreadedConnectionPool
from contextlib import contextmanager
min_connections = 1
max_connections = 10
dsn = ‘postgresql://...’
psql_connection_pool = ThreadedConnectionPool(min_connections, max_connection, dsn=dsn)
@contextmanager
def connection():
try:
connection = psql_connection_pool.getconn()
yield connection
finally:
psql_connection_pool.putconn(connection)
# feature.py
from db_connection import connection
def format_user(user):
return # do something
with connection() as conn:
with conn.cursor() as cursor:
cursor.execute("""SELECT * FROM users""")
users = [format_user(x) for x in cursor.fetchall()]
return users

Building the system

In this third article of a four-part series, we’ve covered the need to understand how we connect to our database. This part of the program is usually hidden deep in the guts of the ORM.

To performantly connect to the database, we need to reuse the connection between transactions, without using that same connection in different threads at the same time.

Once we do that, we’ll have a better understanding of the system that we’re creating, and how its different parts — in this case, the application and database — communicate through a common interface.

Next, we’ll cover practical advice on how to deal with database queries to break free from ORM once and for all.

--

--