How does Django manage DB connections?

Jordan Bae
Django Unleashed
Published in
9 min readJan 20, 2024
Django

❗️ This article is based on Django 5.0. The sections where I analyzed the internal code are my interpretations, and there may be misunderstandings

How does Django creates DB connection

“One of the inconvenient aspects of switching from Django to another framework is having to explicitly manage the DB connection.

For example, when using Flask with SQLAlchemy, you need to create and manage the database connection as follows.”

app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///example.db'
db = SQLAlchemy(app)

# add user API
@app.route('/user', methods=['POST'])
def add_user():
data = request.get_json()
new_user = User(name=data['name'])
db.session.add(new_user)
db.session.commit()
return jsonify({'message': 'User added successfully'}), 201

“As you can see in this code, when using SQLAlchemy, you need to explicitly create a database session, commit and close the session after performing operations. This provides developers with clear control over database connection management, but at the same time, it has the downside of requiring additional code writing.

In contrast, in Django, the creation and management of database connections are handled internally and automatically. When using Django’s ORM and executing a QuerySet, Django automatically creates a database connection, and the Old Connection is closed when the HTTP request starts and ends (I will explain more about this below). This relieves developers from worrying about creating and managing database connections.

The way Django automatically manages database connections greatly simplifies development, but it also raises curiosity about how this process actually works. Understanding in detail how Django’s database connection management system is structured and how it operates efficiently can be helpful in gaining a deeper understanding of Django! In this article, we’ll examine how Django internally creates and terminates DB Connections in its internal code, and share settings to improve application performance by reusing DB connections.

Let’s now take a closer look at how DB Connections are created in Django!”

Misunderstandings About Django Connection Creation

Django connections are actually established inside the worker thread only when the first query is executed in a situation where there is no connection. Often, people misunderstand that a connection is made for each HTTP request because it is sometimes written in Django documentation, but more accurately, the connection is created not when the HTTP request arrives, but when SQL is executed by QuerySet in the Request’s Handler (View). (That is, if there is no code that executes a query in Middleware or Handler, a connection is not created.)

Persistent connections avoid the overhead of reestablishing a connection to the database in each HTTP request. They’re controlled by the CONN_MAX_AGE parameter which defines the maximum lifetime of a connection. It can be set independently for each database.

! Of course, as mentioned above, even if there is no direct query in the view, if DB lookup is required for authentication or if ATOMIC_REQUESTS is set to True, a DB connection may be created for each request to create a transaction.

FYI) If you want to test this aspect, you should use gunicorn to run the wsgi.py file, not manage.py runserver. This is because the operating mechanism of runserver works differently

Creation of DB Connection in Django

Yes, to reiterate, the DB Connection is established when the first query is executed in a situation where there is no connection. As an example, when the result of a QuerySet needs to return the DB Select result as an object, the connection is created at the time of executing the SQL if there is no existing connection.

Let’s verify this through the code!

Let’s suppose we execute len(User.objects.all()) .

  1. The above code calls the __len__() magic method of the QuerySet class, which in turn calls the _fetch_all() method, and then it invokes self._iterable_class (which is ModelIterable).
# db/models/query.py
class QuerySet(AltersData):
"""Represent a lazy database lookup for a set of objects."""

def __init__(self, model=None, query=None, using=None, hints=None):
self.model = model
self._db = using
self._hints = hints or {}
self._query = query or sql.Query(self.model)
self._result_cache = None
self._sticky_filter = False
self._for_write = False
self._prefetch_related_lookups = ()
self._prefetch_done = False
self._known_related_objects = {} # {rel_field: {pk: rel_obj}}
# _fetch_all() 에서 return하는 속성
self._iterable_class = ModelIterable
self._fields = None
self._defer_next_filter = False
self._deferred_filter = None
....
def __len__(self):
self._fetch_all()
return len(self._result_cache)

def _fetch_all(self):
if self._result_cache is None:
self._result_cache = list(self._iterable_class(self))
if self._prefetch_related_lookups and not self._prefetch_done:
self._prefetch_related_objects()

2. The self._fetch_all() method triggers the invocation of the __iter__ magic method of the ModelIterable class. This step is responsible for generating and actually executing the SQL.

# db/models/query.py
class ModelIterable(BaseIterable):
"""Iterable that yields a model instance for each row."""

def __iter__(self):
queryset = self.queryset
db = queryset.db
compiler = queryset.query.get_compiler(using=db)
# Execute the query. This will also fill compiler.select, klass_info,
# and annotations.
results = compiler.execute_sql(
chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size
)

3. Inside `compiler.execute_sql()`, the `BaseDatabaseWrapper.cursor()` is called, and within this, `BaseDatabaseWrapper.connect()` is invoked. This is the part that actually creates the DB Connection. Let’s take a closer look at this part.

  • The `BaseDatabaseWrapper.cursor()` method called inside `execute_sql()` is where a cursor is created to execute queries in the database. This method calls `ensure_connection`.
def execute_sql(
self, result_type=MULTI, chunked_fetch=False, chunk_size=GET_ITERATOR_CHUNK_SIZE
):
"""
...
"""
result_type = result_type or NO_RESULTS
try:
sql, params = self.as_sql()
if not sql:
raise EmptyResultSet
except EmptyResultSet:
if result_type == MULTI:
return iter([])
else:
return
if chunked_fetch:
cursor = self.connection.chunked_cursor()
else:
cursor = self.connection.cursor()
.....
  • The BaseDatabaseWrapper.ensure_connection() method checks if the current database connection is valid and, if the connection is missing or closed, it calls connect. Please remember this part well!!
class BaseDatabaseWrapper:
@async_unsafe
def cursor(self):
"""Create a cursor, opening a connection if necessary."""
return self._cursor()

def _cursor(self, name=None):
self.close_if_health_check_failed()
self.ensure_connection()
with self.wrap_database_errors:
return self._prepare_cursor(self.create_cursor(name))
...
@async_unsafe
def ensure_connection(self):
"""Guarantee that a connection to the database is established."""
if self.connection is None:
with self.wrap_database_errors:
self.connect()
  • The `BaseDatabaseWrapper.connect()` method is what actually establishes the database connection.
class BaseDatabaseWrapper:
...
@async_unsafe
def connect(self):
"""Connect to the database. Assume that the connection is closed."""
# Check for invalid configurations.
self.check_settings()
# In case the previous connection was closed while in an atomic block
self.in_atomic_block = False
self.savepoint_ids = []
self.atomic_blocks = []
self.needs_rollback = False
# Reset parameters defining when to close/health-check the connection.
self.health_check_enabled = self.settings_dict["CONN_HEALTH_CHECKS"]
max_age = self.settings_dict["CONN_MAX_AGE"]
self.close_at = None if max_age is None else time.monotonic() + max_age
self.closed_in_transaction = False
self.errors_occurred = False
# New connections are healthy.
self.health_check_done = True
# Establish the connection
conn_params = self.get_connection_params()
self.connection = self.get_new_connection(conn_params)
self.set_autocommit(self.settings_dict["AUTOCOMMIT"])
self.init_connection_state()
connection_created.send(sender=self.__class__, connection=self)

self.run_on_commit = []

If we delve a bit deeper, the DB connection is created in the get_new_connection method, which is implemented in the respective DB (MySQL, Postgres, etc.) Wrapper classes.

To summarize once more, when our created QuerySet executes SQL, if there is no connection, it is created at that moment. Now we have understood how DB connections are created in Django. Next, let’s check when these DB connections are closed.

When Do DB Connections Close?

“We have seen that in Django, a connection is created in the Thread’s local object when the actual SQL is executed and there is no connection. So, when does the DB Connection close?

If the Django developer does not explicitly close the connection, Django terminates the Old Connection at the start and end of an HTTP Request.

If we look at `django/db/__init__.py`, we can find the following code.”

# For backwards compatibility. Prefer connections['default'] instead.
connection = ConnectionProxy(connections, DEFAULT_DB_ALIAS)


......

# Register an event to reset transaction state and close connections past
# their lifetime.
def close_old_connections(**kwargs):
for conn in connections.all(initialized_only=True):
conn.close_if_unusable_or_obsolete()


signals.request_started.connect(close_old_connections)
signals.request_finished.connect(close_old_connections)

Django executes the function `close_old_connections` both before starting and after completing a view handler for an HTTP request.

The `close_old_connections` function is responsible for closing Old connections that have been created. In detail, it closes connections in cases like the following. In other words, Old connections can be considered as belonging to the following cases:

  • When the current session’s autocommit setting is different from the global setting.
    - When an error has occurred and `is_usable` is False.
    - When `self.close_at` exists and is less than the current time (this part will be covered in more detail in the following section).
# db/backends/base/base.py

class BaseDatabaseWrapper:
...
def close_if_unusable_or_obsolete(self):
"""
Close the current connection if unrecoverable errors have occurred
or if it outlived its maximum age.
"""
if self.connection is not None:
self.health_check_done = False
# If the application didn't restore the original autocommit setting,
# don't take chances, drop the connection.
if self.get_autocommit() != self.settings_dict["AUTOCOMMIT"]:
self.close()
return

# If an exception other than DataError or IntegrityError occurred
# since the last commit / rollback, check if the connection works.
if self.errors_occurred:
if self.is_usable():
self.errors_occurred = False
self.health_check_done = True
else:
self.close()
return

if self.close_at is not None and time.monotonic() >= self.close_at:
self.close()
return

“As we have seen, Django terminates old connections at the start and end of an HTTP request, preventing the reuse of already closed connections.”

Reusing DB Connections with CONN_MAX_AGE Setting!

To summarize the above content, Django creates a connection when executing SQL if there is no existing connection. We confirmed this in the `BaseDatabaseWrapper.ensure_connection()`.

Therefore, in Django, if a DB Connection exists, it can be reused, which means that if the connection is not closed, it can be reused.

Conclusion: We can reuse DB Connections by adjusting the part where the connection is closed. Looking once again at the code for creating a connection and closing old connections, it is as follows:

If `self.close_at` is greater than the current time, we can reuse the DB Connection. The `self.close_at` is set to the current time plus `CONN_MAX_AGE`! In other words, we can reuse the DB Connection we want through the `CONN_MAX_AGE` setting.

class BaseDatabaseWrapper:
@async_unsafe
def connect(self):
"""Connect to the database. Assume that the connection is closed."""
# Check for invalid configurations.
self.check_settings()
# In case the previous connection was closed while in an atomic block
self.in_atomic_block = False
self.savepoint_ids = []
self.atomic_blocks = []
self.needs_rollback = False
# Reset parameters defining when to close/health-check the connection.
self.health_check_enabled = self.settings_dict["CONN_HEALTH_CHECKS"]
# 이 부분을 잘 봐주세요!
max_age = self.settings_dict["CONN_MAX_AGE"]
self.close_at = None if max_age is None else time.monotonic() + max_age

def close_if_unusable_or_obsolete(self):
"""
Close the current connection if unrecoverable errors have occurred
or if it outlived its maximum age.
"""
if self.connection is not None:
self.health_check_done = False
# If the application didn't restore the original autocommit setting,
# don't take chances, drop the connection.
if self.get_autocommit() != self.settings_dict["AUTOCOMMIT"]:
self.close()
return

# If an exception other than DataError or IntegrityError occurred
# since the last commit / rollback, check if the connection works.
if self.errors_occurred:
if self.is_usable():
self.errors_occurred = False
self.health_check_done = True
else:
self.close()
return

if self.close_at is not None and time.monotonic() >= self.close_at:
self.close()
return

In summary, you can set `CONN_MAX_AGE` to define the maximum lifespan of a connection for each database. When `CONN_MAX_AGE` is set, Django will close the connection at the start of each request if the connection has reached its maximum lifespan.

The reason to set `CONN_MAX_AGE` is that databases can terminate inactive connections after a certain period (for MySQL, this is the `wait_timeout`). When the database terminates a connection, Django may not be aware of this, and if a query is executed, a ‘Lost Connection’ error can occur. To prevent Django from attempting to use a connection that has been terminated by the database server, `CONN_MAX_AGE` should be set to a value lower than the database’s `wait_timeout`. (However, this issue may only affect sites with very low traffic, and it would impact at most one request per worker thread.)

“The MySQL `wait_timeout` defines how long to keep MySQL client connections active. If no client sends a request to the server within the specified time (in seconds) set by the `wait_timeout` value, the server will terminate that client connection.”

Ending

In this article, we explored how Django internally manages DB Connections. If there are any parts of the content that you think I may have misunderstood, please feel free to let me know through comments or email. It seems that there are fewer Django-related articles and use cases recently, which I personally find regrettable. I will try to post more content that could be helpful regarding Django from time to time.

Persistent connections avoid the overhead of reestablishing a connection to the database in each HTTP request. They are controlled by the CONN_MAX_AGE parameter, which defines the maximum lifetime of a connection. It can be set independently for each database.
! Of course, as mentioned above, even if there is no direct query in the view, if DB lookup is required for authentication or if ATOMIC_REQUESTS is set to True, a DB connection may be created for each request to create a transaction.

--

--