Manager vs Query Sets in Django

On my last two posts I talked about models and business logic in Django projects. In this article I want to talk about the role of the Manager and Query Set in the Django ORM.

I’ve seen many projects where there were queries all over the place. I’ve also seen projects where queries were concentrated on model managers. And of course, lots of customized query sets classes.

Before we move forward, let me give you a quick explanation about the ORM and its patterns.

Model — It implements the Active Record pattern. In this pattern the Object wraps a row in a database table (or view). The object carries both data and behavior and it is, in most of the cases, persisted in a data storage through Model Managers. One of the responsibilities the model class has within the ORM design is to carry the metadata used to map the result of the queries into Python objects that is defined by Fields, as well as the metadata for the entity that lives on a data storage such as the table name and primary key. The query set uses this map to perform its operations and return an instance (or instances) of the model with the data perfectly mapped to the object attributes.

Fields — A field is a representation of a column in your data storage. It contains all the storage constrains such as “null”, “length”, “type”, and “unique”.

Fields implement the ValueObject pattern, therefore, they have no meaning by itself. However, there is a type of fields that implements a different pattern, the relational fields.

Relational fields are ForeignKey field, OneToOne field, and ManyToMany field. Those implement the ForeignKey mapping, Dependent mapping, and Association table mapping patterns respectively. They are responsible for define the Metadata needed for the Data mapping that will be done by Query Sets.

Manager — According to Django documentation, a manager is the interface through which database query operations are provided to models.

It implements partially the Table Gateway pattern. However, Managers delegate performing the queries to Query Sets.

I always see Managers more like a “Facade” for the complexity of the operations on your data storage. Although, another approach I like is to use them as Repositories in some projects (I will talk more about that in another post).

Whenever you need to perform a query, insert data, or manipulate an existing data, the manager can be powerful for simplifying this process for external usage on your application.

Query Sets — Query sets are the final frontier between your domain and your data storage. They implement the Query Object pattern (named by Martin Fowler). Query sets allow you to build queries with python objects and use different backends to convert those queries to the real SQL query that will be performed in a data storage.

Whenever you need to customize the way the SQL query will be built or its response, Query Sets are the right place to do it.

Query sets perform queries using the patter known as Lazy Load, which means that queries are performed only when the application actually requires the result. You can play with a query set as much as you want before it really gets evaluated.

Backends — They perform the mapping from Query objects to SQL queries for a particular storage. It is very powerful if you consider that by implementing this interface, you can talk to pretty much every data storage, from SQL to No-SQL. I’ve rarely seen customizations at this level.

So here is an overview of the components described above.

I guess it already gives you a glance of the role of the Manager and a Query Set. However, one of the most frequent questions I get on projects is whether you should add custom queries on one or the other.

Like I mentioned earlier here, the manager can be interpreted as a “Facade” that simplifies the complexity of operations you want to perform on your data storage. Every time I have a query to perform with known parameters, I add methods to abstract the creation of the Query set. I also work really hard to keep my queries very simple and avoid “leaking” the internals of my models.

But from time to time you have really complex queries in your project. Queries that involve annotations or even complex filters that should be known only by a specific domain.

Let’s take a look on a quick example.

In this particular case we have three models, Order, OrderItem, and Product.

Here is the python/django representation for them.

from django.db import models

class Product(models.Model):
name = models.CharField(max_length=90)
category = models.CharField(max_length=50)
price = models.DecimalField(max_digits=17, decimal_places=2)

class Order(models.Model):
number = models.CharField(max_length=30, unique=True)
created = models.DateTimeField(auto_now_add=True)
    # reference to objects in a different domain
operator_id = models.IntegerField()
customer_id = models.IntegerField()

class OrderItem(models.Model):
order = models.ForeignKey(
Order,
related_name='items',
on_delete=models.CASCADE
)
product = models.ForeignKey(
Product,
related_name='orders',
on_delete=models.CASCADE
)
    quantity = models.DecimalField(max_digits=19, decimal_places=3)
price = models.DecimalField(max_digits=17, decimal_places=2)
discount = models.DecimalField(
max_digits=17,
decimal_places=2,
null=True,
blank=True
)

I will create a custom manager and add a find method. In this example I will avoid to raise exceptions making the experience a bit more smooth for other use cases that will use this manager as a repository. Also, I just like this pattern :)

from typing import Optional
from django.core.exceptions import ObjectDoesNotExist

class OrderManager(models.Manager):
    def find(self, order_id: int) -> Optional['Order']:
queryset = self.get_queryset()
        try:
instance = queryset.get(pk=order_id)
except ObjectDoesNotExist:
instance = None
finally:
return instance

class Order(models.Model):
...
    # Add custom manager
objects = OrderManager()

Let’s say my application now needs to retrieve all orders for a particular customer. In this case we may need to create a new method.

class OrderManager(models.Manager):
...
    def find_all_for(self, customer_id: int) -> QuerySet:
queryset = self.get_queryset()
return queryset.filter(customer_id=customer_id)

Ok, that is pretty simple. But, what if I need to find all orders for a specific product? In this case we need to create another method.

class OrderManager(models.Manager):
...

def find_all_with_product(self, product_id: int) -> QuerySet:
queryset = self.get_queryset()
return queryset.filter(items__product_id=product_id)

Great! It is still simple. Let’s make things a bit harder.

Your product manager wants to add a new feature. For every new purchase the system now needs to search for previous orders where it had combined sales. This means, all orders for a particular customer which have more than one product from the same category.

class OrderManager(models.Manager):
...
    def find_all_combined_sales_for(
self,
customer_id: int
) -> QuerySet:
        queryset = self.get_queryset()
return (
queryset
.annotate(
count_category=Count(
'items__product__category'
)
)
.filter(
customer_id=customer_id,
count_category__gt=1
)
)

Here’s an important part, your manager now is performing a quite complex query. The manager now "builds" the query that will be performed. A way easier approach would be to delegate the complex part to the QuerySet.

class OrderQuerySet(models.QuerySet):
    def doubled_categories_only(self):
return (
self
.annotate(
count_category=Count(
'items__product__category'
)
)
.filter(count_category__gt=1)
)

Let’s do the refactoring on the Manager now.

class OrderManager(models.Manager):

def get_queryset(self):
return OrderQuerySet(
model=self.model,
using=self._db,
hints=self._hints
)
...
    def find_all_combined_sales_for(
self,
customer_id: int
) -> QuerySet:
# taking advantage of an existing method
queryset = self.find_all_for(customer_id)
return queryset.doubled_categories_only()

Cool! Now we've delegated the complexity of building the query to the query set and still use the manager as a facade to simplify and protect the internals of the domain.

Let’s make it even more interesting now. The CEO of your company wants to make this system multi-tenant!

It means that the system will be sold to other companies and you need to separate orders by them!! OH MY GOD!! What we do now?

Ok, that’s not a big deal! All you have to do is to make your Facade a little bit smarter. Let’s see how we can implement this.

First, I will add the company_id field on the Order model

class Order(models.Model):
number = models.CharField(max_length=30, unique=True)
created = models.DateTimeField(auto_now_add=True)
    # reference to objects in a different domain
company_id = models.IntegerField()
operator_id = models.IntegerField()
customer_id = models.IntegerField()
    objects = OrderManager()

Secondly, I will refactor the manager adding an init function to have company_id as a parameter. I will also make get_queryset recognize this new attribute and create a factory method to protect the state of the manager from being changed outside of the class.

class OrderManager(models.Manager):
    def __init__(self, company_id=None, *args, **kwargs):
self._company_id = company_id
super().__init__(*args, **kwargs)
    def get_queryset(self) -> OrderQuerySet:
queryset = OrderQuerySet(
model=self.model,
using=self._db,
hints=self._hints
)
        if self._company_id is not None:
queryset = queryset.filter(company_id=self._company_id)
        return queryset
    @classmethod
def factory(cls, model, company_id=None):
manager = cls(company_id)
manager.model = model
return manager
...

Now I will create a factory function that builds a Manager every time we need to set a company as the context of our queries.

def factory_manager_for_company(company_id):
return OrderManager.factory(model=Order, company_id=company_id)

And the last act is to set the method on the Order model

class Order(models.Model):
...
as_company = factory_manager_for_company

Our manager now can operate as within the context of a particular company, or for the entire database.

Here's how you use it

company_id = 7
customer_id = 34
Order.as_company(company_a).find_all_combined_sales_for(customer_id)
# or
Order.objects.find_all_combined_sales_for(customer_id)

Wrapping up

Put in your managers methods that make sense for the external world with meaningful parameters. Put in your query sets methods that change the structure of the response of your query, or perform any kind of operations such as aggregations and annotations.

To be clear, simple queries — Managers, complex queries — QuerySets. However, never access query sets directly, making your manager as the Facade layer will preventing you from having multiple queries for the same purpose all over the place. Thus, there's just one source of truth for the external world, the manager. It is way easier to maintain when your application scales. Trust me!