Django F Expressions & Model-Less Serialization

Jason Johns
5 min readDec 29, 2017

--

Django Rest Framework and its serializations are an extremely powerful tool to enable creation of API resources. However, there are some situations where the nesting serialization is less than optimal. For example, if you have to construct a dashboard application, you really have two choices: multiple API calls per model to display, or construct a nested serialization of the result set. Both these solutions come with the probable caveat of requiring some transforming logic in your client consumer before you can use it.

The content here came from a project in which I had to construct a spreadsheet grid application using react-data-grid for internal usage. Having all the data serialized in a nested format required logic implemented to transform the API response into something that can be shown in a grid type layout.

A sample db schema:

class Institution(models.Model):
name = models.CharField(max_length = 100, blank = False)
address = models.CharField(max_length = 200, blank = False)
city = models.CharField(max_length = 50, blank = False)
state = models.CharField(max_length = 2, blank = False)


class Catalog(models.Model):
CATALOG_TYPES = (
('UG', 'Undergraduate'),
('GR', 'Graduate'),
('DR', 'Doctorate'),
('PR', 'Professional')
)

institution = models.ForeignKey(Institution)
catalog_type = models.CharField(max_length = 2, choices = CATALOG_TYPES, default = 'UG')
academic_year = models.CharField(max_length = 20, blank = False)
create_date = models.DateField(auto_now_add = True)
update_date = models.DateField(auto_now = True)
class Link(models.Model):
MEDIA_FORMATS = (
('P', 'PDF'),
('W', 'Web'),
('F', 'Flash'),
('M', 'Mixed')
)

catalog = models.ForeignKey(Catalog)
media_type = models.CharField(max_length = 1, choices = MEDIA_FORMATS, default = 'P')
url = models.URLField()

Here, we have a sample one to many structure where an Institution can have many Catalogs, each with many Link types. If we do a classical nested serializer implementation as described in the DRF documentation, then we can have something like

class LinkSerializer(serializers.ModelSerializer):
class Meta:
model = Link
fields = '__all__'


class CatalogSerializer(serializers.ModelSerializer):
link_set = LinkSerializer(many = True, read_only = True)

class Meta:
model = Catalog
fields = ('id', 'institution_id', 'academic_year', 'catalog_type', 'link_set')


class InstitutionSerializer(serializers.ModelSerializer):
catalog_set = CatalogSerializer(many = True, read_only = True)

class Meta:
model = Institution
fields = ('id', 'name', 'address', 'city', 'state', 'catalog_set')

where the nesting is top-down from Institution to Link. A resource endpoint of /api/institution returns the following snippet from data generated using factory-boy:

{
"id": 253,
"name": "Robinson-Henderson",
"address": "4065 Nicole Lakes Apt. 404",
"city": "New Jennifer",
"state": "Georgia",
"catalog_set": [
{
"id": 1002,
"institution_id": 253,
"academic_year": "2017-2018",
"catalog_type": "UG",
"link_set": [
{
"id": 5001,
"media_type": "P",
"url": "http://davis.com/",
"catalog": 1002
},
{
"id": 5201,
"media_type": "P",
"url": "http://smith.com/",
"catalog": 1002
},
{
"id": 5401,
"media_type": "P",
"url": "http://www.jackson-holland.com/",
"catalog": 1002
},
{
"id": 5601,
"media_type": "P",
"url": "http://kim.com/",
"catalog": 1002
},
{
"id": 5801,
"media_type": "P",
"url": "https://vincent.biz/",
"catalog": 1002
}
]
},
...
...
]
}

A response formatted this way will require some internal logic to transform to be usable with the grid components. That ends up causing delays in the client because of the nested iteration required. Here, there are three levels, so any transformer method will be O(n³). What if we could move this to the database instead, and work with the correctly formatted response right away?

Enter F expressions

Django’s models have F objects, which represent the value of a model field or annotated column. Instead of doing a database query to pull the value of the field into Python for operation, we can use F objects to do it all in the database. Here, we’ll use this to construct aliases for the fields we want to show.

Since this transformation will occur on the model queryset, the best place for this logic is on a custom model manager. For ease of demonstration, this will be a LinkManager

class LinkManager(models.Manager):
def get_composite_data(self):
mappings = {
'link_id': F('pk'),
'inst_id': F('catalog__institution__id'),
'inst_name': F('catalog__institution__name'),
'address': F('catalog__institution__address'),
'city': F('catalog__institution__city'),
'state': F('catalog__institution__state'),
'catalog_id': F('catalog__id'),
'type': F('catalog__catalog_type'),
'year': F('catalog__academic_year'),
'create_date': F('catalog__create_date'),
'update_date': F('catalog__update_date'),
}
keys = tuple(mappings.keys()) + ('media_type', 'url')

return self.get_queryset().annotate(**mappings).values(*keys)

and then update the Link model to use the custom manager. Now, this method will be available with the call Link.objects.get_composite_data().

The way this works is a dict of field names to F objects is defined. Using ’address’: F(‘catalog__institution__address’) as an example, it is saying Alias this Django model lookup for this institution's address to the field address. It is then used to annotate the queryset, and only the keys from the dict mapping are used to extract out the values from the queryset.

In [1]: records = Link.objects.get_composite_data()In [2]: records[0]
Out[2]:
{'address': '4065 Nicole Lakes Apt. 404',
'catalog_id': 1002,
'city': 'New Jennifer',
'create_date': datetime.date(2017, 11, 18),
'inst_id': 253,
'inst_name': 'Robinson-Henderson',
'link_id': 5001,
'media_type': 'P',
'state': 'Georgia',
'type': 'UG',
'update_date': datetime.date(2017, 11, 18),
'url': 'http://davis.com/',
'year': '2017-2018'
}

Serializing

So lets see. We have the data from the database in a flat format. How can we serialize this within DRF to push out to the client as JSON? Enter in model-less serializers! Interestingly enough, a DRF serializer doesn’t have to be bound to a Django model, or any kind of object. Here, I’ve defined a serializer to match up with all the fields of the values list queryset response:

class CompositeSerializer(serializers.Serializer):
catalog_id : serializers.IntegerField()
inst_id = serializers.IntegerField()
inst_name = serializers.CharField()
state = serializers.CharField()
city = serializers.CharField()
year = serializers.CharField()
type = serializers.SerializerMethodField()
url = serializers.URLField()
media_type = serializers.SerializerMethodField()
create_date = serializers.DateField(default = date.today())
update_date = serializers.DateField(default = date.today())

@staticmethod
def get_property_name(name, options):
return dict(options).get(name, '')

def get_media_type(self, obj):
return CompositeSerializer.get_property_name(obj.get('media_type', ''), Link.MEDIA_FORMATS)

def get_type(self, obj):
return CompositeSerializer.get_property_name(obj.get('type', ''), Catalog.CATALOG_TYPES)

The downside for my implementation of the db-less model is the serializer needs to define the fields on which are to be serialized, which can make for a somewhat verbose serializer implementation.

Views and Results

I defined an implementation of DRF’s ListAPIView to use the custom model manager method and serializer defined earlier

from rest_framework.generics import ListAPIView
from rest_framework.permissions import IsAuthenticatedOrReadOnly
from app.serializers import CompositeSerializerclass CompositeListView(ListAPIView):
permission_classes = (IsAuthenticatedOrReadOnly, )
serializer_class = CompositeSerializer
queryset = Link.objects.get_composite_data()

Now, retrieving data from the REST endpoint api/composite returns a result looking like

[
{
"inst_id": 253,
"inst_name": "Robinson-Henderson",
"state": "Georgia",
"city": "New Jennifer",
"year": "2017-2018",
"type": "Undergraduate",
"url": "http://taylor.com/",
"media_type": "PDF",
"create_date": "2017-11-18",
"update_date": "2017-11-18"
},
{
"inst_id": 254,
"inst_name": "Davis, Klein and Meza",
"state": "North Carolina",
"city": "East Brycemouth",
"year": "2017-2018",
"type": "Graduate",
"url": "https://www.grimes.com/",
"media_type": "Web",
"create_date": "2017-11-18",
"update_date": "2017-11-18"
},
...
...
]

which is much easier to use in a dashboard view! Best of all, everything is done in the database, and not within Python or Javascript, so performance impact is extremely minimal compared to other solutions. According to django-silk profiling, the data return requires just two queries and four join operations in all.

--

--