Django + MongoDB = Django REST Framework Mongoengine

Boris Burkov
8 min readSep 19, 2016

--

I’ve been using Django REST Framework (DRF) to create my REST APIs for quite a while and I believe this tool gave a second life to Django. During the PyCon Russia-2016 I was surprised to find out that nearly all of ~150–200 Django developers there were using Django as a REST backend for their frontend Single-Page Apps, not for old-style generation of html pages on backend. And almost all of them use DRF.

Pure Django REST Framework is meant to be used with RDBMS only (such as PostgreSQL, MySQL or SQLite). In 2016 we’ve gotten a new, fully-fledged tool that integrates DRF with MongoDB — DRF-Mongoengine. Here I’ll explain how to use it, but first I’ll cover the functioning of pure DRF. Feel free to skip this part. I write about pure DRF, because architecture of DRF-Mongoengine completely repeats the architecture of DRF. If this introduction is too vague for you, try the official DRF tutorial: http://www.django-rest-framework.org/tutorial/quickstart/.

A 10-minute introduction to Django REST Framework by example (skip this if you know this)

Suppose that you want to create a typical REST endpoint for exchanging the data between your database and frontend. For example, let’s store information about Unix command-line tools (such as cat, wget etc.) in our database and let users query and update this database via REST api.

The typical pattern of writing an endpoint in DRF is as follows: you have to define a triplet of (ModelSerializerViewSet) and register a url that corresponds to that endpoint with a DRF Router. This pattern completely applies to DRF-Mongoengine too, but let’s speak of it later. First, let me explain how pure DRF works.

Model is a standard model from classical Django. As you know, Django is built around Django ORM (Object-Relational Mapping) that allows you to declare the schema of your relational database tables via python classes — Models. You can write queries in pure python, not complicated SQL, to create/retrieve/update/destroy those models.

Let’s define a model for our Tool. A Tool might have multiple inputs and we store each ToolInput in a separate table, keeping it related to a Tool:

Serializer is the central entity of Django REST framework. It converts incoming JSONs into python data and represents outgoing python data as JSONs. There’s a special kind of serializers, called ModelSerializer, which is responsible for automatic conversion between JSONs and Django Models. Sometimes this conversion is non-trivial. For example, serializers offer several ways to represent related structures (e.g. our inputs): as hyperlinks, nested JSONs, plain ids etc.

With ModelSerializer we only have to declare a corresponding Model and it automatically parses JSON fields into Django model fields most of the times. Sometimes though (as for inputs in this example, which is a many-to-one relation) we have to explicitly declare the corresponding field, which is a nested serializer.

Serializer is also responsible for validation of values in input JSONs and for representation of fields as html form inputs in Browsable API.

The ViewSet is a class that handles the standard REST requests — CREATE, UPDATE, PARTIAL_UPDATE (PATCH), DESTROY, LIST and DETAIL — with its methods create()/update()/destroy()/list()/detail(). Standard ModelViewSet already contains default implementations of those methods, but if you need to customize them, you can override some of those methods, as we did with destroy() method here. Note that ViewSet relies on Serializer to parse data from database to JSON or from JSON to database.

The only thing left is to register our ViewSet with DRF Router. Router automatically creates a url for our ViewSet, based on its name. When accessing a single instance of a collection, router also interprets last part of url as a lookup_field (lookup_field is how you identify a single instance of Tool in your api — in our case by an implicit primary key “id”, autogenerated by Postgres):

Now, if you say:

python manage.py runserver

And visit in your browser the following page:

localhost:8000/api/

you’ll see the list of your API endpoints. Currently, there’s only /api/tool/ endpoint, representing you Tool:

This Browsable API is one of the killer features of Django REST Framework for me. It seals a declarative contract between frontend and backend developers about the API schema and keeps it up-to-date. Frontenders, have you heard the tale of “I’ve updated the API, but haven’t updated the docs yet” from your backend fellows? This never happens to you with DRF. Frontend programmer can always see and test the schema of API resources by sending requests via Browsable API.

Why MongoDB?

What if you need to work with deeply nested JSONs in your REST api? Like the one on the picture, which doesn’t fit into several screens?

If you are going to stick with PostgreSQL for storing these data, you have two options. Either use JSONFields or parse these JSONs into relations.

JSONField solution has 2 major problems. First, you have to validate the incoming values manually, writing huge validators. This defeats the whole purpose of DRF, cause validation out-of-the-box is exactly what it was meant to do for you. Second major problem is that you can’t use relations within JSONFields .

The other approach — parsing large JSONs into multiple tables — takes a lot of time, makes your code error-prone and plain ugly. Just imaging a model Workflow that contains multiple Tools, so you parse each Workflow JSON into Workflow models and nested Tools into related Tool model instances. Then you create the related ToolInput and their relations, related with… uh, you got it, right? You’ll run mad, normalizing this tree of nested structures into relations. Especially, given the fact that DRF doesn’t support writable nested serializers out of the box, cause you can think of multiple different ways how to deal with them.

So, your solution is simple: switch to MongoDB for such a resource. Mongo supports nested data structures out of the box and there’s also a way to deal with relations in Mongoengine, when you really need them.

What is Mongoengine?

Mongoengine was started in the late 2009 by Harry Marr, but it was brought to the production level later in 2011-2015 by Ross Lawley. It is an Object-Document Mapper (ODM), a MongoDB analogue of what Django ORM is for relational databases. Basically, Mongoengine provides a declarative API for defining the schema of your MongoDB document and a query language, almost identical to Django ORM:

Here we declared a schema for documents in Tool collection. They are required to have String label and description and have inputs — lists of sub-JSONs with String name and arbitrary data as value. The following is a valid Tool to store in MongoDB:

{
label: "wget",
description: "A non-interactive network donwloader",
inputs: [
{
name: "url",
value: "https://medium.com"
},
{
name: "options",
value: 123
} ]}

Mongoengine also offers a request language, almost identical to Django ORM:

Here we created an instance of our Tool object in MongoDB.

Django REST Framework Mongoengine

Now, we want to use Django REST Framework with MongoDB.

Ok, if Mongoengine repeats the API of Django ORM, why don’t we use DRF on top of Mongoengine instead of Django ORM+Postgres? It should take a bit of glue code to do so, but we’ll get all the upsides of DRF and MongoDB with relatively little effort.

Luckily, this glue code was already written in a project known as Django-REST-Framework-Mongoengine. It was initially created by Umut Bozkurt in the late 2015 and exhaustively covered with unit tests by Maxim Vasiliev. A meager 2000–3000 lines of code port 90% of DRF capabilities to MongoDB!

Now, let’s create our Tool endpoint in DRF-Mongoengine. Just as in Posgres+DRF case, in Mongo+DRF-Mongoengine case we’ll have to create a triplet of (Model-Serializer-ViewSet) and then register our endpoint with a Router.

But first of all, we need to create a connection with MongoDB. Let’s do this in settings.py:

mongoengine.connect(
db="tools",
host="localhost"
)

Note, that there are 2 major versions of PyMongo: PyMongo2 and PyMongo3 and they have a different connection syntax. Mongoengines 0.9- use PyMongo2, while Mongoengine 0.10+use PyMongo3. This syntax is for the OLD Mongoengine 0.9-. I use the older one, because it comes with a built-in integration with Django authentication and sessions system. In Mongoengine 0.10 they’ve separated that code into a standalone project Django-Mongoengine, but it is not stable yet.

Ok, we’ve already defined our Tool Model, while introducing Mongoengine. Let’s re-use it:

Now, we need a ModelSerializer, which is called DocumentSerializer (we’re calling Models “Documents” in MongoDB world, right?). And luckily, we don’t need to specify any nested Serializers as DRF-Mongoengine creates them for us automatically, if it sees EmbeddedDocuments within our Document. Just specify the Document in Meta.model:

And last, we need a ViewSet for this endpoint. Again, everything is very similar to pure DRF. Just define a lookup_field and serializer_class, but instead of specifying queryset in declarative manner, define a method get_queryset(), which works equivalently. (if I’m not mistaken, the difference stems from the fact that Django ORM evaluates its queryset lazily, while Mongoengine attempts to do pre-caching)

The registration of ViewSet with Router is exactly the same as with DRF, just don’t forget to import the router from DRF-Mongoengine, not from DRF (there are subtle differences).

That’s it! You’ve got a fully functional Django REST Framework endpoint, stored in Mongo! It has a fully functional validation out of the box and is available in Browsable API.

However, it would be a lie to say that your life with Mongoengine is as smooth as with Django ORM. There are some places in Django that are tightly coupled with Django ORM and expect RDBMS to be available. As I said, you might experience some trouble, setting up your authentication system with Mongo. Many management commands might fail. E.g. I experienced problems with testing, while setting django database to django.db.backends.dummy and using MongoDB only.

Also, you might want to use database transactions from Postgres. MongoDB offers Document-level transations, which is nice, but if you want better transaction isolation, you’ll have to do it at the application level.

So, in my current project I keep both databases for different purposes and this combination saved me a lot of trouble.

--

--