GraphQL on Django at JOOR

When we decided to build the next generation of JOOR’s APIs a year ago, we had to make some technical decisions about what software stack would power those APIs. We ended up choosing Django because of our existing in-house Django expertise. We ended up choosing GraphQL over REST (with the wonderful Django REST Framework) because GraphQL’s batching features solved some real problems on JOOR’s most complex frontends.

We officially broke ground on the API roughly 8 months ago. Since then, we’ve learned a lot of lessons about building GraphQL APIs with Graphene and Django. Here are some of the key takeaways!

Prefer vanilla graphene over graphene-django

Graphene provides an integration for Django that makes APIs easier to develop by providing abstractions over Django Models. You can define a field that lists all users pretty trivially:

However, we ran into a few problems working with graphene-django that made us prefer writing code with plain graphene instead. These problems were all variations on a theme: too much magic.

A core facet of the Zen of Python is that explicit is better than implicit. The abstractions provided by graphene-django do a little too much for our comfort.

For instance, graphene-django is capable of automatically generating schema connections and will use the related_name you define on your model’s ForeignKey. It uses inefficient n+1 queries by default (and doesn’t try to warn you) and does a little magic to handle pagination. If you add fields (e.g. computed values) that aren’t model fields, you end up with an awkward mix of implicit and explicit field resolvers.

At JOOR, we ended up doing more work overriding this magic than simply defining our objects as vanilla Graphene objects.

Build a domain layer between your API & models

The best practice for designing GraphQL APIs is to treat the schema and its object types as a shared language about your business domain. This closely models the idea of ubiquitous language and underscores a domain-driven approach to resource modeling. This is a killer feature of GraphQL — you describe the what of your business instead of how to get data.

Thanks to the object-relational impedance mismatch, however, the public interfaces of our domain objects often differ from our persistence objects. You don’t necessarily want to just bottle up your DB schema as your API schema, but this is the workflow preferred by graphene-django.

As a trivial example, graphene-django will publicly expose all fields by default — and your ability to stop this is tucked away in the authorization documentation. The deeper problems come when you want to handle inheritance, create facades over multiple tables, or even provide aggregate fields.

Instead, we often build facades, adapters, and domain objects over and above our model layer. This lets us separate business logic from persistence, and model our real domain objects in our GraphQL schema.

Use dataloaders instead of Django’s prefetching

We mentioned before that graphene-django makes inefficient queries by default. What does that actually mean? Imagine the following query:

Suppose we offer some kind of online family tree API. We want to fetch a list containing each user’s name, their father’s name, and their grandfather’s name. This query will run into the N+1 SELECT problem because we first have to load a list of users, then individually fetch all of their fathers, and finally fetch each father’s father. If we have 5 users, the above query will make 11 queries. Yikes!

If graphene-django makes inefficient queries by default, can we use Django’s built-in features for preventing the N+1 SELECT problem? Unfortunately, select_related and prefetch_related aren’t quite up to the task for a few reasons.

Django’s system of relation-following was designed to provision data for template-based views, where the data requirements are always known before runtime. In a GraphQL query service, however, the queries are not known until runtime. To take advantage of Django’s prefetching features with dynamic runtime requirements, we would either need to:

  • Prefetch all nested data by default, which would add a dramatic amount of latency to small queries by overfetching data.
  • Prefetch all nested data by default, but cache very aggressively to reduce latency (feasibility depends on how often your data changes).
  • Develop a system that parses the GraphQL query AST and automatically applies the relevant prefetches (some have attempted this).

Even if you implement one of the above measures, you’ll run into issues as soon as you introduce pagination or filtering. Django has a solution for this, but we would have to develop a very sophisticated GraphQL query parser to automatically optimize our Django queries, and we would ultimately end up with a GraphQL schema tightly coupled to our database schema.

As Raymond Hettinger might say, there must be a better way. Well, it turns out that Facebook ran into this problem too, and developed a batch loading system called dataloaders to solve this problem. The result is that you execute just one query per relation in the query. It means we’re giving up a healthy chunk of Django fundamentals, but it quickly solves our problem. If you’re not using Django, you might even be able to make use of asyncio dataloaders.

Tests, tests, tests!

Testing your graphene object types and queries can be difficult. We recommend a two-part approach to them.

First, keep your resolvers as simple as possible. Put all their business logic into custom QuerySet methods or DataLoader classes. You can then test your business and fetching logic in isolation from the API.

Second, use snapshot testing to test your API. Snapshot tests will allow you to ensure that any changes to your schema are caught and vetted before going out to production.

Since you’ve isolated your fetch logic, don’t use snapshot tests to test every possible variation of your queries — you’ll end up writing a ridiculous amount of setup logic. Instead, write simple queries that fetch all the fields on each object type in your schema and test each connection to ensure the core expected behavior works.

Conclusion

We hope this post saves people time when building GraphQL APIs on top of Django. Since we started building out our GraphQL API almost a year ago, we have learned a lot of lessons and written a lot of tooling to make our lives easier. We hope to share or release a lot of these in the near future.

Additionally, the documentation for graphene has improved dramatically and the community has grown as well. Hopefully more developers share their lessons learned and contribute to more and better tooling for the python GraphQL community.


Like what you’re reading? Interested in learning more about working at JOOR, the world’s largest digital wholesale marketplace? We’re hiring! Check out https://jooraccess.com/careers.