Making GraphQL Easy at Simon

Published in

Simon Systems

7 min readJul 18, 2023

In this post, I will introduce DjraphQL, the open source library we built at Simon to unlock the power of GraphQL for our Django web application. I’ll describe some of the challenges we experienced before, and explain how moving to GraphQL resolved those challenges.

Background

Our goal at Simon is to empower marketers to access and use their customer data for highly-personalized, cross-channel campaigns. With Simon Data, marketers can build segments using data from robust customer profiles, as well as events like page views, clicks and purchases.

To make it all happen, we build and support a plethora of products, ranging from a Customer Data Platform (CDP) and marketing orchestration engine, to native and connected Snowflake apps.

In order to bring the data to marketers, Simon connects directly to your data warehouse, which allows us to build customer profiles, connected segments, event, and identity tables within your infrastructure. These database objects are managed via our Saas application, and are fed into our Jenkins execution environment.

Both our web application and our execution environment are built on the Django framework, which is one of the most popular web frameworks in the Python ecosystem. We have a React/Typescript frontend pulls data from a set of endpoints backed by Django REST Framework (DRF).

The Problem

In the beginning, everything was easy. Complexity was low, and adding new features felt natural. We went through a few iterations of our core feature set and DRF served us quite well. Add a new Django model, a new Serializer, and a ViewSet. Rinse and repeat.

Our model layer grew rapidly, both horizontally and vertically. We had a lot of models and they were highly nested. As a result, we had lots of Serializers, and we’d overridden much of their base behavior to handle nested writes — functionality that DRF doesn’t provide out of the box.

It started innocently enough, but this ability to plug into a Serializer’s hooks turned out to be tough to resist, and eventually our API layer had become bloated with custom logic to handle not only the nested writes, but any and all edge cases around reading, filtering, and semantically validating trees of data.

Because of this, things began to slow down. Our ability to deliver features quickly diminished and bugs became more common. We made a couple observations on how we’d gotten to this state.

We over used DRF’s flexibility in order to handle product requirements, rather than encoding those requirements in a more maintainable way.
We had inadvertently built our own version of GraphQL by adding custom logic to handle related objects in our Serializers.

As luck would have it, we were eventually tasked with building a new coordinated marketing tool: our Journey Builder. This gave us a clean slate along with an opportunity to incorporate previous lessons learned.

We wanted to build a tool flexible enough to handle the typical use cases while also allowing the marketer to be creative. We settled on an industry standard graph building interface, but we wanted it to be performant and a joy to work with.

We knew that our necessarily complexity would be high on the frontend and the backend. We reused much of our data model layer and additionally added modeling to capture the graph-like structure of a marketing journey. We also wanted to seamlessly save changes while the marketer worked, so we needed to queue updates as they were happening while also validating and providing consistent feedback about the state of the journey.

We wanted our frontend developers in the driver’s seat. If they could be decoupled from the API layer, we’d eliminate a significant source of friction while developing features. GraphQL appealed to us due to the ability to pick and choose which objects and fields we needed, along with graceful handling of nested objects for both reading and writing.

The Solution

We decided to explore the possibility of a proper GraphQL API using the Graphene library. It was appealing from a frontend engineer’s perspective, but the maintenance burden was daunting. We would have to hand-roll a hundred or so classes in order to have a usable API. Keeping those classes up to date and consistent was something we weren’t sure we wanted to take on.

We knew we’d have to automate the construction of these classes if we wanted this to work. We built a proof-of-concept library that generated the necessarily classes while walking our Django model graph. We could then use Graphene to build a GraphQL API that handled reading, updating, creating, and/or deleting our model instances!

It actually worked quite well, and the maintenance burden was tiny. We felt we were onto something, but of course, there are concerns other than reading and writing data to the API.

Security — Preventing users from accessing data they aren’t authorized for (cross-account data leaks, object permissions, etc.) should be the #1 priority for any API author, and we needed to make it easy to do that.
Flexibility — Understanding that a generated API won’t be able to fulfill every need, we still wanted to be able to solve for 80% of the data needs for a given feature.
Performance — GraphQL is notorious for falling into 1+N query scenarios. Given that, resolving queries in the most optimal way was a goal from the very beginning.

We addressed these concerns in a library called DjraphQL. (pronounced “Draf-QL”) We’ve been using it in production for a few years to serve the API powering our Journey Builder, which remains our largest and most complex frontend application.

We no longer had to think about whether or not a resource is being served by an existing endpoint, or if it provides all the fields needed to power a feature. We could access data in a consistent way, no matter which model we needed. Filtering, pagination, ordering is handled securely and each query is optimized for performance.

Attempting to read or write objects which the authenticated user did not have access to would throw an error automatically. We could trust that our API remained secure as our application grew in complexity.

How it works

For each Django model we want to expose, we define an Entity class that contains some configuration around which GraphQL operations to generate, which fields are readable and/or writable, any permission rules, how to perform validation, etc. Based on these configuration classes, DjraphQL will generate the necessary Graphene classes and attach them to a graphene.ObjectType for both query and mutation. By automatically generating the types and resolvers for each Django model, we slashed the maintenance burden significantly.

Here’s an example:

from foo.models import Foo, Bar
from graphene import Schema
from djraphql import SchemaBuilder, Entity, AllModelFields
from djraphql.access_permissions import C, R, U, D

class FooEntity(Entity):
    class Meta:
        model = Foo
        fields = (AllModelFields(),)

class BarEntity(Entity):
    class Meta:
        model = Bar
        fields = (AllModelFields(),)

# Create the SchemaBuilder and register our Entity classes
builder = SchemaBuilder()
builder.register_entity_classes(FooEntity, BarEntity)

# Create the Graphene Schema object and execute a query
schema = Schema(query=builder.QueryRoot)
result = schema.execute("query { FoosMany { id name bars { id name } } }")
print(result.data)
"""
{
  "FoosMany": [
    {"id": 1, "name": "Foo #1", "bars": [{"id": 1, "name": "Bar #1"}]},
    {"id": 2, "name": "Foo #2", "bars": [{"id": 2, "name": "Bar #2"}]},
  ]
}
"""

This is just a minimal example to show what’s involved in building a GraphQL API via DjraphQL. Check out all ways to configure the Entity class in the documentation.

Trade-offs

What are the tradeoffs with using a library to automatically generate an API?

We gain ease of use and an almost nonexistent maintenance burden at the cost of transparently surfacing our complex model layer in the API. This means we require our frontend engineers to know and understand the our model architecture. We were fine with making this tradeoff, because we encourage this full-stack mentality anyway. We likely would not use DjraphQL for a public facing API, opting for a more intuitive experience that does not require knowledge of how the data is stored.
While the maintenance of our GraphQL API is minimal, we still have to maintain the library itself. We regularly find new opportunities for enhancements, and yes, bugfixes!
Recall that our goal was to serve about 80% of the use cases out of the box? The last 20% requires hand rolled Graphene classes that sometimes live side-by-side with the autogenerated types. This can sometimes confuse engineers unfamiliar with our codebase, who might wonder where the types/resolvers for each exposed GraphQL type are.
The library handles nested inserts and updates, but this comes with the cost of a more complex schema.

At Simon, we’re solving big and interesting problems, and we use a ton of open source software in the process. We view this as an opportunity to give back to the community. DjraphQL has helped us focus on delivering value, and we believe it can do the same for others. It’s been hardened over a few years, but there’s no shortage of opportunities to make it better.

In the coming weeks, we will update this post with the Github repo location. If you’re interested in doing so, open a Github issue!

Potential enhancements

Reusing the generated types within hand query or mutation resolvers is possible, but clunky. It’s an advanced use case, but reusing the types generated by GraphQL within a hand-rolled resolver is quite convenient. It’s possible today, but with some work we could provide better, more intentional API.
We built our own schema largely driven by the needs of our own product. In hindsight, adhering to standardized schema, e.g. Relay, would’ve been a better choice.
The implementation for resolving queries involving aggregation (avg/sum/min/max etc.) needs work. It supports basic use cases, but we believe a much better DX can be achieved by improving the schema and internal implementation used to resolve queries.

In future posts, we will go a bit more in-depth technically to explain how we ensure each request remains secure and how we execute performant SQL during query resolution.

Join us!

At Simon, we’re making the digital marketer’s life easier by bringing their customer data to the surface alongside powerful marketing tools, thus enabling them to create highly personalized content and deliver it via complex and responsive campaigns.

We’re working on the cutting edge, right at the intersection of big data and product engineering. The challenges unique to this space provide ample opportunity for engineers to learn and grow! If you’ve enjoyed this article and would like to learn more about the problems we’re tackling or opportunities at Simon, check out our job openings page! We’re always looking for smart and curious people to join our team.