Why Kabbage is using GraphQL
I gave a talk recently about a small problem we had at Kabbage developing a GraphQL API in .NET Core and how we solved it within the bounds of the framework we were using. The talk went well, a few questions were asked, and afterwards someone came up to me with another question, quite bluntly, “Why use GraphQL at all? Sure, it’s cool, but what problem does it solve that REST doesn’t? Is it worth spending time solving all the new problems it introduces?”
Now, my role is very focused on the ‘how’, and when I started this project, I listened to my team-lead’s reasoning on the ‘why’ and nodded along, happy to have a technically interesting greenfield project. But here I was; trying to defend a decision made months ago without my input. And he had some really great points! For the record, I think GraphQL does make sense for our use case, but without really thinking about what your API is trying to achieve, it could just be a lot of work with little benefit over another spec.
“Why use GraphQL at all? Sure, it’s cool, but what problem does it solve that REST doesn’t? Is it worth spending time solving all the new problems it introduces?”
What problem is your API solving?
Is it just a backend-for-frontend? Is it the top level gateway for all of your organization’s data? Is it going to be used for internal data exploration and analysis? Is it meant for repetitive batch processes to pull large datasets or highly variable searching requests?
At Kabbage, we needed to create a top level gateway into our decisioning data; it’s a significant factor in our automated underwriting process, and is also used by other internal systems. We needed to split out and scale the database that data lives in, requiring some kind of API to get the data without our internal teams needing to know the particulars about how to connect to the datasource. There are hundreds of tables we need to make available through this API. Using REST would have taken years to support the huge variety of access we needed, one endpoint at a time. GraphQL allows us to very quickly declare the schema and our internal users can still have the SQL-esque querying flexibility they are used to.
If your API is just serving a single front end, or it has finite, predetermined access patterns (get by id, get all, create new) then REST is perfect for that and would probably be a better fit both architecturally and for your end users.
How is your data structured?
GraphQL is a spec for communication between two servers, it provides no opinion or suggestion on how the data is ultimately stored. However, if you are using a graph database, then making your API layer GraphQL is a natural choice. If your data is in something like DynamoDB where efficiency requires knowing your access patterns before designing the table, then trying to accommodate the highly variable queries that GraphQL allows is going to cause you unnecessary work, while a REST API could only expose endpoints that make sense for your allowed access patterns.
Our data is stored in PostgreSQL, which is simple to describe in a graph: Table rows become nodes, and foreign keys become edges. By using Dataloader, we cut out the n+1 problem that crops up from trying to resolve relational database rows from a graph query.
How likely is your organization to upkeep documentation?
I’ve worked with hundreds of software engineers in my career and have met exactly two who delighted in comprehensive and accurate documentation. If you had a question about how to use their system, they would happily paste you a link to the docs and say come back later if anything was unclear. I got more done and they got fewer distractions.
I no longer work with either of those people and Kabbagers are strong advocates for clean, “self-documenting” code. The trouble with that is you still have to take time to read through source or take someone else’s time asking about intended functionality to know how to integrate with their API. There are tools like Swagger that will generate documentation and create interfaces for you to explore endpoints for REST APIs, but they are something extra you add on, not an expected feature of a REST API.
GraphQL, on the other hand, requires introspection as a part of the spec. This means you can query the API for information about the API. A user of your GraphQL API can then use these introspective queries to ask questions like “What fields does this node have”. There are also some tools like the GraphQL playground that use introspection to generate an interactive documentation and testing site for you.
New technologies and styles of designing systems are never silver bullets that solve all of your problems. Rather, they are like differently-sized snowplows that take some of your problems and just push them somewhere else. There’s always trade-offs to different approaches and GraphQL vs. REST is no exception. We chose GraphQL as the communication layer to our decisioning data because of the variety of access patterns we needed to support and the timeline we had to get this data behind a centralized API. That choice created new problems; translating and batching the incoming graph query to the correct SQL is complex. Developing a solution to build efficient queries took longer to develop than any query a REST API would need. But in exchange for that upfront effort we get amazing query flexibility and the built-in documentation via introspection helps get our internal customers migrated faster and with less confusion.