Rethinking an API for Cacheability
At the beginning of anything new, there is just so much uncertainty. The first versions of an API built to support a growing, yet-to-be-defined product, can easily be built on top of quicksand. You’re doomed to make mistakes, but if you overcome those errors and get something tangible you may get the second chance: redoing, but better.
Redoing gives you the possibility of starting with the favorable, unfair, advantage of knowing (a little) about what doesn’t work. You may not know precisely what works, but you know a few things that doesn’t.
And this is amazing for your API.
I recently wrote about our API migration from REST/JSON to GraphQl at letsevents.com.br. A migration between different API query protocols is an opportunity for redesigning your API as a whole. If you limit it to a protocol translation you’re not using all the additional knowledge acquired the first time and wasting resources. By the way, if you decide to redo something, make sure you do it clearly better to compensate at least for the loss in market time.
Our original API grew like a little cute monster — as most software tend to do. As new features needed to be implemented, stuff was added to the API, on top of what was already there. Fields. Endpoints. Relationships. More fields.
As the figures in a geometric pattern of a coloring book, there was no explicit starting point, no well defined way things were supposed to communicate, although they were somewhat correlated.
Fresh start: what to return?
In the API world we looked at that question in terms of data vs. policies. Policies are a subset of the whole business logic, which somehow includes also your data, or, at least, the way you model and validate it. Our APIs were full of policies and by rethinking our design we realized these were actual prejudicial.
Our business domain talks about Lists of people who attend to Events. On every List, we have many Invitations, which roughly represents the attendance of a person to an Event through a List. We have a logged in user (Viewer) that may be allowed or not to delete any one of the invitations in each list of an event. This question can be answered with an authorization policy. This would be calculated, server-side, for each invitation in an event (which may have thousands of them).
For simplicity, let’s say we could calculate this policy considering data about the Event (it may be owned by the viewer), the List (it may have been created by the viewer), or the Invitation (he may be either the invitee, inviter or both). Since we had thousands of invitations, performance could become an issue, but we could cache these calculations to prevent repeating them every time, right?
Policies usually depend on who is asking. Caching would need to happen on a per-viewer basis, and given our permission model includes multiple staff members for the event, caching could help but wouldn’t be really efficient, performance and memory wise.
But if our API would only provide the final products (data) we have built, instead of policies, caching would become so much easier. Caching Invitation responses could be as trivial as handling a few timestamps, as it doesn’t depend on the Viewer anymore. It’s up to our clients to calculate the policy themselves from the raw data returned by the API, but this is hardly a problem as most policies can be expressed as simple boolean equations.
Having the data may be another issue, specially if you have an api that raises rigid walls within your data. But once you have the required data, you’re good to go.
Policies are just one example of information an API may return that depends on multiple contexts. Whenever one such calculation can also (safely) be done client side, you should consider it. Obviously, delegating the calculation of policies to the client side doesn’t mean your server doesn’t need to enforce them, whenever a client request an operation on your data.