Replacing Relay with Redux

It’s been almost a year since GraphQL and Relay were released. Since then, GraphQL has been hyped as the clean, declarative successor to REST — and for good reason, too. With a GraphiQL UI for queries, a built-in schema that eliminates the need for an ORM, and a single endpoint, it’s safe to say that GraphQL is a hit. Conversely, Relay hasn’t seen the same enthusiasm as its server-side counterpart.

Does that mean it’s bad code? Absolutely not! It’s a feat of modern day engineering that is made by incredibly smart people, for incredibly smart people; the ideas behind it are fantastic. To make sure this point is not lost on the reader (and because I just started watching Silicon Valley) I’ll preface each criticism by first stating that Relay is great, but, ya’know (Rigby).

  • Rigby, Redux is a great way to manage local state, and I’d like to use it to manage domain state, too
  • Rigby, a client cache shouldn’t require big server changes. If it makes for more efficient refetches, sure why not; but hopefully, a client cache works with any GraphQL server
  • Rigby, I just want a client cache, not a whole rewrite of my routes and containers that is tightly coupled to React
  • Rigby, HTTP/2 and websockets are here, so it’s not super critical that I fetch all the info in 1 trip. Frequent trips with smaller payloads can even make for a nicer UX
  • Rigby, I shouldn’t need to be an expert in graph theory to understand how to write a mutation
  • Rigby, when an array is mutated, sometimes the new doc needs more logic than append or prepend
  • Rigby, if I know what queries a mutation affects, I shouldn’t need to write a fat query
  • Rigby, optimistic updates are so similar to server updates that writing the same thing twice makes me sad
  • Rigby, my project isn’t facebook sized, so the gzipped size of Relay is bigger than the aggregate of the payload savings it provides
TL;DR

OK, enough justification on why we need something new; let’s focus on the solution. I wanted to take the best parts of Relay (the hundreds of hours of thought behind it) and the best parts of Redux (the easy-to-grok code and friendly API) and combine the two. The result is a package I call Cashay, which I posted on github. I started writing it awhile ago, but life was just too much fun to spend at the computer. Knowing full well that a new package would attract a sea of high-calibre fatiguer beliebers (I’ll say it… JS fatigue whiners are more annoying than justin bieber fans), I decided to go through each pain point I had in Relay and systematically explain how & why Cashay does it differently.

Problem #1: Unix Philosophy: Do one thing, and do it well

A cache is a tool that sacrifices a little memory in exchange for a faster result. That’s it. That’s all Cashay tries to do. It takes what you want, sees what you already have, and goes and gets the difference. Really, really simple. You wanna see what you got? Crack open redux-devtools. You’ll see all your variables & schema-normalized data waiting for your eager eyes. Did someone say serializable persistent data? No? OK, just me then…

Colocating data requests with data requirements is fantastic, but it doesn’t belong in the cache. The cache should be front-end agnostic; colocation isn’t. If you care more about colocation than caching, I’d highly recommend adrenaline, and in the future, maybe Cashay and adrenaline will play nicely together. Anyways, the same problem also affects server-side CSS, and both problems can be solved with the same technique. To learn more:

Problem #2: Using a vanilla GraphQL endpoint

Let’s focus on the server. Client software that requires changes to your server might throw up some red flags, but Relay does it for a good reason: edges and nodes. In graph theory, everything is edges (relationships) and nodes (entities). Now I’m no stranger to graphs, I’ve even written some npm packages for highly obscure bipartite graph heuristics that no one uses, but just because graphs are powerful, doesn’t mean they’re necessary. So, let’s see how we can offload cursors and pageInfo (hasNextPage, hasPreviousPage) from the edges. Turns out it isn’t too hard.

First, let’s tackle cursors. Relay 100% got it right by using cursor-based pagination. There is no logical reason to “skip to page 3”. If that’s what your users are doing, it’s a clear sign that you are failing to offer the proper query, sort, or filter. I often wonder why Google still does this (although Google Images is an infinite scroll)… maybe they just accept that people don’t like change? Regardless, Relay needs a 1-to-1 relationship between cursors and documents. That way, if you want 10 documents after document #5, you get the cursor for #5 and send off a request for the next 10. So if we assume that a document always has the same cursor, regardless of the query (and I can’t think of a reason for that to be false), then we just put the cursor on the document itself. In the future, this metadata could even be attached via a GraphQL annotation. This could be a timestamp, UUID, you name it. To learn more, see how Disqus does it: http://cramer.io/2011/03/08/building-cursors-for-the-disqus-api

Second, let’s tackle pageInfo. The Relay spec for it is pretty darn brittle:

hasPreviousPage is only meaningful when last is included, as it is always false otherwise

What’s that mean? Well if I click “next page” and I’m on page 2, then hasPreviousPage is false. Seriously. So how do we improve it? Again, pretty simple. pageInfo should be derived from how much data we have in the local state. If I’m showing documents 1–10 and I have an 11th document locally, then hasNextPage should be true. So how do we accomplish that?

Option #:1 Request more and hide the rest in the application logic

You can accomplish this without any logic from your server or cache (ie in the view-model layer). When you want 5, ask for 6 and have the last one fade to white so the user understands that they need to request more. When they request more, you make your request starting with the cursor of the faded one, unfade it, and then wait for the next page to come in. When there’s no more, you don’t see a faded doc.

Extra application logic not your thing? Don’t mind modifying the GraphQL server? Ok…

Option #2: Have your server send n+1 documents

If the client asks for 5 documents, return 6. Cashay is smart enough to only show what you asked for, but it’ll keep that 6th one handy. Again, this has the benefit of decreasing perceived latency because the next document is returned immediately & while the user is consuming that fresh new info, you’re already coming back with the next 5 from the server.

Talk about a sweet UX! But I know there’s a turd out there saying something like, “I’m a very important person and I can’t afford to send a possibly useless 72 bytes down the wire.” Really? …really? Fine.

Option #3: Have your server send an extra null

If a small payload is what really makes you happy in life, then just add a null to the end of your server response. Cashay ignores nulls for everything but determining if there is another doc out there. Enjoy your perfectly run-of-the-mill UX.

That solves our edges problem, but what about nodes? Relay offers up a really smart idea with their node interface. You send in a type and id (in an opaque base64 string), and it fetches that doc. Kind of like a getPostById query, but it works for Posts, Comments, and everything else in your schema. Let’s expand on that idea:

  1. Call a getTop5PostIDs query that returns an array of 5 IDs.
  2. Filter that list against the IDs you already have
  3. Call a second getPostsByIDs query passing in the filtered list

That makes for super efficient payloads at the expense of 2 network requests. So now we’ve got options: either fetch everything in 1 request (faster, bigger payload), or use a predictive fetch technique. For example, if the user hovers over a button, execute getTop5PostIDs. If they don’t click it, you only lose out on a few bytes (an array full of IDs). If they do click it, you efficiently grab the posts you don’t already have via getPostsByIDs. Decoupling cache from the view layer for the win!

And with that, the server problem is solved! No more Relay-enabled server, just your standard GraphQL schema. In the future, this could also work with 3rd party services that only return 1 cursor per page (albeit less efficiently).

I’ll be working on these features as soon as I have the time… or a billable project requires it. *hint hint*

Problem #3: Mutations

Rigby, the Mutations API for Relay is ugly. But for good reason. 80% of the difficulty of creating a client cache rests in the mutation. This is largely due to the huge number of variables at play. For example, let’s say you delete a post that was in the Top5Posts. Do you keep the list as is with a document that no longer exists? Do you only show the remaining 4 posts? Do you fill the hole with the next-best post you have locally? Do you say screw it and requery the whole thing? The answer, as every elbow-patched college professor loves to say, is, “it depends”.

When I started to think about problems like these, I began to appreciate why the API was so darn unwieldy. Relay accomplishes the herculean task of solving (most of) these questions while keeping mutations decoupled from queries and keeping the schema off the client.

I say screw it to both.

  1. If each query handles a mutation differently, let the developer decide exactly how a specific query should handle a specific mutation. No more complicated getConfigs logic and 100+ LOC mutations.
  2. The client schema isn’t that big. Cashay makes it even smaller. The gzipped size of the client schema for a medium-large app is <10KB gzipped, and having the schema on the client yields some big benefits.

Once you include the clientSchema and write 1 handler per query-mutation, life gets easier. For example, you don’t have to write fatQueries; Cashay writes them for you (I’ll explain how in a future post). You can also use the same handler for your optimistic UI and your server mutation. Best of all, there’s no limit to how a mutation can affect a query. Wanna stick the new document in the middle of an array? Fine. If that new document contains X, can you edit every document in another array and then reverse it? Weird, but sure. Should deleting a document trigger an entire query invalidation? You’re the boss!

Solution

All of this talk is pretty useless without a POC, so, I created Cashay-playground. Go ahead and check out how pagination knows when there are no more docs to fetch. Or how the comments are fetched. Or how you can add a comment. Install the redux devtools extension and see how Cashay stores all of it in your state.

Under the hood, Cashay caches the data after it fetches & denormalizes it from the state. It also manages dependencies for those denormalized queries so there is no unnecessary re-rendering. The end result is a pretty dang simple tool that’s highly performant with easy-to-grok code. What’s exciting about it is that with all the information stored in your state, ordered by GraphQL type, building a Django-style admin tool is dead simple — just spit your entire state onto the screen. Soon, I’ll be building a package to do exactly that.

Closing Remarks

My goal here isn’t to crap on Relay. It’s brilliant, just a little too complicated. The reality is that Relay is built to solve a problem specific to facebook, and that problem isn’t as common as the one that GraphQL solves, ergo different solutions will emerge. It works great for some folks, and some might even prefer it over Cashay. After all, facebook is probably more reputable than some dude living down in Mexico pushing code on the weekends. Then again, the notion that a small shop can do something better than a megacorp is probably why half of us have jobs. That’s why we’ll be using Cashay at Parabol to build out our new open-sourced project management service: Action. Not only does it keep you out of boring meetings, but it serves as a production-quality example webapp that the JavaScript community so desperately needs.

Ultimately, I want to open up a conversation to discuss how we can improve client caches. Neither is going to be the client cache of choice in 5 years. Ultimately, we’ll need something that can handle TTL for persisting infrequently queried data, offline-first, subscriptions (working on it!), and CmRDT subscriptions (eg collaborative document editing, like swarm.js). These are really exciting times for JavaScript. Cashay isn’t the final answer, but maybe it’ll be a stepping stone to get there. After all, the beauty of OSS is that no matter who gets it right, we all win.