Reconsidering GraphQL because REST is actually terrible
By now, you’ve probably caught wind of the growing hype around GraphQL, the trendy new way to do APIs pioneered by Facebook.
If you’re a responsible adult working on an established product or at an established enterprise, the mature, responsible thing to do is to hesitate on buying the hype, lest ye get caught on the trough of disillusionment. After all, SOAP was also supposed to be an improvement on REST, but now it’s relegated to REST’s dedicated straw man and punching bag. The “correct” response with GraphQL is to wait-and-see, and stick with the tried-and-true industry standard, REST.
But what if REST APIs is actually so horrible that most alternatives, even if not fully vetted, are just better by default?
REST vs HTTP, (and REST isn’t an API)
It’s a reasonable opinion to view REST as a perfected version of HTTP — the full expression of its principles. On the other hand, most rank-and-file programmers can go their entire careers consuming and writing web APIs while having no idea what distinguishes a true REST API from any other HTTP API (we’ve all heard the terms used interchangeably.)
Another thing people don’t realize is that in Roy Fielding’s doctoral dissertation, REST was not a type of API! REST was an architectural style. As such, several of REST’s primary tenets should have no bearing on REST-as-an-API. For example, from Fielding’s dissertation, he outlines 6 architectural elements including concepts like statelessness, and layered systems. We can all agree that these are good architecture ideas, but they have no bearing on the API, which is an interface. I’ve worked with large systems built on session state — it makes almost no difference to the API surface. Similarly, whether a back-end system has many layers, or one gigantic piece of spaghetti code, should be merely an implementation detail, invisible to the API layer.
Hypermedia as the engine of application state (HATEOAS)
It makes every resource endpoint an XSS vector.
The most defining trait of REST-as-an-API is the one the least people want. HATEOAS says that you’re not supposed to know the REST endpoint URLs that you read on the documentation page. Instead, you’re supposed to go to an initial “home” endpoint, and the JSON (for example) for that resource will return a URL hyperlink to other objects, and the JSON of those will contain more hyperlinks, etc.
That’s right — in true REST, you’re not supposed to know the API URLs. If you didn’t already know that, I’m sorry but it’s completely true and I’m not making it up. Most people go through a brief period of denial.
HATEOAS isn’t just annoying, it’s a security flaw. It makes every resource endpoint an XSS vector.
HTTP is an insult to network engineering
The Other Kind of ‘Stateless’
HTTP is a “stateless” protocol. When Roy Fielding wrote his dissertation on REST, he also made statelessness a central tenet. But there’s statelessness, and then there’s statelessness.
If you’re old enough, you might mistakenly believe “state” refers to the outmoded practice of having an RAM-based session cache for HTTP servers in order for the server to identify consecutive requests from the same user and respond accordingly. (If you’re young enough, you don’t even understand what I’m describing.) I work at a software company where engineers are still working to remove session caches from legacy code.
RAM-based session caches are terrible for scalability, but they don’t actually matter for an API. Indeed, the statefulness of a server is a mostly invisible implementation detail from the vantage of the caller. The need for statefulness is a direct function of the needs of the application. We still have stateful servers. The only difference is we now move the needed state to storage like a RDBMS, Memcached, or Redis instead of RAM. Then we call this “stateless” — but it’s not stateless, it’s just scalable.
“Thousands of round trips, for thousands of simultaneous users, on each of thousands of sites using REST APIs. The internet would literally die.”
HTTP is a thin wrapper of text over TCP
Actually, the RAM-based session cache is not what “stateless” refers to in REST at all! In HTTP, and therefore REST, what “stateless” really means is that each HTTP session closes the TCP socket after each request/response pair. It’s the whole point of HTTP, and it turns out it’s actually terrible. It was suitable for a WWW predicated browsers navigating between documents and downloading exactly one, entire, static document, in exactly one request, per navigation, end of story.
Today, in a RESTful API, each data item on any page requires a separate request. In a truly stateless connection, that would mean each data item on the page would need a separate TCP handshake, followed by an SSL handshake, and then finally the HTTP request/response. Each TCP handshake requires several trips across the internet, and SSL requires an additional pair (but many more packets.) There would be dozens of round trips per REST call, each repeated for each of 100 data items on the page. Thousands of round trips, for thousands of simultaneous users, on each of thousands of sites using REST APIs. The internet would literally die. If you told a high-skilled network engineer in 1990 (pre-web) this is what we would be doing with the internet, he’d rip the wires out of the ground because we don’t deserve them. Fortunately, they fixed this by changing the browsers and servers to treat HTTP like it’s not HTTP anymore.
The evolution of HTTP has been an a slow-moving trudge toward fixing the stateless mistake (by making HTTP be not-actually-stateless at all.) First, we had HTTP/1.1 and the keep-alive header, which allows client and server to loosely agree to re-use the same TCP & SSL connection, probably. (This seems to be good enough to save the internet, although as an application developer it’s opaque so there’s no way to verify if/when/how works, nor to tweak or optimize your app around it. Theoretically, we may regard each request as discrete and stateless, but practically, we know the internet will only continue to work insofar as the requests re-use the TCP connection.)
Next, we had web sockets, which is supposed to be the next best thing to HTTP having never been invented and just going back to writing custom duplex TCP protocols.
Somehow, these still weren’t enough, because next and most heinously, we will have HTTP/2’s Server Push (along with several other features applicable to this topic!) Server Push disturbingly advertises the ability for a web server to make a guess & push out additional responses before the client actually requests them. (For comparison, GraphQL just advertises letting your client request exactly the data it needs.)
Each of these upgrades had to spend years being vetted, and then painstakingly implemented on both clients and servers. And they’re all designed to address how unbelievably terrible HTTP is for APIs.
HTTP is unstructured chaos
This is the OSI model.
Network engineers designed the OSI model to help describe how the complicated network we call the Internet works. Each layer runs on its own protocols and data model.
This is an HTTP request:
The thing to notice about HTTP is that everything that would be “on top” of the OSI model, is now condensed into a single request/response protocol. Parts of the protocol which allow the client to communicate with different parts of the back-end system are intermingled into a single indecipherable request. With back-end systems becoming increasingly complicated, with multiple moving parts and microservices, the requests and responses are becoming unmanageable. Proxies, gateways, server software, your application, your database, your message broker, another back-end server, can all handle different parts of the request separately, and return an error.
Applications achieve nominal “statelessness” by storing their state in the request with the expectation/need that it will be passed back (such are cookies.) The notion of building an API into this system is absurd. Your client software can follow your documentation explicitly and still never come close to creating working software because of the overhead involved in dealing with proxies, etc. This clearly only works when the browser is the client, and it must be able to manage all of the other cruft flawlessly. There’s really no such thing as a REST API over HTTP, because HTTP is already so bad it contradicts everything.
Understanding APIs through the lens of Google and Facebook
Both Google and Facebook’s strategies belie the fact that HTTP and the Web are completely awful.
Google’s main money maker is search, and by creating it, they fixed the Web. Google owns the Web. As a reward, they’ve been a benevolent master by continuing to give us great stuff like G-Mail, G-Suite, and dozens of other odds and ends, mostly for free — and probably usually at a loss, in order to help make the Web a great place to be and lock-in their main money maker.
If you made another Internet today, you’d build search directly into the DNS, telemetry and analytics into the routing protocols, allow it to monetize the backbone providers, and Google’s main business model would be obsolete. Everything they do could be done more precisely at a lower level of the OSI model.
If the Web wasn’t awesome, we’d probably set about creating just such a new Internet. The fact that Google creates so much great stuff for free, is basically an admittance of guilt — the guilt that the Web is, at its core, a terrible, terrible technology.
Google’s main competitor is Facebook, and Facebook is on the Web but completely independent from it. As such, you can see their approach to technology is to build tech that is independent from it. GraphQL abstracts HTTP away just as React abstracts the browser away.
By default, anything is better than REST
Yes, I’m concluding the article. If you were expecting another spam article advertising the latest GraphQL framework, sorry. This was always a diatribe against current standards, not a marketing pitch for a particular framework.
The problem is that we’re stuck on REST because we’re thinking of it as the industry standard, waiting for its throne to be usurped by a well-vetted, flawless solution beyond all reproach. The reality is REST-as-an-API has itself failed to deliver anything unique that we can value. REST encourages us to make superstitious costly overtures concerning hypermedia, standard headers, response codes, statelessness, etc., but the most defining characteristics of REST are things nobody asked for — no client consumer benefits.
When a new API standard is proposed, we should, by default, assume it’s better than REST, until proven otherwise.