A More Practical REST

Paul Wolf
9 min readJul 27, 2018

--

The REST pattern for writing APIs has a lot of traction and is the basis for many public APIs. It does, however, have some serious problems. This article proposes some changes to the principles underlying REST to make it a better convention.

My criticism is of REST as defined by the following:

  • Roy Fielding’s dissertation
  • Common practices that have developed in the last years
  • Implementations by common framework libraries

There are many critical articles on REST and REST vs JSON-RPC vs GraphQL et al. (see References). My purpose here is to be concise about the shortcomings of REST without comparing it to other protocols and propose an alternative set of conventions that are not miles away from current practice.

Problems with REST

REST was conceived in a time when the web was different. Modern development has shifted on some important issues:

  • HTTP Verbs: only excessive piety towards the ideal REST paradigm and rigidly prescriptive framework libraries keeps developers using all of these instead of just GET and POST. “Use nouns in urls, not verbs“ is the advice usually offered with no supporting evidence that their is a good reason.
  • Caching: One advantage is caching based on using the right verbs. Page caching is not what most applications are concerned about today. Caching on a “resource” level has to an extent been removed from the hands of the HTTP protocol in many use cases.
  • Stateless: Most applications are stateful if you include the case where state is itself retrieved via REST methods: GET list of bookings, GET a booking from the list. But it seems the usual meaning of ‘stateful’ pertains to session information. But the concept of session information is vague. What information belongs to the session exactly?
  • Resource concept: REST confuses services with resources managed by services. /booking/ seems like a service and /booking/123/ seems like a resource locator. Saying that “resource” is an overarching abstraction is losing information in an unhelpful way.
  • HTTP response status codes: REST over-states the importance of using different verbs and understates the importance of consistent HTTP response codes. For application developers, if you look for guidance on using “proper” response status codes within REST, it is a carnival of exotic opinions.

Part of the problem with REST is the founding principles were developed when resources were more static, caching was done differently. Web services as data services were not as clearly distinguished. Calls were more likely to be really stateless (Fielding’s thesis was published in the early 2000s).

Context

People write numerous internal services that communicate with one another, their internal services architecture. In addition, they present some of these interfaces as external APIs publicly. A common recommendation is to use JSON-RPC (or some RPC) internally and REST externally. The overwhelming trend in the last years is to use REST when presenting public APIs. But for many cases this is an unfortunate overhead. I think — without proof here — that most software designers want to have the same protocol for both internal and external APIs. That is one reason I’m not as interested in whether JSON-RPC or GraphQL is better than REST. Connected to this is my feeling that when developers use REST internally, they wrap it with their own framework code making it look more like JSON-RPC but hewing to REST principles to be able to expose some of the interfaces externally. This split between technology and design choices between internal and external APIs is harmful.

HTTP Response Codes

The HTTP response status codes are:

  • 2xx: successful
  • 3xx: redirect
  • 4xx: client has done something wrong
  • 5xx: service has done something wrong

The main example is 404, “not found”. Apache, Nginx, whatever reverse proxy, is going to return this for a URL that is not configured. A web application will return a resource that is not found with 404. This might be you, the developer, deciding it’s a 404 or your REST framework. The team managing the web server might have re-written the url incorrectly. The client now does not know what he or she has done wrong. Example:

/api/v1/beast/666/

A large number of web services have a reverse proxy configuration of basic urls and then let the application use a router for directing to specific resources. Your client calls the above url and gets a 404 back.

The Beast does not exist? Maybe it was plural, /api/beasts/666/? Trailing slash? Oh, that’s 301, is it? Maybe 666 is there but not to you because you don’t have permission to the resource. Maybe the web server rewrote this url so the application router did not recognise it.

What a mess.

Let’s clarify what we mean that an “entity” or a “layer” returns a HTTP response code:

HTTP enables the use of intermediaries to satisfy requests through a chain of connections.

A gateway here is a reverse proxy, usually understood to be the web server, like Apache or Nginx. Unfortunately, they will respond with a 502, “Bad Gateway” if the upstream intermediary is not responding properly. Therefore, terminology can be confusing about which intermediary entity is referenced. For practicality I’ll use these terms:

  • client: e.g. some remote requesting entity, browser or some other user agent
  • reverse-proxy: a web server like Apache, Nginx or AWS API Gateway
  • application router: the code in a library linked in with the application code that routes requests based on uris
  • application: code written by a developer that implements some logic like CRUD operations on a database, by way of example

There can be more intermediaries in the request chain but these are the ones that are useful for this discussion.

What we really want are different results depending on whether the web server or the application returns “not found” because the service doesn’t exist. Ideally, we also want to know if the application router is not finding the resource.

There’s an RFC for that:

Instead, the aim of this specification is to define common error formats[…], so that they aren’t required to define their own, or worse, tempted to redefine the semantics of existing HTTP status codes.

It would be worse if developers defined their own codes which is the exact situation that REST + standard HTTP response codes has caused. So, instead of fixing the problem, there is a proposal for a more complex protocol that still drags the old flaw with it.

It makes little sense to say the payload embeds the “application-specific” error code. And worse when someone decides to always return 200 via HTTP and a code embedded in the response provides the “real” status. And it can be remarkably difficult for developers to decide whether to return 400 or 500.

Current HTTP ranges (2xx, 3xx, 4xx, 5xx) are not expressive in anywhere like a meaningful enough way. There should be a way to indicate whose judgement is used to decide that status.

I propose this:

  • 404: reverse proxy does not have the resource url configured (“resource” in the “service” sense)
  • 604: application router has decided that a resource does not exist
  • 704: application business logic has decided that a resource does not exist (not found in db)

Status code ranges should be used that indicate which layer is responsible for the outcome. If 404 is returned from the web service, it’s obvious where the problem is only if you know the reverse proxy returned it. If 404 is returned from the application, the resource (not the service) does not exist. This distinction should not get lost.

One possible objection is that architecture styles evolve and identifying layers in this way will become obsolete in the future. But it’s interesting that while some elements of Fielding’s description of system behaviour have changed over the years, the basic entities are pretty much the same. I don’t see a problem with the above scheme being used for serverless (e.g. AWS Lambda), Kubernetes/Docker or other container management systems.

Some people might react to this saying it is a dumbing down of the protocol rather than adding features that overcome the flaws such as proposed by RFC 7807. Fielding had the good grace to not embed the word “simple” in his acronym. But if we want REST to have longevity, it would be better to not just claim it’s simple like other protocols (snmp, smtp, soap). Better to solve fundamental problems with fundamentally simple tweaks.

Nothing in the above prevents RFC 7807 from being implemented.

A possible supplement to the protocol would be to set a header that expresses the layer returning the status. So, a X04 is returned. The client looks at the header X-STATUS-ENTITY=’application’.

The exceptions are 200 and 201 which always are returned anyway from the application layer when appropriate. 202, 203, 204, 205 have some additional significance not always relevant to REST services.

It’s a pretty clear that RFC7231 is mostly concerned with HTTP protocol exceptions and that 2XX codes are not really part of this protocol other than to say “success”. The nature of that success is an application matter. Likewise, codes above 505 are really none of the business of HTTP other than asserting that the HTTP intermediaries have given the application a fair crack at the request.

REST software frameworks should be considered a layer whether it’s called a Resource or Resource Controller. I think the problem here is that these frameworks think it’s their job to be prescriptive about REST compliance even on an application level. What they should do is provide the developer with the ability to define the 7xx codes rather than blindly return a code that should be the developer’s judgement when `/beast/666/` does not exist. If the framework’s router code decides the url doesn’t exist, it should return 604. Only the developer (user of the REST framework) should decide after that what to return.

Django REST Framework or Spring Framework are examples of libraries that I consider an application router.

Resource Lifecycle Assumptions

One egregious aspect of REST is resource lifecycle semantics. For a client trying to update a resource:

PUT /api/beast/666/
{
“id”: 666,
“name”: “The Beast”,
“mode”: “Slouching”,
“destination”: “Bethlehem”
}

It needs to know this resource exists or not or else it will error out with some unreasonable behaviour, like 404, to the effect that the resource does not exist. If we are doing an update, we need to know the resource exists on the remote system and use PUT or know that it doesn’t exist and use POST without the 666in the URL. This is fundamentally wrong. Distributed systems should not make assumptions about resources on remote services existing or not. And how is this consistent with the statelessness assumption?

Far better: POST should create or update. If it updates, return 200, if it creates, return 201.

Summary

Some REST proponents say that not using HATEOAS features of REST deprives you of exactly the motivation for using REST. But how many developers make use of that? Maybe not an insubstantial proportion, but it’s far from universal. It says a lot about the success of REST that it is adopted so widely even without a feature that is deemed essential.

It’s funny that JSON-RPC which adheres to a specification document somewhere with words like “MUST” in all caps is less cabable of achieving widespread use than a protocol, REST, that relies on mere assumptions of how everybody else works. My feeling is that REST is just a more successful meme.

Equally, it’s astonishing to see the august reverence accorded the standard HTTP response status codes in comparison to the confusion wrought by trying to shoehorn them into real-life situations. It seems to me that for the vast majority of cases there would be little negative fallout by using higher integer ranges to the existing 500 HTTP response codes for expressing the entity in the request chain that is responsible for deciding the outcome.

My proposal for a more Practical REST:

  • Use 6XX and 7XX HTTP response codes for application router and application implementation code respectively (with the exception 2XX in case of success)
  • Let POST have update-or-create semantics
  • Don’t consider this heretical:
POST /delete/beast/666/

All codes up to the application code (maybe up to the REST framework code) would be as before. The entirely arbitrary world of application responses would get its own space for defining response status. Clients would know if they get 5xx back, it’s the web server, if it’s 6xx, it’s framework router code that doesn’t know about the application and 7xx when the application itself is responding.

References

https://tools.ietf.org/html/rfc7230

https://tools.ietf.org/html/rfc7231

https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

https://www.jsonrpc.org/specification

https://medium.freecodecamp.org/rest-is-the-new-soap-97ff6c09896d

https://blog.apisyouwonthate.com/understanding-rpc-rest-and-graphql-2f959aadebe7

https://blog.restcase.com/rest-api-error-handling-problem-details-response/

https://tools.ietf.org/html/rfc7807

--

--

Paul Wolf

Software technical and business exec; Python, Django, Microservices