Advanced REST API Design

We review some advanced use cases of the Representational State Transfer (REST) protocol.

By Jorge Albaladejo, Senior Software Engineer at CauseLabs

In my previous post, I covered the basics of designing a REST API: defining endpoints, using HTTP verbs and performing common read-write operations on data. In this chapter, I’m going to introduce some advanced use and edge cases you may want to consider in order to give your REST API a sound and long-lasting design.

More precisely, I will cover the following topics:

  1. Working with resource lists: filtering, ordering, and pagination
  2. Error handling and response codes
  3. API versioning
  4. Data formats

Let’s begin!

Working with lists: filtering, ordering, and pagination

In the previous chapter, we saw how to query a list of resources. If what we want is a list of contacts, for instance, we would do:

GET /contacts

And that’s it, the system should return a list of all the contacts in the database. However, this simple case is not very practical on production environments because the data set can be very large and expensive to send through HTTP and the query operation inefficient. In order to keep API calls as nimble and lightweight as possible, we may want to be more specific about the data we want to get. Hence, we can filter the list by adding a query condition and order it so that the most relevant results appear first; then, a widespread industry standard would be to split that list into several chunks and only serve the one immediately required, like the first 20 in the collection. This is called pagination.

There are several ways to pass these modifiers on to the API endpoint. We can use specific URIs, custom headers or query parameters in the URI, for instance. Let’s see the pros and cons of each one.

Specific URIs

Defining dedicated endpoints for each operation on the list we may want to do could look like this:

/contacts/filter/:param/:value
/contacts/orderby/
:param/:asc
/contacts/page/
:pagenumber

This looks pretty straightforward. But, what if we want to combine the operations? We could have some sort of endpoint like:

/contacts/filter/:param/:value/orderby/:param/:asc/page/:pagenumber

That begins to get complicated. What if we want to specify several fields to filter or to order by? We would have to make the endpoint even longer and harder to debug, maintain and remember. What if we don’t want to set all the available parameters or operations in a given request? We would have to omit them, and the server would have to guess which ones were provided and which ones weren’t.

This alternative won’t scale and will give us more headaches than joy, so let’s skip it and move to the next.

Custom Headers

Another way to convey operations and pass parameters to an API is to add custom headers to the HTTP request. For instance:

GET /contacts
Headers:
Query: param1=value,param2=value
Sort: param1:desc,param2:asc
Page: pagenumber,pagesize

Now, this is much more manageable and rather elegant. The URI is simple and easy to remember and we tell the backend what operations we want to do with the data by enclosing the instructions and parameters in the headers, keeping the URI unchanged. The server will check for the existence of each one of the headers and will apply or omit the operation accordingly. Extending the system with further fields is, therefore, very easy to do.

There are a couple of main drawbacks with this solution, though. Not all clients and servers will deal with custom headers the same way. Some of them may ignore non-standard headers altogether. Thus, this method may break interoperability with clients. You should use custom headers only for informal purposes so that clients and servers do not fail when they don’t find them in the request.

The second argument against this method is that URIs are Unique Resource Identifiers. While a paginated list of contacts is still about contacts, it has an associated modifier that should be made obvious and be part of the URI. The same goes with filtering conditions and sorting. A list of contacts from Ohio ordered alphabetically by last name is still a list about contacts, but not quite the same resource than a plain list.

To overcome these two limitations, we need to take a look at the third approach.

Query Parameters

As we have seen in the previous two sections, using query parameters to alter the URI would be the best method to represent modifications to the base resource list. Here, you can use whatever naming convention makes more sense in your domain, but try to use industry standards to make life easier for your API consumers. For instance:

GET /contacts?sortBy=param1:desc,param2&foo=bar&page=3&pagesize=20

Let’s take a look at this endpoint in detail.

The first thing we see is that the root of the URI has not changed, it is still /contacts. This makes the API design easy to remember and to consume. Then, we can see the different parameters added after the question (?) mark. Query parameters are flexible enough that we can add or omit them as we want, and it’s easy to check whether they are defined on the backend. The URI is unique and identifies specifically a list of contacts sorted by param1 descending and param2 (ascending by default as a common-sense convention), then filtered by a field or attribute named foo whose value is bar; finally, we want the page number 3 with a page size of 20 items per page, this is, the items ranging from #41 to #60.

This method is easy to design, implement and remember. The order of the query parameters will not affect the result and we can extend it to add as many filter parameters as we need to. We could argue that by changing the query parameters’ order we would be producing a duplicate URI — which is wrong by definition since URIs must be unique. However, that would make the API inflexible so we will have to live with that trade-off.

Error Handling and Response Codes

Most of the times, requesting data from an API will return the data we asked for in the way we specified. But often, errors happen and the back-end cannot process the request for any reason. It is important, that the API handles these exceptions and returns an appropriate error message so that clients know what to do, eg. whether to ask again, or fix the request, or wait for some time.

There is a whole list of status code your API can return with the response, you can check it here. They are basically grouped into the following families:

  • 1xx: informational
  • 2xx: success
  • 3xx: redirection
  • 4xx: client error
  • 5xx: server error

The ones you will come across more often and that will be more helpful are:

  • 200 OK: the server got the request and responded accordingly.
  • 201 Created: a new resource has been created; you should return this code when handling a POST action on a collection, for instance.
  • 202 Accepted: the request has been accepted but not processed; it is queued for execution and might not be fulfilled. You can use this code when a request generates an action that will happen asynchronously, like a data update or an email.
  • 204 No Content: the request was processed but the server does not have any content to return. You can use this code when handling DELETE requests.
  • 301 Moved Permanently: the URL changed permanently and it can be found elsewhere.
  • 302 Found & 303 See Other: the URL changed temporarily and can be found elsewhere.
  • 304 Not Modified: use this code to implement caching systems. If the request headers If-Modified-Since or If-None-Match are used and the server does not have a newer version to provide, then return 304 so that the client can use the copy it has stored locally.
  • 400 Bad Request: the request is malformed. Use this code to tell the client to re-send the information correcting the format issue.
  • 401 Unauthorized: the user is unauthenticated and cannot be granted access to a restricted resource. Logging into the system will solve this error.
  • 403 Forbidden: the user is authenticated but her cannot access this resource. Logging into the system will not change this error.
  • 404 Not Found: as it sounds, probably the most well-known error code on the Internet.
  • 405 Method Not Allowed: use this code when the client requests an HTTP verb that the given resource cannot handle, like POST on individual resources or PUT on collections.
  • 415 Unsupported Media Type: use this code when the content type used or requested by the client is not supported by your API, like JSON when you would expect XML.
  • 500 Internal Server Error: an unexpected error prevented the server from returning a response.
  • 503 Service Unavailable: the server is temporarily unavailable because it’s overloaded or due to maintenance reasons.

We should make sure that the right HTTP codes are returned along with the response, whether that was successful or produced an error. This will help your API consumers understand what your system is saying and build their applications on top of it. It will also ease debugging and testing your back-end as well — you might find edge cases faster by looking at the error codes returned.

Another important best practice is to return an explicit and meaningful error message, with a link to a further explanation of the problem, so that the developers consuming your API can learn how to properly use it. Also, consider adding a Date header with the time at which the error was produced, and an error ID if it was tracked for further reference. For instance:

HTTP/1.1 400 Bad Request
Date
: Wed, 4 Jan 2017 20:41 GMT
Link: <http://www.example.org/errors/badrequest.html>;rel="help"
{
"message": "Wrong input parameters. Missing value for 'sortBy'",
"errorId": "348–587–956"
}

You should try to make the error message as explicit as possible so that it’s easier for your API consumers to build their applications on top of your system.

API Versioning

An API is our system’s gateway to the world. Many external applications, systems, and services may depend on it. As a result, once an API is made public, it should not change: its endpoints and response data structures should be consistent in time.

Software systems should be designed with maintainability in mind and should evolve and mutate very rapidly in the modern cloud web world. When your application needs to change so that it cannot maintain compatibility, or some clients require different behavior from other clients, it is the right time to version your API.

The most commonly used pattern is to prefix your URIs schema with easily detectable terms like v1 or v2 in sub-domain names, path segments or query parameters. These are some valid alternatives:

http://v1.yourapidomain.com/contacts
http://yourapidomain.com/v2/contacts
http://yourapidomain.com/contacts?version=v1

Use the pattern that works best with your server deployment and software frameworks. If you want the different versions to be handled by the same application, using path segments or query parameters may be convenient.

Please also consider that versioning may also introduce other problems, such as:

  • Data compatibility may break between versions, or when upgrading to the last version.
  • Different business rules and application flows may require the clients to update.
  • Maintaining multiple versions of the same application will add complexity to your system.
  • Clients will need to be updated to use the new API endpoints.

Once an API is versioned, that version should be immutable and all future changes and adjustments released to a newer version. This is vital for your API clients to keep working correctly.

Response Content Types

JSON is the default content type standard in the industry nowadays, although XML is also used by some major APIs. Depending on your use case, you may want to offer the ability to retrieve the same data in different content types, including HTML or any other future representational format that may appear in the future.

HTTP comes with a few headers specifically designed to deal with content types. On the client, you can request a specific format with the Accept header, like this:

Accept: application/json

If your client precises several content types, the server will respond with the first one available:

Accept: application/atom+xml, application/xml, application/json

If you are sending data to the server and you want to tell it to look for a specific content type, there the Content-Type header ready for you to use:

Content-Type: application/json

Please note that this header, unlike Accept, will only take one value. The server response will include the same Content-Type header specifying the format returned, that your client can check to make sure it can understand the data from the API.

This is a very neat and elegant solution since the data format is not represented in the URI and therefore it’s not part of the endpoint as such. The resource is the same, we are only changing its representation.

Your API should always check the Accept header received with the request. If it cannot satisfy the requested content type, then it should return an error code 415 Unsupported Media Type and let the client know a list of supported content types. If the Accept header is not present, then your API should return a default content type. Choose the one that suits your project best; if you want a data format that’s easy to work with that widely supported, JSON is probably your best bet.

We have covered a few topics that will help you boost your API to professional levels:

  1. Use query parameters to implement resource modifiers like list sorting, filtering, and pagination.
  2. Use HTTP error codes to tell the clients what went right or wrong and include a human-readable meaningful message if an error happened.
  3. Version your API as soon as you need to introduce changes that would break compatibility with existing clients.
  4. Use the Accept and Content-Type HTTP headers to specify the data format sent to and expected from the server; default to JSON for a quick starting point.

If you wish to learn more, please leave the CauseLabs development team comment or question. We’d love to chat!