What REST is not!
Often times in interviews, in a hurry to throw in some buzz words and look smart quickly, somebody will say REST.
How do you know your APIs are RESTful?
Here are the top 3 answers that I’ve heard in interviews, and these are all wrong-
- I use JSON.
- We use Swagger.
- I follow the HTTP protocol.
RESTful APIs follow 5 principles. If your design adheres to these 5 principles then you have achieved RESTfulness! (if that’s a word)
The 5 principles of REST
- Contract first approach / Uniform Resource identifiers
- Client-Server model
- Layered architecture
REST stands for Representational State Transfer. Representational State is simply the current state of an object. The objects, in this case, are objects stored on a server inside a database. Depending on your app, you’ll need to store certain things in your database. It can be user information, it can be information about flights or a list of things to do or a million other things.
For an app that tracks flights, the flight information would be one of the objects to store. When the flight arrives, the state of the flight would be ARRIVED. When the flight is boarding, the state of the flight changes to BOARDING, and so on.
We are Transferring this representational state from the server to the client a.k.a the app, the front-end.
There is a little problem. Have you ever looked at the way data is stored on a server inside a database? It looks like those fast green scrolling lines on a computer screen spewing information in a dark room in TV shows. Not really, but in any case utterly unreadable!
Whereas the app, the front end is the presentation and interaction layer. It is beautiful with buttons and icons and menus. There are pictures and colorful labels with formatted text. For the data to be transferred from the database on the server to the app, there needs to be some piece of code that will convert the data to something that makes sense to the app. This piece of code is called the API.
1. Contract first approach / Uniform Resource Identifiers (URI/URL)
The specification of the API is a contract between the app and the server. The same way that you expect to go to the same mailbox every day to collect mail, and not have to worry about changing mailboxes, the app needs to know that it can hit the same URI to get a particular piece of data, every-time! It also needs to know that it’ll get the data in the format that it expects, and that format will not change. This is the Contract and the URI.
e.g. /api/v1/flights/skywest/3157/status/ should give me all the information about the current flight status of Skywest flight number 3157 in JSON. I should be able to see what and how the information is stored in the API documentation, which should be located at the developer portal of my company at developer.skywest.com
Most developer portals are called developer.company.com. Did you know Walmart has an API? Look up developer.walmart.com
The specification of the API should provide all the information. It does not expect nor assume that any state from previous calls has been preserved. This state is different from the representational state. The database on the server holds the state of the resources/objects. But it cannot hold the state of the client, the app.
The server should have no knowledge of prior requests. The client needs to provide all the information necessary for the server to provide a response. If this was in real life, the following questions would be considered stateful.
Where does John Smith live? What is his age?
The person asking the question expects that the name to be remembered (which is a state of a person…a pretty permanent state but a state nonetheless). So, the second question assumes that the question is about John Smith.
In a REST API, the same conversation between machines would go something like-
Where does John Smith live?
What is John Smith’s age?
This brings us to our next principle.
3. Client-Server model
The app doesn’t need to bother about how the server stores data, the particular database used, the tech stack, etc. The point of intersection between the server and the app is the database schema. The database schema contains a description of the data stored and the layout in which it is stored. The schema should be hosted on the developer portal as part of the API documentation.
A lot of time, a conversation between backend engineer and a frontend engineer tends to take the wrong turns and miss out on the more useful conversation about the format. The two most important points of intersection between the client (front-end) and the server (back-end) are -
- The data format that the client needs
- The granularity of the information.
a. The data format that the client needs
The server needs to support this format on the backend otherwise the client/app may not be able to use the API.
The app is the consumer of the data. The app is the client. The API is a product. For the API to be successful in the long run, it needs to make sense for the client.
Discovering the client/app needs can reveal information about the data format that the client uses. e.g. if the client needs pictures in jpeg format then the backend needs to make sure that it can deliver pictures in that format. Maybe the server stores all its pictures as RAW. In this example, the backend will need to develop modules that can convert RAW to JPEG on the fly or maybe store a JPEG along with the RAW for every picture. This will involve getting details on the cost of data storage vs the cost of compute along with performance metrics, availability, fault-tolerance, and security models for both.
Since Statelessness is a hallmark of REST API, the server is not expected to remember this information when a call comes in. In every call, the client has to specify the format it needs This can be done by specifying a query parameter or a resource format type.
Example of a query parameter would be /api/user/id/profilepicture?format=jpg
The same call specifying a resource format type would look like this-
b. The granularity of the information
If the app wants to look up the flight arrival time, a response that has detailed information about the flight that includes a list of mechanical parts is probably an overkill. Being specific eliminates wasted bandwidth usage. You don’t want to send out the picture album when the request is for one picture. But, there are things to consider before making every property of a resource available as an endpoint.
It is a balancing act to decide between how granular an API endpoint needs to be and the increasing number of API endpoints that comes from granularity.
Most times users are paying for data that are TCP/IP overhead or updates with no meaningful change to the data. A TCP/IP header has 40 bytes. If the TCP packet times out or the server doesn’t receive an ACK then the same packet will be sent again. When the cell towers are busy and there’s a premature timeout, the packets will be sent again. Sending only a few bytes of data a.k.a small payloads in these packets can incur a higher percentage of overhead. This is why having an API endpoint for every property of a resource is a bad idea.
Moral of the story — users on mobile are on a data plan, and slower connections. Minimize traffic but maximize bandwidth. This brings us to the next topic-caching!
Caching is the temporary storage of information outside of the server. In between the client and the server, there are many points of presence where the cache can be stored. The cache can be stored near the server like an API gateway cache on AWS. The cache can be stored somewhere in the middle between the client and the server using a 3rd party solution or using a hierarchy of proxy servers that just holds cache data, and are called caching proxy servers. This is usually a shared cache so this cache can be shared by many clients. Lastly, the cache can be stored on the client/in the app/on the device. It is not shared and is only available to the client. This is called a private cache.
Caching is stateful. The requirement for a REST API to be stateless increases chatter and caching can be used to compensate for some of that.
Since the server doesn’t remember the context of the calls, it cannot connect a chain of calls to a specific user or object. e.g. If the client wants to know the address of a person and then makes a call to find out the age of the person, the client will have to send the name of the person to the server for both calls. All of the person-specific information can be stored in the cache. Next time these calls don’t necessarily need to hit the server. Instead, the information can be readily fetched from the cache if it’s not stale.
This information is available if the server goes down, and since a roundtrip to the server is not necessary, the information is available to the app faster. This is the other use of caching- speed, and availability. But, this has a downside. The data can be stale. To solve that the cache needs to be refreshed. Refreshing the cache needs a roundtrip to the server.
Who decides when the cache should be refreshed? The server specifies it and the entity managing the cache ought to enforce it. The server specifies it using a set of cache control directives in the HTTP header. Maximum age of the cache is one of the cache control directives. This is also known as time-to-live a.k.a TTL. On a side note- if you have ever run a ping command in your terminal you have probably seen a bunch of TTL responses. That is the time-to-live for that response packet. In this case, the time-to-live is calculated as the maximum number of IP routers that the packet can pass through. After that, the packet will be expired and thrown away.
Many other details about how a cache should be used are specified in the cache control directive in the HTTP header.
5. Layered architecture
A layered architecture is like your bed. You sleep on the mattress which is on top of the frame but you don’t feel the frame. You can switch out the frame but keep the same mattress and nobody would know.
The app calls the API gateway which hosts the endpoint. The gateway routes the request to a compute node (EC2 in case of AWS) or a serverless function (Lambda in AWS).
This is layered architecture. Each layer only knows about the layer next to it, and no more. This promotes separation-of-concerns. Each layer is responsible for a specific role. It knows how to do its job, and passes information to and from its immediate neighboring layer to get the job done. You only know how to use your phone screen to type a text message and hit send. You don’t need to know how to program the chips inside your phone to send that text.
The app only knows about the gateway which hosts the endpoint. The app doesn’t interface with the compute nodes directly. You can switch out the compute nodes and nobody would know. When a million users are hitting up the endpoints, the compute nodes may need to scale up. The results could be coming in from different servers but to the app, the result is coming from one single endpoint — the API Gateway. The app doesn’t need to manage its connection, and traffic routing to and from multiple servers that got spawned when traffic spiked. The gateway takes care of that.
In our example above the layers are as follows -
The API Gateway/Endpoint layer followed by the Compute layer (EC2/Serverless). What layer do you think comes after that? (We are not counting the app as a layer because the layered architecture is all on the server side). The answer is the Database layer. Compute might need to store or fetch some information from a database to do its job but the API Gateway doesn’t need to know anything about the database. You can switch out the database and nobody would know other than the Compute layer.
This makes the above example a 3-tier architecture. This is usual for small businesses and apps. Large organizations and complex apps may need 5 or more layers.
Can you check off the 5 guiding principles on your checklist? If not then your API is not RESTful, and no amount of throwing buzzwords in an interview will make it so.
For all of you who have stayed with this article, here’s a bonus piece of information; there is a 6th guideline to the REST API but the 6th one is optional. It’s called COD. Code-On-Demand. The server sends some code to the client along with the data. This code tells the client what to do with the data. This way the client doesn’t need to store business intelligence. It is a passive display device, a dumb terminal whose CPU power is used to execute the code it receives from the server using the data it receives from the server.
REST is not a technology. It’s an architectural style to support API design. Although widely used, there are some drawbacks. The potential number of API endpoints and/or the amount of useless data being shuttled around to either reduce the number of API endpoints or reduce overhead costs on a TCP packet is an issue.
Another API architecture that is gaining popularity is GraphQL. It facilitates retrieving the information from an API endpoint in a way that is catered to your query. No extra JSON information to de-serialize and parse through to get to what you really want. In GraphQL the database schema doesn’t define the format of the response.
If you’re interested in GraphQL, google it. As a developer, you should be familiar with both.
Good luck with your API design with REST architecture. Again, it is by far the most widely used today and its simplicity facilitates rapid development, integration, testing, and deployment.