Part 7 — Cache, compression and much more on RESTful APIs performance

7 min readSep 13, 2019

Hey hackers!

I’m always saying that designing a solution’s architecture is an art.

Even though there are many patterns and techniques, a real project has countless variables, and because of this, that’s not something trivial as it may seem.

One of the most discussed topics in architecture is performance, because it directly impacts server resources usage, and therefore, on your scalability level.

I’ll clarify this topic a little, and show you how to have better performance on RESTful APIs.

What is cache?

Cache is a computational storage structure, focused at keeping copies of frequently accessed data.

Cache’s purpose is to make data search faster and to save the usage of server’s resources.

With cache, your API will have the following benefits:

Minimizing network latency
Minimizing servers’ processing load
Time optimization when responding for clients

Caches are divided into two major categories:

Shared cache — it stores servers’ responses for reuse among various users.

Private cache — it stores servers’ responses for a single user.

Those caches can be implemented through browsers, proxy servers, gateway, CDN, reverse proxy or WEB servers load balancers.

Cache in client

If you have read my last articles from this series, you already know that a RESTful API uses HTTP protocol for client and server communication.

Having said that, the HTTP protocol has its own specifications (RFC) for cache policies usage.

All of them are specified on a request’s header. Let’s jump in and understand what the available policies are.

Note: I won’t cite HTTP 1.0 specifications because those methods are deprecated.

Cache Control

cache-control is recommended for declaring cache mechanisms. The main available directives are:

No caching

With “no-store” directive, you’ll be specifying that neither clients requests, nor server responses will use cache. In other words, the client will download the server response for each request made.

Cache-Control: no-store

Cache but revalidate

With “no-cache” directive, the browser will send a request to the server for validating data.

The main point here is to save bandwidth, but making sure that cache is updated, that is, if the server returns an updated cache to the browser, the browser fetches cache data and does not download it again.

Cache-Control: no-cache

Private and public caches

“public” directive allows data from servers responses to be stored in a shared cache, while “private” directive allows them to be stored on a browser to a single user usage.

Expiration

“max-age” directive is used to set the maximum time (in seconds) that data will be available in cache.

This directive implicitly tells your browser to store that page in cache, but it should validate with the server if the maximum time was exceeded.

Cache-Control: max-age=30

Validation

“must-revalidate” directive forces browser to always revalidate data with the server before using cache. In this case, cache keeps being used normally, even though it’s revalidated on each new request.

Cache-Control: must-revalidate

Entity Tags

ETag is one of the most important headers in the cache topic.

It’s used as a server response to specify a version for the returned data.

It works as a consistent unique key, and it enables an intelligent flow for consuming cached data by the client.

That’s how the flow looks like:

client makes a GET request
The server sends data and an ETag key, that’s stored in a local client’s storage.
client makes a request with ETag value on the “If-None-Match” header,
Server validates if the data was modified, and in case it wasn’t, server returns HTTP status code 304 — Not modified. Check the following image:

Cache in Server

On the last topic, we saw how cache instructions work, considering HTTP clients’ level.

Therefore, we now understand how to handle cache via frontend, but now we need to understand how to do it on the server-side.

So let’s see what the main available techniques are for who’s going to develop a RESTful API.

Load Balancing

When your API has a high amount of requests for a single server, you can use this technique to distribute both network traffic and request processing capacity. With this, you make your API more scalable and safer, because if one of the servers fails, there are others available.

Load balancing doesn’t have a default cache feature, but it does have a session storage feature.

As we know, HTTP is stateless, thus, it doesn’t store sessions, but when load balancing is used, it stores the client session to better distribute the requests. Because of this, you can use it as something similar to the cache.

There are numerous load balancing algorithms, and the main implementations are made through web servers’ own settings, like Nginx, Apache, and IIS.

Reverse Proxy

A reverse proxy is, basically, a public interface for your API, that is, it works as an agent that intermediates all the external requests.

Main advantages of using reverse proxy:

Security — features that protect against DDoS attacks and SSL certificates settings.

Performance — native features for compressing cache and transmitted data, because it acts as an intermediator agent for requests, it treats data in a smart way before sending it back to the client. See in the image below how a reverse proxy with load balancer works:

Usually, the implementations are also made through settings for the Web server itself, like Nginx, Apache, and IIS.

Gateway

Think about a reverse proxy with more features.

That’s the Gateway. It has all reverse proxy features plus many more.

In practice, besides being your API’s access interface, the gateway also brings interesting features, like routing, monitoring, authentication and authorization, data transformation, advanced security by policies, and more.

It’s the architecture default and the most recommended technique for using cache on a RESTful API.

In that case, instead of using a web server, you can use API Management platforms like LinkApi, for instance. Check the image below:

An API Gateway brings with itself an embedded cache intelligence, because it makes use of its own reverse proxy feature, and for more advanced issues, it uses other high performance data storage mechanisms, like Redis and Memcached.

CDN

CDN is a widespread concept through the Web world, and it means Content Delivery Network.

It works like a servers’ network, that keeps web content replicas and distributes them in an optimized way.

The “thing” in CDN is that those replicas are made in strategic geolocations, and because of that, CDN manages to redirect clients requests to a closer server, making latency way shorter.

As latency is reduced, the time for responses is also reduced. Therefore, it’s a great choice for a high-performance API that’s consumed in different geolocations.

This is a technique that can be adopted for the general Web, not just for the APIs world.

The main CDN players are Akamai and CloudFront.

Data Compression [Bonus]

REST pattern allows many formats for data, like XML, JSON, HTML, and more.

In all of them, compressing messages is possible, making your server deal with lower amounts of data and enhancing your API’s performance.

Headers usage is necessary on requests:

Accept-Encoding

As a client, the request’s header would be:

Accept-Encoding: gzip,compress

Gzip is the default format for compressing data.

Content-Encoding

In a scenario that a server has data compressing implementation, it obviously compresses data before returning to the client, but it also returns evidence of that process through content-encoding header. This is what the clients receive:

200 OK
Content-Type: text/html
Content-Encoding: gzip

Some important considerations:

If Accept-Encoding has been filled in a not recognizable way by the server, it should return status code 406 — Not Acceptable.

If it’s a known format, but the server does not have it implemented, it should return status code 415 — Unsupported Media Type.

An interesting fact is that the majority of browsers automatically make requests using those data compression techniques.

That’s it, folks, we’re coming to the end of another article on our RESTful APIs series.

As you may have noticed, performance is a broad concept in software development, but to make it briefer, I’ll stop here.

If you are interested in learning more about performance, leave me a comment in this article. If a lot of people are interested, I’ll make a focused article on how to diagnose performance issues through APM (Application Performance Management) tools, stress and load testing, and many more.

We’re reaching the end of our series, and the next and last article will be a thorough list of best practices when developing APIs. Stay tuned and see you on the next one!