3 Best Practices for API Clients
There are many blog posts/articles on best practices for well-engineered REST APIs. This article is an attempt to do the same for the clients of REST APIs.
You have selected a well-engineered API that is resilient, secure, and scalable to meet your needs as an API client application. It shouldn’t be expected to go down, right? The reality is, a large number of badly behaving clients (either malicious or poorly coded) can overwhelm even the best API or API gateway. If building with APIs is a partnership between the API client and producer, how can clients be good stewards in that relationship? The focus of this article is on helping API clients better integrate with their APIs and is based on my experience as an engineering leader for Capital One DevExchange.
API Client Best Practices
Know Thy HTTP Status Codes
HTTP status codes have been around for a long time and are very well documented. Yet, a large number of poor client behaviors stem from improper handling of HTTP status code. Here are two simple rules to follow for HTTP status codes:
1. HTTP 4XX Status Code: This means one thing — the server (API or the gateway) is pointing the finger at the client for having erred. If this happens, do not retry the request as-is, something needs to change before the client can retry the request. More often than not, the client will need to do something different such as providing missing request data. Or in some cases, an out-of-band action will need to be taken. For example, access provisioning on the server to avoid HTTP 401, before the client can retry.
Example: I recently came across a client that, instead of slowing down after receiving an HTTP 429 (rate limited by the gateway due to too many requests), kept retrying blindly. Obviously, this, in addition to putting unnecessary load on the gateway infrastructure, exacerbated their own problem.
2. HTTP 5XX Status Code: This means one thing — the server (API or the gateway) is saying ‘mea culpa’ so give them a break. Clients have the right to retry but should do so cautiously. Retrying blindly on HTTP 5XX could further destabilize the API. So, what should clients do?
- In cases where a user is waiting, client will need to abort the process, so the user can choose to retry or not.
- For background calls or system-to-system calls with no user waiting, client may retry using an exponential back-off strategy (more on that below).
Know Thy OAuth Tokens
There a number of ways for an API to be protected, including Basic and Digest Authentication. But if the API you are accessing is secured using OAuth 2.0, then as an API client, you will need to deal with tokens a lot — access token, bearer token, refresh token etc. Here are few tips on how to handle these tokens:
- Reuse Tokens: More often than not, access tokens are reusable until they expire. Reuse the token and do not request a new token for every API call.
- Handle Expired Tokens Gracefully: Do not hardcode token expiry time. A well-designed API should have returned an expires_in field (when requesting a new token) and/or should return HTTP 401 when using an expired token. Use these as trigger to request a new Access Token (using the previously provided Refresh Token).
- Treat Tokens Opaquely: Do not depend on or make any assumption on the structure, content, or size of the tokens. They are server-side implantation details and relying on them could break clients inadvertently.
Exponential Backoff Strategy
Upon receiving HTTP 5XX status code, a client could retry the request as-is, but should do so after a delay. One of the best ways to do these retries is by employing an exponential backoff algorithm.
Simply put, this will enable the clients to retry after progressively longer wait times between the retries. For example, retry after 1 second, 2 seconds, 4 seconds, 8 seconds, 16 seconds… and so on.
In Java, this could be expressed this way:
Such an algorithm should also implement maximum retry interval & maximum number of retries and use optimal values for it. For example, retry after 1 second, 2 seconds, 4 seconds, 8 seconds, 16 seconds… up to a maximum of 300 seconds or up to 10 retries.
And finally, if the client runs on a large number of devices (such as a mobile client calling an API in the background), the client must employ jitter (randomized delay) in the exponential back off algorithm. If this randomized delay is not introduced in the wait time, during error scenarios, millions of clients could all retry at the exact same time (concurrently) and overwhelm the API further. Here is sample graph what system load may look like with and without a Jitter.
Refer to this AWS Architecture Blog for more on Exponential Backoff And Jitter.
While malicious API clients do occur, most APIs and API gateways are overwhelmed by well-intentioned but poorly-coded API clients. A poorly coded API client can undo all the care that went into selecting a resilient, secure, and scalable API for your app. Help prevent this by understanding and implementing these simple best practices. Your users will thank you.
DISCLOSURE STATEMENT: These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by, any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © Capital One 2018.