Aspects for API Designing (1/3)

David E Lares S
Feb 22 · 7 min read

APIs are the most common software piece used today for building products and services, these are considered as a kind of abstraction or middleware component for low-level system interactions.

Many authors skip the logic behind an API implementation and go straight on explaining how to use a framework and connect to persisted data. Leaving aside the designing aspect of it.

The aim of this post is to talk briefly about good practices of API development, the HTTP protocol, REST architecture and their resource representation, error handling, versioning, caching, and so on.

The API value chain

Let’s start by talking about the API value chain, in a few words, this sets how the data is managed in repositories or databases to a later user offering.

At this point, the data is used by API developers and then is leveraged by product owners or App developers, which should design mechanisms to how to interact with the data based on design principles established by the API owners.

The most important link in this chain is the App developer for obvious reasons, these are the people who can control the end-users on how to access your exposed data. So, you as an API owner should make your API easier for developers than your competition, the easier to adopt, the better.

The API value chain considers APIs a product. And for that reason, you should always be attending to designing principles such as API simplicity, un-ambiguous design, clear ideas for action intents, and of course, consistency.

You can check this article that explains how APIs can work in many industries in such detail and if you are already familiar with APIs I recommend this lecture about managing APIs.

REST APIs

Rest APIs are another type of APIs that are oriented in delimiting a resource identity. Here’s an URL example of using the Marvel API:

http://gateway.marvel.com/v1/public/comics

It consists of a base URL, a version, and a referred resource. As you can see, a good thing is to avoid the use of a direct domain name, instead, use a simple base URL or a subdomain.

Also, apply plural nouns for pointing resources and actions, API operations are different than CRUDs. Resources can be associated as well in the endpoints to perform mixed operations if required. You should avoid deep nesting, and subqueries as well.

The uniform interface says that we need to establish a contract connection for the client-server combo for CRUD implementation, so, depending on what type of HTTP method and status response used, you can figure out that type of resource action was made.

The HTTP method indicates what to be performed on a resource, where.

  1. POST is for creating a resource
  2. GET is for reading a resource
  3. PUT / PATCH are for updating a resource
  4. And DELETE, well, delete a resource

What about HTTP Status? responses have a 3 digit status code that represents an action.

  1. The 1xx is for information
  2. The 2xx response to success events
  3. The 3xx is for redirection scenarios, temporary redirection
  4. The 4xx specifies client errors
  5. And the 5xx specifies server errors

This is the standard, of course, you can create your own codes, but is strongly not recommended, you can work with the many status codes around, I’m pretty that all of them suit your needs. Check them out here.

Here are some examples of CRUD scenarios.

[POST]

  1. success (2xx): return a new object
  2. failure (4xx): bad request or 500x (an issue in processing)

[GET]

  1. success (2xx): ok, returns back the response
  2. failure (4xx): bad request or (5xx) for issues in processing

[PUT (update all) or PATCH (update partial element)]

  1. success 200 ok, 204 (no content), or 201 (created)
  2. no need to send back the link

[DELETE]

  1. success (2xx): ok, return back the response. It uses the responses as the PUT method, there’s no need to return content needed
  2. failure (same as usual)

The Resource Representation

Depending on your needs, you can feature your API for supporting multiple formats such as JSON, XML, or CSV responses, opening the umbrella for more clients' capacity.

A common thing is to specify what kind of format you want inside the query parameters, adding something like this: ...&format=json or ...&format=csv . Another workaround is done by placing HTTP headers for formatting, you can ask for JSON with the Content-Type: application/json or Content-Type: application/xml for XML resources.

The response object must include the Content-Type , the HTTP code status and the document itself, based on the type of format requested.

API responses

When responding to an API call, you can show a successful response or a failure response based on the HTTP status. Sometimes this is resolved by the browser, but regardless of the case, you should need to decide on the code what kind of status you are using for each response.

That also includes the error handling spectrum, depending on the error, you will need to set the HTTP status code to the 4xx and the 5xx cases, and also the response format based on a common template for the sake of simplicity and reading ease.

The template formatting is intended for developers only, it must include a message of the text, the timestamp, the method used, the endpoint information, and the list of errors in a detailed way.

For success cases, you can alternate the error response format, with a simple status code and the payload (the data itself).

The API changes

The changes of the API must be transparent to the end-user, but many times will impact the experience for internal and external consumers. For that reason, you need to really caution on the planning and delivery of changes

When adding changes, you need to ensure that it generates a non-breaking change, if so, the change will break the API. This is also presented in common scenarios when adding new operations, resources, and optional parameters or HTTP verb changes.

Intense planning is a must-do for API development, you need to guarantee the elimination or minimization impact on the developers, present backward compatibility, provide supports, and established a well-scheduled change

This transition can be done with the semantic versioning when you should provide support to at least X previous versions for a period of time, mark the unsupported as deprecated, and publish a rollout plan in advance.

Implementing a changelog file can work in the clearing of the new features and why the changes were made.

Cache

Another strategic behavior is caching, you can specify the cache behavior for performance improvements

The cache can occur on the Client, the ISP, the gateway, or in the back-end side, and is used for improving performance mainly. Once the hardware cannot handle the volume of incoming traffic, the cache can enlarge the life of your bare-metal.

In the mid-tier, when the call comes from the API to the database, the response will be sent sent to the client. Depending on the type of cache made, the closer to the API consumer the better for performance.

Caching is not for every type of data, this is ideal for static data, files, and front-end assets. Caching dynamic data is risky because you will need to evaluate the speed of change, the time sensitivity, and the security.

With this in mind, you need to make decisions on which component should control the cache, what to cache, and for how long. This is done with the cache directives.

The cache-control directives rule the caching capabilities based on the HTTP RFC 2616 specification. This must be obeyed by mechanisms along the request chain cycle. It occurs on the response HTTP header, on the API-level, and on the request itself, where the same settings should be matched.

The Cache-control tag can control who can cache the response and the conditions of it, this can override the caching behavior and protect sensitive data from caching too.

The public and private arguments are sensitive data that should not be cached on intermediates. The private data is meant for a single-user and will no cache data in the intermediate elements.

Here’s an example: Cache-Control: "private, max-age=60"

The no-store argument is for cache in the browser, writes data on the browser directly, and is replicated on backups. Sensitive data should not be stored anywhere

Here’s an example: Cache-control: "no-store, max-age=60"

The no-cache ETag always get the data from the server, it is a subsequent request to the same URL that can return different data, the ETag comes with the response data, it's like hash data, if changes, so the hash and can is used to check if the data has changed.

The no-cache avoid any caching in whatever particular element. While the ETag is calculated and sent with data and the ETag received in the original response evaluates if it changes or not.

The max-age sets for how long the cache has validity in seconds.

Here’s another example: Cache-control: "no-cache" , at this point, the request can validate if the data changed by sending the ETag

Here are some considerations for caching.

  1. Perform cache always you are using a high-volume API
  2. Use the no-store and private for sensitive data
  3. Provide the ETag for large responses
  4. Carefully decide on the optimal max-age

What’s next

We just covered caching for performance but is not the only mechanism to do so.

The API partial response gives the control to the consumer to ask for that it makes sense. Many developers just return whatever the database queries show up, this is all about parametrizing the queries from the consumers' perspective.

This is good for optimizing resource usage when parsing the data, also for CPU, memory, request sizes, and bandwidth, also, the consumer can ask for what it wants, leading to control the granularity of the data independently of the type of client used for the usage.

And another great component for performance is the use of pagination, the API can control a big chunk of data and deliver it in pages. This can be used in a cursor-based for controlling the structure on the frontend side, or the offset-based which is the pagination is established at a query level, limiting the number of records to show.

There are a lot of things to consider when developing APIs, make sure to consider the most possible and go for a plan. A pro-tip is to work with data wrappers for better scaling, show the most relevant data, and use a common response format. Pagination is a plus, but try to avoid performing a whole SELECT * instruction.

Join me next for a brief talk on the architectural constraints in the API development world.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +792K followers.

David E Lares S

Written by

Backend Developer, Pentesting and InfoSec Student

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +792K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store