Enterprise API design practices: Part 3

Photo by Youssef Abdelwahab on Unsplash

Welcome to the third part of my series on creating valuable, usable, and future-proof Enterprise APIs. Part 1 covered some background context and the importance of design, while the second post was about security and backend protection; this chapter is on optimization and scale.

3. Optimize interactions

When dealing with scenarios that weren’t anticipated in the initial release of an API (for example when integrating with an external product from a new partner), developers often have to rely on data-driven APIs to extract information from the backend and process it externally. While use-case-driven APIs are generally considered more useful, sometimes there might not be one available that suits the requirements of the novel use case that you must implement.

By considering the following guidelines when building your data-driven APIs, you can make them easier to consume and more efficient for the backend and the network, improving performance and reducing the operational cost (fewer data transfers, faster and cheaper queries on the DBs, etc.).

I’ll use an example with sample data. Consider the following data as a representation of your backend database: an imaginary set of alertsthat your Enterprise product detected over time. Instead of just three, imagine the following JSON output with thousands of records:

{
"alerts" : [
{
"name": "Impossible Travel",
"alert_type": "Behavior",
"alert_status": "ACTIVE",
"severity": "Critical",
"created": "2020-09-27T09:27:33Z",
"modified": "2020-09-28T14:34:44Z",
"alert_id": "8493638e-af28-4a83-b1a9-12085fdbf5b3",
"details": "... long blob of data ...",
"evidence": [
"long list of stuff"
]
},
{
"name": "Malware Detected",
"alert_type": "Endpoint",
"alert_status": "ACTIVE",
"severity": "High",
"created": "2020-10-04T11:22:01Z",
"modified": "2020-10-08T08:45:33Z",
"alert_id": "b1018a33-e30f-43b9-9d07-c12d42646bbe",
"details": "... long blob of data ...",
"evidence": [
"long list of stuff"
]
},
{
"name": "Large Upload",
"alert_type": "Network",
"alert_status": "ACTIVE",
"severity": "Low",
"created": "2020-11-01T07:04:42Z",
"modified": "2020-12-01T11:13:24Z",
"alert_id": "c79ed6a8-bba0-4177-a9c5-39e7f95c86f2",
"details": "... long blob of data ...",
"evidence": [
"long list of stuff"
]
}
]
}

Imagine implementing a simple /v1/alerts REST API endpoint to retrieve the data and you can’t anticipate all the future needs. I recommend considering the following guidelines:

  • Filters: allow your consumers to reduce the result set by offering filtering capabilities in your API on as many fields as possible without stressing the backend too much (if some filters are not indexed it could become expensive, so you must find the right compromise). In the example above, good filters might include: name, severity, alert_type, alert_status, created, and modified. More complicated fields like details and stack might be too expensive for the backend (as they might require full-text search) and you would probably leave them out unless really required.
  • Data formats: be very consistent in how you present and accept data across your API endpoints. This holds true especially for types such as numbers and dates, or complex structures. For example, to represent integers in JSON you can use numbers (i.e. "fieldname": 3) or strings (i.e. "fieldname": "3" ): no matter what you choose, you need to be consistent across all your API endpoints. And you should also use the same format when returning outputs and accepting inputs.
  • Dates: dates and times can be represented in many ways: timestamps (in seconds, milliseconds, microseconds), strings (ISO8601 such as in the example above, or custom formats such as 20200101), with or without time zone information. This can easily become a problem for the developers. Again, the key is consistency: try to accept and return only a single date format (i.e. timestamp in milliseconds or ISO8601) and be explicit about whether you consider time zones or not: usually choosing to do everything in UTC is a good idea because removes reduce ambiguity. Make sure to document the date formats properly.
  • Filter types: depending on the type of field, you should provide appropriate filters, not just equals. A good example is supporting range filters for dates that, in our example above, allow consumers to retrieve only the alerts created or modified in a specific interval. If some fields are enumerators with a limited number of possible values, it might be useful to support a multi-select filter (i.e. IN): in the example above it should be possible to filter by severity values and include only the High and Critical values using a single API call.
  • Sorting: is your API consumer interested only in the older alerts or the newest? Supporting sorting in your data-driven API is extremely important. One field to sort by is generally enough, but sometimes (depending on the data) you might need more.
  • Result limiting and pagination: you can’t expect all the entries to be returned at once (and your clients might not be interested or ready to ingest all of them anyway), so you should implement some logic where clients should retrieve a limited number of results and can get more when they need. If you are using pagination, clients should be able to specify the page size within a maximum allowed value. Defaults and maximums should be reasonable and properly documented.
  • Field limiting: consider whether you really need to return all the fields of your results all the time, or if your clients usually just need a few. By letting the client decide what fields (or groups of fields) your API should return, you can reduce the network throughput and backend cost, and performance. You should provide and document some sane default. In the example above, you could decide to return by default all the fields except details and evidence, which can be requested by the client only if they explicitly ask, using an include parameter.

Let’s put it all together. In the above example, you should be able to retrieve, using a single API call, something like this:

Up to 100 alerts that were created between 2020–04–01T00:00:00Z (April 1st 2020) and 2020–10–01T00:00:00Z (October 1st 2020) with severity “medium” or “high”, sorted by “modified” date, newest first, including all the fields but “evidence”.

There are multiple ways you can implement this; through REST, GraphQL, or custom query languages: in many cases, you don’t need something too complex as often data sets are fairly simple. The proper way depends on many design considerations that are outside the scope of this post. But having some, or most of these capabilities in your API will make it better and more future proof. By the way, if you’re into GraphQL, I recommend reading this post.

4. Plan for scale

A good API shouldn’t be presumptuous: it shouldn’t expect that clients are not doing anything else than waiting for its response, especially at scale, where performance is key.

If your API requires more than a few milliseconds to produce a response, I recommend considering supporting jobs instead. The logic can be as follows:

  • Implement an API endpoint to start an operation that is supposed to take some time. If accepted, it would return immediately with a jobId.
  • The client stores the jobId and periodically reaches out to a second endpoint that, when provided with the jobId, returns the completion status of the job (i.e. running, completed, failed).
  • Once results are available (or some are), the client can invoke a third endpoint to fetch the results.

Other possible solutions include publisher/subscriber approaches or pushing data with webhooks, also depending on the size of the result set and the speed requirements. There isn’t a one-size-fits-all solution, but I strongly recommend avoiding a polling logic where API clients are kept waiting for the server to reply while it’s running long jobs in the backend.

If your need high performance and throughput when in your APIs, consider gRPC, as its binary representation of data using protocol buffers has significant speed advantages over REST.

Side note: if you want to learn more about REST, GraphQL, webhooks, or gRPC use cases, I recommend starting from this post.

Finally, other considerations for scale include supporting batch operations on multiple entries at the same time (for example mass updates), but I recommend considering them only when you have a real use case in mind.

What’s next

In the next and last chapter, I’ll share some more suggestions about monitoring and supporting your API and developer ecosystem.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store