API Design: Stability Versus Readability — Must One Choose?

By Martin Nally

Good API design separates APIs that merely expose assets from those that help developers get things done. As I’ve written before, and as we’ll explore in this article, good design includes the style in which web API URLs are constructed.

Below are two API URLs that exemplify two divergent schools of thought on URL style. The first example is an anonymized and simplified version of a real URL and the second is a theoretical URL:

https://ebank.com/accounts/a49a9762-3790-4b4f-adbf-4577a35b1df7

https://cinemacanon.com/genre/documentary/film/los-angeles-plays-itself

Some major differences are obvious:

  • The first URL is opaque, providing enough information for a human to infer that the URL references a bank account at ebank.com, but nothing else. For those without a photographic memory, the URL details will be difficult to remember and difficult to distinguish from other, similar-looking URLs.
  • The second URL is much easier to interpret, memorize, and compare with other URLs. It tells a clear story: Los Angeles Plays Itself is a film in the documentary genre, and it is listed, along with other films in other genres, among the offerings on the fictional Cinema Canon website.

Because the second example is much friendlier to humans and because APIs are products used by human developers, it may seem that the hierarchical style is preferable. This is not always the case, however.

To understand why, let’s consider another difference between the two URLs above:

  • the first references a bank account, which is a persistent entity;
  • the second relies on a categorization of a film, which may change as cinemacanon.com reassesses its information architecture or otherwise changes the way it organizes content for users.

Persistent entities such as bank accounts are typically described by identifiers, even though these strings of characters may be hard for humans to read, because these records must remain valid and unambiguous even when other things change.

We cannot identify the account using information about the owner, like address, marital status or name, for example, because those may change. To reference her account in the future, it is necessary to have a unique identifier that is not altered even as other related information is modified. Hexadecimal strings are one way to do this.

URLs based on names and genre classifications, in contrast, are easier than alphanumeric identifiers for humans to use, construct, and get information from. But they may not be stable when things are changed.

In our experience, API designers may not anticipate the need to rename or reorganize within a hierarchy — but it is nevertheless generally necessary or desirable to provide this flexibility. In the case of cinemacanon.com, if a film or genre name is changed, references based on the previous hierarchy may break.

The ramifications of a break may be far reaching. URLs exposed by an API are generally based on identities stores in a database by the API implementation, meaning that decisions that affect URLs typically also affect database and API implementation design, and vice versa.

So, faced with these compromises, what should an API designer do? Spurn human readability, and thus perhaps undermine the ease with which others might leverage the API? Or make concessions for people even if it means sometimes having to find and repair broken URLs?

Luckily, it doesn’t have to be a binary decision — in Google Cloud’s Apigee team, we recommend providing both. By providing both styles of URL, one’s API entities can possess both a stable identifier and more user-friendly hierarchical naming.

Direct References Versus Search

We often think of URLs as representing a specific entity. In the ebank example above, for example, the URL refers to a specific bank account.

But suppose that the Los Angeles Plays Itself URL refers today to a specific physical DVD. In the future, it may refer to a different copy of the film if if the existing copy is lost or stolen.

This shows that whereas the bank URL does refer to a specific entity — an account — the second URL does not. So if it does not refer to specific entity, what does it refer to? Think of such URLs as referencing search results. In the case of the movie, for example, the search might be: “find the film that is currently named ‘los-angeles-plays-itself,’ and that is currently categorized in the ‘documentary’ genre.”

The difference between hierarchical naming and fixed identifiers, in other words, is the difference between referring to a search result and referring to a specific entity.

Names and Identifiers Working Together

To use both URL styles together, one should allocate a permalink — or identifier — for each entity.

For example, Cinema Canon might start with the creation of a genre by POSTing the following to https://cinemacanon.com/genres:

{“kind”: “genre”,
   “name”: “documentary”,
}

This results in the allocation of the following URL for the genre:

https://cinemacanon.com/genre/25311fcf-0117-4217-9346-bz21c4834374

Then, to create the entry for the film, they could POST the following to https://cinemacanon.com/locations:

{“kind”: “film”,
   “name”: “los-angeles-plays-itself”,
   “genre”: “/genre/25311fcf-0117–4217–9346-bz21c4834374”
}

This results in the allocation of the following URL for the film:

https://cinemacanon.com/film/755ab01d-51a1-4215-9871-gg14d15bb3ay

This stable URL will always refer to this particular copy of Los Angeles Plays Itself, even if the copy of the DVD is lost or destroyed or if the Cinema canon website is reorganized.

Based on these entities, the following search URLs could be valid:

https://cinemacanon.com/genre/documentary/film/los-angeles-plays-itself

https://cinemacanon.com/search?kind=film&name=los-angeles-plays-itself&genre=(name=documentary)

These two URLs have exactly the same meaning — the difference is just a style choice. Developers should choose the style they like better, or implement both.

Whenever a client performs a GET on one of these search URLs, the identity URL (i.e., its permalink, in this case https://cinemacanon.com/film/755ab01d-51a1-4215-9871-gg14d15bb3ay) of the found entity should be included in the response, either in a header (the HTTP Content-Location header exists for this purpose), in the body, or, ideally, in both. Following this practice enables clients to move freely between the permalink URLs and the search URLs for the same entities.

Provide Two URLs For Stability, Reliability, and Ease-of-Use

Every design has its drawbacks. Obviously, it takes a little more effort to implement both permalink entity URLs and search URLs in the same API. For a fuller conversation of these drawbacks and managing them, see my previous article on this topic, linked in the first paragraph of this article, and Apigee’s ebook Web API Design: The Missing Link.

But it is not possible to build APIs that are stable, reliable, and user-friendly with a single set of URLs. Well-designed APIs include both permalink URLs based on identifiers and search URLs based on values such as names.

[Looking for more API design best practices? See API Design, From A to Z, including an on-demand webcast with Martin Nally, “API Design Best Practices & Common Pitfalls.”]