Why Hypermedia Makes Sense

If you’ve worked in the API space in the last few years, you’ve probably come across the concept of hypermedia and the debate surrounding it. Some people believe it’s a silly idea that makes client applications more complex than necessary while others believe it’s a powerful concept that can be used to simplify applications for easier upgrades and greater flexibility.

At Clarify.io, we believe that hypermedia makes sense as it makes our API explorable for clients, more flexible in design, and provides explicit descriptions for relationships between resources.

Before we get into the “why,” let’s make sure we’re talking about the same thing.

What is Hypermedia (or HATEOAS)?

First, hypermedia is the idea that the API itself tells you what is available based on your context. This is similar to when you visit Amazon.com. The first time you ever visited Amazon, you saw links like “create account” and “log in.” Once you logged in, those links were replaced with “log out” and “account history.” The site knows your current context and only gives you options relevant at that time. The alternative — and how most APIs are structured — would require you to dig through the documentation and find the specific URL for each step and figure out what to do next. Relatively simple multi-step processes that we take for granted — like checking out — become complicated and fragile to the point of being unusable for mere mortals.

Hypermedia: Choose Your Own Adventure

In a Choose Your Own Adventure book, you don’t read the pages sequentially. Instead, as you progress through the story, you are presented with choices which send you to different pages. You go to that page, are presented with another choice, and then repeat. In an API, these choices are represented by URLs which are described by relationships or intents. As you use the API, the relationships we understand become options available to us. This understanding comes from a defined set of relationships called link relations. Some link relations are defined — like next, previous, first, last — but you can also create more which are specific to your industry, application, or even use case.

How is hypermedia used?

The “how” is where things gets a little squishier.

Great APIs are designed around noun/verb combinations. Those are listed in the documentation, developers build things around them, and then embed that information in your client applications. This all works until the API changes. In the simplest case, new optional parameters are not a problem. On the other end of the spectrum is changing an existing URL. It’s a breaking change which implies a new version of the API. Somewhere between those extremes is when the API adds new resources.

A new resource gives you easier, faster, or more powerful functionality that you don’t currently have. Unfortunately, in most APIs, a new resource implies new/more documentation, helper library updates, new quickstart guides, more support emails, etc, etc. It’s a pain at best because we can’t take advantage of this resource unless we update our application to create the URLs and process the results. But if we shift our thinking from URLs to link relations (or an intent), things change in subtle but powerful ways.

If we have a payload that suddenly adds a URL with the “next” link relation, we know that following it will get another page of results. As long as we know how to process the payloads, our application is updated. We don’t have to do anything else. We can skip over concatenating strings or doing string replacements. Which brings us to the point…

So why does Clarify use hypermedia?

At Clarify, we’re building machine learning systems to process, understand, and transform audio and video from bits and bytes into data that’s actionable and understandable to mere mortals. The most important part is that those systems are constantly dissecting and analyzing media files — including both the audio and video — to extract what it contains, understand what it means, and turn it into something useful and actionable. As a result, sometimes our systems discover unexpected patterns, relationships, and attributes that someone finds useful.

Or to put it another way: we don’t always know what’s next.

Therefore, we end up with an unpredictable collection of ever-expanding objects beneath each piece of media that we need to process and consider. In our opinion, hypermedia is the only way to support this growing web of related resources.

When we launched, we only supported search. Although we were processing and understanding quite a bit about the media, we weren’t exposing it via the API. As our automated transcripts improved, we applied basic Natural Language Processing to extract the spoken_keywords report to give you the important words. Taking this a level deeper, we analyzed the content to determine the spoken_topics to figure out what the media is about. As we explore and learn more, we simply add more link relations to represent each insight we discover about your media.

The resulting JSON payload looks something like this:

Now for someone to find the keywords, they find the insight:spoken_keywords key and retrieve the corresponding href. Or the client application can let the user explore related resources or even go back to the beginning with the parent link relation. Our helper libraries further emphasize this by completely hiding the URLs and letting you request specific insights by name. In fact, when we launched the topics insight recently, none of our libraries needed updating because you simply request the insight by name and the resulting data is the same structure as keywords.

The point is that once you understand one part of the API, that understanding can be applied to other parts with almost no effort. As new parts of the API light up, developers — or their client applications — can explore quickly and simply. It speeds the initial development and encourages more and deeper integrations over time.

If you’d like to delve deeper into hypermedia, check out Luke Stokes’ post on “Why your Colleagues Still Don’t Understand Hypermedia APIs.”