APIs aren’t perfect, but they can be better

Published in

Telnyx Engineering

8 min readJun 2, 2020

In the 10 years Telnyx has been around, our product offerings have grown and evolved. By working with our customers, we have learned to fulfill their needs in increasingly easy-to-use ways. Our telephony engineers would release some features, messaging engineers (that’s me!) others, and number inventory engineers more. We delivered a very functional API to customers — important to be sure — but it was not always the most consistent or easy to understand.

Our engineering division used to be able to maintain consistency with zero effort using this one weird trick: just have one engineer. As the team grew, one engineer would do things one way, another that way, but by sharing the same code base, it would passively converge on the true way. Later on, engineering split roughly along the lines of our usage products and account support, and so ended one group of people looking at one codebase. The river between the groups caused a speciation event, and new traits could emerge independently in the two groups. Additional splits occurred later: splitting off inventory, splitting messaging. In each case, new traits could and would arise to fulfill their technological niches.

We have lively debates about which of our teams best resemble what modern-day organisms. I’ve found Thermus aquaticus to be fantastically useful. Image: Gringer, public domain

When I joined the company two years ago, our teams had already speciated. I joined the colony of Telnyxus mercurii (from the Roman god Mercury) and adopted the view from the messaging niche: a very deep, but very narrow view—everything messaging and only messaging. I knew every permutation of options, features introduced that few used, and what happens in that strange edge-case. The considerations I placed on adding features and fields were to be consistent with whatever our group had, but often they were novel, so I just shopped it around my team. However, they may not have been new to other teams. I wasn’t as familiar with what the other teams were doing: time was spent on problems that telephony apps or numbering inventory may have already solved, but worse, inconsistencies arose in our API.

As much fun as I’m having, we don’t want our customers to have to be evolutionary technologists to make sense of what’s going on. Coming to our docs and API in modern-day, interacting with multiple products, they may ponder why widgets are timestamped with “created_at” in Unix milliseconds, and doodads have “date_created” in the RFC 3339 format. One object can be removed with HTTP DELETE, but another would require an HTTP POST with a ?delete=true querystring. These are the remnants of our API’s evolution brought about because individual product teams seldom need to interact with the public API of other teams, and a general view of the API being the “last thing” along the path to develop a new feature.

Time for a change

So we’ve evolved. It was messy and not obvious why there’s that appendix or tailbone. To clean up from our evolution, we decided to release a second major API version and adopt a style guide so our engineers could work towards a common public facade. By providing a consistent API,

our documentation can be more consistent,
developers integrating can understand new products faster, and
we can maintain more SDKs for your language of choice by relying on the uniformity.

…but change is hard.

APIs are difficult to change. Adding fields is generally safe, but clients may be expecting fields in a specific order and may process responses in ‘creative’ ways which could be thrown off by something different. Removing or changing existing fields can be much more difficult, if not impossible. In theory, you could log and check who updates those fields, but you can’t check if they’re being read. Notifying users to update their code via your website might get some, via email some more, but until you make the breaking change, you won’t know. Ramifications are not immediate as well: it could only occur in rare (to the customer) circumstances. The customer may not use that one feature often, or could just be in a lull. Months later you can get questions about “why don’t we see X” anymore.

There are some proactive approaches to mitigate this API ossification. A project I used to work on, OpenStack, generalized their V2 API to use microversions. This would allow clients to declare which microversion they wanted via an HTTP header, and if the server supported it (servers would support a range), it would reply with the mutually supported response schema. However, this in and of itself requires a change to the API. Additionally, the clients and software using them must be periodically updated even if nothing functionally changes, to stay in the supported band of versions. This is OK for a coalition of projects like OpenStack, but when the components are us and our customer’s software, it can be a costly proposition to tell them they need to update every release cycle.

However you change, it should be gradual, allowing for users to migrate by opting-in over time. Taking down your entire version X API over a weekend, then bringing up version X+1 is a way to not do it (but that we’ve seen.)

World v1 shutting down for maintenance, v2 will up in a few years. Evolve or die! Image: NASA/Don Davis, public domain

Guiding principles

OK, so we needed to unify, but how? What would it look like? Adopting an API design guide is good for any organization. As a Pythonista, I try to take many “Pythonic” ideas (which are not necessarily about Python) to heart. For instance, PEP-8 teaches us:

A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important.

But it should be tempered. The Zen of Python also applies, such as

Readability counts.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.

Whatever you do, choose a style as a team. It ensures a consistent “voice” no matter who is writing parts of your API, with what language, using whatever libraries. If you don’t like something about it, fork it and change it. If it’s silent on a matter that causes inconsistencies, make it explicit, make it yours. Engineers, testers, and users should all be contributing to the guide.

Finding the One True Way

What is it? Does it exist? When we chose to go in our final direction a year ago, and still now, there does not appear to be One True Way. If there was The Way, everyone from AWS, GitHub, and Google would all have been using that way and this wouldn’t be hard. That said, I am severely understating my predecessors' contributions: HTTP, TLS, TCP/IP, and more underpin it all and are all part of The Way.

On the application layer, however, things are a little rockier. REST (or REST-esque depending on how stringent you can be) has provided HTTP with sane semantics unlike the wild west of 90s web dev. JSON isn’t XML, therefore it’s good. However, how to model and structure data still varies. Back in the day, SOAP as a form of XML-RPC would allow auto-discovery of calls and objects, but XML was a bit of a bear to deal with and is now considered more of a dinosaur.

HATEOAS (Hypermedia As The Engine Of Application State) is typically mentioned along with REST and does proscribe a bit more. Links (URLs or paths) to connect objects, define relationships, and show valid actions when in a given state. This has the advantage of being a discoverable API, and you can avoid versioning entirely! But then means that your users must rediscover the API. Unfortunately, lazy developers like me generally make calls by firing a JSON/query string directly to the previously discovered/documented endpoint with a semantically correct HTTP verb and happily getting the data in one round trip. The code it takes to do that is clear and straightforward, but even as a fairly experienced Python programmer, I don’t know what clients are out there that will do HATEOAS discovery for me, and I don’t have the time to develop my own. Even if I did, you’ll need to thwart your users from “misusing” your API and hardcoding paths. In effect, SDKs then turn from an optional feature (to provide and use) to be mandatory.

During our initial search for a guide, JSON-API was brought up. It’s a specification that follows many RESTful and HATEOAS principles, notably extensive linking and referencing of related resources. The decisions against it principally boiled down to a perception of complicatedness and lack of adoption by bigger players. Our initial view of sample specifications and responses seemed overly complicated, for us to provide, and we had a concern that customers may feel the same. I’m still having trouble finding larger players that have a publicly avowed “JSON-API” interface. Many like GitHub V3 make use of Hypermedia but seem more generally HATEOAS.

Our choice

We ended up using the Heroku/Interagent’s API design guide as a starting point and forked and extended it to our domain of telephony. Phone numbers will always be phone_number, not number, pn, tn, tel, telephone_number, or anything else. Local access and transport areas (LATA) are lata. The style was a good balance of being prescriptive in many areas, parsimonious, and able to accommodate a wide variety of workflows required at Telnyx.

Typical additions to the API guide are usually about constraining formats, as what’s “natural” in one language may be slightly different in another language. “Trivia” like how a date-time is formatted, or if query parameters are “starts_with” or “startswith”, don’t take much effort to fix if done as part of the development. The improved fit and finish of the API is something users won’t notice, as they can just get on with what they need to do. My OXO peeler isn’t the sharpest in the drawer, but it’s the one I reach for most often because it’s just easy and feels right.

Thermus aquaticus lives on, just being awesome. Image: Bernd Thaller CC-BY 2.0

Looking forward

So now we have a consistent version 2 of our API. That’ll last forever, right? Nope, we’ll need to improve it later. We even needed to improve it between our beta and final and did so in a rolling manner where our customers can switch over at their and our leisure. Tech is constantly evolving and the smoothest, obvious way to do things changes with it. I have hopes that one of the “binary JSON” formats like Protobuf, MsgPack, or Avro take off with a killer design and there’s ubiquitous usage of it in public APIs (I imagine they’re much more common internally). GitHub’s version 4 moved to GraphQL, which has many strengths that I value much more after undertaking our whole saga, and maybe we’ll beat them to it (version 3 is before version 4…haha). Maybe it’ll be something I read about on HackerNews tomorrow.

But I’m done for a bit. It’s your turn to use our API and see if we succeeded. Hit me up @nicktimko on Twitter with what you think.