Designing Beautiful APIs

hamid
6 min readSep 23, 2014

“Just list the fundamental function and classes in your module and give us 2 minutes to guess how we should be using it.”

That was simply the statement I used to explain a preliminary step I suggested in code reviews within one of the teams I worked with. I used to give it the name “API intuitiveness review” and I am quite sure I did not invent it.

The whole idea of this review was to make sure that pretty much any developer who lays his or her eyes on the piece of code in question will be able to guess how to use it and/or how it works without having to refer to documentation. If all of the reviewers fail to guess how the code in question works or how it should be used, we would all listen to the author’s explanation of how to use it. If the explanation looks far-fetched or — worse yet — misleading, we start thinking of suggestions to make the code more intuitive before we go into any further details or code reviews.

But there is more to designing beautiful APIs than just intuitiveness.

Use case: Random number generation

Let’s imagine ourselves on a typical day at work writing code. We run into this simple problem that requires basic (pseudo-)random number generation for some reason. Let’s take a tour of what the standard libraries of some of today’s prominent programming languages have to offer.

  • C
#include <time.h>
#include <stdlib.h>
srand(time(NULL)); // seed with current time
int random = rand() % 11; // range [0, 10]
  • C++ (as of C++11)
#include <random>

//
// seed generator
//
std::random_device rd;
std::default_random_engine e1(rd());
//
// we have to pick a distribution: uniform makes sense
//
std::uniform_int_distribution<int> uniform_dist(0, 10);
int random = uniform_dist(e1);
  • C#
Random r = new Random();       // seeding happens internally
int random = r.Next(0, 11);
  • Python 2
from random import randint    # seeding happens upon importing
random = randint(0, 10)

If the C++ version looks a bit too sophisticated for our purposes, I have to tell you that it’s doing a little bit more work since it’s generating random numbers that are guaranteed to be uniformly distributed, i.e. no number is more likely to appear more often than the others. This is not guaranteed by the other code snippets, i.e. generated numbers may be skewed towards certain values/ranges of numbers. Remember that it is quite common and acceptable for C++ developers to fall-back to the C version relying on C standard libraries instead.

Let’s look back at the code. Nothing surprising so far. All code snippets seem to be doing the same thing more or less. There is probably another class/function for generating real values as well. OK. What if we have a sequence/collection of objects from which we would like to draw 1 element at random? This is a pretty common use case, isn't it?

We definitely know how to do that. We’re just going to generate a random number in the range [0, n) where n is our sequence/collection size and then — assuming our sequence/collection is random-accessible — we will use that generated random number to index into it, thus retrieving our randomly chosen element. That was pretty easy!

Guess what? It gets even better. In this particular scenario, Python standard libraries present a direct solution that is intended to solve this very problem, i.e.

random.choice(seq)
Return a random element from the non-empty sequence seq. If seq is empty, raises IndexError.
e.g.
>>> arr = [2, 3, 5, 7]
>>> random = random.choice(arr) # choose 1 element at random

Now this is starting to reveal another aspect of beauty in API design, i.e. scenario-driven APIs. If you know the most common patterns of use that clients of your module(s) will exhibit, then you might as well offer APIs that serve their most common uses in addition to the low-level rich-in-detail APIs.

This principle so profoundly honored in the vast majority of Python standard libraries that it has become almost a tradition across Python communities. In fact, the random module exhibits some of the most commonly adopted patterns for producing beautiful APIs. For instance:

  • As you saw in the Python code snippet above, the random number generator is seeded with the current system time upon random module import, which is a fairly decent default for the vast majority of uses. For all other purposes, there is a seed function that you could call if you need to.
  • Not only does the random module offer the aforementioned choice function, but it also offers a sample function for choosing a set of objects at random from any given sequence/collection.
random.sample(population, k)
Return a k length list of unique elements chosen from the population sequence. Used for random sampling without replacement.
e.g.
>>> arr = range(100) # arr = [0 .. 99] inclusive
>>> sample = random.sample(arr, 10) # 10 random ints from arr
  • Although Python’s random module seems to be extending its services exclusively through functions, it is actually employing an instance of a Random class under the hood, just like the one we explicitly created and used in the corresponding C# code snippet above. For convenience though, all of the instance methods of the Random object employed by the module have been exposed as functions. Should you want to create more than one instance of the Random class — maybe because you want to give them different seeds or don’t want them to share state — 0r should you want to subclass the Random class in order to change its generation behavior altogether, you can go ahead and do so more or less the same way you would do so in any other language. Actually, the random module itself contains a couple of alternative subclass of the Random class, i.e.
class random.WichmannHill([seed])
Class that implements the Wichmann-Hill algorithm as the core generator.
class random.SystemRandom([seed])
Class that uses the os.urandom() function for generating random numbers from sources provided by the operating system.

The pillars of designing beautiful APIs

This is a topic worthy of way more elaborate discussions and use cases. However, there are some pillars and common aspects most of the beautifully designed APIs I have encountered have in common. For instance:

  • Providing/Choosing good defaults
    Be they template types or function parameters, always make sure to provide good defaults for the most commonly needed purposes and uses. A few good examples are:
  1. Boost’s tokenizer class
string s = “This is, a test”;
//
// Thanks to good defaults, you don’t need to write the original:
//
// tokenizer<char_delimiters_separator<char>,
// string::const_iterator,
// string> tok(s);
//
tokenizer<> tok(s);
for(iterator beg = tok.begin(); beg != tok.end(); ++beg)
{
cout << *beg << “\n”;
}

2. Python’s requests library (HTTP library)

>>> import requests
>>> req = requests.request(‘GET’, ‘http://httpbin.org/get')
<Response [200]>
"""
Parameters:
(1) method – method for the new Request object.
(2) url – URL for the new Request object.
(3) params – (optional) Dictionary or bytes to be sent in the query string for the Request.
(4) data – (optional) Dictionary, bytes, or file-like object to send in the body of the Request.
(5) headers – (optional) Dictionary of HTTP Headers to send with the Request.
(6) cookies – (optional) Dict or CookieJar object to send with the Request.
(7) files – (optional) Dictionary of ‘name’: file-like-objects (or {‘name’: (‘filename’, fileobj)}) for multipart encoding upload.
(8) auth – (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
(9) timeout (float or tuple) – (optional) How long to wait for the server to send data before giving up, as a float, or a (connect timeout, read timeout) tuple.
(10) allow_redirects (bool) – (optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed.
(11) proxies – (optional) Dictionary mapping protocol to the URL of the proxy.
(12) verify – (optional) if True, the SSL cert will be verified. A CA_BUNDLE path can also be provided.
(13) stream – (optional) if False, the response content will be immediately downloaded.
(14) cert – (optional) if String, path to ssl client cert file (.pem). If Tuple, (‘cert’, ‘key’) pair.
"""
  • Save your clients the need to write boilerplate code. Provide helpers and utility functions if necessary. An elegant example is Boost’s lexical_cast. Instead of having to write:
void str(int i, std::string& s)
{
std::ostringstream ostream;
ostream << i;
s = ostream.str(); // error-checking ommitted for brevity
}

one could write:

std::string s = lexical_cast<std::string>(i); // same here
  • Write scenario-driven APIs that serve the most commonly needed uses of your module. A good example is Python’s collections.Counter, a clever sub-class of dictionary that is intended for counting hashable objects.

Please, do take the time to tell me about some of the beautiful aspects you have encountered in your favorite library. Thanks for reading.

--

--