Usability testing for APIs (and other textual interfaces)

Text-based interfaces are both an opportunity and a challenge for UX practitioners. In this story I’ll discuss some of what makes testing these interfaces both easier and harder.

Al Sweigart put together a text-based adventure system in Python. It’s pretty awesome. (read more)

What’s an API?

Honestly, I used the term API because it gets the most hits. I’m really talking about any text-based interface, whether that’s a library for a programming language, a Unix-shell based program, a text-based adventure game, or a chatbot.

It might seem like text-based interfaces are outmoded, but their use remains strong in some segments of the population. Nearly every consumer website has a chatbot these days, and almost all[1] programming is done via text. Chat software — SMS — was the killer app for the cell phone, and still is, with booming business from all the major players.

Differences

Textual interfaces have some unique qualities that make them fundamentally different than graphical interfaces.

Difference 1: You can type *anything* next

In a GUI, the user’s choices are constrained by the on-screen elements. If your dialog box only has six buttons and two fields, there are only eight things the user is going to interact with. And for each one, there’s only a handful of gestures — clicking on buttons, entering text, maybe swiping around a scrolling region.

In a properly designed textual interface, the number of things the user can interact with is very high, and the ways they can interact, likewise. Let’s take the Unix command line as an example. Even restricting ourselves to the simple world of file manipulation and navigation, there are dozens of commands (cd, ls, cp, mv, pushd, …) and dozens or hundreds of files to act on. And that’s not even getting into composition (ls -al | sort -nrand all that sort of thing)

This divergability poses a significant challenge to the UX practitioner. Mockups for a text-based interface, practically, cannot cover every aspect of a workflow. They can’t even cover a fraction of them. Often this is a blessing in disguise, because it forces the team to really focus on the intended workflows, while allowing opportunistic wins by permitting interaction with other pieces of the system. (More on opportunistic wins later.)

Difference 2: The interface is part of a larger interface

Every text-based interface is really part of a larger whole, which has its own set of rules and expectations. ZORK-like games, you expect to have an inventory, navigation controls like (s)outh or (u)p, and perhaps to be able to use a handful of well-known verbs (examine, take, etc.) on many things in the world.[2]

Command line utilities (e.g. sed, git, brew) likewise inherit a bunch of goodness from the environment — commands are often expected to be composable using pipes, they generally should have some built-in documentation behind -h or the equivalent, and there are even some flags and usages that are common. For example, many commands offer a -n / --not-really flag that lets you preview the outcome of a command without actually performing it.

While GUIs are often part of an operating system with common paradigms (Window, Icon, Menu, Pointer), it’s rare for GUIs to be composable or separable. You can’t tear off the Layers Toolbar from Photoshop and apply it to your MS Word document, as cool as that would be.

This is also a blessing in disguise. By shaping your test to lean on these constraints, you can defer a lot of the pain of prototyping — because the work is done by some other thing.

Difference 3: The name *is* the interface

This is a consequence of a text-based interface. The name for something is the thing itself; cd is cd, and maybe you know it means “change directory” and what a directory is, but it’s immaterial. Users end up thinking about it as “the cd command”, and that’s fine.

This means that getting the names right is really, really important. Unlike the name for a GUI application, which is mostly branding (“Photoshop”, “Little Snitch”, etc. are names meant to evoke the functionality, while names like “Safari” are purposely chosen for their catchiness), the name of a bit of an API has big impact: on how hard it is to type, tab completion (if it exists), sorting when in a list, what it gets confused for, and so forth.

Difference 4: It’s just text

This is the crux of the matter. These interfaces are, largely, “just text”.[3]

This is a place where APIs have a huge advantage over GUIs when it comes to prototyping. To prototype a GUI, you need someone proficient in a prototyping tool such as Balsamiq or Sketch, or at least with a whiteboard marker. Interactive GUIs can be constructed from paper, and I’ve had good success with doing that. But interactive APIs? All you need is a text editor.

I do 90% of my usability testing in a text editor; sometimes it’s an IDE that is syntax-aware, but that’s not necessary. The user types code. I tell them what it does. Alternately, we run a “Wizard of Oz” session where a team member launches a shared editor and acts as the ‘computer’, cutting and pasting canned responses or making up results if the user’s been inventive.

As low fidelity as this is, you learn an awful lot, at a cheaper expense than even the cheapest wireframes.

They’re Still User Interfaces To Me

A final note for experienced UXers who might be faced with the daunting challenge of supporting a text-based product. After all is said and done, they’re still UIs, interfaces between a computer and a human, and all your regular knowledge and know-how applies. Things have to be discoverable, understandable, differentiable; the gulfs of execution and evaluation should be kept small; and so forth.

For a list of specific things I watch for when designing programming languages, check out my article on that subject.

And if you’d like to see more content in this vein, let me know by leaving a comment or just a clap or three. It really helps me know it’s worth my time.


[1] There are some largely-graphical programming environments, like Scratch or Simulink. These are really neat and I hope the trend continues. But text is a good match for programming because of the qualities outlined above: when programming, you need to be able to “almost anything” next.

[2] Many games flout these conventions, to very good effect — but this is the basic ZORK/ADVENT paradigm you often start from.

And some years ago, PC Gamer had fun wondering what modern computer games would look like if they were text adventures.

[3] Most modern text-based interfaces are not “just text”. IDEs provide color shading, popups, autocompletion, and a host of other features. Most chat applications put smileys and special effects over your text. These are important considerations, and worth a future article.