“Real developers don’t use UIs”

The value of web UIs for CLI-oriented users

Micah Linnemeier
IBM Design
30 min readJul 18, 2018

--

Written by Micah Linnemeier & Spencer Reynolds — July 12, 2018

Executive Summary

  • Web UIs should be complementary to their CLI counterparts, not redundant with them.
  • Even the most die-hard CLI user will be more productive with a web UI in certain situations, due to fundamental strengths of GUIs that ultimately are tied to traits inherent to all humans.
  • CLIs are strong when it comes to tasks that should be automated, tasks where the user already knows exactly what they want, and tasks that benefit from interoperating with other CLI tools.
  • Web UIs are strong when it comes to the ‘getting started’ phase of product adoption, and particular tasks in the ‘productive use’ phase such as: making sense of large data sets, browsing, important tasks that need human intervention, and tasks that benefit from interoperating with other web experiences.

Introduction

As user experience (UX) designers working on a cloud platform with both a command-line and web interface, the elephant in the room when it comes to prioritizing and allocating resources to the web UI is often the following: “Why should we invest in the web UI when our users only work in the command-line interface (CLI)?” This is a good question that deserves a well thought out answer.

Unfortunately, many UX designers working on products for developers, DevOps professionals, data scientists, or system administrators (hereafter referred to as ‘software professionals’) don’t engage with this issue head on. They treat the CLI as a mysterious world that is beyond the reach of UX. Instead, many designers attempt to create a simpler, easier web UI world they can understand, assuming that this will automatically have some beneficial ramification for their users (“I actually get the web UI, so it must be easier for users too, right?”).

For the uninitiated, the CLI can seem impenetrable. But don’t assume that necessarily makes it difficult for software professionals.

Ignoring the CLI leaves UX designers vulnerable to a massive problem. If they don’t design the web UI intentionally with respect to the capabilities of the CLI, they are opening themselves up to the possibility that the entire web UI is redundant with the CLI, bringing no additional value to the overall experience. Working on such a web UI is not a good use of a designer’s time, or their company’s resources. And ultimately, putting effort into something that users will never touch does the users themselves a disservice, because it takes designers’ time away from improving other areas of the product (like the UX of the CLI. But that is a discussion for another day).

All things considered, web UI and graphical user interface (GUI) designers in general need not fear — there is still a role for graphical UIs in a world dominated by CLIs, even for products with deeply technical user bases.

In this article, we will attempt to give perspective and guidance on the relationship between and relative merits of web UIs (GUIs in the medium of the web browser) and their CLI equivalents, with an audience of UX designers and also other product stakeholders such as developers and product managers in mind.

Many of the qualities that make web UIs valuable are true of GUIs in general, but we will also discuss a number of traits and recommendations that are specific to web UIs and their CLI equivalents.

We’ll start by comparing the relative merits of CLIs and web UIs, which will prepare us to discuss how a web UI and a CLI can work together to create the best possible experience for your product.

Why aren’t CLIs extinct?

A typical CLI, responding to a command to list the detailed contents of the current directory.

At first glance, the CLI may feel like a relic of a bygone era. One where computers couldn’t quite figure out how to make nice pictures, and had to make do with text instead. Why haven’t GUIs blown CLIs off the face of the earth by now?

Let’s do a quick comparison of their interaction models:

At first glance, the CLI appears to be less capable in every way: fewer options for providing input, and fewer options for seeing the results of your actions. But as good designers know, sometimes constraints can drive fantastic outcomes.

In the case of the CLI, these constraints help to create an ecosystem of extremely powerful tools for software professionals. Let’s dig into how these constraints manifest as some of the best qualities of CLIs.

Why CLI tools dominate with software professionals

CLIs are almost always available

CLIs are the lowest common denominator way of interacting with any machine. Machines that are in a different location, have a minimalistic installation, are in a state of partial installation, or are in an error state can often only be interacted with via CLI.

Why are CLIs so widely available?

  • Because CLIs are simple, they have fewer dependencies and require fewer system resources, making them a good fit for servers that want to put as many of their resources towards serving content as possible.
  • The CLI mode of interaction (send a command, wait, receive results) is more resilient to latency issues, making CLIs a good choice for remote interaction with a machine.
  • The simplicity and time-testedness of the CLI means reduced security risk compared to GUIs.
  • CLIs predate GUIs, and sit metaphorically closer to the underlying operating system. As such they are sometimes the only way to perform certain low-level operations that software professionals must do.

Because CLIs are guaranteed to always be available, there is a logic behind software professionals focusing as much of their attention as possible on becoming comfortable with the CLI, rather than learning two different ways to achieve the same thing.

Even for personal computers, almost all operating systems have a way of booting the system directly to a command line, circumventing the GUI and giving more direct access to the operating system in times of need.

The CLI is automatable

With the CLI, the way a human does a task is the same way an automated script does it: by providing a series of text commands. If you do something once on the CLI as a human, you can make a script do it for you a thousand times, using the exact same commands. You can also make a computer do it for you when you’re not even around. This is absolutely vital for software professionals looking to reduce the manual effort involved in common tasks.

Some GUI environments do have the capacity for automation, but because of the more complex mode of interaction entailed in working with GUIs, GUI automation scripts are often much more cumbersome to create and maintain.

Give a man a GUI and he solves his problem for a day. Give a man a CLI and he solves his problem for a lifetime.

CLI commands make it easy to re-use the work of others

Because all CLI commands are text, it is easy for CLI users to share, find, and re-use the commands of others. Simply copy and paste.

Consuming someone else’s GUI instructions is much more difficult. Consider the process of following a Photoshop video tutorial. You have to constantly jump back and forth between watching the video and attempting to mimic actions in your workspace. This takes much more effort, and is much more likely to result in error.

GUI instructions are also much more difficult to share. You either have to create and edit a video, produce a series of screenshots with annotations, or awkwardly describe your mouse interactions in written form.

The ease of sharing CLI scripts led to the term ‘script kiddie,’ an unsophisticated user that blindly uses scripts to perform hacks with little to no knowledge of the underlying technology.

CLI tools connect to each other relatively easily

Because the CLI input format is the same as the output format, it is relatively easy to chain or ‘pipe’ a series of command line tools together to accomplish a complex task.

GUIs take text, files, and mouse clicks (and other touch input) in the right locations as input, and produce text, files, and images on your screen as output.

From, the perspective of chaining, there are two problems here:

  • GUIs struggle to produce mouse clicks in the right locations as output.
  • GUIs struggle to accept images on your screen as input.

GUIs can produce mouse clicks in specific locations as output, but picking the right click location means making adjustments based on the location of a given GUI element, which means adjusting for the user’s desktop environment, the state of the app, and the specific version of the app. No thanks.

GUIs can’t accept images on your screen as input, although you can technically ‘scrape’ screenshots for data using optical character recognition. What’s less clear is whether users would rather scrape a series of GUI screenshots, or scrape their eyes out from the difficulty of making this work.

The end result for software professionals is that with GUI tools, you often have to pray that the full workflow you want to accomplish can be done within a single tool, since interoperability between tools is far less than guaranteed.

Jimmy’s pipe dream of interconnectedness between tools is real with the magic of CLI piping.

CLIs are faster than web UIs

CLIs often return results from remote servers significantly faster than their equivalent web UIs. Why is this the case?

  • Where CLIs typically request only raw data from the server, web UIs must transfer not only raw data, but also presentation information (e.g. CSS), and information about other actions that can be taken on a page (e.g. navigation).
  • CLI commands typically only hit one server endpoint once. Web UIs often request resources which then request their own resources, which then request their own resources, etc., leading to multiple round trips before a page is fully rendered.
  • CLI tools often allow you to be more precise about the exact data you want to retrieve, resulting in less unnecessary data being transferred in the first place.

CLIs can also increase the effective speed of advanced users in ways that are unrelated to the network:

  • Knowledgeable users can execute any command at any time, without needing to ‘navigate’ to a specific area in the tool.
  • Users that already have their hands on the keyboard from writing code may find it faster to switch to another keyboard-driven tool (CLI) compared to a mouse-driven one (GUI).
  • CLI commands are often more concise and faster to enter than their GUI equivalents. The mouse is typically not a high throughput means of conveying information.
Real footage of a CLI user at work.

CLIs make remote systems seem local

CLI tools that operate on remote resources (a computer or other entity elsewhere on the internet or local network) follow the same interaction model as CLI tools that operate on the local machine: text and files in, text and files out. In combination with the automation and interoperability capabilities of the CLI, this means that software professionals using the CLI can often orchestrate complex tasks between the local machine and multiple remote resources with relative ease.

The ability to SSH (Secure SHell), or log in to a remote machine and begin working with it as if you were on that machine locally, also enhances the CLI’s ability to make the remote feel local. While remote GUI desktop sessions are possible, the GUI model of constant screen feedback suffers greatly under even slight network delays, leading software professionals to pine for their CLIs. The better alternative to a remote desktop session is a remote interaction via a web UI, which we will get to in a moment.

CLIs fit the culture of software professionals

In addition to the concrete factors listed above, there may also be a set of cultural factors at play behind some software professionals’ preference for CLI tools over web UIs. In keeping with the developer ethos, I’d prefer to focus on the concrete benefits of the CLI mentioned above for the purposes of this article. But if some of these additional factors are of interest to you, I’ve described my personal understanding of some of them in a separate article.

  • The Anti-Aesthetic Aesthetic
  • Complexity Masochism
  • The ‘Matrix’ Effect
  • ‘RTFM’ Culture

So why use a web UI at all?

All of the concrete reasons above are, in our opinion, very good reasons to use a CLI over its web UI equivalent. These factors are so strong that they make CLIs the right interface for a substantial portion of an experienced software professional’s tasks.

The funny thing is, web UIs should be strictly superior to CLIs. They have a superset of the capabilities of the CLI.

Let’s take a look at the CLI vs. GUI interaction model comparison again:

The impossible is unknown with GUIs.

Fundamentally, what are we getting with GUIs that we aren’t with CLIs?

  • The GUI can control every pixel on the screen.
  • Users can provide input in new ways to the system via mouse/touch input.

The biggest benefit is control of the screen. More visual control gives us more options when it comes to leveraging the user’s visual cortex: an information processing beast.¹

The mouse also plays a role, by leveraging human intuitions and capabilities from the physical world around the direct manipulation of objects.²

The screen and the mouse work together to fuel the most secret and controversial weapon of the GUI: paternalism.

The advantages of fine-grain control of the screen

While humans have an innate capacity for language, reading itself is unnatural, having only been part of human culture for the past five thousand years or so.³ Vision, by comparison, has not only been part of human biology since day one, but predates humans altogether, occurring in the vast majority of animals. And humans have some of the best vision in the animal kingdom.

Which one gives you the important information faster: the text or the image? Our pro-reading ancestors sadly did not survive this test.

From our evolutionary legacy we’ve had both the time and motive to develop specialized cognitive circuitry for deriving value from visual information. Much like how data can be processed more efficiently when formatted in a way that a specialized chip (like a graphics card) can understand, representing information in a visual form is an extremely useful ‘hack’ for making information much more digestible to the human mind. This additional cognitive bandwidth can be used both to improve users understanding of your tool’s domain (‘how many servers are down?’) and also their understanding of the capabilities of your tool (“How do I turn those servers on again?”).

GUIs can leverage richer visual signifiers to convey more information faster

A “signifier” is any form that implies something else. Words are signifiers, but they aren’t the only ones, and often aren’t the best ones for helping our users understand. Because GUIs allow us to leverage a wider spectrum of signifiers, they have more opportunities to convey information efficiently and effectively.⁴

Consider the following examples:

Which would you prefer to see in your UI?

GUIs are full of signifiers. They can be as obvious as an icon, or as subtle as the small distance between two objects implying they are grouped (see ‘Gestalt Principles’ for more on this). Some signifiers are specific to computers (e.g. hyperlinks and tooltips), whereas others leverage more natural intuitions (e.g. buttons looking pressable, red things catching the eye.)

CLIs don’t have much to work with to form signifiers: just plain monospaced text and a few other bells and whistles depending on your CLI environment (foreground and background color per character, bold vs. normal text, the ‘empty’ character to create spatial metaphors, flashing command prompts, sometimes hyperlinks.) Text also implies a few other capabilities that are sometimes but not always supported, such as copy & paste, or text-based search. These capabilities can be quite powerful compared to not having them at all, but are still a subset of the possibilities of total screen control.

In the movie Shinboru (Symbol, 2009), a man must escape a room full of ‘buttons’, where the buttons in no way signify what happens when they are pressed. Suffice to say, it takes him quite a bit of trial and error.

GUIs can use images to make information easier to digest

As my grandma used to say, a 2D pixel array is worth a thousand UTF-8 strings.

The capacity to represent information in visual forms other than text allows GUIs to leverage the processing capabilities of the visual cortex that make it easier to filter, find, and make sense of information.⁴

Consider the following image:

A composite photograph of the world at night. Beautiful isn’t it? You can learn a lot from it quite easily.

How hard would it be to attain the same level of understanding this image gives you if it were presented as a table of luminosity ratings at various longitudes and latitudes? How about if you were to describe it in words to another person? How long would it take to describe it to the same degree of accuracy vs. just showing them the image?

Let’s take a look at a more simple image to better illustrate some of our engrained visual capabilities.

Try answering each of these questions using only the left side, then only the right side of the image.

  • How many red items are there?
  • How many triangles are there?
  • How many blue circles are there?

It’s likely that you were able to answer all of these questions faster with the image on the left compared to the image on the right. Why?

One effect at play here is our ability to ask our brains to perform a special ‘color highlight’ operation. This operation is a low one-time cost, and reduces the search space by 2/3rds: a tidy savings computationally speaking compared to examining each item in series.

The other trick you probably used is highlighting on the basis of shape: bringing items with a certain appearance to the mental forefront.

Interestingly, this shape highlighting effect is in play when examining both the image of shapes on the left and the image of text on the right. On the left the shape your mind highlights is the triangles. On the right your mind attempts to highlight the shape of the word ‘triangle’ itself.

The shape of individual words are not always terrible to use as the basis for this mental highlighting trick, but the trick gets weaker the longer the word or words are, or the busier the context is (like being amongst a ton of other text). Because GUIs are able to use a wider visual space than just the shape of words as the basis for these engrained highlighting tricks, they have the potential to be more effective at communicating information to software professionals.

GUIs use vision to trigger recognition more easily

Why is recognition so important for user interfaces? It is because humans fundamentally find it easier to recognize an item from a set that is presented to them, rather than to recall that same item without any prompting.⁵

Think about the difference between a waiter that says “for your side, do you want soup, a salad, or chips?”, vs. a waiter that says “what side do you want?”. The first waiter is making it cognitively easier for the guest by presenting the options to their guest and letting them decide. The second waiter is requiring the guest to either already know their side options, or to look it up themselves somewhere on the menu. Both waiters are asking for the same information, but the first is reducing the guest’s cognitive burden.

(Note however, that if this guest is a regular at the restaurant, she may find it easier to specify in her initial order that she wants the salad, with no prompting at all. This should be well supported for power users. See discussions on the ‘productive use’ phase later on.)

Now imagine a third waiter that brings by a cart containing the sides themselves, and lets you point and grunt at the ones you want. This is perhaps the easiest of all possibilities. Seeing the sides saves you from having to mentally connect the name of the sides to the sides themselves. It gives you much more information about the sides with very little additional cognitive overhead (for instance, are the fries thick cut or thin? What ingredients are in the salad? Does the soup look good?). The visual presence of the sides relieves you from having to keep the list of sides as a focus of your mental attention, freeing more of your attention for actually reasoning about your options. And finally, the ability to grunt at your sides of choice saves you from even having to remember the words for the objects in front of you. Indeed, clicking your way through a web UI is at its easiest a series of grunts: what I refer to with veneration as having achieved “caveman simple.”

For a more technical example, on IBM’s managed Kubernetes Service (which we work on), to create a cluster in the command line there are many different arguments that you may need to provide via the ‘cluster-create’ command…

Many users would find it difficult to recall their desired location, public vlan ID, private vlan ID, machine type and cluster name without any assistance (setting aside the issue of formatting these arguments in the way the CLI expects).

Selecting a machine type for your worker nodes in the Kubernetes Service web UI. Just grunt at the one you want (with your mouse please, pending Siri integration).

Now imagine this same experience in a web UI. Instead of memorizing all of the different flags and values that you can pass in for each argument, each option appears as a separately labeled interactive field. For fields whose values can be enumerated, the options appear in a list, reducing the set of options you need to consider to exactly the options that will work.

Because GUIs are built around visually presenting recognizable options rather than expecting users to recall commands, and because GUIs are able to leverage images that are more recognizable than sentences, GUIs can reduce the cognitive burden for software professionals.

The dark side of better access to the human mind

From the perspective of many software professionals, web UIs often abuse the additional bandwidth their visual capabilities afford. With dubious aesthetic choices, gratuitous animations, and advertisements, web UIs can use their additional bandwidth as a channel for noise. Don’t dump trash into your user’s brain! With great power comes great responsibility.

Note also that not all information necessarily has a visual encoding or augmentation that is superior to plain text. Text and language in general are powerful because they are able to represent many abstract concepts clearly. Know when to leave a good thing alone.

The advantages of touch input

Hands are our most effective natural tool for interacting with the world. Poking keys in rapid succession is clearly an effective if difficult to learn channel for transmitting text, but there’s more to hands than just typing.

GUIs can use touch interaction as a more intuitive input mechanism

Direct manipulation is a mode of interaction that interprets a user’s physical input (typically moving the mouse) more literally with respect to its repercussions on digital ‘objects.’ Examples include tapping icons to open a program, scrolling via touch gestures, or dragging the corner of an image to make it bigger.

All else being equal, actions you take on objects in the physical world are most obvious when they can be performed directly on the object itself, rather than through some intermediary. The use of direct manipulation in user interfaces allows us to more directly apply our mental circuitry — trained against the physical world — since birth to the digital realm.²

For domains that have a direct physical analog, like 3D modeling, touch input is a no-brainer. For other tasks that are more common to software professionals, the benefits of reusing mental circuitry trained on the physical world are smaller. Software professionals’ digital mental models may be as rich or richer than their physical mental models. But there may still be some physical intuition baked into the human mind from birth that continues to give physical analogies an advantage, regardless of your level of digital experience.

The argument I find more persuasive is around what I will call co-location of action and object, where for instance the actions that can be taken on a digital object are revealed by right-clicking that object. Taking action near the on-screen object may piggyback on the already expended cognitive effort involved in looking at and recognizing that object on the screen. The alternative being that you take all actions at the command prompt, regardless of where or whether the object you are manipulating appears on screen.

Einstein famously denounced quantum theory as ‘spooky action at a distance’ because of the sheer unintuitiveness of particles affecting each other without interaction across great distances. While we cannot ultimately make the reality of our world conform to our notions of intuitiveness, we can control the systems we create for ourselves.

GUIs can use the touch interaction for superior performance in select tasks

Touch input can be a faster means of providing input in certain situations, even for the CLI adept. Consider the ease of scrolling down a page through touch gestures on a trackpad. With one movement you transmit information about direction and distance using natural motion. Scrolling with the keyboard entails either repeatedly pressing a key, or holding a key down for an extended period, a well-known aggravation prior to the days of the mouse.

And yet, my father still grabs the scroll bar and moves it down manually. Old habits die hard.

The non-software professional, clearly befuddled in the face of technology.

Paternalism: the secret weapon of the GUI

Also known as ‘nudging’, or suggestion, paternalism is limiting a person’s autonomy in a way that is intended to promote that person’s own good. For the purpose of UIs, paternalism is showing, hiding, highlighting, or de-emphasizing information and actions with an intent to help a user accomplish their goals.

Paternalism is key to the success of GUIs, and the adoption of personal computers in general, because it helps manage complexity. Users that don’t already have a comprehensive mental model of a system are more likely to benefit from having their choices intelligently reduced. Also known as the paradox of choice or information overload, presenting more information and more options can actually increase the cognitive burden entailed in making a decision, leading to lower productivity for users as they struggle to divine the right information from what lays before them.

Whether paternalism helps or hinders a UI is dependent on three key factors:

  1. How smart your UI is

If (and it’s a big if) your UI can use what it knows to bring the right actions and information to a user’s attention at the right time, it can dramatically cut down the user’s cognitive burden, and improve their productivity. But if your UI makes bad calls, it will get in the way more often than it assists.

Your product’s UX lives or dies by this factor.

2. How experienced your user is

Users need more suggestion in the earliest stages of their journey because their mental model isn’t built out yet. In later stages, when the user can conceive of exactly what they want to do and how to do it without any feedback from the system, your suggestions are already too late to the mental process party.

3. Your user’s personality

Different users experience suggestion differently. Some may take it as an affront to their intelligence (“I’ll tell YOU what I want!”). Others will see it as a considerate accommodation that helps build a relationship (“Oh, how thoughtful of you!”). See our article on the culture of CLI usage for more on this.

What does any of this have to do with GUIs in particular?

Both CLIs and GUIs can use paternalism. But GUIs tend to do it much more. GUIs make paternalism more effective due to the combination of two of its core capabilities:

  1. Images trigger recognition more strongly than text. Recognition drives the ability to suggest one alternative over another.
  2. Direct manipulation (or more specifically, co-location of action and object) allows users to ‘click’ the thing they recognize, completing the recognize-act-recognize cycle of paternalistic suggestion.
Evernote clearly thinks ‘Work Chat’ is less important than making a ‘New Note’. And they selected the ‘All Notes’ tab for me by default when I opened their app. How dare they!

I use the word paternalism specifically because of its negative “father knows best” connotation. In the eyes of many software professionals, GUIs and web UIs have abused this technique, keeping them from finding and quickly accessing the functionality and information they know exists. It is this history of abuse that UX designers are up against when creating a web UI for software professionals.

Most CLIs take the opposite approach to paternalism, being highly un-opinionated about the right next action. GUIs have modes (e.g. what page you are on), whereas CLIs are often modeless (all data and actions are reachable in one action). GUIs can use visual bandwidth to make certain actions more or less visible (e.g. using a big bright button or a more sedate one). CLIs often list their actions in alphabetical order.

The GUI wins when the relief of having the right thing suggested to you outweighs the pain of things being hidden from you that you want.

The first portion of the AWS CLI help topic for EC2. This alphabetized list of commands takes no stance on which are more or less useful to the user right now. Yikes.
Yeoman is an example of a CLI tool that uses paternalism heavily. It also stretches the limitations of the CLI. Can you see the GUI influences?

A summary of the GUI’s fundamental advantages

The GUI has a handful of tricks up its sleeve that make it more efficient and easier for users to understand and act on information. Via more direct control of the screen, it can leverage visual signifiers, mental ‘highlighting’ techniques, and the power of visual recognition to communicate information more efficiently and more easily than with plain text alone. Via touch input, it can leverage human physicality to make certain operations more intuitive and more efficient. And most importantly, it uses these capabilities in tandem with paternalistic UX patterns to make systems easier to understand and act on for users that aren’t already complete experts.

And because many of these tricks rely on fundamental qualities that are universal to all humans, they apply to both software professionals and more typical users alike.

This is not to say however that these advantages apply uniformly to any task. Far from it. The strengths of GUIs and the strengths of CLIs when considered side by side reveal the situations in which a given task is a better fit for one or the other. The end result being that for most cloud-based products with an audience of software professionals, both a CLI and web UI have some role to play.

What the CLI should specialize in

Anything that should be automated

A fundamental strength of computers in general is automation, and CLIs are excellent at taking advantage of this.

In keeping with developer values, anything that can be automated should be. The CLI is the perfect environment for this, due to the strengths mentioned above, particularly around ease of automation, and the ease of connection CLI tools together.

The “productive use” phase

The user here knows exactly what they want without any need for prompting. The guidance provided by a GUI would only get in their way. (From xkcd)

Experienced users often know exactly what they want to do and how to do it. In this situation the CLI is often the best interface for the job, because of its aforementioned raw speed advantage. As users move from the initial ‘getting started’ phase of product adoption to the everyday ‘productive use’ phase, the likelihood that they already know exactly what they want to do increases dramatically. In this phase, you should get out of your user’s way as much as possible; and the CLI is often the best way of giving them that level of direct access. (Note that keyboard shortcuts can help reduce this disadvantage in GUIs, although their use on the web is not standardized and a topic of some debate.)

Workflows involving other CLI tools

Because of the ease of using CLI tools together to accomplish complex tasks, and the already rich ecosystem of CLI tools for software professionals, it goes without saying that workflows that benefit from interoperability with these other tools are good candidates for being available on the CLI.

As a rule of thumb, if the end goal of a workflow is to manipulate data into a different form, the CLI is often a good bet. If the immediate end result of a session with your product is a human insight that doesn’t need to be fed into another program, the web UI may be a good candidate.

What the web UI should specialize in

The “getting started” phase

In the early stages of product adoption, users often struggle to make sense of the wide array of capabilities your product offers. They may need help in making a mental model of your product. They may need suggestions on what they can or should do. Because the GUI specializes in suggestion (see ‘paternalism’ above), it is the ideal interface for this phase. Once the user has a strong mental model of the capabilities of your product and how to interact with them, they no longer need suggestiveness as much as they once did.

Because of the web UI’s strength in early phases and weakness in later phases, the web UI should ultimately seek to facilitate a transition to the CLI as software professionals enter the ‘productive use’ phase of product adoption. One way a product can help to do this is by giving the CLI equivalent of actions taken in the web UI — especially commands that you expect users to want to use on the CLI in their productive use phase.

Heroku, a cloud hosting platform, showing the CLI equivalent of adding a service, right next to the web UI button for that action.

Another more general way to facilitate the transition to the CLI is to make sure that you are teaching them a mental model that is complementary, rather than in conflict with, the model taught by the CLI. This means using the same terminology consistently across both interfaces, making the underlying entities and relationships that users need to understand in the CLI clear in the web UI as well.

Browsing workflows

To browse, in contrast to ‘searching’, is to peruse for something that you don’t already have a specific definition of. In this mode, you need rich feedback from the world as much as you need to tell the world what you want. Because GUIs specialize in being suggestive and driving recognition, they are a perfect fit for browsing experiences. If your product has something analogous to a catalog, manifesting this in the web UI is a no-brainer.

Image result for netflix
Imagine browsing Netflix in a CLI vs. in a GUI.

Important one-time or infrequent workflows

Because of the CLI’s strength in automation, the end game in your user’s product journey may be that your product is almost entirely automated by that user, and thus not manually interacted with at all for most workflows. But there are always workflows that may be inappropriate or overkill to automate, either because they are one-time workflows (e.g. upgrading to a paid tier of the product), or because they need special care from a real human user (e.g. merging two pieces of code together). In these tasks, a user experienced with a tool in general may find themselves in a more exploratory mode that the web UI can assist with, for the same reasons that the web UI is beneficial in the ‘getting started’ phase of the experience.

Note that if a workflow is truly unimportant as well as infrequently used, it may be ok to leave it as a CLI-only feature, simply because it doesn’t warrant the extra effort entailed in exposing it in the web UI.

Workflows that involve making sense of large amounts of data

Even the most hardened and grizzled software professionals will take a look at a monitoring dashboard of their app. And very few of us use command-line web browsers on a regular basis.

Lynx: the classic CLI web browser. See also ‘browsh’, a more modern take on this, in the ‘ongoing developments’ section.

As discussed in the section on a GUI’s visual capabilities, some information is more easily communicated and interacted with in a visual form, even when the user has worked with your product for years. These situations should be identified, so that the web UI can be leveraged when the time is right.

Workflows that involve the rest of the web

As a matter of practicality, many workflows for your user already begin on the web: either in a totally different corner of the internet, or perhaps in another website or app you own that nevertheless isn’t integrated with this particular product’s CLI experience. The most common example is the ‘workflow’ of your user discovering your product in the first place. Unless they are using a package manager to install your CLI tool directly (and do you support registration on the CLI?), chances are extremely high that they will see your web presence first, making this a logical jumping off point for other workflows.

But initial product discovery is only one example. Consider a DevOps service that links a GitHub commit to a build of that code, and the build to a dashboard showing data for the servers running that build. Traversing this chain of services on the web is only a matter of clicking hyperlinks. But the equivalent CLI tools are each separate programs on the user’s computer. Each needs to be individually downloaded and installed. And each is likely maintained by separate teams and perhaps entirely separate companies. As the number of links between disparate areas in a user’s workflow increases, the potential value of the web UI experience increases relative to the CLI.

Note that none of this precludes interoperability between CLI tools and web UIs. You should be open to workflows where users move seamlessly between the web UI and the CLI. This means including hyperlinks where appropriate in the CLI; and references to CLI commands, download links, and the like in the web UI.

What the web UI should NOT do

Please, don’t reproduce the entire CLI

Reproducing 100% of the functionality of a mature CLI is a fool’s errand. The CLI will often contain a ton of obscure functionality. It is valuable that this functionality exists, but that does not mean it rises to the level of needing to be reproduced in the web UI. For better or for worse, the CLI often specializes in making everything POSSIBLE. There is a relatively low amount of effort involved in exposing existing functionality on the CLI vs. exposing that same functionality in the web UI. The web UI shouldn’t be in the business of playing catch-up to a fast-evolving, complex CLI. Instead, think of the CLI as a safety net for your users. In a pinch, they can always do it in the CLI, especially if it is rarely needed or only needed by power users. This frees up the web UI to do what it does best, and keep the focus on areas where web UIs shine.

A very “feature-rich” bulk rename utility, suffering from trying to do too many things in a GUI. (Image from CorsairMediaServices)

Please, don’t overdo it

All of the tricks of GUIs can be used for evil more easily than they can be used for good. Increased visual capabilities often become a channel for noise more than a channel for increased understanding. Touch interactions can easily be less efficient and less intuitive than just pressing some keys. And using paternalistic UX to hide capabilities can lead to much more aggravation from users not finding what they want, compared to the minor cognitive benefits they may reap from seeing one fewer option in their dropdown menu.

To make matters worse, the historical abuse of these capabilities may already give your web UI a negative perception from the get-go with software professionals. Convincing them otherwise is an uphill battle that can only be won through the thoughtful application of these capabilities from a user’s first exposure to the web UI.

Make sure your visual tricks add valuable information and not noise. Don’t visualize information unless it’s a clear improvement over just text. Don’t add touch interactions unless there is a strong rationale or a precedent that your users will expect. Don’t make assumptions about what your user does or doesn’t want to do unless you have a good basis for it.

Only a Scumbag Steve would add unnecessary visual flourishes to a practical screen.

Ongoing ‘developments’

Let’s look at a couple areas that give us new perspective on the traditional strengths and weakness of GUIs and CLIs.

CLI/GUI cross-pollination

The separation between CLI and GUI, or CLI and web UI, may be more of a historical fact than a necessary difference. It’s not clear (to the authors at least) why these worlds can’t be combined for mutual benefit. Perhaps some opportunities lie in wait here.

CLIs continue to explore the incorporation of graphical elements and other GUI-world patterns in order to make the best use of human mental circuitry, while keeping many of the CLI’s advantages intact.

Tools such as Google Cloud Platform offer CLI-like experiences in their web UIs, opening the doors to a world of workflows that support seamless interoperability between these two modes of interaction.

“Hey, you got your CLI in my web UI!

Tools such as browsh push the graphical limits of CLI tools, ‘hacking’ the background color of empty characters to function as (very large) pixels.

Browsh uses a terminal to approximate the graphics of a web page. For use cases that can benefit from the other benefits of CLIs, such as being available everywhere, this can be very useful.

Wttr.in makes great use of ASCII art, color, and tabular layout in a CLI tool. It also doesn’t require you to download a program at all, defying a typical limitation of CLI tools.

Web Automation

Currently CLI-based tools are the king of automation, even when it comes to making public services on the internet work with each other. But a new(ish) set of tools are making service-to-service automation easier every day, without the need to stick a machine you own in the middle. Tools such as IFTTT and Zapier simplify setting up service-to-service automation via a web UI that makes it easier to accomplish common use cases (e.g. ‘If this RSS feed updates, POST to this URL’.) And function-as-a-service offerings such as AWS Lambda and IBM Cloud Functions support more open-ended service-to-service automation through code.

Zapier supports chaining of web services in a manner reminiscent of CLI piping.

Conclusion

Remember, the CLI is part of the user experience as much as the web UI is, even if UX traditionally considers the web UI more of its territory. The web UI and CLI should be complementary to each other, rather than redundant or in opposition to each other.

For designers, it is important to acknowledge and not downplay the prevalence and the reality of the strengths of the CLI. Web UIs that are poor replicas of the CLI experience only perpetuate the ‘GUIs are only for n00bs’, or even worse, the ‘GUIs aren’t necessary’ fallacy. Instead, if you fully appreciate the situations in which the web UI is the best tool for the job, it will allow the web UI to specialize and succeed in that narrower and more thoughtfully defined scope. That means the work put into creating the web UI will be more appreciated, both inside your company and with your users. And that should be something for everyone to be happy about.

Micah Linnemeier and Spencer Reynolds are UX designers at IBM Cloud, based in Austin, Texas. The above article is personal and does not necessarily represent IBM’s positions, strategies or opinions.

References

Broad credit is due to Jeff Johnson 2010, Designing with the Mind in Mind for being my first introduction to many of the human factors considerations described in this article.

[1] “Researchers at the University of Pennsylvania School of Medicine estimate that the human retina can transmit visual input at roughly 10 million bits per second, similar to an Ethernet connection.”
Koch K, McLean J, Segev R, et al. How Much the Eye Tells the Brain. Current biology : CB. 2006;16(14):1428–1434. doi:10.1016/j.cub.2006.05.056.
https://www.sciencedaily.com/releases/2006/07/060726180933.htm https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1564115/

[2] Edwin L. Hutchins, James D. Hollan, and Donald A. Norman. 1985. Direct manipulation interfaces. Hum.-Comput. Interact. 1, 4 (December 1985), 311–338. https://dl.acm.org/citation.cfm?id=1453235

[3] The first written language, Sumerian, is dated to around 3100 B.C. Gelb, Ignace J. “Sumerian language”. Encyclopædia Britannica Online. Encyclopædia Britannica.

[4] Ware 2012, Information Visualization, Third Edition: Perception for Design (Interactive Technologies)

[5] Tulving, E., & Osler, S. (1968). Effectiveness of retrieval cues in memory for words. Journal of Experimental Psychology, 77(4), 593–601.(http://alicekim.ca/osler%2068.pdf)

From the Cool Retro Terminal project. Credit Diego Pacheco.

--

--

Micah Linnemeier
IBM Design

UX designer. Developer. Cute baby contest winner. Worker bee at IBM in Austin, TX.