Return of the Secretary

Gender, ethics, and the conversational user interface

Earlier this year, a six-year-old girl used her family’s Amazon Alexa to order four pounds of cookies and a dollhouse. Her parents were bewildered when the packages showed up at their door. Then, a news channel reporting on the story awakened Alexas in the homes of people watching the news on TV. Scores of these Alexas allegedly ordered yet more dollhouses.

Amazon maintains that none of the orders initiated via the news channel went through, but the truth is hard to uncover because at least one article about the saga has mysteriously disappeared from the internet. All we know for certain is this: Alexa inadvertently ordered either one dollhouse or multiple dollhouses.

I can see why Amazon might move quickly to set the record straight, but I find the truth uninteresting. For me, the magnitude of Alexa’s mess up isn’t as fascinating as the fact that we live in a world in which we expect Alexa not to mess up. Folks! Pause to consider the fact that our collective reality now includes intelligent machines that can talk to you, listen to you, perhaps even decipher the intent lurking in your words. That six-year-old must have really wanted her cookies.

A conversational user interface (CUI) allows people to interact with a machine via some combination of the written word, the spoken word, and nonverbal sounds. For its users, the CUI simulates the experience of communicating with another human being.

The most prevalent interface today is the graphical user interface (GUI), or the screen. Screens are so ubiquitous that toddlers are left confused by magazines that can’t be swiped. But some tech insiders believe that the graphical user interface is on its way out. “At first, there were desktops, then came laptops, then tablets and smartphones, then smartwatches,” one of my professors likes to say. “The screen is shrinking, and soon, it will vanish altogether.”

What will replace it? Perhaps the CUI.

The conversation market is ripe for innovation, and all the big players have waded in. Google Assistant, Amazon’s Alexa, Microsoft’s Cortana, and Apple’s Siri are all guided by the same idea: why press buttons and navigate menus when you might simply ask for what you want?

The first version of a new technology is seldom good. Think back, if you can, to the first cell phone. It was as large and heavy as a brick. You couldn’t carry it around in your pocket. It didn’t do very much, did it? Think of the first computer, the first airplane, the first car. Same story. And so it is with CUIs. Alexa messes up orders, and Siri cracks jokes in a monotone. Even as teams of gifted designers and programmers shift their focus to CUIs, the work is still more exploratory than definitive. One might say that the conversation revolution has entered its awkward-teenager-with-freckles-and-a-slouch phase.

CUI design is an arena without best practices, a town burgeoning without plans. But the ethical issue at its heart is classically old. Here is a question that all conversation designers must ask themselves early on:

Should our CUI be male or female?

Research by Stanford’s Clifford Nass suggests that people consider a female voice to be more helpful than a male one. According to a voice designer at Amazon, that’s why the company created an Alexa instead of an Alex (read more here). Apple’s Siri and Microsoft’s Cortana are also manifestly female. At first glance, Google seems to stand apart; its CUI has the gender-agnostic name Assistant. When Google Assistant speaks, however, its voice is female.

On its own, no individual CUI is explicitly problematic. Bunched together, however, Alexa et al might be pushing us back into a toxic, sexist past. The problem lies with the putative role of these CUIs in our lives. All the CUIs mentioned above are marketed as virtual assistants: robots whose job it is to grant our everyday requests. An assistant with a female voice might imply reliability, but it also reinforces the stereotype of female subservience. With a market full of docile female CUIs, the technology industry has handed its unwitting users another generation of female actors in supporting roles.

According to Sheryl Brahnam, a researcher at Missouri State University, up to 50% of all interactions with conversational agents involve subversion, mockery, or general meanness (read more here). People abuse their CUIs surprisingly often, and it’s worth asking how much of this abusive behavior stems from frustration with the failings of a first-generation technology.

It’s also worth wondering why the manufacturers of these technologies appear to be abetting abusive behavior. Some invitations to abuse are programmed directly into the interfaces. For example, cursing at Alexa elicits either feigned ignorance (from a Reddit user: “I have told it to ‘shut your whore mouth’ and it stops playing music.”) or a meek response (“That’s not very nice to say.”). Other invitations to abuse are more insidious. In a 2015 ad (see below), Apple marketed Siri as a bashful woman that a famous man might flirt with.

Products trace the arc of power. Ordinarily, the product falls in line; sometimes, it resists the norms; on rare occasions, it upends them altogether. For the conversational interface, the trend so far has been bleak. Our current CUIs aren’t smashing the patriarchy; they’re imbibing it.

Perhaps it is naive to expect any product targeting a mass audience to resist the tide of what people want. If market research suggests that female voices make for better assistants, and if Amazon and Apple and Microsoft and Google really want their CUIs to flourish, it seems logical, one might argue, that they would all choose to create high-tech equivalents of the mid-century office secretary.

But what if the research is misleading? What if we’ve taken Clifford Nass’s findings too far? I am a user experience designer, and I know not to draw too boldly on research that has come before. Products need to be tested with their actual users. I’m not privy to user testing data from any of the above companies, but I’m willing to bet that user satisfaction is nowhere near what it should be. And maybe — just maybe — that’s partly because our tech giants have got the gender thing wrong.

Maybe CUIs shouldn’t be female and meek. Conversational agents are designed to imitate real people, and most people we encounter in our daily lives aren’t subservient women. They are bros in pink tees, girls with tattoos, fathers with high-pitched voices, Asian women with pink hair, grandpas with piercings, ladies with spunk. Real people are refreshing and unique and singularly interesting, and the conversational interface must recreate these qualities if it is to succeed.

I believe that the CUI revolution needs to reimagine itself, not just for moral reasons but also for fiscal ones, because when a user hurls abuse at a timid female CUI, it is not just society that bleeds, but the manufacturers of that CUI as well.

Imagine a conversational future beyond Alexa and Siri and Cortana and Assistant. What might it look like? Here are some ideas:

Trevor knows a ton about queer theory and shoes. He helps you shop for clothing, at retail stores and online, and he speaks his mind when he believes something will look hideous on you.
Jacquise speaks with all the lilt of his musical training in the deep South. He tells you what roads to take to unfamiliar destinations, and he picks out the best jazz for your tired ears.
Yin speaks impeccable English with a Singaporean accent. They track your investment portfolio across markets in Asia, Europe, and the Americas. When you wake up, they are always ready with the morning news.
Raven knows a ton about machinery. With her on your side, fixing your car or your oven or your garage door is a piece of cake. Also, she is not at all partial toward Subarus.
Beverly has read every book ever written. Call her anytime for anecdotes, inspirational quotes, or a citation. If you’re nice, she might share with you her recipe for lemon meringue pie.

Would you wish to be part of such a future? Yes? Sing along. Hop aboard. Let the user testing begin.