This article covers primary UX research with Alexa users in London, what we learned and how designers, marketers and product owners can build relevant voice experiences.
It’s Valentine’s day, and you’re anticipating a new relationship in your life with Alexa. You look at her, on the kitchen counter, amongst the salt and pepper shakers. She looks like an awkward teenager at her first school disco. You say, “Alexa, what now?”. Her blue light flashes. You smile in anticipation. She replies, “Now is usually defined as the momentary present”.
The reality of living with a voice assistant is not the romantic AI future painted in films and television. Alexa is not going to start your car and help you chase down criminals anytime soon. Though imperfect, today’s voice assistants do serve a very critical purpose. They’ve successfully introduced voice interactions to the public.
This first phase of mass voice interactivity has concentrated on skills and utility. As expected, most current research measures the types of skills developed and their overall popularity. We know what skills are most often used. This understanding of lead use cases will inform future development and improvement. Amazon, for all we know, is constantly capturing and analysing user data built out of the high penetration of devices and will accelerate the technology in years to come.
Our Quest for User Research
Data only measure what people are doing. As a qualitative researcher and designer, we were keen to know how interactions manifest, why Alexa matters to its users and what impact these devices have on their lives. It is in these human-centred interactions that we understand the relationships to the technology and can identify other creative design opportunities.
To answer this, we ran a self-funded project which included 6 90 minute home interviews, secondary research and expert interviews. We focused on Londoners who had Alexa for longer than 6 months. Our sample included households with young families, couples without children, and singles. The youngest Alexas’ were 6 months old, the oldest was 2.5 years old.
Young Technology Has Its Limits
Alexa arrived in the UK during September 2016. At the time of our interviews, the Echos in use were very young. Like many technologies, we expected the people we spoke with to be early adopters who loved exploring and testing new gadgets. We discovered something very different.
All of the participants interviewed described themselves as average technology users. They liked the idea of operating technology through voice. They wanted to break away from the constant hand-holding and screen gazing we’ve become so attached to with our phones. They believed it would be more convenient, fun and, in some cases, educational for their kids.
In reality, our users described an imperfect system. They struggled to both be understood and find the right commands to perform tasks. One user described the focus required quite well:
“I want [my device] to speak like me. It’s tiring to have to speak like her. It’s like dealing with a foreign exchange student all the time.” [Luke, 29]
Users also lost interest in devices and upgrades because they perceived it to be a lot of work.
“I don’t need any more devices or upgrades. It’s great for music and not much else.” [Poppy, 42]
In secondary research and viewing countless reviews, we found Alexa’s limits to using natural language to be a common criticism. Our participants described the assistant as a novelty, it’s a great digital radio”, “it’s a bot”, “It’s 20% a cat”.
We often read about the rapid penetration of devices, the robust library of skills, the absolute convenience of smart plugs and Hue lights. However, all of our interviewees seemed disengaged with the plethora of possibility. This contrast between potential and reality inspired us to map the learning journey.
Life with Alexa: An Abrupt Honeymoon
Relationships are built on a learning journey of curiosity, trial, error and reflection. The more often this happens, the more confident and knowledgeable we become as technology users. In our research, the Alexa relationship plateaus rather quickly.
On it are two themes. The first theme is experimentation. People we interviewed were most excited and willing to experiment when they first brought their Alexa home. Very quickly they found key use cases which voice-driven commands worked well — round-ups of news headlines, playing of music, activation of built-in utilities like timers, alarms and weather. These convenient, hands-free applications are often the inspiration for youtube tutorials.
We call this period of settling “the comfort zone”. People tend to stick to these features and not go much further. The desire to experiment drops dramatically. A year into owning their devices, none of our participants seemed engaged in learning. Amazon’s email updates were left unread, the mobile app was hardly used, and new skills were never explored online. Our interviewees were not motivated to engage in Amazon’s digital ecosystem to broaden their learning. Their experience remained predominately informed by Alexa’s voice interface.
The only time our interviewees were truly motivated to experiment and learn was during social occasions. We were told friends would visit and want to play with Alexa. We also learned music features became especially useful at dinner parties where guests could ‘play DJ’. However, immediately after the social event, experimentation drops gain.
The second theme we heard was about the effort to learn new things. People spoke of a point of exhaustion where they lost interest. The payout did not equal the effort.
Why the Relationship Stalls
Most technologies have established design patterns, like sliders and the triangle play button, which transfer across devices and interfaces. This makes new applications and interfaces accessible and familiar. These visual design patterns don’t transfer naturally into auditory cues. Voice is a medium with low affordance. Here we have to create a new cadence of design; however, the challenge is significant because language is idiosyncratic.
In the current voice experience, each new action requires the user to remember the right command. For example, if you need to turn on your Dyson fan, you need to say, ‘Alexa, ask Dyson to set the speed to 3’. Though you can also ‘set your Nest thermostat’, you may not naturally associate turning up the heat with “setting the temperature”. The cognitive load is akin to learning a new dialect. The more devices connected, the higher the cognitive load.
Design Brief: Light a Spark and Grow it into a Fire
As of November 2018, Voicebot.ai identified nearly 46,000 skills in the US and 25,000 in the UK. The challenges for brands and designers looking to develop a ‘voice’ experience that breaks through the clutter and builds a rewarding habit loop.
We identified three different opportunities where this could be done:
Personal engagement, social opportunities and feedback loops.
Use Emotion to Engage People
Fundamentally, people will continue to engage with a product or service when they find it truly rewarding. Though it is possible to buy an Alexa with a screen, the primary use case for voice assistants is still grounded in audio. People we spoke with during the interviews and since have identified music to be the most enjoyable skill offered by their voice assistants. We believe this the beginning.
Visual media are useful for presenting data and complex information. Audio is very powerful in triggering emotions. A good voice designer rewards users by evoking feelings: joy, excitement, focus or love for example. Exploratory research can help us understand the current emotional connections associated with a category or a brand. Voice design practice can help us plan and test the conversations that will bring people back.
Make it Social
In our research, we learned that social experiences triggered renewed experimentation with voice assistants. “Dropping into” distant family members were also a source of amusement.
“Sometimes [my kids] drop into their cousins. It’s pretty funny and cool for them to have a way to connect without the adults.” (Stephanie, 42).
In this example, Alexa acted as almost an anecdote to the current isolation and one-upmanship associated with today’s mobile-on world. The skills and features facilitated interpersonal interactions. At dinner parties, everyone could play DJ, and at playdates, children relished in the opportunity to discover and give Alexa commands on demand.
This new dimension offered by voice assistants is exciting and should be explored further.
How do we facilitate opportunities that build on these social events over time?
Establish a Feedback Loop
The voice assistant explosion is unique in its ability to reach mass market consumers so quickly. The feedback loops which other technologies developed over years with a selected, often patient, audience first.
This audience forms a community and will feedback to the developer, share experiences and establish best practices.
A common loop emerges. Failures are reported, acknowledged and in some way, a response is taken. In software and app development, this feedback loop is communicated to users either through pushed alerts or updates.
Today, voice assistant users understand the technology is young. They are ready and willing to share their experience with Amazon. Though the Alexa app provides a platform to troubleshoot ‘failures’, it requires the user to take an extra step to use a new device. Our participants weren’t invested enough to take out their mobile phones to troubleshoot their hands-free, device-free experiences. Within the voice format, users didn’t know if errors or ‘conversational misunderstandings’ were acknowledged, captured and fixed. The auditory feedback loop is not existent, and therefore people complained that Alexa does not understand them or worse is ‘cold’, ‘rude’ or simply ‘dense’.
If voice assistants are to evolve, it needs to be easy for users to provide qualitative feedback within the voice interaction. Individuals should be able to use voice itself to easily share, access and review performance related issues, without the need for an additional device. Ideally, as the technology improves, they should also be able to track progress and resolution of these issues through voice.
Some Final Thoughts
We have seen that relationships are being formed with Alexa, but these relationships are limited by the current experience. This is expected in the early years of a new interface where use cases are still being tested and improved.
Users find an immediate ‘comfort zone’ when using convenient utilities that help make simple tasks hands-free. However, experimentation drops because subsequent interactions are hard. Commands are tricky to recall, and failures are not verbally acknowledged.
Moments of experimentation are linked to social occasions and sharable activities. People connect over music and Alexa makes this frictionless. Children congregate around Alexa because it can facilitate group play and discovery. Adults felt a presence because a voice responded to their calls.
As we look to the future, perhaps the role of a voice assistant can evolve from a utility-driven productivity tool to a facilitator of human interactions. Within this scenario, the balance of giving and taking, asking and telling, curating and producing can bring us closer to the user experience we’re seeking.