Designing an Alexa Skill for the Public Library

When designing the information architecture of a web or mobile app, we map out the user flow. We use visual design principles to guide the user through this flow to where they want and need to go. But what happens when our interface isn’t visual at all? What shifts in thinking do we as designers need to make? I decided to design an Alexa skill to find out.

Amazon has a great guide to designing an Alexa skill that I based this process on. I recommend checking it out!

  1. Establish the purpose and user stories
  2. Write scripts
  3. Develop the flow
  4. Get ready to build

I thought about web services I use that would translate well to voice. Since I often check my account on the public library’s website, I decided to design an Alexa skill for my public library.

Step One: Establish the purposes and user stories

This first step, as in any design process, is all about empathy. Here, the goal is to think of scenarios where the skill will be “useful and desirable” to users.

  • What is the purpose of the skill? Why will people want to use it? The purpose of the skill is for users to check their accounts, get important information, and use library services quickly and efficiently. Users will want to use the skill to check book due dates, fines, and status of reserved books. They will want to use the skill to renew or reserve books or add a book to their wish list.
  • What can a user do, or not do, with the skill? A user can do anything they could do through their account on the library website: view items checked out, items on hold, unpaid fines, view and add to reading history and wish lists, write and view ratings/reviews, and renew books.
  • What information is the person expected to have available? Since library card numbers are long and difficult to memorize, a user-friendly voice interface would remember the user each time. There may be multiple users in a household so Alexa may clarify whose account she is checking based on the user’s first name. Library account information is not highly sensitive, so there is not a privacy issue.
  • What are the ways a user can invoke the skill? Users can invoke the skill by asking about the status of a specific book, or for a more generalized account overview. Users can also invoke the skill by asking to place a book on hold or renew a book.

Step Two: Write Scripts

So far, the design process has been very similar to how we might begin to think about any digital product. Step two, writing scripts, is where voice design is unique. The way we speak and listen is inherently different from the way we read, write, and look. Rather than thinking about user flow as a series of buttons and links to be clicked, we have to think about it as a conversation. Amazon offers these tips:

  • Keep interactions brief.
  • Write for how people talk, instead of how they read and write.
  • Avoid repetitive phrases.
  • Indicate when the user needs to provide information.
  • Don’t assume that the user knows what to do or what will happen.
  • Clearly present options.
  • In general, provide no more than three choices at a time.
  • Ask for information one piece at a time.

Here’s an example script for my library Alexa skill:

User: “Alexa, ask SAPL to place a hold on the book Little Fires Everywhere.”

Alexa: “You have placed a hold on Little Fires Everywhere. You are number 93 of 93 on the list. Would you like to check the status of your other holds?”

User: “Yes.”

Alexa: “You are number 30 out of 45 on the list for The Nightingale. You are number 2 out of 27 on the list for The Lying Game.” Would you like to place another book on hold?

User: “No, thanks.”

Notice that because the user asked about something specific, Alexa is presenting related options (and, as recommended by Amazon, she is only offering a couple choices at a time). The interaction is very brief. In about 30 seconds, the user is able to complete a task and learn important information about their account. Alexa is speaking conversationally, using phrases like “Would you like to?” In a visual interface, this polite phrasing wouldn’t be necessary, and might even be considered superfluous, but it is reflective of typical conversation.

Step Three: Develop the Flow

In their design guide, Amazon cautions that, “A basic script doesn’t fully represent how people will interact with your skill in real life. Users may say too little, too much, or say things that you weren’t expecting.” With a traditional visual user interface, we have much greater control over how our users interact with the product. User flows can be pretty straightforward depending on the product; there’s only so many things the user can click on, after all. User flows in voice design, however, have to plan for the unpredictable nature of human speech.

Let’s look again at the script I planned in step two. While Alexa’s side of the conversation would rarely vary, there are infinite phrasings possible within human speech. Another user may be seeking the same information but phrase it in a totally different way. For example, a user might say, “Alexa, reserve Little Fires Everywhere” or “Alexa, put me on the list for Little Fires Everywhere.” A successful Alexa skill would need to be programmed to anticipate and respond to these variations in phrasing and grammar.

A user might also leave out important information. What if, for example, a user remembered the title incorrectly? Let’s imagine what would happen if the user said, “Alexa, ask SAPL to place a hold on the book Little Fires All Around.” After searching the library database and not finding a match, the skill should still seek to engage the user and help them accomplish their task. We want to be careful not to place blame on the user but instead help them to do what they set out to do. Perhaps the user flow might look like this:

User: “Alexa, ask SAPL to place a hold on the book Little Fires All Around.”

Alexa: “Hm, I didn’t find that book. Do you know the author’s name?”

User: “No.”

Alexa: “Do you want me to search for a key word in the title?”

User: “Yes, search ‘fires’”

Alexa: “I found the book Little Fires Everywhere. Would you like to place a hold on this book?”

User: “Yes, please.”

Alexa: “You have placed a hold on Little Fires Everywhere. You are number 93 of 93 on the list. Would you like to check the status of your other holds?”

User: “No.”

The skill attempts to problem solve by asking the user for alternative search terms, such as the author’s name or a title keyword. This alternative pathway allows the user to still accomplish the task, with minimal added time and no blame placed on the user.

Step Four: Get Ready to Build

In this step, we determine the “intents,” or unique things the skill is able to do. We also determine “utterances,” which are the things users will say to engage with the content. For my library skill, I have determined three primary intents:

  • Place a book on hold
  • Check my account status
  • Renew a book

In further iterations, I might add intents such as reviewing books or paying fines, but I would want to start with a simple functionality that would allow for easy user testing. For each intent, I have identified the utterances that users will say to Alexa:

  • Place a book on hold: “Reserve a book,” “Place a hold, “Place a hold on the book The Nightingale, “Reserve the book The Nightingale,” “Put me on the list for the Nightingale,” “Put a hold on The Nightingale.”
  • Check my account status: “Check my account status,” “Tell me about my library account,” “Account details,” “Account information,” “My account.”
  • Renew a book: “Renew Little Fires Everywhere,” “Renew,” “Renew a book, “Renew my book, “Change a due date.”

After completing these steps, I would test my user flow with users to determine what utterances I did not account for. In this case, user testing may look like me telling users what task (intent) they are seeking to accomplish, and then me playing the role of Alexa as they navigate their way through the skill. It may be beneficial to have pre-recorded responses, since users may interact with a human differently than they would with a robot.

While the user flow can more complicated and additional considerations need to be made, this process showed me that the basic principles of UX apply to voice design just as they do to visual design. As always, we empathize with the user and keep their goals and intents at the forefront of our work. Good design is good design no matter the output.