Building Games For Voice

Published in

Well Red

6 min readJul 31, 2018

Smart speakers and devices with voice interfaces are becoming increasingly common in households across the world.

In 2017, the total device shipments was estimated at 433.1 million units worldwide, an increase of 27.6% from the previous year, and is expected to grow to 939.7 million units by 2022.

The biggest challenge with this new medium is moving from visual and physical interactions to conversations. This is challenging enough for translating UI-driven applications, but, for game designers and developers who focus on creating games with varied rules and stories, this can be especially tough.

Ground rules for voice

Before getting into game design, it’s worth knowing some basics about building voice-driven experiences. Voice design is an emerging field, and we’re likely to see lots of change over the next several years as people learn to build better conversational interfaces. However, there are a few solid guidelines that aid in designing heavily interactive voice experiences.

Explicit options

Present users with the options available to them in order to progress, and do so in the way you want them to be spoken back to you. Users shouldn’t have to guess what to do when prompted to speak, they should have a clear understanding of what they can say and how they should say it.

There are exceptions to this rule however. If you want to surprise users, you can have hidden options for people who try to outthink the application and say something different. It’s fun to think of the crazy things a user might say at any given point.

Another exception is when you provide general purpose actions the user can trigger at any time, like quitting or asking to have options repeated. You can give these options to the user up front and then assume they will keep them in mind if needed later on. That said, it doesn’t hurt to give them the ability to ask for help if they want to hear all of the general purpose commands again.

Give alternatives

Provide as many alternative ways to say something as possible. Even though you present a clear option, you can’t be sure that users will phrase it the exact same way. This means you need to anticipate your users as much as possible. For example, you may want users to say “Move left”, but they could easily say “Go left”, “left”, or even throw a curveball like “Move west”.

You should try to anticipate user responses like these as much as possible so they don’t hit roadblocks and so their user experience is seamless. Few things are as frustrating as being unable to communicate, and this is further amplified when talking to a machine.

Keep it simple

Don’t overwhelm players with choices. Amazon recommends giving users only three options at a time, which is a good rule of thumb to try and stick to. Hidden secret options, or general purpose actions that don’t need to be stated are still okay however.

Also, be sure to keep the actions themselves succinct. Keep them between one and three words; don’t make users repeat a whole sentence.

The gist of this is you don’t want users to feel as if they’re listening to the options in a phone directory. If they have to wait too long, they will check out mentally and will probably move onto something else.

Rethinking game design

Translating visuals to voice

Since traditional game design is built around things like graphics, physics, and underlying systems that will challenge players, you really have to step back and rethink how to design a game for voice. It’s kind of like trying to turn a movie into a radio play.

Since users generally won’t be playing with any form of physical input, and they won’t be looking at anything visual, the kinds of games that work best in this new format are narrative-driven experiences or social party style games.

When looking at voice game design ourselves, we referred back to classic gaming experiences like text adventure games and choose your own adventure novels. These types of games immerse players in a story where they direct and determine the outcome.

Prototyping

Our first foray into voice gaming was to build a choose your own adventure prototype where the player was a superhero who could choose the appropriate power to solve various problems.

At each choice point, players are given the explicit options of flying, using super strength, or shrinking down. There is also a hidden fourth option at each choice point to just be a regular person.

For an example of how this plays out, one choice point has the player trying to get across a city to stop a super villain but they are blocked by a massive traffic jam. They can choose to fly over everything, use super strength to smash through the blockade of cars, or shrink down and dodge in between vehicles. The hidden option allows players to direct traffic and proceed peacefully.

Players go through a series of these choices until they reach a faceoff with the super villain and save the day. From there, they can repeat the experience and try out the other options.

Learnings

This process taught us a lot about voice game design. The two biggest takeaways were around sound design and player agency.

Something that became clear quickly was that sound design is crucial in voice-driven games. It’s one of the key methods of immersing players in a story. Essentially, it has to make up for the lack of graphics. It doesn’t hurt to listen to radio plays, or to Graphic Audio’s audiobooks for reference.

The other big takeaway was that this gameplay structure is a great way to immerse players and give them agency over how everything plays out. There are graphical and written choose your own adventure stories out there, but when you speak out the commands there is a personal connection you get with the game that you don’t get by clicking a button or flipping a page.

This personal connection is what drove us to add a secret path for players, because discovering it feels like you’ve uncovered something special of your own. This aspect is going to be where voice games make their mark.

Tools and technology

While working on the prototype we had two key technical goals: we wanted to support both Amazon Alexa devices and Google Home devices, and we wanted a game engine that could support new stories without having to write all the code from scratch.

Cross platform support

To achieve cross platform support, we used the Jovo framework.

https://github.com/jovotech/jovo-framework-nodejs

It’s a Node.js based framework that acts as an abstraction layer above the Alexa and Google Voice APIs. This means you write your voice app code once, and it works everywhere. This vastly simplifies voice app development, and saves a ton of time.

You still have to set up platform specific intents and hook them up to your Jovo based backend, but the code itself is only written once without any porting needed.

Jovo also exposes platform specific features, like displaying video on Amazon Echo Show devices, so you can still support special features for those users as well!

Game expandability

To ensure that our game engine could support a variety of branching stories, we looked at dialogue tree editors and settled on Dialogger.

Dialogger is a simple dialogue tree editor that exports data in JSON format, which we can easily read in and use to drive the structure of the story.

We don’t write out the story inside Dialogger, but instead use key variables that bind to script lines and audio files in a separate JSON mapping file.

We also have it set up to be capable of setting variable values when the user hits specific branches, allowing for simple actions like setting flags to more advanced ones like score keeping. This allows for a lot of expandability down the road without having to rewrite everything.

Moving forward

It’s an exciting time to design and build voice games. Smart speaker adoption is increasing quickly, and the capabilities of the technology is developing at a rapid pace. Over the next few years, we are likely to see some genuine standout games that immerse players in totally new ways.

We can’t wait to see (and create) what comes next!

REDspace is a full-service digital studio specializing in web, mobile, gaming, and video solutions. We craft digital experiences that empower our clients, helping deliver their content to their audiences. redspace.com