Recipe for a great #VoiceFirst interactive story

Florian Hollandt
#VoiceFirst Games
Published in
11 min readMar 4, 2018

After having tested and reviewed some amazing interactive story Skills, let’s wrap up the patterns and techniques you can use for your own voice game.

Wait, you haven’t ever produced an interactive story voice game!
What makes you think you can give advice?

Great point! I’m writing this article from a critic / analyst’s perspective. During the last weeks, I have consumed quite a number of interactive story games, and written detailed reviews about four exceptional ones.
In my analyses, I recognized patterns and techniques to increase the game’s engagement and/or retention, and some pitfalls that reduce the immersion — And now I want to share these insights with you!

You’ll probably get most out of this article if you’re an independent / hobby voice app developer and wonder how to implement your cool interactive story game idea, or a a game with strong interactive story elements.
If you’re a literature writer or someone who already produced a successful interactive story game, you’ve probably given many of the points listed here at least some thought already. Either way, I’d love to hear your opinion about what you think makes sense, where you disagree, and what might be missing.

Let’s get started!

Prerequisite: Some writing capacity

There’s no way around this: If you play to produce an interactive story game, you will need do get quite some authoring work done. Possible alternatives are external (freelance?) authors, crowdsourcing or using an interactive story from a different medium, but you will have less flexibility with these.
You should also have a bit of an idea a good plot, i.e. one solid storyline. You could do without a strong plot, if you offer a sufficiently fascinating environment to explore. I can’t give you any recommendations about which genre is particularly well-suited, but I would avoid introducing too many characters and concepts (because the player can’t easily turn back and read about who and what X was), and I’d personally wouldn’t do fan fiction because in order not to violate copyrights.

Defining the shape of the story graph

An interactive story consists of scenes, each of which begins and/or ends with a player interaction. If you imagine each scene as a node, with arrows connecting each pair of consecutive nodes, you get the story graph (or story network). Let’s call each path from the start node to a ‘happy end’ node one storyline.

Figure 1: Two story graphs consisting of multiple (1a) and one single (1b) storylines

While having multiple storylines increases your voice game’s potential retention, it’s not mandatory: Several successful interactive games have only one storyline, and most of the techniques I discuss work irrespective of the number of storylines.
If you did make the effort of creating multiple storylines, you should make sure your players realize it and remain curious to explore them!

Let’s analyze which types of interactions and decisions you can use in a storyline.

Figure 2: Storyline with only single-item ‘choices’
  • Single item ‘choice’
    In this type of interaction, the player is required to say one word or phrase in order to continue the game. While this kind of interaction seems superfluous, it can have at least three possible uses: 1) To increase the game’s rate of interaction with the player or break down overly long scenes, 2) to ask the user for the solution of a riddle, or 3) to serve as a savepoint from where the player can continue their game.
    However, such ‘choices’ should be used in moderation, because they can trigger a feeling of disempowerment and frustrate players.
Figure 3: Storyline with progress choices
  • Progress choices
    This is probably the most intuitive kind of choice in an interactive story game: If you make the wrong choice, you forfeit your chance of a happy end and have to start again. Despite their ubiquity, these choices should be handled with care: If a perfectly plausible choice leads to an early end, players might be perceive this as arbitrariness. On the other hand, if some choices are obviously wrong, the decision feel ‘cheap’. So the best choices are those where one option is just slightly ‘better’ than the other.
    We’ll explore a bit later how to mitigate the frustration players might still have from losing.
Figure 4: A storyline starting with a fake choice
  • ‘Fake’ choices
    A fake choice is a node of the story graph that emits two (or more) paths that converge again after one (or more) nodes. A player won’t notice a fake choice unless they’ll have traversed both paths.
    Despite the negative name, fake choices are a tremendously useful: Since they cover how (and not if) the player proceeds towards the happy end, they can build autonomy (the player can make a decision), immersion (the player identifies more with the protagonist) and curiosity/retention (if both choices are equally plausible, the player might want to try out the other one too) at the same time.
Figure 5: Storyline with a loop
  • Loops
    Loops are situations where a node emits a path which (after traversing one more more other nodes) enters back into the emitting node.
    The ‘natural habitat’ of loops are ‘escape game’-type situations where the player can or must investigate several options before moving on. While loops are a great way to force the player to engage with the game, they are easy to get wrong: It can reduce the immersion if, after a player has already activated option A, they are offered A, B and C again as if A had never happened. There’s no general solution to this, but my proposal is to use different texts depending on whether A, B or C have already been traversed.

With these options, you can determine the overall ‘architecture’ of your interactive story and set a solid foundation for an engaging experience.
Here’s another ‘global’ choice for your game, which you should make early on:

Choosing the point of view

Here we’re already deep in investigating techniques for building immersion! Your game’s point of view might already be determined by the plot or your personal style, but it’s still worth giving it some thought.

Figure 6: The emojis we’ll use to illustrate the different points of view

Here are you options, ordered from most to least plausible:

Figure 7: The player is the protagonist, and the narrator is outside of the action
  • Abstract second person
    You wake up at the beach. Do you want to walk along the shore, or go to the direction of the hills?
    This is the traditional perspective of all ‘Choose Your Own Adventure’-books I’ve ever read, and as such it’s a natural starting point when thinking about your voice game’s point of view. Typically, the narrator is limited and knows only what the protagonist currently knows, but it can also be omniscient. In either case, it doesn’t matter much if you use the voice of Alexa or not, because the narrator has no own character.
    The advantage is that your users expect this perspective, and it’s easier to write in because you can get inspiration easily. It also allows you to describe both the external world and the player character’s inner world as needed.
    The disadvantage is that it’s just what’s expected, and you might miss the potential of not using a more interesting point of view.
Figure 8: The player is the protagonist, but the narrator is also part of the action
  • Second person character
    We come across a river. Do you want us to try and swim across, or look for a bridge?
    In this point of view, the story is told from the perspective of a non-player character accompanying the player. It’s your choice whether you give the narrator a rich or sparse characterization.
    If you use Alexa’s voice, you will also have to use her character… I won’t go into the details, but if Alexa’s perceived character is not aligned with your narrator’s actions, your users will be irritated. On the other hand, this option has a deep potential to leverage the user’s connection to their virtual assistant.
    If you use another voice, either using Amazon Polly or a recorded voice actor (or maybe yourself?), there are few limits of what adventures your player and their narrator buddy can experience!
    The advantage is that you can increase immersion by creating a bond between player and narrator, and that it opens interesting plot options where they (inter)act as a team.
    One disadvantage is that this point of view is very uncommon in other media, and it might be more challenging to write in. It also cuts off your options to describe your protagonists inner world.
Figure 10: The narrator is the protagonist, and the player is not part of the action
  • First person character
    I got stuck in quicksand! Should I quickly wade out, or lean back and try to float?
    All points of view are located on a spectrum of how close or far the narrator is to the action, and this perspective constitutes one of the spectrum’s poles: In this case, the narrator is the protagonist, and the player is their prompter or director.
    Ironically, using Alexa’s voice with this perspective would make the player her virtual assistant, which sounds like an game I would love to play!
    The more natural choice, as seen in the Mr. Robot Alexa Skill, is to use a recorded voice. This point of view is also quite robust against low audio quality, if it simulates a phone conversation with a bad connection.
    The advantage is that this point of view makes it easy to create a dense and immersive atmosphere, with a strong bond between player and narrator, room for describing the protagonist’s inner world, and a feeling of immediacy and urgency.
    The main disadvantage I see is that this perspective doesn’t work well with Alexa’s voice (if she’s not the protagonist) and requires the effort of using a different voice.
Figure 11: Neither the player nor the protagonist are part of the action
  • Third person
    Antonius follows the thief until the aqueduct, where he disappears into the catacombs. The catacombs can be dangerous places at night! Do you want Antonius to enter the catacombs, or fetch enforcements?
    This is the traditional perspective of literary prose, but it ‘removes’ both the narrator and the player from the action, and thus gives away quite some potential for immersion. There might be some use cases where this ‘distanced’ storytelling can be used as an intended narrative device, but it’s not a point of view that I generally recommend.

Now that you’ve determined the structure of the story graph and of the relationship between protagonist, narrator and player, we can now move on to the structure of your scenes.

Composing your scenes

You want your scenes to be engaging and enjoyable to listen to. Let’s brainstorm the factors determining this:

  • Amount of story covered
    There’s a minimum of action that should take place in a scene — Something like ‘You open the drawer. It’s empty! What do you want to do next?’ is probably not enough to keep the gaming experience dynamic. If you cover too much story between two interactions, the experience might become disconnected.
  • Richness of the narration
    It’s hard to build engagement if the narration is monotonous! Alexa (and computer-generated speech in general) are not great at reading longer passages of text, so your options are 1) to use recorded audio from a human voice actor, which might be hard to come by, 2) to use a good amount of sound effects (like from Freesound or the recently established Alexa Skills Kit Sound Library), or 3) to structure the text in interesting ways (like rhetorical questions, speechcons, pauses, modulation of prosody, direct speech, …).
  • Asking the right questions
    The established way of prompting the user at the end of a scene follows the pattern ‘Do you want to do A or B?’, which is fine if A and B are easily distinguishable words or phrases. A rather unconventional alternative is to use the pattern ‘There is A and there is B. What do you want to do?’, optimally with a reprompt offering both options in their traditional form. This approach offers the chance to be more immersive and challenging, but comes with a risk of confusing the player.

By now, we’ve got much of our interactive story covered. Let’s look at some special ingredients to make our game more interesting.

Identifying and eliminating sources of frustration

We shouldn’t cut ourselves any slack in this regard: If players are sufficiently irritated in a moment of engagement, they might not only abandon the game, but even leave negative reviews.
Here are some places to look for sources of frustration:

  • The language model
    Make sure that variants of utterances for your choices’ options are properly matched to their respective intents, even if they are uttered by a child or someone with an accent.
  • Early progress choices
    If a player discovers your game and is reaching a dead end early in the game, you risk losing them, because they’re not sufficiently engaged with your game to give it another try yet. Consider incorporating your first progress choices only after the player is a few nodes into the story.
  • Late progress choices
    In these cases, frustration can arise from the fact that players have to start from the beginning to have another shot at reaching the end. This might also discourage players from exploring suboptimal endings just for the fun of it. Here are some ideas for mitigating this source of frustration:
    A) Use plenty of fake nodes so that players can explore alternative approaches in their re-runs.
    B) Use savepoints to which players can jump from the game’s start.
    C) Offer ‘another chance’ to make the right decision after the player lost their game. In order not to give away too much retention potential, you could grant your player only one retry for each start from the beginning.

After having invested a lot of work and thought into creating an immersive and engaging interactive story, we should focus on what happens after the player reached the happy end.

Increasing your retention potential

Your motivation to build for retention is to get more usage of your players, in order to make your game rise in the featured ranking and be more successful.
Here are some ideas for increasing retention:

  • More storylines or fake choices
    This is straightforward: If your engagement level is high, players will want to have more of the gaming experience. Just make sure they are aware that there’s more content waiting for their exploration.
  • Scattered parts of a riddle
    We’ve seen this technique at the Mein Auftrag Alexa Skill: In order to reach the happy end, you need to solve a riddle whose solution consists of parts distributed throughout the game. It should be so hard that only about half the people get it right at their first try, but so easy that people get it at their second or third attempt.
  • Feedback on the player’s choices
    People love getting feedback on themselves, so you can give them an evaluation of their choices, maybe even with kind of a personality test.
  • Hidden ‘bonus’ content
    At a few choices, you can add a hidden option (like ‘There is A, B and C. Do you want to do A or B?’) which triggers some interesting ‘bonus’ content. A good time to make the player aware of an example of what they missed and send them ‘on the hunt’ would be after their first successful walk-through.

This concludes my collected thought on how to produce a great interactive game. I’m immensely curious: Where do you disagree? Which techniques have you used, and what are your experiences with those? What did you find surprising or particularly useful? What would you have liked to read more, or less, about?
Both in order to spark interesting conversations and to provide more value to voice game producers, I would love to hear your feedback — Here, in Twitter or
per email! Thanks and kind regards!

--

--

Florian Hollandt
#VoiceFirst Games

Maker, with a focus on Arduino, LEDs & 3D printing. There’s a range of other topics I’m also engaged and/or interested in, most notably Alexa skill development.