Conversation as Gameplay (Talk)

Published in

Spirit AI

18 min readJan 20, 2019

[This is the approximate text of a talk Emily Short gave about conversational mechanics in games.]

The Problem Statement

I want more games to be about human interaction, about the nuances of how people deal with one another, about the kinds of topics that appear in dramatic movies. That’s partly because I’d like to play more games about conversation and social interaction. I’m not as interested in action as a topic, and to be honest I often fall asleep during superhero movies these days.

Meanwhile, as an artist, part of the reason I write games is to explore and interrogate things I don’t yet fully understand. Building procedural systems and seeing how they perform is a great way to explore whether our mental models are correct. How people understand each other (or don’t), how they connect and why, are topics of enduring fascination for me.

So I want more conversation-rich games. For that to work as I’d like, the conversation needs to be rewarding as gameplay — not just bolted on around gameplay, as it so often is.

When it comes to my own work, I have a few more ambitions and requirements as well:

First, I want it to allow the player to act with intentionality: to lay plans and carry them out. That means that we need some systematic mechanics that the player can learn and manipulate.

For the purposes of this talk, I’m not spending much time on things that are pure branching dialogue trees without ongoing state or clear mechanics. I’ve sometimes written work in that space, and if you’re interested in how to get the most out of a relatively state-light dialogue presentation, I recommend having a look at Jon Ingold’s AdventureX talk about writing sparkling interactive dialogue. But that’s not what we’re looking at today.

[I’ve written more about world model and systematic mechanics for conversation elsewhere.]

Second, I want the resulting mechanic to have good pacing and dramatic qualities — so a mechanic that systematizes conversation but makes it feel very slow, stilted, metaphorical, or hard to manipulate is not what I’m looking for. Some of these can be cool to play, but I myself tend to be looking to write something that has a bit more fluidity.

What’s out there?

Before we get into my own attempts to solve this problem, let’s take a quick survey of what some other games are doing in this space, and of mechanics I’ve considered as inspirations:

Some games use minigames to represent the dynamics of conversation, like Red Strings Club‘s manipulation of dialogue via cocktails; about Dangerous High School Girls in Trouble and the mini-games that drive interaction there. Some use other gameplay as the main point of interaction and just have the characters in the game react to it, the way Textfire Golf makes the golf-playing into the means of communicating with your colleagues.

Some games, especially mysteries, let you manipulate inventories of information or knowledge in order to turn conversation into a puzzle. Consider the clue inventories used in Phoenix Wright, or the evidence-assembly screens in Detective Grimoire:

Being able to construct evidence like this and put your own questions to NPCs works pretty well in the context of a mystery game. Puzzles like this are often a good way to confirm that the player has understood the key points of the narrative.

What’s more, the utterance-building is designed systematically enough that the game can give you a little feedback about your wrong guesses. But individual utterances take a while to construct, and the interface is more about manipulating thoughts than about being present in the back-and-forth flow of dialogue with another person. So this example does well on intentionality, but less well on the dramatic/writing strengths that I’m interested in capturing.

Finally, Ladykiller in a Bind gives the player options that appear and vanish again over time, and overtly tells the player when particular choices are going to increase suspicion (undesirable) or gain them votes in a sort of popularity contest (one of your main goals during play):

(Note that this game is not safe for work.)

This comes closer to capturing the you-are-there aspects of conversation that interest me — the sense of thoughts coming into your head that might or might not be the right things to say; the sense of taking risks or staying silent — and because there are a lot of ways to manipulate your suspicion levels and vote stats, the player gets a bit more room for intentional action than in some systems.

[If you’re interested in even more things than I was able to fit into my talk, I’ve also written a semi-recent mailbag post with examples and categories of conversation mechanics in recent games.]

Aside from all those examples presented in existing conversation-focused games, I’ve also drawn inspiration, historically, from games that aren’t about conversation at all, but that have mechanics that feel like potential inspiration material.

Polaris is a tabletop story game in which players negotiate about what is allowed to happen next in the story, using key phrases (“okay, that happens, but only if this other thing happens as well”) — which provides an example of how negotiation mechanics might work in general. You might let players stack caveats and requirements, adding more stakes and outcomes to an agreement, until the characters either agree or decide that their negotiation has failed.

Combat games provide examples of attacking, blocking, and parrying that can be used (with a more cooperative spin) in conversational contexts as well. In fact, Tea-Powered’s in-progress Elemental Flow explores exactly that idea, so people who were at my talk had a chance to hear that concept unpacked in some detail.

In Diplomacy — a board game — the players discuss what their alliances will be, then write down detailed moves describing how their armies will move around the board. All of the written moves are resolved at once before anyone gets to enter further discussion or write new moves. A structure based on making complex commitments and then resolving them all at once would allow for players to make really subtle, socially complex decisions that were very expressive. At the same time, that mechanic undercuts pacing, because most conversations don’t involve that much thinking in between lines of dialogue. It’s a better fit for games about writing letters or at the very least emails.

Finally, Jenga captures a short-term mechanic that I’ve often seen successfully rolled into interactive conversations, where you invite the player to push an NPC as far as they dare and see what happens.

Galatea and other parser-based conversation games

My own work in this space started with the exploratory gameplay but unyielding UI of Galatea (nearly 20 years old now):

Galatea let the player ask or tell the character about various keyword topics — and there were enough contextual clues in the environment of the interaction that the player could actually know about some information that the non-player character didn’t have, including emotionally fraught information about what happened to her creator. Dozens of different endings were possible, depending on what you found out about the character and what relationship you ultimately formed with her.

After Galatea, I wrote a number of other parser-based conversational games.

Perhaps the most notable of these for the current conversation is Alabaster, a game written collaboratively with almost a dozen other writers. To create Alabaster, I built a game with an initial story hook and an engine that let players add new dialogue content whenever they ran into a topic that the game didn’t cover.

We also added a generative art system, the imagery on the left side of the screen, that gave the player visual cues about the mood of the major character. This was particularly useful in the Jenga-mechanic-esque moments of the game. It’s possible, by insistently questioning the player, to drive the non-player character into bad states of mind, and cause bad endings. Players have generally responded well to the visual signals that they’re pushing their luck.

[My blog contains a number of posts about the progress of Alabaster, from when the project was live, if anyone is curious to know more about the development process and design decisions.]

Versu

The Versu system presented options to the player based on its underlying social model and dialogue options that were hand-authored for that particular moment. A full explanation of how that system worked could occupy an hour all by itself, so I won’t get into too much depth here, except to talk about the UI and model for conversation, and how that design played.

Moving away from the parser made the game significantly more accessible to new users, while the underlying social model made it possible for players to intentionally play towards different relationships with key characters. The flow of interaction was much smoother, especially since Versu allowed characters to talk to one another: as the player, you could choose to watch NPCs interact for long periods before deciding to interrupt them.

The biggest work released in Versu was Blood and Laurels, a game of Roman imperial intrigue that could end with you getting yourself stupidly killed, or getting your friends killed, or becoming emperor, or restoring the Republic, all the while romancing (or not) any of several NPCs.

One of my favorite affordances of the game was that you could at one point get some poison, and subsequently use it to knock off an enemy at a banquet. When you poisoned someone, that character would have time for another line or two of dialogue before keeling over (though what that dialogue might be depended on the rest of the current conversation flow). It could be a lot of fun watching your enemy say another sentence or two of fatuous or self-serving nonsense, then face-planting into their dinner.

That sequence combined some planning on the player’s part (I want to remove this character, I got the resources to do it in advance), plenty of perceivable consequence (the corpse, the slaves running around in consternation, long-term changes to narrative flow based on that character being gone)… and some procedural variety, since the conversation was likely to be going slightly differently every time you played the poison move, and therefore it wasn’t verbatim the same each time.

Overall, then, the Versu approach was getting me closer to what I want to do with conversation in games, but still with some drawbacks:

A lot of what didn’t work was about communicating world state to the character. We did have some UI features, like the pictures of the characters along the bottom of the screen, and the ability to tap on their heads to see what they were thinking currently, that were meant to supply that information more richly.

[Though I didn’t get into this in the talk, I at one point ported Galatea to Versu, and I’ve posted about how that change of format affected the game experience. I’ve also written about the experience of creating Versu content in general. My keynote at ICCC 2015 gets into more detail than this talk, and it’s “what next” discussion at the end lays out some of the thinking that later affected the design of Character Engine.]

Character Engine and Restless

I am now working on Character Engine for Spirit AI. Character Engine is middleware designed to let game designers — and in fact people in other industries, such as education, entertainment, and social care — build interactive characters with memory and personality.

Character Engine can build characters that respond to natural language and gestural input, expanding on the repertoire of a traditional chatbot. It produces dialogue that can include emotional markup and expression cues, allowing a CE character to drive real-time text-to-speech, lip-syncing, and performance.

As with Versu, there’s a lot more to say about Character Engine than would fit into a short talk. One of the major differences, however, is Character Engine’s ability to build very dynamic text output, so that a character’s lines can be subtly rephrased to reflect changing world state, moods, and emotional closeness. The image here with the diagram on the right side of the screen is showing how text is actually realized in real time.

But natural language isn’t the best fit for every possible game, so Character Engine also features a mode in which it can dynamically suggest menu options for the player to choose from. Those menu options can also be built with very dynamic text. To explore this, we built internally a small piece called Restless for a Halloween-themed game jam.

Restless puts the player in the role of a ghost haunting a house. Initially, you can’t do anything but make sinister noises and smells manifest in the house — using the single menu item at the bottom of the screen.

But you also have two emotions, “angry” and “hungry”, that you can choose to turn on and off, which will change your speech options. “A crunch from the entrance hall” might become “A pervasive scent of apples in the scullery” if you want to manifest as hungry. Mixing in anger might change that to “An overwhelming scent of apples” or something else a little more intense. Choosing anger alone might create “Thunder rattling outside the upstairs window” or something along those lines. The player is free to explore and remix their options as much as they wish before selecting a menu item.

[For this part of the talk, I ran silent video of gameplay while I spoke. It comes out to a fairly long sequence, and it’s hard to make that link up with text in a blog post, so I’ve substituted a few screenshots here. However, the game is available to play from itch if you’d like to explore more fully what the user interface feels like.]

In addition, when new major topics come into the conversation, those also become selectable, so that you can prefer menu items that will mention those entities.

After your new acquaintance gives you some animal blood to drink, you have enough of a voice to be able to communicate in words, and the dialogue opens up quite a bit, with new emotion affordances as well. It’s possible to explore the information space of the game, find things out from the NPCs, or work with them to find particular discoveries.

Some actions remain available almost all the time. At almost any point in the game, you can make sinister noises in the house or even set it on fire. You also have a store of knock-knock jokes you can tell to relieve the other character’s anxiety, if you decide to play the “amused” emotional trait. Those are limited resources, because you can only play each punchline once; it’s up to you as a player when or whether you think it’s worthwhile to deploy them.

And in fact pairings of emotions and topics, or more than one emotion, can produce specific meanings.

This is a game in part about anxiety, and about characters who have trouble facing certain things about themselves. The protagonist was an anxious perfectionist during life, and as a ghost, is still set off by the anxiety of other people, and by elements that remind her of her worst experiences in life. Being open with other characters in the story can be healing, and can unlock backstory that might interest the player, but there are certain topics you can’t go into without deliberately allowing yourself to be sad.

Even when a pair of emotions doesn’t make a major difference, there are some subtle remix effects. For instance, if you’re set to Curious, you might be offered the question “Why not?” If you turn on Angry as well, that question might morph to “Why the hell not?”

The other character can still respond with the same information either way, but the second version of that question carries a different emotional freight.

Restless even includes some of those Jenga-esque, pushing-your-luck moments I talked about earlier. There are certain sequences where choosing to be silent repeatedly has a strong influence on the character who is trying to talk to you, and they respond with increasing distress. At any time in that process, you can break out and answer them. But if you’re curious, you can choose to keep pushing their buttons. We used some character art to communicate when they were getting more worried or angrier, together with tweaks to the exact wording of their dialogue.

A Slightly More Theoretical Analysis

Conceptually, the way I think about this design is that we are giving the player a search mechanism to explore the range of all the currently-open affordances. We only show three menu items at a time, but that’s a small window on a larger space of things that could possibly be said next.

For instance, we could picture the space of all possible things we could say this way, with the dot labeled “lover” representing a question for Sylvie about her lover Anna:

Meanwhile, the axes of this space (in practice more than three) might represent the emotional tonality or topical relevance of particular options.

As soon as we think of things this way, it becomes clear that we need that state space to be pretty densely populated if we want the player to feel that their search is consistently productive and that they’re getting a lot of expressiveness out of the system. And obviously, generating enough variants by hand to fill all of that territory is not really doable in reasonable human time.

This is where Character Engine’s text generation abilities become particularly important. When creating new lines, the author can use a generative grammar to substitute individual words or entire phrases, and associate those elements with different moods or topics. Here’s a bit of the script view of Restless, as seen in Character Engine’s authoring tool:

And we can add layers and layers of variation to any of these lines, representing different world states and emotion combinations:

Above are a couple of screens of building out that grammar, where the word “pervasive” might have some differently shaded replacements if the protagonist is expressing anger (“intense”) or sadness (“lingering”, e.g.). There’s a point quite early on where the system has many many adverbs for slurping — vigorous slurping, playful slurping, grotesque slurping, et al — depending on exactly what mood you’re in when you drink your first allotment of blood. The tool has some built-in provisions to pull content from dictionaries etc. to help authors build this kind of content faster.

Of course, the other point is about how we handle those choices and allow them to affect the world model. Restless does less with this than it could — some of our subsequent experiments go further.

But suppose you’ve written the system to generatively build your “Why not?” (curious) and “Why the hell not?” (curious + angry) outcomes. Either way, you want the NPC to answer the question, but you do also want to register the shift to the emotional landscape that comes from the player picking an angry approach. Character Engine tracks this kind of thing with numeric traits within a bounded range and fairmath-style adjustments. That means you can have the player’s choice of angry action to make the NPC a little more displeased each time (for instance), until that adds up enough to produce significant changes in behavior.

(In fact, Character Engine can go quite a bit further with this than we exposed in Restless: the social action model allows characters to make a decision about how to respond to input depending on their personality traits and current moods.)

Restless Outcomes

I was very happy about a lot of aspects of the game. It’s a game jam piece, but it has a significant amount of optional content and a range of possible endings, which I managed to put together in ca 40 hours of writing work. I estimate that building something similar in less procedural systems would have taken at least 3 times as long and yielded less freedom for the player.

I also really enjoyed the fact that I could make some affordances for the protagonist that were highly specific to particular emotion and topic mixes. If you talk to Sylvie about her girlfriend Anna, and you’re set to be both angry and hungry, you can make menacing comments that suggest you might like to drink Anna’s blood. It’s a creepy interpretation of who the protagonist is.

If I’d turned up that option as one of three immutable options in a more static dialogue tree, that would have had a much bigger effect on the gameplay, because I would have been making a strong statement to the player about who the protagonist wanted to be. Putting the option in the game but in a place where the player had to intentionally pursue an emotional strategy allowed me to include the feature without so strongly biasing the game in the direction of that content.

Overall, some takeaways from this project based on the feedback we got:

Most of the “didn’t work” elements are fairly easy to address, especially with this experience already in place. Because this was a game jam piece and was written in a small time window, we did more limited iteration than you’d typically do on game UI, and a lot of our testers were people already acquainted with the system. What we found in practice is that some players were confused or surprised by the fact that they could change up the menu. We didn’t tutorialize this much at all — in Restless the only concession to the user’s initial ignorance is that you start out with two emotions to remix rather than six.

Along the same lines, for this size of game it might have made sense to start with fewer emotional axes, perhaps using just four instead of six, but put even more effort into giving the combinatorics interesting effects. Not all players realized that pairs of emotions could be meaningful, and that may mean that we still underpopulated that state space.

A number of players explicitly said that they enjoyed regenerating their character’s text, as a fun/playful/toy aspect of the game, while a few others said that it felt emotionally distancing that they were allowed to do so, and they would rather have been forced to commit to an emotional reaction before seeing what text that turned into. That would have felt much more visceral, and I think in particular it would have decreased the backstory-exploration aspects of the story, in favor of something that really emphasized the emotional wrangling. The underlying engine would allow any of those affordances with some tweaks to the Unity UI level, so it is definitely possible to explore some more of those concepts in the future.

Another point several people raised was that the topic-selecting freedom let the protagonist change the subject abruptly. The game does constrain that in a few places — occasionally an NPC will ask you a significant question that demands an answer, not a non-sequitur — and it was a choice not to do more of that. We also could have set things up so that the player could only move to topics that were somehow related on a graph to the current one, or so that changing the subject drastically caused the other character to react with surprise/confusion.

I’d also say, as an aesthetic point, that Restless taught me a few things about the ideal size for the elements being swapped out by the generative grammar. In this project, we often provided a lot of substitute forms for nouns and verbs, and that can lead to some less-idiomatic expressions without actually giving good value from a gameplay perspective. This and subsequent work have taught me that it’s often more valuable to focus on swapping out elements that are on the level of dialogue actions: is the character using hedges (“somewhat”, “a bit”, “I think [statement]”)? Emphases, as in the “why not / why the hell not” example above? Some individual words and phrases can be productively swapped, especially when they’re e.g. the name that a character is using to refer to another character.

But the takeaway from this is not “provide variants for everything.” Instead, it’s best to focus on variation where a) the variation is within a tightly constrained generative space and variety is part of the point (for instance, Restless can generate many many possible ice cream flavors); or b) the variation is tightly tied to the statefulness we’re trying to communicate.

Conclusion

There are a lot of additional experiments to do in this space. Character Engine is flexible enough to allow for many different UI choices, and those choices are likely to yield significantly different player experiences.

The combination of world model, narrative structure, and dynamic generative text does afford the freedom to make low-stakes exploratory conversations side-by-side with higher-stakes emotional ones. We can give the player an expressive freedom that’s not available in most conversation games, and build content more efficiently for that kind of experience.

If you’d like to get in touch about doing your own work with these tools, please do get in touch. I can be reached as emily at spiritai.com, as well as through my usual contact information here.

Originally published at emshort.blog on January 20, 2019.