Open Sourcing the Magnificent Escape Action
In a previous post, I discussed the design and implementation of my Magnificent Escape game for the Google Assistant. I’ve received lots of questions regarding the code and the Dialogflow agent. So, I’ve decided to open source the game.
I’ll explain how the project is organized and also discuss the latest learnings from the game in this post.
The game’s Dialogflow agent and fulfillment source code is available on GitHub. Some of the main files include:
- app.js — the main entry point for the Node.js app, which launches a web server for handling the Dialogflow agent fulfillment.
- fulfillment.js — the main fulfillment endpoint for the web server.
- game.js — functions for managing the game logic and state.
- rooms.json — the JSON data structure for the rooms of the game.
- prompts.js — the prompts used in the responses to the user.
- analytics.js — a utility class for sending events to Google Analytics.
utils.js — a collection of utility functions for handling intent matching and response generation.
The Action is implemented in Dialogflow and its Node.js fulfillment is deployed on App Engine. I have used App Engine for other large projects and it has been reliable and performed extremely well at scale.
As I mentioned in my previous posts about this game, I’ve bought royalty free audio from both Pond5 and Getty Images music. However, I don’t have the rights to distribute these sounds, so I’ve removed all of them from the open sourced code and replaced them with free sounds from the Actions on Google sound library.
There are also various images and icons that I’ve replaced with Creative Commons licensed content from Wikimedia.
The source code
The latest version of the Action has over 6,000 lines of code. Originally, the main fulfillment logic for the intent handlers was all in a single Node.js file, which worked well for a small number of intents. But, as the number of intents grew, this became less manageable.
The Actions on Google Node.js client library doesn’t provide a good way to modularize the intent handlers in separate Node.js files. Two of the Actions on Google Github samples provide their own custom routing solutions for this: Number Genie and the I/O Action. Since I had the goal of open sourcing the game and ensuring that the code is reusable for other Actions, I decided to avoid these custom solutions.
One aspect of the evolution of the game that helped make the code more modular was the custom slot-filling logic. This required the introduction of helper functions, which could then be moved to another file to reduce the line count of the main fulfillment file. Other utility functions were also combined into another file to reduce the line count even more.
I wasn’t completely satisfied with how the design turned out and was going to spend some time modularizing the code even more, but the various options I considered would only have made the code more obscure and less useful for other developers to learn from and reuse. A better solution might be to have a routing option built into the client library that would support larger Actions. A related issue is that Dialogflow could also do a better job of allowing intents to be grouped instead of just providing a long list.
The Dialogflow agent
The Dialogflow agent has 122 intents and 12 entities. To make the custom slot filling work (which I discuss in my first post), I created an intent for each kind of slot data and assigned each its own context, which is dynamically enabled in fulfillment. The intent priority of these slot filling intents are also set to be at the highest level, so that any other intents with similar user phrases don’t get selected by Dialogflow while the associated context is live.
Here is one of the slot intents for getting a direction value from the user:
The entities cover all the kinds of data that the game needs to get from the user to explore the rooms and change the state of the various items:
For each of these, I disabled the “Allow automated expansion” setting since I got some unexpected matches, but your mileage may vary.
The data for the rooms are stored in a single static JSON file. Since the room data wasn’t going to change often, there wasn’t a need to use a database.
Each room has the following features:
- Metadata that describes each room (e.g. name, level).
- Resources for screen devices (e.g. URLs for images).
- A set of rewards: users are rewarded with hints as they explore the room.
- A set of directions: north, south, east, west, up, down.
- Each direction has a description and at least one item.
- A set of stuff: items that are in the room that the user can look at and interact with.
- Each item supports multiple actions (e.g. look, move, use, etc.).
- Each item has a state and can also depend on the state of other items.
- Interacting with an item can change the item state and might reveal other items (e.g. open a drawer, find a tool).
- Users can collect items they find and add it to their inventory.
- Items in the inventory can be used on other items in the room (e.g. use the screwdriver on the screw).
- Special items, like safes, have a solution that the user needs to solve (e.g. a number of different turns of the dial).
- There are easter eggs hidden in each room, which reward a user with a special hint.
- Interacting with certain items will let the user escape the room (e.g. unlock a door).
I had considered breaking out the data for each item so they could be reused across rooms. For the current set of rooms, there are variations in the items in every room so I wouldn’t have gotten much benefit in such a design. However, if a larger number of rooms need to be supported, then modularizing the room definitions would likely be necessary.
Based on the data in Google Analytics, the top Dialogflow intents matched for the first room, the office, are in order of usage:
- Direction — for looking in a particular direction (e.g. “I want to look east”)
- Look — for looking at a particular item (e.g. “look at the desk”)
- Turns — for turning the dial of the safe (e.g. “turn to the right”)
- Open — for opening items (e.g. “open the drawer”)
- Side — for looking at the various sides of items (e.g. “look behind the painting”)
- Hint — for getting a hint on playing the game (e.g. “give me a hint”)
- Default Fallback — for handling any user input that isn’t matched by any other intents.
The results for the other rooms are similar, but the “Hint” intent gets invoked less.
I still find it very encouraging how hard some users try to win each room. Some users take hours exploring each room. Most of the users who won the first room take less than 20 minutes, and I’m assuming these users are more familiar with the game genre and mechanics.
For the lobby, it’s interesting to see how users pick a room. This matches our experience with users of templates, where we found that users will select answers to questions in various ways. These include selecting options by position or even giving partial option values. Even if you restrict the way you expect the user to select options, we’ve found that users don’t follow instructions and just want to answer the questions in a normal conversational manner. So, for this game, there are intents for “the bedroom”, “the first one” and “number two”.
Interestingly, the “No-input” intent appears in the top 10 intents for the lobby. The game implements good error recovery techniques to re-prompt the user to select a room.
Also very interesting is that the “Cancel” intent doesn’t have a high hit rate for the lobby. This means that most users actually pick a room and give the game a try.
I’m very happy with how well the Google Analytics Measurement Protocol has worked out for tracking various in-game events and data. If anything, I should be tracking even more aspects of the game. You can read more about how to use Analytics in my post “Analytics for Actions”.
Reflection on the project
It has been quite gratifying to work on a project that is much larger than the typical samples my team creates for developers in my role as a Developer Relations Engineer at Google. I’ve encountered many issues and designed various solutions. These have been passed on to our product and engineering teams to help improve our developer experience.
It has also been encouraging to see the reactions from developers on my previous posts about the project and how this has helped them with various production issues they have encountered.
I still think doing an escape-the-room game was a good genre to implement as an Action. Similar games have had success on other voice platforms. Also, it is a simple game mechanic that can quickly be learned by novice users. I was particularly interested in how well this would work on an audio-only devices, and I’ve been very pleased with how well it has turned out.
Creating a production-ready Action that follows best practices is a substantial effort that requires a large number of intents and associated fulfillment logic. It also requires a range of expertise, especially regarding conversation design, which most developers are not familiar with. However, the effort can provide an engaging experience for users, even at this early stage of voice platforms.
By open sourcing the game, I hope this encourages more developers to give Actions a try. It really can be a lot of fun! I encourage you to take as much of the code and as many Dialogflow intents as you need for your own ideas, and hope that you also share your experiences with your fellow developers.
Want more? Join the Actions on Google developer community program and you could earn a $200 monthly Google Cloud credit and an Assistant t-shirt when you publish your first app.