Creating voice assistant for games (tutorial for FIFA)
Play games with voice commands using a Deep Learning powered wake-word detection engine.
Voice Assistants like Amazon Alexa and Google Home have become widely popular for automating and simplifying execution of mundane, everyday-tasks. They allow users to quickly get things done by using their voice, without having to go through multiple interactions with their phones or computers.
Their increase in popularity and recent widespread adoption are in no less thanks to improvement in speech recognition technology, driven by the advances in Deep Learning.
I wanted to explore how we can leverage this tech in order to make our gaming experience better by adding a new input method to go along with the legacy game-pad control system. Hence, I created a voice assistant for the soccer/football simulation game FIFA which can change your team’s tactics or perform skill moves and goal celebrations in a match, all with just voice commands.
In this tutorial, I’ll cover how you can recreate the same for FIFA, but you can also follow the same steps to create your own custom voice command and add it to any game of your liking!
Motivation: Why I chose FIFA?
If you have been playing FIFA through the years like me, you would know about the different key combinations you need to remember in order to change the tactics in the game. To change formations or substitute players during a game, you need to pause the game and go to the menus, which breaks the flow of the game and becomes really annoying, especially in online mode. Moreover, there are more key combinations to remember for different goal celebrations and you usually end up only remembering a few. Same with in game skills, giving almost a hundred different key combinations to remember.
Let’s start by seeing how speech recognition engines work in voice assistants.
How do speech recognition algorithms work?
Assume we want to detect the wake-word “Okay EA”.
We capture the raw audio from the microphone and convert it to a visual representation of spectrum of different frequencies present in the sound, called a spectrogram. This is inspired by how a human’s ear captures audio. Next, we feed thin slices of this spectrogram to a Recurrent Neural Network model as continuous time steps. This model is trained to predict the likelihood of a character being spoken in that time frame. Putting together such likelihoods in a sequence gives us the textual form of the words spoken in the audio input, thus converting speech to text.
We’ll use the library called Porcupine which claims to utilize such deep learning models to perform real-time detection of keywords that we can use to identify voice commands. Clone the repository below, we shall use it next to create custom voice commands.
Tutorial
For this tutorial, I’ll show how we can create a custom wake-word (like “Alexa” or “Okay Google”) and then fire a command to perform a skill move in the game.
Step 1: Create custom voice commands
From home directory of the cloned porcupine repository, run the following in terminal to create a wake-word detection file that activates the assistant. I’m using “Okay EA” (since EA is the publisher of the FIFA series). The resultant file from this command will be stored in a directory named output.
tools/optimizer/windows/amd64/pv_porcupine_optimizer -r resources/ -w "Okay E A" -p windows -o output/
Next, create the command to perform a “rainbow flick” skill move.
tools/optimizer/windows/amd64/pv_porcupine_optimizer -r resources/ -w "flick right" -p windows -o output/
This will give us two .ppn files in the output directory.
Step 2: Setup Porcupine Detection engine
Step 3: Setup real-time command detection from microphone
Download the helper file directkeys.py from here. This file helps simulate key presses from keyboard so that the python script can interact with the game. Then, execute the following to detect keywords in a continuous loop and activate the relevant skill move upon detection of the appropriate command.
Note that lines 38 to 48 need to match the keys you need to press in the game order to perform the skill move. You can add more such moves by using their respective key combinations.
That’s all, you can now use this script to detect voice commands and activate the skill in the game! Find the full code with more supported actions below on my GitHub repository.
Results
Goal-scoring celebration:
Changing tactics during game:
More such results can be found on my YouTube channel, with the video embedded below.
Conclusion
It would be great if EA could incorporate such a voice assistant in the future editions of FIFA. It could become possible to give commands like “Substitute Pedro for Willian” or “Change formation to 4–4–2” and immediately see the changes in the game, without having to go to the pause menu. Also, making a voice assistant like this could simplify controls in the game, and can act as a perfect complementary input method for casual gamers.
What do you guys think, would you like to use your voice as an additional input method to control games or are you satisfied with the existing game-pad controls? Let me know down below!
Thank you for reading. If you liked this article, please follow me on Medium, GitHub, or subscribe to my YouTube channel. Oh, and Happy New Year 2019!
Note: This is a repost of the article originally published with towardsdatascience in 2018.