Towards *Generalized* Game-Playing RL Agents

Overview of the paper “Crossing The Gap: A Deep Dive into Zero-Shot Sim-to-Real Transfer for Dynamics” by E Valassakis et al.

Chintan Trivedi
deepgamingai

--

I am of the strong opinion that Reinforcement Learning (RL) is the future of game-playing AI agents. However, to transition to a world where we can use RL in real-life games, there are many sub-problems that we need to take care of first. One such sub-problem is that it is very difficult to make a generalized AI that is good at playing more than one game which it isn’t trained on, even if the game genre is exactly the same.

For example, if we train a bot that is excellent at driving a car in Gran Turismo, this bot will not perform very well on other car racing games like Forza Horizon. This means for every game we need to train the RL agents from scratch, which is not ideal because the current RL algorithms are not very sample efficient and take a lot of resources to train.

This is why today I want to cover a review paper from the robotics world which could also be applied to solve our problem in the game development. It is titled “Crossing The Gap: A Deep Dive into Zero-Shot Sim-to-Real Transfer for Dynamics” and is published by Robotics Lab at Imperial College London.

[source]

This paper focuses on various techniques that are used in robotics research to transfer a trained model from a simulation to real life. The problem here is that various robot and environment dynamics like friction of surface, wind resistance, speed of robotic arm, etc. are ever so slightly different between the simulation and real-life, making it difficult to successfully transfer a model between the two. This means a robot arm that performs well in simulation may not perform well in real life due to these differences in dynamics, which is very similar to our problem in game AI development stated earlier.

It seems one of the better approaches to train robust RL policies is to introduce variations and randomness during training time, instead of adapting the learned policies at deployment time. By adding small randomness to the environment dynamics, the RL policy learns to generalize well due to the added noise and works much better when transferred to real-life. Similarly, adding Random Forces to this environment also seems to work just as well, if not better, than tweaking the simulation dynamics.

[source]

Something similar should be done when training game bots on a particular game. By tweaking parameters like reaction time of agents, we could build a more generalized game AI that can also be transferred to other games of similar genre.

Thank you for reading. If you liked this article, you may follow more of my work on Medium, GitHub, or subscribe to my YouTube channel.

--

--

Chintan Trivedi
deepgamingai

AI, ML for Digital Games Researcher. Founder at DG AI Research Lab, India. Visit our publication homepage medium.com/deepgamingai for weekly AI & Games content!