Pitting Game-Playing Agent Against Game-Designing Agent
Overview of the paper “Fully Differentiable Procedural Content Generation through Generative Playing Networks” by Bontrager et al.
In an earlier article, I had shared a Procedural Content Generation (PCG) paper that showed how we can use Reinforcement Learning to train an agent that can design game levels, instead of playing them. Once this agent designs a game level, the quality of that level was evaluated using a hand-crafted method, which made it cumbersome to apply this method to multiple different games.
So today, I want to share a follow-up paper from the same research group which is titled “Fully Differentiable Procedural Content Generation through Generative Playing Networks” by P. Bontrager and J. Togelius. The key difference in this paper is that it simultaneously trains a game-playing RL agent and a game level-generating agent where both agents are in a symbiotic relationship.
The RL agent tries to learn how to win the game by successfully completing a game level and the generator agent provides it increasingly difficult game levels to complete. Here, the reinforcement learning method used is of an Actor-Critic setup, where the Actor network learns what actions to take in the current situation, and the Critic network learns to evaluate the difficulty of the agent’s situation in the current game. This architecture is important because the Critic network is central to the evaluation process which helps the Generator to produce more and more challenging game levels.
This is an innovative paradigm shift in procedural content generation because it allows us to simultaneously train agents that can learn to both design and play a game. This means we know the difficulty level of artificially generated game levels, and based on this information we can automatically generate new levels which are fun and challenging for humans to play with.