DeepMind’s AlphaStar Benchmark Improves RL Offline Agent With 90% Win Rate Against SOTA AlphaStar Supervised Agent
StarCraft II is one of the most challenging Reinforcement Learning (RL) environments, it requires RL agents to have smart strategic planning over long time horizons with real-time execution.
While online Reinforcement Learning (RL) algorithms have achieved great success by training on the challenging environments, most real-world applications requires RL agents to learn in the offline setting, which demands on more challenging offline RL benchmark for agents training.
In a new paper AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning, a DeepMind research team presents AlphaStar Unplugged, an unprecedented challenging large-scale offline reinforcement learning benchmark that leverages a offline dataset from StarCraft II for RL agents training, and its baseline offline agent achieves 90% win rate against previous state-of-the-art AlphaStar supervised agent.
The team considers StarCraft II as a two-player game that combines high-level reasoning over long horizons with fast and delicate unit management. It is suitable for…