DeepMind & UCL’s Stochastic MuZero Achieves SOTA Results in Complex Stochastic Environments

Synced
SyncedReview
Published in
4 min readJul 21, 2022

--

Introduced in 2020, MuZero (Schrittwieser et al.) is a model-based general reinforcement learning agent that combines a learned model of the environment dynamics with a Monte Carlo tree search planning algorithm. Although MuZero has achieved state-of-the-art results on a wide range of domains ranging from board games to visually rich environments, it is limited to deterministic models; and thus struggles in real-world environments that are inherently stochastic.

In the new paper Planning in Stochastic Environments with a Learned Model, a research team from DeepMind and University College London proposes Stochastic MuZero for stochastic model learning. The novel approach achieves performance comparable or superior to state-of-the-art methods in complex single- and multi-agent environments while maintaining the superhuman performance of the original MuZero in deterministic environments such as the game of Go.

Stochastic MuZero combines a learned stochastic transition model of the environment dynamics with a Monte Carlo tree search (MCTS) variant to model the dynamics of stochastic environments

--

--

Synced
SyncedReview

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global