Monte Carlo Tree Search (MCTS), AlphaZero and MuZero for dummies!

michelangelo
2 min readAug 4, 2022

--

I will try to explain and code a simple but working implementation of the Monte Carlo Tree Search algorithm (MCTS), AlphaZero and MuZero, in order to play video games!

Photo by GR Stocks on Unsplash

If you are a Computer Scientist like me, you likely have stumbled across all the research and results on AI that are happening these days.
Chances are, you already know about DeepMind and similar AI companies, and how they are revolutionizing the AI research and development in pursuit of AGI (Artificial General Intelligence).

For me, everything changed when I saw the AlphaGo Documentary, more or less a year ago. This is what I understood at the time: some very smart people from DeepMind managed to create an algorithm that learns to play video games and table games(some of which, like Go, are very difficult given their state space), through self-play, at a super-human level.

How did they do it? This is the question that stuck into my mind since then.

I am a Computer Scientist but I am not specialized in AI, but damn if I wanted to understand how they have done it. In the end, these are the video games that I was playing when I was a kid, and now there is the possibility to write algorithms that are able to learn how to play them.

I wanted to try to replicate their results on a small scale.

So this is where my journey started, reading some generic books and looking at videos about AI, and then graduating at Udacity Deep Reinforcement Learning course.

In this series of articles, which assumes you already have some familiarity with basic programming (we will use Python), basic ML (Machine Learning) and RL (Reinforcement Learning) concepts, and the algorithms themselves, I will try to explain how to implement a (very) simple but working version of the MCTS, AlphaZero and MuZero algorithms, and try to explain some of the key concepts behind.

This is the kind of implementation which I wish I had at the time I started to understand these concepts, because the only ones that I found may be much more better and complete, but are also quite complex in order to understand the basic ideas behind such algorithms.

So stay tuned and see you in the next articles!

--

--

michelangelo

Computer Scientist at the core | AI passionate | Deep Reinforcement Learning enthusiast