AI Beats Grandmasters in Chess
By now, most people know of the infamous match between IBM’s Deep Blue and grandmaster Garry Kasparov, arguably the greatest player ever, in 1997. Kasparov lost, and now, over 20 years later, we have another AI, built by Deepmind, Alpha Zero. These AI are almost perfect at playing chess, but it turns there is a lot to learn when we try to replicate it ourselves.
AI in chess
When creating AI to play chess, there are 2 commonly used methods, Supervised learning, and reinforcement learning.
For a second, let me give you some background information on chess. Every time you make a move in chess, you are creating a position. Trillions of chess games have been played throughout history and recorded. It can be almost guaranteed that at some point, someone has already played the moves you played and ended up in the same position as you. It’s based on this fact that supervised learning in chess works.
Supervised learning makes sense of an environment(the chessboard) using historical data to try and maximize reward. There are huge databases with trillions of chess games played within them. We can program an AI to analyze all of these games, and when in use, it can discern the best move that gives it an advantage by referring to previous games that have already been played.
Most of the so-called “engines” on chess websites like chess.com are simple supervised learning AI that suggests the best move by referring to their databases.
These have serious drawbacks though, firstly, there are millions of positions in chess. In every chess game, it’s just as likely that at one point you will have ended up in a position that has never been played before. At that point, the AI has to refer to positions that are similar to the one it’s in. This sometimes leads not to the best move, but one that is okay or mediocre.
Secondly, the longer the game goes on, the longer it takes for the AI to come up with a good move. Again, the millions of possibilities of chess work against the AI. As games go on, the AI has to look through more and more positions to find the correct move, unfortunately, chess is played on a time limit, so this doesn’t work in its favor as well. (This last point is a little bit more centered towards building the AI and quality of life, not its chess-playing ability.) On lesser-powered computers or machines, it takes an extremely long time to train these AI, due to the fact of the many millions of games it needs to analyze. Training with a low number of games means our AI would struggle in more situations, which is not ideal. Certainly a complicated topic, but I hope you’re still with me. Next, let’s take a look at the method I found more advantageous to my situation.
Reinforcement learning works on the basis that we can make an AI try to maximize reward. Reinforcement learning focuses on trying to make AI understand the changes in the environment that are caused by its own actions, whether this is in playing chess or a video game. Then it tries to make a prediction on which action can maximize its reward.
But computers and AI don’t necessarily understand winning or losing, so we have to find another way to quantify it, with points. Every time the AI does something that changes its environment(in our example the chessboard) by making a move that is advantageous to its position, it’s rewarded, with the ultimate, maximum reward of winning the game. If the AI makes a mistake or blunder, we penalize it by taking away some points. Over time, the AI learns which moves maximize reward in any position. The easiest way to train a chess-playing AI like this one, would just to have it play itself over and over using the systems I described above. If you’re asking how this is better than supervised learning, let me explain.
Firstly, reinforcement learning does not rely on historical events to make or suggest a move, we don’t need to have a reference for the AI to make the best possible move, as it’s learned the concepts and rules itself. On top of that, the length of chess games doesn’t matter to this type of AI, it knows what moves are advantageous in any position. As for training time, let me provide you with a pretty incredible statistic: The greatest grandmasters, Garry Kasparov, and Magnus Carlsen process about 2 chess positions per second, which is around the fastest the human mind can go. A computer can process thousands of positions per second. Not only does this give us a huge advantage in the number of moves we consider, but it also means we can speed up games exponentially. With reinforcement learning, we can have our AI play thousands of games a minute, which would take thousands of hours for these games to happen naturally.
The rest of this article will be me showcasing some of the code and how I went about creating the AI.
Closing thoughts and results
So, in the end, after around 20 hours of training, I faced off the AI against a 2600 rated bot. (2500 is the minimum rating to be considered a grandmaster). In 76 moves, the AI was able to create a draw against an above-average GM. This project really gave insight into the situational ability of a lot of types of AI. Hopefully, you learned something as well!