M2M Day 370: I think this new approach might actually work

5 min readNov 6, 2017

This post is part of Month to Master, a 12-month accelerated learning project. For October, my goal is to defeat world champion Magnus Carlsen at a game of chess.

This entire month’s challenge has hinged on a single question:

If I have a big set of chess positions labelled as either “good” or “bad”, can I create a function that correctly maps the numerical representations of these chess positions to the appropriate label with a very high degree of accuracy?

I think the answer to this question is yes, but it assumes that the original dataset is properly labelled. In other words, if I’ve incorrectly labelled some chess positions as “good” when they are actually “bad” and some as “bad” when they are actually “good”, my function won’t be able to find and correctly extrapolate patterns.

I’m almost certain that this is the current problem I’m facing and that I’ve mislabelled much of my dataset.

Let me explain where I went wrong (and then share how I’m going to fix it)…

Labelling my dataset v1

The dataset I’ve been using was labelled in the following way:

For a given chess position, compute its evaluation.
Then, for this given position, compute the evaluation of the position that came immediately before it.
Find the difference between the evaluations (i.e. the current evaluation minus the previous evaluation).
If this difference is greater than a certain threshold value, label the position as “good”. If the difference is less than the same threshold value, label the position as “bad”.

This method assumes that if the most recent move you made reduced the evaluation of the position, it was a bad move. Therefore, if you are evaluating this position, you should consider it bad because the move that got you there was also bad. (The same logic applies for good positions).

But, there is a problem here… There are multiple ways to reach any given position. And, the moves that immediately lead to this position don’t necessarily need to be all “good” or “bad”.

For example, in one case, I can play a sequence of all good moves, followed by one bad move, and end up in the “bad” position. In another case, I can start the game by making bad moves, followed by a series of good moves (in an attempt to work out of the mess I created), and still end up in the same position.

In other words, it’s highly likely that, in my dataset, there are multiple copies of the same chess position with different labels.

If this is the case, it’s no surprise that I couldn’t find a reasonable function to describe the data.

Originally, I overlooked this problem via the following justification:

Sure, there might be multiple ways to get to a single chess position, but because I’m going to be using the algorithm for every move of the game, I will only be making a sequence of “good” moves prior to the final move in the sequence. Thus, the evaluation of a given chess position will only be based on this final move— where if the move is “good”, then the chess position is naturally good and if the move is “bad”, then the chess position itself is bad.

Even though this is how I would implement my chess algorithm in practice, my dataset was still created with all possible paths to a given chess position, not just the one path that I would take during game time.

And so, I essentially created a crappy dataset, resulting in a crappy machine learning model and an ineffective chess algorithm.

Labelling my dataset v2

An hour ago, I came up with a much better (and computationally faster) way to label my dataset. Here’s how it works:

Load a bunch of chess games.
For each game, determine if White won or lost.
For each chess position in each game, label all the positions as either +1 if White won the game and -1 if White lost the game (and 0 if it’s a tie).
Process millions of games in this way, and keep a running tally for each distinct position (i.e. If the position has never been seen before, add it to the list. If the position has already been seen, increment the total associated with the position by either +1 or -1).
Once all the games are processed, go through each of the distinct positions, and label the position as “good” if its tally is greater than or equal to zero, and “bad” if its tally is less than zero.

In this way, every chess position has a unique label. And each label should be highly accurate based on the following two thoughts: “Over millions of games, this chess position led to more wins than losses, so it’s almost certainly a good chess position” and “Over millions of games, this chess position led to more losses than wins, so it’s almost certainly a bad chess position”.

Additionally, because I don’t need to use the Stockfish engine to evaluate any of the moves, my new create_data program runs about 10000x faster than my previous program, so I’ll be able to prepare much more data (It processes 1,000 chess games, or about 40,000 chess positions, every 15 seconds).

Right now, I have it set up to process 2 million games or about 80 million chess positions overnight.

It’s been running for 30 minutes now and has already processed 193,658 games.

Honestly, this morning I wasn’t sure how or if I was going to make forward progress, but I feel really optimistic about this new approach.

In fact, I can’t see how this data wouldn’t be clean (so if you have any ideas, leave a comment and let me know).

Assuming this data is clean though, I should be able to generate a workable chess algorithm. Whether or not this chess algorithm will be human-learnable is another story. But, I’m going to take this one step at a time…

Read the next post. Read the previous post.

M2M Day 370: I think this new approach might actually work

Labelling my dataset v1

Labelling my dataset v2

Max Deutsch is an obsessive learner, product builder, guinea pig for Month to Master, and founder at Openmind.

If you want to follow along with Max’s year-long accelerated learning project, make sure to follow this Medium account.

Written by Max Deutsch