M2M Day 371: My computer just can’t handle this

Max Deutsch
4 min readNov 7, 2017

--

This post is part of Month to Master, a 12-month accelerated learning project. For October, my goal is to defeat world champion Magnus Carlsen at a game of chess.

Last night, I left my computer running while it processed chess games using my new labelling method from yesterday.

As a reminder, this new method labels all chess positions from a winning game as +1 and all chess positions from a losing game as -1. Then, each time the program sees the same chess position, it continues to add together these +1 and -1 labels until a final total is reached. The higher the number, the more definitively “good” the move is, and the lower the number, the move definitively “bad” the move is.

After waking up this morning, I saw that my program had crashed after processing around 700,000 chess games. Simply, my computer ran out of usable memory.

Sadly, as a result, none of the output was saved.

Thus, I restarted the program, setting the cutoff to around 700,000 games.

Once the dataset was successfully created, I uploaded it to Floyd (the cloud computing platform I use), mounted it to my train_model program, and started training my machine learning model.

However, very quickly Floyd (which is just built on top of AWS) also ran out of memory and threw an error message. I tried to max out the specs on Floyd and rerun the program, but only to run out of memory again.

So, I scaled things back and created a dataset based on 100,000 chess games… This still broke Floyd.

I scaled back to 25,000 chess games, and finally Floyd had enough memory capacity to handle the training.

I’ve been running the training program for about four hours now and the accuracy of the program has been steadily climbing, but it still has a long way to go:

It started at around 45.5% accuracy on the test data (worse than randomly guessing, assuming good and bad positions are about equal).

And, after four hours, reached about 54.4% accuracy on the test data (slightly better than randomly guessing)…

Hopefully, if I let this program run through the night, it will continue to steadily march up towards 99%. (The program is only cycling through ~400 iterations per hour, so this could take a long time).

To hedge, I’m preparing a few other datasets that I also want to use for training purposes in parallel. In particular, I’m worried that, because I had to shrink down my dataset to get the program to run on Floyd, there may be many chess positions in my dataset that were only processed once, effectively labelled randomly (rather than being properly labelled by the aggregate view).

Thus, I’m creating a few datasets where I’m processing significantly more chess games, but only accepting chess positions into my labelled dataset that have been seen multiple times and demonstrate a definitive label (i.e. the chess position has a tally that is >25, or >50, or >100).

In this way, I can likely eliminate any one-off positions and create a dataset that has a cleaner divide between “good” positions and “bad” positions.

I also always have the option of introducing a third label called “neutral” that is assigned to these less definitive chess positions, but an additional label will add significantly more complexity — so it’s only worth it if it greatly increase the effectiveness of the algorithm.

Anyway, today has been a lot of waiting around, running out of memory, and crunching data. Hopefully, tomorrow, I’ll have some indication whether or not I’m headed in the right direction.

Honestly, this is starting to seem like a job for Google’s Alpha Go or IBM Watson as far as infrastructure and optimization are concerned. Is it too late in the game to pursue some kind of sponsorship/collaboration…?

Read the next post. Read the previous post.

Max Deutsch is an obsessive learner, product builder, guinea pig for Month to Master, and founder at Openmind.

If you want to follow along with Max’s year-long accelerated learning project, make sure to follow this Medium account.

--

--