Deep Learning by Example

Checkers, anyone?

3 min readJun 10, 2017

In 1959, Arthur Samuel’s taught a machine to play checkers. He coined the phrase “machine learning”. And the rest was history… actually deep learning, well really just partially.

As a result, checkers is the most popular board game… the machine is now the best in the… IBM stock went up 15 points in one day!

There we go. Samuel taught a machine to be a better player than he was and the business world realized this was a pretty big deal… until it really wasn’t. It is a great place to start learning the components of deep learning. So let’s jump in.

It Helps To Be Lazy

Checkers is not exactly the most complicated game on the planet. There are 64 squares on a checker board. Each side gets 12 pieces. Pieces move in a limited number of directions and the rules are pretty straight forward. Unless it is the 1950’s and your “super” computer has the same computational abilities as a wrist watch (and I am not talking about a fitbit).

In which case, a game includes way more possibilities than you have punch cards. And so, you have ample incentive to be lazy. Samuel couldn’t run a Monte Carlo or some other Cartesian process, he didn’t have the CPU. So he tried something else, he projected only a few moves ahead and scored the configuration of the board. It is a very similar process to what human players would do (minus the formal scoring).

Scoring, Pruning, and Minimax

Samuel’s score was based on several things. The number of pieces, the number of kings, and the proximity of pieces to becoming kings. The scores allowed his machine to value series of moves without playing them full to completion. It also allowed him to utilize alpha-beta pruning.

Samuel’s machine would take the current state of the board and iterate through all potential future states produced over the next few moves. This iteration is known as a search tree. By scoring the branches of the tree, the traverse of the search tree can be sped up by ignoring all future branches that fall below your current top score. This is known as pruning.

Scores are evaluated using a process known as minimax. Minimax is a risk algorithm. In it, the computer makes the choice that results in the minimum loss from the maximum (worst) case. As such, it is a very conservative algorithm.

Shallow Learning

Samuel’s Checker Player may best be labeled shallow learning. Deep Learning is categorized by layers (often hidden) of similar algorithms, his did not swim so deep. But it was an excellent first step. Enough to dip our toes in the water, so to speak.

For those who would like to dive right into the code (yeah, really selling this metaphor), the following link can get you started:

johncheetham/samuel

samuel - Draughts Game for Linux

github.com

Draughts, by the way, is the English name for Checkers. Thanks for reading!