Part 1 — Computer Tic Tac Toe Basics
Basic Tic Tac Toe support classes and game logic
In the first part of this series, we will introduce the basic framework and helper classes that we will use throughout. We will also create the first computer player.
Let’s get started:
We will use the following classes which are defined in Board.py:
Board
: Contains all the Tic Tac Toe board state management plus some utility methodsGameResult
: Enum of all the possible game states. A game can be eitherNOT_FINISHED
,DRAW
,CROSS_WIN
, orNAUGT_WIN
CROSS
,NAUGHT
: Will tell our players what side they play, as well as indicate what pieces are on each square of the board - which can also beEMPTY
.
We also define a utility method print_board
that prints a board state pretty in HTML:
We can now create a new board and print it in all its empty glory like thus:
Time to make a move. We can use the methods random_empty_spot
and move
to find a random empty spot on the board and put a CROSS
there. We can then print the board to confirm:
We can use these methods to play a whole game.
We reset the board state and play alternating CROSS and NAUGHT until the game is either won by one side or has ended in a draw. We print the board after each move. After the game has finished we print out which side won:
Let’s wrap this code in a utility function called play_game
. It takes a board and 2 players and plays a complete game. It returns the result of the game at the end. We will use this going forward to have computer players play each other:
The RandomPlayer
Time to introduce our first contender, the RandomPlayer
. It looks for a random empty spot on the board and puts its piece there - pretty much the same way as we just did above - just wrapped into a class.
To use it, we need to import it, create 2 instances and let them play a game:
Establishing some ground truth
Using the code we have introduced above, we can now establish some ground truth: If we let 2 random players play against each other a large enough number of times, how many games do we expect to be won by NAUGHT
, how many by CROSS
, and how many do we expect to end in a draw?
Going forward, building more intelligent players, we can then measure how much better those new players play compared to a random player.
After 100000 game we have
draws: 12827,
cross wins: 58313, and
naught wins: 28860.
Which gives percentages of
draws : cross : naught of about 12.83% : 58.31% : 28.86%
In addition to the statistics computed above, we also know that if both players always make the best possible move, a game of Tic Tac Toe will always end in a draw. The gives us the following baseline:
Player | P1 Win | P2 Win | Draw
================================
Perfect | 0% | 0% | 100%
Random | 58% | 29% | 13%
In the following parts we will aim to create players that play perfectly. Where we don’t quite achieve that, we will still be able to at least measure how much better than the RandomPlayer
our player is.
The other parts of this series can be found here:
- Intro — Teaching a computer how to play Tic Tac Toe
- Part 1 — Computer Tic Tac Toe Basics
- Part 2 — The Min Max Algorithm
- Part 3 — Tabular Q-Learning
- Part 4 — Neural Network Q-Learning
- Part 5 — Q Network review and becoming less greedy
- Part 6 — Double Duelling Q Network with Experience Replay
- Part 7 — This is deep. In a convoluted way
- Part 8 — Tic Tac Toe with Policy Gradient Descent
The source code and Jupyter notebooks for all parts of this series are available at GitHub.
If you don’t want to install Jupyter or Python on your computer, you can run the Jupyter notebooks online in Binder.