Building a game of Tic-Tac-Toe in Python
This is the first post in a series of posts about building a Reinforcement Learning AI with Python.
The first thing that came to my mind when I started my journey with Reinforcement Learning was to train a bot that will play difficult games for me. I was in hope of building a chess bot that can beat players (I’m still on it! But here’s a simple bot for you to try out) that I couldn’t beat in my lifetime. But as with anything, my journey had its humble beginnings with some much easier games.
I figured that instead of writing a very long article that confusingly tries to explain both how to code the logic for the game and an RL algorithm, I would write some much shorter articles that break down the elements and make it easy to understand. So here goes …
The Basics
So for people who are unaware of the game, they can have a quick look here or play some games here. We will be building a standard game that has 3 rows and 3 columns. So first we import packages that we need and define the constants:
We will structure the program in an Object-Oriented way, so, let's first write the main loop that tells us what the program will do when executed.
As you can see above, we need two classes — one for the game and another for the player. The game class takes in two players as input and has a play
method that contains the game loop.
The Player Class
The player class will be short and easy to write. We have a name for the player and a method called make_move
that asks the user for input, validates if the input is correct, and then passes on the move to the game.
Init
The __init__
function is pretty simple, it takes a name provided by the user earlier and sets the player name attribute to the given name.
Make Move
The make_move
function takes as an input the list of available moves. The first line prints out whose turn it is, the next two lines take in inputs from the user as to which row and column they want to cross or circle. The try-except
block makes sure that the while loop runs on till the user inputs a valid integer and not some random non-sense.
The method then parses the row and column values as a tuple and checks whether the tuple appears in the list of available moves — if it does, then the method returns the (row, value) pair; else it asks the user for input again.
The Game Class
The game class is the chunk of the program. It is a larger class so we will discuss it in parts and understand the logic for each part.
Initialization
The Game class has a few variables that we need for it to function. The constructor takes in two objects of the player class as inputs and stores them in two variables. The board
variable stores the board for the game. I have modeled it as a 2-D numpy array. The board
variable is initiated to be an array full of 0’s. The boolean variable gameover
stores whether the game is over or still continuing. Finally, the turn
variable keeps track of which player’s turn it is (1 indicates player 1’s turn and -1 indicates player 2’s turn).
The Game Loop
Again let’s first tackle the game loop which will tell us more about the functionality of other methods.
So the main game loop contains a while condition that depends on the value of gameover
variable. Basically, till the game is not over the code inside the while loop is executed.
self.show_board()
is a function that prints the board to the screen. In the second line, we generate a list of moves using the available_moves
function. We call the make_move
function from the player class to let the player choose a move and store it in p1_action
variable. In the next line, we update the state of the game by updating the board. Next, we check if the game is over and we have a winner. If we have a winner or it’s a draw — we print out the result, reset the game, and break the loop. Else, we continue and it’s player 2’s turn. In the else
block we repeat a bunch of code but now for player 2’s gameplay. Finally, we ask the user if they want to play again. If the reply is a “yes” (y/Y) we call the game loop again.
Next, we implement the methods as they appear in the game loop.
Print the Board
As with the turn
variable, a value of 0 in the board indicates an empty space, a value of 1 indicates that it is player 1’s square, and a value of -1 indicates that it is player 2’s square. We assign the default symbols — X for player 1 and O for player 2. The top and bottom print buttons print the top and bottom ends of the board. The other print statement prints the middle part (rows) of the board. If the value is 1 it prints an ‘X’, if it is -1it prints an ‘O’ and if it is 0 it prints an empty space.
Legal Moves
The available_moves
function iterates over the board and appends positions if they have the value 0 (are empty). It is a very basic and easy function (should’ve started with this to make things look easier! 😃).
Update the Board
Ah, I think this is the easiest now. The motto of the update_board
function is simple, just add the symbol of the player to the board and change the symbol to that of the other player. As the symbols for the players are 1 and -1, we can easily do this by just multiplying -1 to the value of self.turn
.
Check for Win
The check_win
function checks if any one of the players has won. Players can win when either a row, a column, or one of the diagonals has the same sign. As we are using 1, 0, and -1 for values in the board, we can simply check for the sum of rows, diagonals, or columns to check whether a player has won. If the sum is 3, player 1 won; if the sum is -3, player 2 won.
The first block of code checks whether one of the rows has a value of 3 or -3 and returns the sign of the winner (remember that the player in self.turn
has changed, thus, we need to multiply it by -1 again). The second block is similar and does the same for the column values.
For calculating the values of the diagonals we make use of some functions from numpy
to make things easier. The trace()
function gives the sum of values on the main diagonal, so that gives us the value for the first diagonal. To calculate the value for the other diagonal, we have to flip the matrix horizontally and then take the trace. The flipping is done by fliplr()
and then we again take the trace.
Finally, if no one has won, and we have no available moves, the game is a draw and the function returns 0.
Reset the Game
Lastly, we reset the game when there is a win or tie. Resetting includes:
- setting the board to zeroes
- setting the variable
gameover
to False - setting the turn to player 1 (1)
Hooray!
We finally have a working code for the game. Let’s start playing!
Want to know how it ends? Tune in for the next article next week!
Before you go…
Connect with me on Instagram and Facebook.
If you liked this story, hit the clap button (you can clap up to 50 times with just one button!) and follow me for more such articles!
Share, recommend, and comment away! Engagement helps us communicate and be more human!