Analytics Vidhya
Published in

Analytics Vidhya

Minimax Algorithm in Tic-Tac-Toe: Adversarial Search Example

Photo by Kvistholt on Unsplash

I was around eight years old. We would play Tic-Tac-Toe with my uncle, who was five years older than me, and each time he would either win or tie. I don’t blame him for hurting my self-esteem, but I must have been quite foolish at that age.

Today, after years of school and a year and a half of college, I can finally say that I implemented an algorithm that can play Tic-Tac-Toe without ever losing. It is called the Minimax Decision Rule, which is a type of Adversarial Search, meaning that this algorithm faces an opponent that is playing against the machine. For example, in Tic-Tac-Toe, the AI plays against a human; the AI also knows the complete state of the game (it can see the entire situation, unlike the game of Poker), which is a requirement for the Minimax Algorithm.

Demo From My GitHub Repo

Minimax uses each state’s representation, labeling a winning condition as 1, a losing situation as -1, and a neutral condition as 0. So, getting into details might make it sound quite complicated (at least that’s how it sounds for me), but let’s imagine a situation in which you play Tic-Tac-Toe against yourself. You would probably want to maximize your score and minimize your component’s score (which is also you, lol), hence the name Minimax. The same you do when you play for the other side, picking the action to maximize your score and minimize your opponent’s score. What you would probably end up doing is you would probably start considering each and every movement you can do and the possible move your opponent might do. Calculating each and every possible outcome is the cornerstone of Minimax because that helps the AI pick the best action, which is also worst for the opponent.

Alright, if that wasn’t technical enough for you, let’s dive deeper. The functions we need to implement for the Minimax Algorithm are player to determine which player is currently taking action (X or O); actions to query which actions are still available (free cells); result to build a hypothetical board with the proposed action, meaning it is a complete copy on which we put our potential action (it has to be a copy to explore all the available state without modifying the original); winner to determine if the game is done and who is the winner; terminal to see if the game is a tie or someone won; IMPORTANT, utility to calculate whether a hypothetical board with an action resulted in a win or loss (bread and butter of minimax); and finally minimax that is meant to take a board, apply all of the functions mentioned above, and return the optimal move.

So, let’s go over the implementations of those functions. It sounds like a lot of work, but most of them take no more than 10 lines of code. The player function, given a board (which is a simple 2D list 3x3), tells us whose turn it is to go. Our goal here is to count the number of Xs and Os to find which one is smaller than the other, which means it is their turn to take action.

def player(board):
Xs = 0
Os = 0
# simply iterate over the given 2D array and calculate how many Xs and Os are there
for y_axis in board:
for x_axis in y_axis:
if x_axis == X:
Xs += 1
elif x_axis == O:
Os += 1
# if numer of Xs is smaller or equal to Os, it is a turn for an X because it always goes first
if Xs <= Os:
return X
else: # otherwise it is a turn for an O
return O

The actions function, given a board, tells us what actions we can take. Our goal here is to keep a set of tuples (i, j), where i is the row of, and j is the board's column representing an empty cell.

def actions(board):
possible_actions = set() # set is used just to be sure there will only be unique tuples
for y, y_axis in enumerate(board):
for x, x_axis in enumerate(y_axis):
# initial implementation puts variable EMPTY in all cells, which is equal to None
if x_axis == EMPTY:
possible_actions.add((y, x))
return possible_actions

The result function, given a board and an action (which is a tuple of the cell we should fill out), returns a deep copy of the board. What is a deep copy? It is a copy that is an entirely identical but a separate object, which does not share any pointers with the original. We need a deep copy to explore what actions we can take to find the states to maximize or minimize the score without modifying the original.

def result(board, action):
if len(action) != 2: # check if given action is a tuple of two elements
raise Exception("result function: incorrect action")
# check if given action is within the boundaries of the board (3x3)
if action[0] < 0 or action[0] > 2 or action[1] < 0 or action[1] > 2:
raise Exception("result function: incorrect action value")
y, x = action[0], action[1]
board_copy = copy.deepcopy(board) # using the imported library 'copy'
# check if action is already there (even though we will call 'actions' before it)
if board_copy[y][x] != EMPTY:
raise Exception("suggested action has already been taken")
else: # here we use the player function to know which letter to put in the copy
board_copy[y][x] = player(board)
return board_copy

The winner function, given a board, tells whether there is a winner. Just between you and me, it was a hectic experience trying to figure out how to write this one (it got better once I remembered it was a 3x3 grid).

def winner(board):
# Since the board is always 3x3, I believe this approach is reasonable
for y in range(3):
# Check horizontal lines
if (board[y][0] == board[y][1] == board[y][2]) and (board[y][0] != EMPTY):
return board[y][0]
# check vertical lines
if (board[0][y] == board[1][y] == board[2][y]) and (board[0][y] != EMPTY):
return board[0][y]
# Check diagonals
if (board[0][0] == board[1][1] == board[2][2]) or (board[0][2] == board[1][1] == board[2][0]) \
and board[1][1] != EMPTY:
return board[1][1]
return None

The terminal function, given a board, tells whether the game is over. Really. That's it, lol. IMPORTANT, it is useful because we would like to know wheter our prospectve state of the game is over, that is why it is a separate function.

def terminal(board):
if winner(board) == X or winner(board) == O: # check if there is a winner
return True
# check if there is a tie (if no cells left and neither X nor O won)
elif EMPTY not in board[0] and EMPTY not in board[1] and EMPTY not in board[2]:
return True
else: # otherwise return that the game is still going on
return False

The utility function, given a board, tells whether X or O won by returning 1, -1, or 0. IMPORTANT, it is useful when choosing what is the optimal choice in the Minimax as it tells whether the given state is winning or losing.

def utility(board):
if winner(board) == X:
return 1
elif winner(board) == O:
return -1
else:
return 0

And finally, the minimax function, given a board, returns the optimal action for the current player. The algorithm first checks if the board is terminal, meaning the game is over and no actions can be taken. If not, it then check whose turn it is to take take a move. IMPORTANT, if the AI plays for X, it tries to the pick the best out of all minimum values that have been selected from maximum values; likewise, it the AI plays for O, it tries to pick the worst out of all maximum values that have been selected from minimum values.

def minimax(board):
if terminal(board):
return None
if player(board) == X:
score = -math.inf
action_to_take = None
for action in actions(board):
min_val = minvalue(result(board, action))
if min_val > score:
score = min_val
action_to_take = action
return action_to_take elif player(board) == O:
score = math.inf
action_to_take = None
for action in actions(board):
max_val = maxvalue(result(board, action))
if max_val < score:
score = max_val
action_to_take = action
return action_to_takedef minvalue(board):
# if game over, return the utility of state
if terminal(board):
return utility(board)
# iterate over the available actions and return the minimum out of all maximums
max_value = math.inf
for action in actions(board):
max_value = min(max_value, maxvalue(result(board, action)))
return max_valuedef maxvalue(board):
# if game over, return the utility of state
if terminal(board):
return utility(board)
# iterate over the available actions and return the maximum out of all minimums
min_val = -math.inf
for action in actions(board):
min_val = max(min_val, minvalue(result(board, action)))
return min_val

It is quite hard to wrap your head around the minimax function because it involves calling a function (e.g. maxvalue) that then calls another function (e.g. minvalue) that calls the first function, which is even more complicated to understand than recursion (double recursion? mutual recursion?). Please let me know if you have any questions about any of those functions.

The complexity is certainly not the strongest suit of the project as it involves calculating each and every state on the 3x3 board. You can feel it when playing for O because the AI has to calculate all the possible states before taking any action. Please note that this project is a part of my coursework towards Harvard's CS50 Artificial Intelligence; you can find more about it here. My GitHub repo with demos for the course is here.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Damir Temir

Damir Temir

25 Followers

Data Science Aspirant w/ interest for writing. Student at the University of Illinois. Born and raised in Kazakhstan. temir.dev