Published in

Analytics Vidhya

# Minimax Algorithm in Tic-Tac-Toe: Adversarial Search Example

I was around eight years old. We would play Tic-Tac-Toe with my uncle, who was five years older than me, and each time he would either win or tie. I don’t blame him for hurting my self-esteem, but I must have been quite foolish at that age.

Today, after years of school and a year and a half of college, I can finally say that I implemented an algorithm that can play Tic-Tac-Toe without ever losing. It is called the Minimax Decision Rule, which is a type of Adversarial Search, meaning that this algorithm faces an opponent that is playing against the machine. For example, in Tic-Tac-Toe, the AI plays against a human; the AI also knows the complete state of the game (it can see the entire situation, unlike the game of Poker), which is a requirement for the Minimax Algorithm.

Minimax uses each state’s representation, labeling a winning condition as 1, a losing situation as -1, and a neutral condition as 0. So, getting into details might make it sound quite complicated (at least that’s how it sounds for me), but let’s imagine a situation in which you play Tic-Tac-Toe against yourself. You would probably want to maximize your score and minimize your component’s score (which is also you, lol), hence the name Minimax. The same you do when you play for the other side, picking the action to maximize your score and minimize your opponent’s score. What you would probably end up doing is you would probably start considering each and every movement you can do and the possible move your opponent might do. Calculating each and every possible outcome is the cornerstone of Minimax because that helps the AI pick the best action, which is also worst for the opponent.

Alright, if that wasn’t technical enough for you, let’s dive deeper. The functions we need to implement for the Minimax Algorithm are `player` to determine which player is currently taking action (X or O); `actions` to query which actions are still available (free cells); `result` to build a hypothetical board with the proposed action, meaning it is a complete copy on which we put our potential action (it has to be a copy to explore all the available state without modifying the original); `winner` to determine if the game is done and who is the winner; `terminal` to see if the game is a tie or someone won; IMPORTANT, `utility` to calculate whether a hypothetical board with an action resulted in a win or loss (bread and butter of minimax); and finally `minimax` that is meant to take a board, apply all of the functions mentioned above, and return the optimal move.

So, let’s go over the implementations of those functions. It sounds like a lot of work, but most of them take no more than 10 lines of code. The `player` function, given a board (which is a simple 2D list 3x3), tells us whose turn it is to go. Our goal here is to count the number of Xs and Os to find which one is smaller than the other, which means it is their turn to take action.

`def player(board):    Xs = 0    Os = 0    # simply iterate over the given 2D array and calculate how many Xs and Os are there    for y_axis in board:         for x_axis in y_axis:            if x_axis == X:                Xs += 1            elif x_axis == O:                Os += 1    # if numer of Xs is smaller or equal to Os, it is a turn for an X because it always goes first    if Xs <= Os:         return X    else:  # otherwise it is a turn for an O        return O`

The `actions` function, given a board, tells us what actions we can take. Our goal here is to keep a set of tuples `(i, j)`, where `i` is the row of, and `j` is the board's column representing an empty cell.

`def actions(board):    possible_actions = set() # set is used just to be sure there will only be unique tuples    for y, y_axis in enumerate(board):        for x, x_axis in enumerate(y_axis):            # initial implementation puts variable EMPTY in all cells, which is equal to None            if x_axis == EMPTY:                 possible_actions.add((y, x))    return possible_actions`

The `result` function, given a board and an action (which is a tuple of the cell we should fill out), returns a deep copy of the board. What is a deep copy? It is a copy that is an entirely identical but a separate object, which does not share any pointers with the original. We need a deep copy to explore what actions we can take to find the states to maximize or minimize the score without modifying the original.

`def result(board, action):    if len(action) != 2:  # check if given action is a tuple of two elements        raise Exception("result function: incorrect action")    # check if given action is within the boundaries of the board (3x3)    if action[0] < 0 or action[0] > 2 or action[1] < 0 or action[1] > 2:        raise Exception("result function: incorrect action value")    y, x = action[0], action[1]    board_copy = copy.deepcopy(board) # using the imported library 'copy'    # check if action is already there (even though we will call 'actions' before it)    if board_copy[y][x] != EMPTY:        raise Exception("suggested action has already been taken")    else:  # here we use the player function to know which letter to put in the copy        board_copy[y][x] = player(board)    return board_copy`

The `winner` function, given a board, tells whether there is a winner. Just between you and me, it was a hectic experience trying to figure out how to write this one (it got better once I remembered it was a 3x3 grid).

`def winner(board):    # Since the board is always 3x3, I believe this approach is reasonable    for y in range(3):        # Check horizontal lines        if (board[y][0] == board[y][1] == board[y][2]) and (board[y][0] != EMPTY):            return board[y][0]        # check vertical lines        if (board[0][y] == board[1][y] == board[2][y]) and (board[0][y] != EMPTY):            return board[0][y]    # Check diagonals    if (board[0][0] == board[1][1] == board[2][2]) or (board[0][2] == board[1][1] == board[2][0]) \            and board[1][1] != EMPTY:        return board[1][1]    return None`

The `terminal` function, given a board, tells whether the game is over. Really. That's it, lol. IMPORTANT, it is useful because we would like to know wheter our prospectve state of the game is over, that is why it is a separate function.

`def terminal(board):    if winner(board) == X or winner(board) == O: # check if there is a winner        return True    # check if there is a tie (if no cells left and neither X nor O won)    elif EMPTY not in board[0] and EMPTY not in board[1] and EMPTY not in board[2]:        return True    else: # otherwise return that the game is still going on        return False`

The `utility` function, given a board, tells whether X or O won by returning 1, -1, or 0. IMPORTANT, it is useful when choosing what is the optimal choice in the Minimax as it tells whether the given state is winning or losing.

`def utility(board):    if winner(board) == X:        return 1    elif winner(board) == O:        return -1    else:        return 0`

And finally, the `minimax` function, given a board, returns the optimal action for the current player. The algorithm first checks if the board is `terminal`, meaning the game is over and no actions can be taken. If not, it then check whose turn it is to take take a move. IMPORTANT, if the AI plays for X, it tries to the pick the best out of all minimum values that have been selected from maximum values; likewise, it the AI plays for O, it tries to pick the worst out of all maximum values that have been selected from minimum values.

`def minimax(board):    if terminal(board):        return None    if player(board) == X:        score = -math.inf        action_to_take = None        for action in actions(board):            min_val = minvalue(result(board, action))            if min_val > score:                score = min_val                action_to_take = action        return action_to_take    elif player(board) == O:        score = math.inf        action_to_take = None        for action in actions(board):            max_val = maxvalue(result(board, action))            if max_val < score:                score = max_val                action_to_take = action        return action_to_takedef minvalue(board):    # if game over, return the utility of state    if terminal(board):        return utility(board)    # iterate over the available actions and return the minimum out of all maximums    max_value = math.inf      for action in actions(board):        max_value = min(max_value, maxvalue(result(board, action)))    return max_valuedef maxvalue(board):    # if game over, return the utility of state    if terminal(board):        return utility(board)    # iterate over the available actions and return the maximum out of all minimums    min_val = -math.inf    for action in actions(board):        min_val = max(min_val, minvalue(result(board, action)))    return min_val`

It is quite hard to wrap your head around the `minimax` function because it involves calling a function (e.g. maxvalue) that then calls another function (e.g. minvalue) that calls the first function, which is even more complicated to understand than recursion (double recursion? mutual recursion?). Please let me know if you have any questions about any of those functions.

The complexity is certainly not the strongest suit of the project as it involves calculating each and every state on the 3x3 board. You can feel it when playing for O because the AI has to calculate all the possible states before taking any action. Please note that this project is a part of my coursework towards Harvard's CS50 Artificial Intelligence; you can find more about it here. My GitHub repo with demos for the course is here.

--

--

## More from Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

## Damir Temir

25 Followers

Data Science Aspirant w/ interest for writing. Student at the University of Illinois. Born and raised in Kazakhstan. temir.dev