Minimax algorithm and alpha-beta pruning

Aaron Brennan
4 min readJan 20, 2023

--

The minimax algorithm is a decision-making algorithm used in game theory and artificial intelligence to find the best move for a player in a situation where the other player is also playing optimally. The algorithm is typically used in two-player, zero-sum games, such as chess, tic-tac-toe, and Go.

Search tree used for the minimax algorithm. Player taking even turns tries to maximise his ‘score’ whereas the player taking odd turns tries to minimize it.

How it works

  • The algorithm starts at the current game state and generates all possible subsequent game states by simulating all possible moves for both players. Then for each of those states, it again generates all possible subsequent states and so on, until it reaches a terminal state/leaf node(e.g. win, loss, draw) or a maximum search depth is reached.
  • The algorithm assigns a score to each terminal state/leaf node, usually +1 for a win, 0 for a draw, and -1 for a loss. Then, it propagates these scores back through the game tree using the minimax rule: for a max node (a node representing the current player’s move), the score is the maximum of the scores of its children, and for a min node (a node representing the opponent’s move) the score is the minimum of the scores of its children.
  • The algorithm then chooses the move that leads to the state with the highest score for max nodes and lowest score for min nodes, considering the opponent will play optimally.
Basic example of the minimax algorithm. If both players play optimally the score will be 4 after 3 turns.

Alpha-Beta pruning

Alpha-beta pruning is a technique used to improve the efficiency of the minimax algorithm in game-playing AI. The technique is based on the observation that in many games, some branches of the game tree can be safely pruned (ignored) because they are guaranteed to be worse than other branches.

Example used to explain alpha-beta pruning. Can you see why the right hand section is ignored in this example? White nodes are maximisers and black nodes are minimizers.

Why is the 4th leaf node not considered above?

  • Here we evaluate the first 3 leaf nodes as before (-1, 3, 5 are the values). The first maximizer picks 3.
  • Since the second maximizer has a leaf node of 5, we know that the second maximizer value is at least ≥ 5.
  • Since the minimizer above picks the lesser of the two maximizers, we already know that the minimizer will chose 3.
  • Therfore it does not matter what the 4th leaf node value is since it will not be considered again.

Why is the last maximizer and the last 2 leaf nodes not considered?

  • Next we find the values of the next 2 leaf nodes (-6 and -4), The maximizer will then choose -4.
  • Therefore the minimizer further above has the option to choose a value of -4. Therfore the minimizers value is ≤ -4.
  • Since the minimizer on the left side of the tree has a value of 3, the root node being a maximizer will choose the left hand side of value 3.
  • Therefore the final maximizer and the last 2 leaf nodes don’t matter since they will never be considered in this tree (for minimax).

More technical explanation (alpha-beta values)

  • The algorithm uses two parameters, alpha (best value for the maximizing player) and beta (best value for the minimizing player).
  • The algorithm starts with alpha and beta having negative and positive infinity values, respectively. Note these are the worst possible scores possible i.e. alpha will go up and beta will go down.
  • As the algorithm evaluates each node, it compares the score of the node with the current alpha and beta values. If the current alpha value is > than the current beta value, it means that the current branch is worse than a branch that has already been evaluated and can be safely pruned.
  • Here the root node as alpha and beta values or +- infinity respectively. These values get passed down the tree to bottom left maximizer.
  • Reading the first leaf node the alpha value is set to -1 (better than -infinity), and upon reading the second node the alpha value is upgraded to 3. Beta remains the same
  • Now in the minimizer above the beta value is set to 3 (better than +infinity for minimizer) and alpha remains at -infinity.
  • A = -infinity and B = 3 is passed down to the second maximizer. Upon reading the third leaf node, the alpha value is set to 5.
  • Now since the alpha is > the beta value we can prune (ignore) the rest of this section since we know that there is another branch with a max value of 3 (beta = 3 )
  • Now that the left side is complete we can upgrade the alpha value at the root node to 3. The value a=3 and b=+infinity is passed down the tree to next leaf nodes.
  • The alpha value remains 3 after looking at the two leaf nodes -4 and -6. For the minimizer node above the beta value can be upgraded to -4.
  • Now since a > b we know that there is a more optimal path that the maximizing player (player with alpha) would have taken, hence we can ignore/prune the rest of the tree!

By pruning branches that are guaranteed to be worse than other branches, alpha-beta pruning can significantly reduce the number of nodes that need to be evaluated, making the minimax algorithm run much faster.

--

--