Artificial Intelligence plays Kariba!

Bram Kooiman
Analytics Vidhya
Published in
5 min readFeb 24, 2020

So you want to use Monte Carlo Tree Search (MCTS), but your game has imperfect information, and an element of chance. That’s difficult! In this blog we’ll apply MCTS to such a game; Kariba. We’ll discover that what we really need is Multiple-Observer Information Set Monte Carlo Tree Search (MOISMCTS).

Kariba is a fun little card game with a few simple rules. When it’s your turn you play one or several cards of the same species from your hand onto the field (around the waterhole, because animals need to drink). If at some point there are 3 or more animals of the same species on the field, they chase away all animals of the closest weaker species. However, everyone knows elephants are afraid of mice, so the mouse is the only animal that can chase away the elephant. For every animal you chase away, you get a point. That’s it!

Example: There are 2 mice, 2 zebras and 2 leopards on the field. Player A plays a third leopard card from her hand to the field. There are now 3 leopards, which chase away the 2 zebras. Player A receives 2 points (because her action chased away 2 animals). Player B now plays another leopard. There are now 4 leopards on the field, which is more than 3, so they chase away the two remaining mice. Player B now also receives 2 points.

Cute animal cards
Mythbusters proving that elephants are afraid of mice

Kariba is a game with imperfect information, because a player cannot view the hand of the other player. Also, it is a game of chance, because you never know what cards you draw next. MCTS doesn’t really deal with such games.

So let’s talk about MCTS!

Each game state will be called a ‘node’. Starting from the root node we select which child node we want to know more of. We do this again for that node and keep selecting child nodes until we’ve hit a node that doesn’t have any children (a ‘leaf node’). At this point we expand; we find the new children this node can have. Starting from a new node we perform a simulation (or “rollout”). That means that we simulate a game where both players make random moves, and record who wins. We then backpropagate this result to all ancestor nodes. When backpropagating a simulation’s result through the tree, we only record it as a win when the action of that node was taken by the agent who would become the winner. If the action of that node was taken by the agent who would become the loser, we record the result as a loss.

We perform the loop (selection, expansion, simulation, backpropagation) until it’s time to make a ‘real’ move (after, say, 500 simulations). Then we usually pick the action we’ve run most simulations for. When we’re in a new situation and must make a new move we start MCTS from scratch again.

Monte Carlo Tree Search in 40 seconds. Also available as pdf in the git repo

But when is a node ‘interesting’? That’s an interesting question! In selecting what actions the AI wants to learn more about, it should find a balance between exploration and exploitation. An excellent way to do that is to select the node with the highest Upper Confidence Bound (UCB):

Upper Confidence Bound

where nᵢ is the amount simulations starting from the node, wᵢ the amount of wins, Nᵢ the amount of simulations of the parent node and c is a hyperparameter. High values for c promote exploration and low values promote exploitation. c is typically set to √2.

The problem with Vanilla MCTS is that it assumes that both players can completely observe the state, but Kariba is a game with imperfect information. Additionaly, states can change not only due to actions, but also due to drawing cards, which complicates matters by adding an element of chance.

The way to deal with both is with Multiple-Observer Information Set Monte-Carlo Tree Search (MOISMCTS).

To deal with imperfect information, MOISMCTS keeps track of a separate tree per player. A node in this tree is not a state, but an ‘information set’; the set of all states the game could be in given the information that the player has.

States and information sets. If player 0 can only observe part A of the state, and player 1 can only observe part B, then the game could be in the state (A=2,B=4), which would be represented in player 0’s tree as the information set (A=2) and in player 1’s tree as the information set (B=4).

To deal with the element of chance, the tree distinguishes between neutral nodes and post-action-nodes. Transitioning from a neutral node to a post-action node we do ‘the regular way’; by selecting the action with the highest UCB (luckily, the result of an action is not subject to chance). Any other state transition we model by traversing to the next node as a result of an ‘outside force’. These outside forces can be the opponent’s move or a draw of cards from the deck.

Multiple-Observer Information Set Monte Carlo Tree Search (MOISMCTS) for the card game Kariba. Also available as pdf in the git repo

Pdf versions of the gifs are available in the github repo, along with source code (Python) and a small battle arena in which to play Kariba against an MOISMCTS-agent.

Other interesting stuffs on MCTS are:
- int8’s MCTS for beginner’s guide
- int8’s MCTS repo on tic-tac-toe (python)
- Cowling et al paper on Information Set Monte Carlo Tree Search (ISMCTS), particularly section G (page 10)
- tetraptych’s repo on synapsen (python), Information-Set Monte Carlo Tree Search (ISMCTS) for Schnapsen
- junxiaosong’s repo on Gomoku (python), AlphaZero for a game that’s less complex than Go (MCTS without MOIS, but with deep learning for estimating the policy and value function)

--

--

Analytics Vidhya
Analytics Vidhya

Published in Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Bram Kooiman
Bram Kooiman

Written by Bram Kooiman

Artificial Intelligence Researcher/Developer/Enthusiast

No responses yet