My AI found its first wins! Sort of …

This post is part of ChurrPurr.ai, a challenge to design an online strategy game and the AI to master it. Play the latest version of the game here.

Steven Adler
5 min readOct 29, 2017

After yesterday’s examination of key questions for my AI design to handle, I’m pleased to announce that today my AI found its first victories.

Of course, these started from late-stage positions in which the AI was only one or two moves from savoring its sweet, sweet ‘+1’ reward. We’ve got to start from somewhere, right?

In fact, my AI has found considerably more than a handful of victories. This is because, rather than having my AI wander through depths of the game’s complexity to eventually locate wins, I’m pre-seeding it with semi-random positions in which wins are likely, and then testing my program’s ability to detect them.

Above are results for a few thousand simulated Stage 2 board scenarios, in which player 1 has 8 pieces remaining, and player 2 has only 3. Wins are both more likely and far easier to detect in this scenario than they will be in many earlier-game positions, but we’re getting somewhere.

Starting from simplified positions has been a godsend, as it turns out there were a few edge-case bugs in the program that otherwise would have been very difficult to detect. Starting from simple scenarios in which I knew there were wins allowed me to narrow down what wasn’t working, then gradually build up to more randomness.

For anybody wondering how my AI is operating, there are a few essential components:

  • A representation of the game’s current ‘state’ (including non-board elements, like whose turn it is to move)
  • Translating a particular state to a valid set of possible ‘actions’
  • For each pair of (action, starting-state), understanding the ensuing ‘future-state’ (in the same form as the initial state representation — just updated values)
  • Evaluating future-states for ‘rewards’, and in particular, whether they meet a win condition
After some time getting back up to speed in JavaScript, I now feel pretty comfortable writing code to do the things I want — though maybe I could work at some shorter variable names …

Each of these components is easy enough to understand in English, but they’ve presented varying challenges in integrating with my code — particularly when there’s a mismatch between systems I’m using for the basic game-engine vs. what the computer needs to be able to look ahead. Noted that I should think more carefully about this design in the future.

(Slightly more technical example — when checking for 3-in-a-rows in the actual game, e.g., it’s okay to simply check whether a series of squares is currently occupied. But when the computer is looking ahead, it can’t just borrow this function without modification, as a square that will be occupied in its forecasts might not be occupied as of yet. And because the data structures I use to convey this information to the computer are generally single-character based, rather than the engine’s more complex squares, I had to add on a few new functions.)

Probably the most interesting aspect for now is how the computer understands the state of the game. Similar to aspects of encoding a chess-board in a string of 0s and 1s, I also want to convey information about the game in a relatively simple string. For my purposes, however, I’m willing to trade some efficiency for human-understandability, which helps me to debug and grasp what I’m doing.

Check out the picture below — it showcases 10 random states my program generated for testing — and let’s talk about what’s going on here:

There’s something oddly comforting and Matrix-y about a long string of digits.

Each of the above lines is a different 33-character long representation of the game’s state. In reality, the state is broken up into a few pieces:

  • Digits 1–24 (aka indices 0–23); occupation status for various spots on the board. 0 means open; 1 means player 1; and 2 means player 2. The first eight digits are the inner ring, then the middle ring, then the outer ring, with each ring beginning in its bottom-left spot and proceeding clockwise.
  • Digit 25 is either ‘A’ or ‘B’ depending whether the game is in stage one or two.
  • Digit 26 represents which player’s turn it is, 1 or 2.
  • Digits 27–28 are the remaining pieces to place for player 1 and player 2; these are all 0 because I set the random scenarios to be stage-2 only.
  • Digits 29–30 are the remaining live pieces for player 1 and player 2.
  • Digit 31 is ‘m’, ‘r’, or ‘s’, depending whether the next action is a move, remove, or slide. This is somewhat different than how I conceptualize actions for the human player but overall maps pretty well.
  • Digits 32–33 are win-loss conditions — ‘NN’ while game ongoing, ‘WL’ if player 1 wins, and ‘LW’ if player 2 wins.

What I like about this representation is, after a few minutes working with it, I have a pretty decent feel for patterns and what certain states mean. What’s frustrating is that there’s certainly interdependency that could run amok somehow — e.g., boards with ‘sNN’ at the end shouldn’t be possible if players still have pieces to place — but this method doesn’t let me easily regulate.

Overall though I’m happy with this approach to encoding the information, and now that the last edge-cases are worked out (maybe?), I’m looking forward to the next steps: Figuring out how to help my AI look more deeply into the future, and then configuring the game to take this computer input directly.

P.S. Upon playing with a friend over the weekend, there’s been a mild rules update — gridlock scenarios and scenarios in which there are no legal removals now result in a ‘pass’, with it becoming the other player’s turn. How quickly the world of Churr-Purr is evolving …

Read the previous post. Read the next post.

Steven Adler is a former strategy consultant focused across AI, technology, and ethics.

If you want to follow along with Steven’s projects and writings, make sure to follow this Medium account. Learn more on LinkedIn.

--

--

Steven Adler

I work at the intersection of AI, ethics, and business strategy; thoughts are my own. www.linkedin.com/in/sjgadler