Ryan D’s Terminal Strategy

Published in

Terminal Player Strategies

7 min readFeb 18, 2019

My name’s Ryan Draves. I’m a freshman at the University of Michigan studying computer science with in interest in AI and robotics. Coming into this competition, I had some prior experience with the Halite II AI challenge, some basic Python skills, and a bit too much free time. Since then, I got to explore in depth Python usage, good programming and version management practices, and some basic machine learning applications.

I started working on the Terminal competition after watching CodeBullet’s announcement video for a sponsored mini competition. After teaming up to place second in the amazing live event C1Games hosted at UMich, I worked with a friend of mine to finish 3rd in the global competition.

Below is a write-up about the thought process I had in developing my algos, as well as some of the underlying strategy. A huge shout-out to Correlation One for hosting such an in depth and fun opportunity to program, as well as Roger, my teammate, for being the main proponent behind our prediction phase and mentoring me on the programming practices that made our algo’s development possible.

Q1.1. Which programming languages, environments, and tools did you use during the competition, and why? If any were new to you, how did you assimilate them into your existing knowledge and how did you leverage them in practice?

I primarily coded in Python for the competition using my typical environment
of Visual Studio Code hosting an Ubuntu WSL terminal inside to manage the
Git repository and run Terminal matches locally. A couple of weeks into the
competition, I started working with a partner (a friend of mine I invited to help me in the live event at UMich), and the Git repository became more essential as we had to coordinate our work. Later in the competition, and especially in the second season, we began porting things over to C++ to help maximize performance. I started the competition with a beginner’s level of experience in Python, so a lot of time was spent learning generic Python skills. Particular to this competition’s demands, I became much more familiar with Python lists and dictionaries, as most of the functional code involved working with those data structures. I have more experience and coursework in C++, so a lot of techniques I had to learn to resolve a Python-specific issue I was knowledgeable of and could just look up the equivalency, such as a copy constructor. Other techniques, such as compiling and interfacing the C++ code back into Python, I spent a longer time learning about and reading through tutorials to get comfortable with the new concepts.

Q2.1. What was your detailed process for improving your strategies and code over time? What significant obstacles did you encounter in this process, and how specifically did you work through those obstacles?

I primarily developed in a short cycle of analysis, planning, design, and
implementation, using the game’s short feedback time to see how my algo was
performing. When making a major overhaul, this could span several days or
many hours in order to plan out the change with my teammate, but typical cycles involved analyzing a replay, thinking “well it shouldn’t have done that!”, and designing a behavioral adjustment to make. A major obstacle that came from this workflow was the increasing difficultly of making major overhauls, which stemmed from testing and maintainability being left out of the cycle. For example, when making an optimization overhaul of the
simulator engine, which simulates what would happen in a theoretical state in the game, a large part of the main strategy file was rendered broken and incredibly messy to sort through. Even worse, the changes I made were not easy use and would require explaining to my partner what was going on under the hood in order for him to use the simulator. To overcome this, I began to precede each major change with a refactoring period to simplify and abstract the codebase, as well as some refactoring after the overhaul to clean up the large-scale changes. The result was a much more robust algorithm that can interchangeably swap out key points in the logic — or entire strategies — with new implementations.

Q2.2. Please walk us through your final algorithm’s logic, as well as one significant non-obvious change you made to your algo along the way, as well as the rationale for the change.

The algorithm starts by trying to obtain perfect knowledge of the state of the
game. Due to the simultaneous turn-based nature of the game, the moves an
opposing player makes on a given turn is unknown to the player. To overcome this, the algorithm owns a small army of “predictors” that analyze the opponent for specific patterns of behavior. As these “predictors” latch on and become confident in their predictions, they’re employed by the algorithm to spend the enemy’s resources for that turn and to pass the regular logic this predicted game state. The regular logic is fueled by a min-max strategy where it’ll simulate the outcome of a turn if it makes X defensive placements and Y offensive placements through several combinations of X and Y. Then, it’ll run these outcomes through a series of metrics to pick the most optimal turn the algo thought of that maximizes its outcome while minimizing the opponent’s.
One of the less intuitive changes we made is how we separated metrics
between the early phases of the game from the rest. In the early phases, less
about the opponent is known and the more savvy “predictors” are still trying
to gain confidence in their predictions. What we ended up with was two minor adjustments to early game outcomes. The first was a tolerance, which lowered scoring-based outcomes due to unknown defenses that could be placed. The second was to punish scoring-based outcomes that also damage enemy structures, as what we often noticed was outcomes trying to one-up a minuscule amount of damage by getting close to enemy defenses when they should instead be seeking to avoid getting in range of them.

Q3.1. Please walk us through the main components of your code. How did you design and organize these components to maximize code quality and efficiency?

After many refactorization passes, we ended up with a hierarchy that enabled
smooth replacements and additions to isolated parts of the algorithm’s logic.
At the top level, the code is organized into a “master” strategy file which handles the routine function calls and passing information to and from the prediction phase and the main strategy. Keeping the main logic out of the top level allowed for us to simultaneously develop several different strategies that could be interchanged to run in a single line in the “master” strategy. At the prediction phase level, the “small army of predictors” are each added and controlled in a sensible class hierarchy, such that adding a new predictor
is no more troublesome than its development. Similarly in the main logic of
each strategy file, each type of simulation the algorithm runs of abstracted in a versatile manner that allows for each component to be easily replaced with a new implementation.

Q3.2. Which sections of your algorithm code were your most proud of, and why? Please display snippets of those sections which demonstrate the quality of your code.

I’m most proud of my recent work on ML_Bootstrap, an algo that generates its
defensive placements with a neural net. It’s development was divided up into four parts: a replay fetching script to interact with Terminal’s API, a data generator to turn replay files into usable data sets, a training environment, and a C++ port of the neural net to upload the algo without Keras, Tensorflow, or Numpy dependencies. The model I ended up with was able to successfully mimic placement patterns in a supervised learning environment, climbing the leaderboards up to ~40th place.

The segment I chose is a set of simulations in the main logic, which highlights the robustness of the codebase. In order to swap in the neural net’s placements in the ml_bootstrap strategy file, I simply had to change the placements on the two self. simulate_map() lines.

Q4.1. Compare and contrast your algorithm with other top algorithms on the leaderboard. What do you see or do that others missed, and vice versa? What would you do differently next time?

The algorithm’s min-max approach was similar to many other top algos. Most
top players had independently developed their own simulation engines, and while each has their own definition on what “maximizes” a turn, it’ll often appear to be similar. What set our algorithm apart, however, was it’s ability to predict the opposing player’s placements and utilize that against them. Many of our competition matches featured the predictor’s ability to shut down opponent’s attacks and dictate a favorable pace in the game. After the final season one competition, my partner and I opened up a conversation about the prediction phase on the forums and discovered that it had not been as much of a priority for other players, which is something I think others missed out on. On our end, we seemed to have missed out on the structure of the algorithm’s placements. I had firmly believed in the benefits of our structure, such as its ability to control the flow of the game to boost the predictor’s reliability, but it became clear other aspects of it, like its vulnerable early game, were a burden on the algorithm’s performance. Since the season ended, I made a refactorization pass that enabled easier structural experimentation, and I noticed a significant improvement when using one of the more popular structures between top 10 players.