Video Game AI and Machine Learning

Nicky Liu
Future Vision
Published in
9 min readNov 9, 2018

Machine learning is the ability for a computer or program to improve its own functionality and efficiency. There are several layers, 1) input later, 2) hidden layers, 3) output layers.

Every neuron is connected to every Neuron in the next layer. When certain neurons in the input layer fire, they follow some of the paths and activate some of the neurons in the hidden layer, which in turn fire and activate some of the neurons in the output layer. These neurons and their connections however are all different in that they all have different weights. A neuron takes a weight from 0–1 and that value is multiplied by the weight of its connections summed up, which give the activation value of the next neuron it will cause to fire.

Some stuff happens that does some stuff somewhere else.

If the input nodes and output nodes have set values (again between 0–1) can use the backpropagation algorithm. Basically after attempting to reach an end point, the difference between the actual results and the expected goal are noted and brought backwards to update and either strengthen or weaken the nodes of the path used to reach that point so far. Tons of experience and data is needed for the program to be able to map inputs to outputs, otherwise the program will have to start from scratch whenever it is given new data. The ultimate goal is to try to map new inputs to outputs.

Inputs and outputs are in number form, but input that isn’t numbers can be used as well; they just need to first be converted to numbers. For example if we want a program to decide between choices in a video game, simply assign each action to a node, which is then give them activation rates, with the highest one more likely to be chosen by the AI.

This AI was bul****.

Originally often times video game AI barely qualified as AI at all, there was no complex decision tree; it was practically the same as simple flow control. In Mortal Kombat 2 and 3 for example, the AI simply reacted based on your input to counter you. If you pressed uppercut, the AI would crouch to avoid your uppercut, then uppercut you back. If you tried to jump at an AI they would instantly jump straight up and kick to knock you out of it. And if you ever tried to throw an AI, they would instantly throw you back, while theirs was given priority (this got ridiculous to the point that if you defeated them an they were in their “FINISH HIM” state and you were low on HP, if you tried to throw them they would come back from the thread and throw you first and kill you.

This specific Mario Kart’s AI is actually a lot better and more fair, but this is a really nice picture isn’t it?

Mario Kart was similar, each AI had a set path that they followed based on their character, the course, and the speed selected. They used any item they picked up instantly with no secondary input (shells and banana peels can be thrown backwards or held, but AIs would press the use button once only, resulting in throwing shells straight forward and banana peels being dropped in place). AI performance and difficulty was determined solely based on an inherent speed boost and not a difference in play style, although some AIs took slightly different paths than others based on character. When a course (a set of 4) was chosen, AI’s are seemingly assigned a speed boost factor, with the AI that was determined to be higher ranked (number 1, 2, etc) given a bigger speed boost (say 1.5 vs an 8th place AI being given a 1x). This can be seen in certain hacked versions where if you get really far ahead and lap them, they just speed up at insane rates to catch up (something called rubber-band AI, which is an artificial mechanism to keep AIs competitive) but they always speed up exactly based on their rank and stay in formation.

In August of 2017 an AI called OpenAI was created by Elon Musk that was able to defeat a professional player in a 1v1 Defense of the Ancients (DotA) match. DotA is much more complicated than a board game; there are over a hundred different characters, over 40 different items and combinations that can be bought, an open map that has an enormous amount of locations along with hero abilities that can go over that same range of locations, limited information in that the map and everything’s position on it are unknown at any point, and 5 players on each team that need to coordinate and work together. The scale of this 1v1 however was less impressive, both players were limited to one hero, items were pre-set, and since it was a 1v1 the map range is limited and information was largely unimportant. It was also given direct access to DotA’s API which got everything’s position, allowing it to perfectly move and match the hero’s max range which the enemy’s location. While this may sound unimpressive, even with all these abilities for an AI to use the correct abilities, position correctly, and fight at the proper opportunities is still extremely difficult. The AI is basically a script, and players in games like DotA are often cheating with scripts, but guaranteeing accuracy on skills and dodging is still not enough to win a 1v1, since positioning and choosing when to attack minions vs attacking the enemy is a key skill that cannot be programmed, since it is completely dynamic. It also did learn a few interesting tricks by playing against itself for several lifetimes such as the ability to feint (starting and canceling an attack).

All these concerns however became moot in June 25 2018, where a team of OpenAI were able to take down a team of humans, the very own employees who worked on OpenAI.

They are turning on their creators! :monkaS

Here the OpenAI learning was more like the traditional path of trial and error learning. They play against themselves for 180 years worth of time a day; starting off just wandering the map and picking up basics after a couple of hours. They learn through reinforcement learning, it would take actions at random, but was given points any time it performed desired actions, such as killing a player or winning a match. OpenAI eventually learned the combos and moves more likely to lead to kills, and would go around chasing kills, since kills were highly rewarded. In DotA however, kills are not an objective, but a means to an end, getting kills rewards you with gold, which is used to buy items and make you stronger, and also takes the enemy off the map for a set amount of time, allowing you to destroy buildings until you can destroy the main base at the end, which leads to a victory. OpenAI at this point is similar to a low-rated player, all it can do is chase and try to kill enemies. But later on a new attribute, “Team Spirit”, which rewarded points for doing objectives (things like grouping together and killing buildings or bosses) was introduced to OpenAI, which caused them to focus more on shared objectives rather than individual play. Interestingly a human was subbed in for one of the bots, and reported that it felt like he was no a very supportive team. It improved to the point where it went on to beat a team of ex-pros and popular streamers that have the same play level as pros (although less teamwork), and set its sights on competing in The International, the most prestigious tournament of the game.

The International — The Biggest DotA 2 Tournament of the Year
Live footage of OpenAI’s rise against humanity being quelled.

Defeat

OpenAI experienced its first failure at the international, where it lost to Pain Gaming, then again to Big God, a Chinese team consisting of some of the best players in the world with the title of “The Gods”. OpenAI was crushed, and once it fell behind it was unable to make any moves to try to come back. The matches here displayed the limits of OpenAI’s abilities, as well as what might very well be limits for machines playing these types of games in general.

  1. The AI learned to play the game through directly matching up in lanes with the enemy, and was completely lost when facing a different strategy. The Chinese team used a strategy where one player was given all the resources while the other 4 ran interference until that one player can become impossibly strong later on. This is countered by dog-piling that one player, but the AI is unable to read and adapt to this strategy; it played as though it was going to be matched up normally.
  2. The most successful algorithms prioritize success over how much success. Just like the AI that defeated the champion in GO, moves that were more likely to succeeded (90% chance of gaining 10%) were favored over moves that were less likely to succeed but had a bigger payout (60% chance to gain 90%). In a game like DotA, the big successes are inherently tied with big risks, and playing for small guaranteed successes leads to inactivity. For example, if you had a team that was currently stronger but would eventually get outscaled (enemy teams heroes become stronger than yours as the game gets longer) you need to group and force a fight over an important objective (Roshan is a strong neutral minion on the map). You either take down the objective and win with the power it provides you, or force the enemy to fight you for it, defeat them, and either take more of their base or take the objective and continue on with your power boost. But fighting over the objective is always very dangerous, so while the reward is monumental the risk is almost always much higher than other options that OpenAI had which would lead to small advantages; this would lead to OpenAI not making the right calls in game. Similarly, when a team falls behind, the best way to comeback is to take a high risk high reward fight, minimizing big losses would lead to slowly losing the game and losing any hope of a comeback. So once a team gained a solid advantage over OpenAI, it basically rolled over and died.
  3. OpenAI lacked foresight. All programs can do it figure out the optional place of when to use an ability (one that his the most people) but cannot act for the future. A player might save a spell predicting that people might group up later (making it more effective) or save it as a threat to disable certain options for the enemy. The AI cannot plan for the future of a game, it cannot use strategies like the Chinese team used that involved powering up slowly until you become stronger later, as during the time you are powering up you are making decisions with suboptimal payoffs, knowing that you can win if you magically get to a certain point. The ability to predict the future seems to be one of the hardest possible things to teach with an algorithm.

Machine learning is still rising, and improving in almost every aspect, and without doubt video game AI will continue to improve with it. Perhaps a video game champion can even be made by simply making an AI so dominant that it doesn’t need to predict the future and can win through its “skill” alone. Even if it doesn’t, AI that good enough to fairly beat players, even if they cant beat “The Gods” is impressive in its own right, and will make practice single player games much more interesting. We will continue to await the day when perfect and unstoppable AI is at the top of all video games.

--

--