An Analysis On How Deepmind’s Starcraft 2 AI’s Superhuman Speed is Probably a Band-Aid Fix For The Limitations of Imitation Learning

  1. AlphaStar played with superhuman speed and precision.
  2. Deepmind claimed to have restricted the AI from performing actions that would be physically impossible to a human. They have not succeeded in this and most likely are aware of it.
  3. The reason why AlphaStar is performing at superhuman speeds is most likely due to its inability to unlearn the human players’ tendency to spam click. I suspect Deepmind wanted to restrict it to a more human-like performance but they are simply not able to. It’s going to take us some time to work our way to this point but it is the whole reason why I’m writing this so I ask you to have patience.

The Superhuman Speed of AlphaStar

Here’s the lead designer of the AI giving us their mission statement.

Spam Clicks, APM and the Surgical Precision of Robots

Most human players have a tendency to spam click. Spam clicks are exactly what they sound like. Meaningless clicks that don’t have an effect on anything. For example, a human player might be moving his army and curiously enough, when they click to where they want the army to go, they click more than once. What effect does this have? Nothing. The army won’t walk any faster. A single click would have done the job just as well. Why do they do it then? There are two reasons:

  1. Spam clicking is the natural by-product of a human being trying to click around as fast as possible.
  2. It helps to warm up finger muscles.

Doing Things the Right Way VS Doing Things the Fast Way

Are you sure about that David?

Why Does Deepmind Allow AlphaStar to Have Superhuman Mechanical Ability?

Now we finally get to the meat and potatoes of this essay. Thank you for sticking with me for this long. First, let’s recap.

  • We know what APM, EPM and spam clicking are.
  • We have a rudimentary understanding of what the upper limits of human play looks like.
  • We understand that AlphaStars gameplay is in direct contradiction with what the developers claim it was allowed to be able to execute.
  • We understand that the consensus among Starcraft 2 scene is that AlphaStar won the games through superhuman army control and that superior strategic thinking wasn’t even needed.
  • We understand that the goal of Deepmind is not to create a bot that only micros really well or abuse the game in ways it was never meant to be played like.
  • It is incredibly unlikely that no one in Deepmind’s Starcraft AI team questioned that burst APM of 1500+ is possible for a human player to replicate. Their Starcraft guy probably knows more about the game than I do. They are working closely with Blizzard, the company that owns Starcraft IP. It is in their interest (see the previous bullet point and mission statements from David Silver and Oriol Vinyals previously mentioned in this essay) to make the bot act as close to a human as possible.
  • Maximum average APM over the span of a whole game.
  • Maximum burst APM over a short period of time. I think capping it around 4–6 clicks per second would be reasonable. Remember Serral and his 344 EPM that was head and shoulders above his competitors? That is less than 6 clicks per second. The version of AlphaStar that played against Mana was able to perform 25 clicks per second over sustained periods of time. This is so much faster than even the fastest spam clicks a human can do that I don’t think the original restrictions allowed for it.
  • Minimum time between clicks. Even if the speed bursts of the bot were capped, it could still perform almost instantaneous actions at some point during the time slice it was currently occupying and still perform in an inhuman way. A human being obviously could not do this.
Look closely for the blue circle animation thingy.

Why Care About This? What’s the Big Deal?

So there you have it. I suspect that the agent was not able to unlearn spam clicking it picked up from imitating human players and Deepmind had to tinker with the APM cap to allow experimentation. This had unfortunate side effect of superhuman execution which resulted in the agent essentially breaking the game by being able to execute strategies that were never intended to be possible in the first place.

This image was posted by Deepmind to their blog: https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store