Attaining Maximum Performance with Automated Tests

Michael Short
spaceapetech
Published in
8 min readMar 21, 2019

Performance

Why is it important?

At Space Ape Games we take performance very seriously. Over the last year the studio has made a huge progress in this field. Performance is king. Our players deserve the best games. The best games are fun to play, look amazing, maintain a steady frame rate and don’t leave you with 10% battery charge after each session. If we take a look at one of our older games, Transformers: Earth Wars we can see there is a correlation between frame rate and day 30 retention

Day 30 retention vs Average FPS

If our players are enjoying our games, they are going to play them for longer as they are going to have a better experience.

How do we track performance stats?

We are a mobile only studio. We make games for iOS and Android. There are a relatively small number of iOS devices to optimize for, but we tend to support everything from the iPhone 5S and onward. However when you factor in Android, on which we target everything from the Samsung Galaxy S5 and up, the number of handsets and tablets we support skyrockets. There’s no way we have the bandwidth to test each of our games on every one of these devices.

In this pie chart we can see a breakdown of the devices that Transformers: Earth Wars is installed on. Each game differs, but we can assume that

  • Low = Samsung Galaxy S5/iPhone 5S
  • Medium = Samsung Galaxy S7/iPhone 7
  • High = Samsung Galaxy S9/iPhone X
The range of devices on which our games run. Its a pretty even split.

We track performance metrics in all of our games. These metrics are completely anonymous. They allow us to ensure our games are running at peak performance, and if not we can do something about it so that our players are getting the best possible experience. As users play our games, these stats are sent back to us. For more information on how this works, check out my Realtime Mobile Performance Logging blog post. Essentially it allows us to make correlations between performance metrics and hardware, operating systems, graphics API’s etc.

Average frame rate across Android and iOS devices over the past 30 days

How do we fix performance regression?

Tracking these performance metrics allows us to fix issues as we find them, however this is a very reactive approach to the problem. Our users would have to experience our game running poorly before we even know about it. Ideally we want to be able to catch performance regressions before we ship a new build.

Overdraw in one of our games

Our senior tech artist, Stoyan Dimitrov, created a monthly game report to help the teams identify problems during development. These reports highlighted issues such as overdraw, draw call counts and shader complexity. At the end of the report he would give some advice on what each of our games teams should do to remedy any issues he found. Unfortunately as we started more and more exciting prototypes, Stoyan was unable to dedicate enough time to each team.

A subsection of a Stoyan Report

This is a much more proactive approach. We find and fix issues before builds are released. But it took up way too much of Stoyan’s time.

Collaborating with Arm

Workflow

Our old workflow would look something a little like this. We would identify a slowdown, someone would take the time to profile the game and try to identify the issue. The first step is to identify if the game is CPU or GPU bound. From here we need to find out why we are CPU (e.g. we’re submitting to many draw calls to the graphics API) or GPU (e.g. we’re shading to many pixels) bound. For example, say we are GPU bound, are we blowing up one of our shader pipelines (vertex or fragment), then are we maxing out the ALU’s, are we stalling on texture reads? This can be a very time consuming task.

Old workflow

As we often use Streamline and Mali Graphics Debugger to identify these issues we decided to reached out to Arm. After some discussion we decided to collaborate on a prototype profiling tool. Our aim was to produce Stoyan reports automatically. This would essentially change our workflow to look more like this.

Desired workflow

Unity Integration

The first step is to integrate Arm’s tooling into our games. We use Unity across all games in our studio. If you look inside your Mali Graphics Debugger install folder there will be a /target/android/arm/unrooted/ folder. Copy this directory into your /Assets/Plugins/ folder within your Unity project.

If you’re using Unity 2018.2 or newer a new native interface was introduced allowing us to pass Unity profiler annotations directly to external profilers such as Streamline. We actually open sourced the Unity profiler integration so you can simply drop it into your project.

Now you should be able to run your game on an Android device and attach the Mali Graphics Debugger or Streamline with detailed profiler annotations.

Streamline, Gator and the Performance Advisor

Before we dive in, lets take a quick look at how Streamline works. At the base layer our hardware and software expose counters and events. These events are collected by a proxy tool known as Gator. Gator gathers all of these metrics and composites them into a single capture. This capture is then opened in Streamline where we can view all of the counters and events in graph form.

Streamline can be quite intimidating to use. It’s very low level. We were desperate for a tool that would give us a high level overview of performance within our games that the entire team could understand and use. Arm were more than happy to prototype this tool. Below we can see the basic flow that was required to output data in a simple report format.

The flow required to generate a performance report.

Arm created what’s now known as the Analysis Engine. This engine gathers all data created from the Gator capture and analyses it, making sense of the data so that it can provide insightful information and suggestions via a report. The report is a simple HTML document that can be opened by anyone. We then worked with Arm to take this and build it into our CI system. Each night we would build one of our games, run it on an Android device, capture the performance data with Gator and automatically generate a report that the team can then see each morning when they sit down with their coffee.

CI Integration

The Continuous Integration setup was a little trickier than we anticipated (at Space Ape we use Jenkins). Capturing data on Android using Gator is currently a very manual process. Working with Arm we managed to create a prototype automation system. The plan going forward is to develop this into an out of the box solution.

We developed a shell script that runs through the following steps

  1. Uninstall any old build from the device and install the new APK.
  2. Run the newly installed APK on the device.
  3. Capture data via Gator.
  4. Transfer this capture data back to the build box.
  5. Analyse this data and produce a Performance Report.

There’s a lot to cover here, so I’ve shared a script below that we developed with a lot of help from Arm. Like I say this is a prototype, but going forward it will help us to develop a really simple, out of the box solution.

Automated Performance Advisor report generation

Devices

Because the Android platform is so fragmented we found it really difficult to identify devices that would give us all of the counters and events that we needed out of the box (i.e. a non rooted device). In the end we found that these 3 devices were the best at providing reliable results.

  • Oppo R15
  • Huawei Mate 10 Pro
  • Samsung Galaxy S9

Going forward it would be awesome to see Google or the hardware vendors collaborate on this and standardizing the platform a little to help out developers.

Performance Advisor Reports

So we finally have our reports! We have a overall summary of the capture at the top. It shows us the percentage of time where our game is bound by either VSync, CPU, GPU vertex shading, GPU Fragment Shading along with an average frame rate.

Arm’s Performance Advisor frame summary

Underneath this we have a more in depth graph. This graph shows the average frame rate over the entire capture. The level of overdraw in your game. It also colour codes the capture so you can see at any point in the capture where you are bound. We also worked with Arm to add regions. This allows us to categorise parts of the capture. So we can see clearly if were on a loading screen, character select screen, in game etc.

Arm’s Performance Advisor frame analysis

The Performance Advisor then gives us a region by region overview. Allowing us to see where each region was bound, and its average frame rate. Using our plugin, you can define a region in your game by calling StreamlinePlugin.BeginRegion("myRegion"); and StreamlinePlugin.EndRegion("myRegion");

Arm’s Performance Advisor region analysis

The Future

We’ve been working closely with Arm to help drive the development of this tool. It’s been awesome working with them. We really feel as though they have valued our feedback and they’ve delivered an awesome prototype that’s allowed our game teams to take control of performance. The data is so accessible that everyone from the game team, QA, product, art and dev, can find what they need. We’re hoping to further collaborate with Arm and take the Performance Advisor tool from prototype to shippable. Arm have discussed taking this even further to provide automated frame analysis, like what you see inside MGD or Streamline.

Automated Frame Analysis

For anyone that’s interested, we’re holding a talk with Arm at GDC this year, so if you’re there drop by and say hello or come speak to someone at the Expo (South Hall, #1049). If you’re not lucky enough to be at GDC then you can always download the slides afterwards. If you’re interested in becoming a beta tester and providing feedback for the Performance Advisor suite, reach out to me and I can pass your details along to someone at Arm or sign up here.

--

--