How we accelerated development in Unity with our build pipeline on Assemble With Care.
By Matthew Newcombe, Lead Developer on Assemble With Care @ ustwo games
Assemble With Care was an 18 month project recently released on Apple Arcade and Steam. A meditative tactile game, in it you restore objects that have sentimental value to their owners, and explore their relationships to one another.
This follows a previous post focused on the Technical Art behind the project you can find here
Often the value a build pipeline can add to game development is a bit of an unsung hero. It is a critical component of a healthy development life cycle. It is an automated way to validate all changes to your game, identify problems and bugs before they’re seen, and allow any team member to quickly download the game onto their platform.
Ultimately it is the time it will free up from your development, that makes it so valuable, allowing you to concentrate on making something great.
- Ensure the code compiles for a commit
- Test the games critical path is playable with Automated Testing
- Ensure that the team is aware of any issues via Slack
- Produce builds on all our target platforms (iOS, tvOS, MacOS and Steam)
- Make these builds easily accessible and available to anyone in the studio through App Center
- Move tickets in our bug database (Jira) over to our QA team to do deeper validation.
- Free Up Time — Accelerate Development
Here we’ll dive deeper into some specific parts of our pipeline that gave us a lot of value, and hopefully in doing so share some lessons that may help others, looking at:
- The Hardware — What our setup is and how we arrived there, comparisons with Unity Cloud
- Validating and Feeding back — How we’ve setup automatic testing for Assemble, integrated Slack for feedback, and some of the cultural problems we’ve had with build failures
- Integration with process — The importance of Jira for our workflow with QA, and how we ensure tickets are updated
- Downloading Assembler — Making builds easily accessible and available to anyone in the studio
Hardware behind the Pipeline
For a small studio there is a literal cost to creating a build pipeline, you need one (or more) seperate machines which run these automated processes and they need to be maintained, making suitable choices saves money and reduces overhead.
We’ve adopted Jenkins, an open source automation software, to orchestrate our pipeline.
For Assemble, at our peak, we had 12 team members contributing to the Github. This was made up of 4 programmers, 3 artists, 2 designers, and a sound engineer. During peak development we were averaging 102 commits a day. In a perfect world we would be able to verify that each of those commits would build successfully.
There are many solutions for building Unity games, a multitude of CI software solutions to roll out on your own hardware, but also things like Unity Cloud Build which removes the need entirely to maintain your own hardware — which is a huge saving, and ideally we wouldn’t need to manage our own. However our highest priority was turnaround time (i.e. time from a commit to having a build), allowing a team to react much faster to problems, and help ensure the quality stays high. For comparison we ran Unity Cloud alongside our own hardware builds for 2 months and then used the information to decide what to run with.
The time it took for Unity Cloud to make a single build from a commit over these 2 months was:
Min — 52 minutes
Average — 95 minutes
Max — 156 minutes
We then compared this against hardware we had available in our office, and ran multiple builds in parallel to simulate what our server load might look like.
This showed us that even for the lowest spec hardware we run builds on, the best case Unity Cloud build time was 10–20minutes slower, and compared to the Mac mini it was over 5 times slower. These times meant we can run around 16 builds an hour per mac mini, or 128 builds a day, which met our turnaround time requirements.
We have Jenkins Master installed on an iMac, and our tiny ‘server farm’ consisting of three mac minis acting as slaves. We typically have a mac mini reserved for each game project that is currently under development, with one free mac mini that can be used for time critical builds when needed.
Within Jenkins we run a pipeline job that covers our gamut of builds whenever a new commit is detected. As we don’t have our IP address exposed Jenkins polls the Github repository to detect a change.
We start off by building and running our automated testing (integration tests) which plays through the game’s critical path automatically and ensures each level can be completed.
Once this stage has passed, we then spawn iOS, MacOS, TvOS and Steam builds which are built in parallel.
Once a build has finished, it will move any bugs that were fixed in Jira to our QA team, and upload the build to either AppCentre, TestFlight or Steam — depending on the target.
There are a number of methodologies and techniques for reducing the bugs which make it into a build. One of the key techniques we rely on is our Automated Test Runner — an automatic playthrough of the entire game from start to finish, which will fail the build if any level cannot be completed, or any errors are logged.
On each of our build machines we have an iPhone attached. With the Unity Test Framework it runs on the device, the phone plays the game through, and reports back the status (fail or success) for each level, ensuring that every object can be repaired and the level reaches the end screen.
We run our tests at 16x real speed so that the entire game only takes a couple of minutes rather than the 20 or so a real time play-through would take, here’s a test in-editor, showing the breakdown of each gameplay step through to finishing the level.
It was important to us that it was as close to real world conditions as possible, i.e. resembled what it would look like if someone was actually playing the game on the phone.
This can be achieved by actually playing the game, recording the input, then playing it back.
This is the most ‘real world’ approach, but is incredibly prone to breaking if level design changes even slightly, a problem we wanted to avoid given how much we iterate design. To avoid needing to re-record whenever a level changes, we created an auto-player using the visual scripting logic plugin Fungus.
Each level has a flowchart with a test block, made up of a series of Commands (as in the gif above), and each command is a discrete action to perform in the game, sequencing these actions creates the play-through. Key to this was implementing those actions as emitters of touch data to be given to our Input code — thus emulating a more real world interaction, protecting the tests against design changes and giving us broad coverage on the code.
Diving into a step, here’s our Attach Command:
In this snippet we integrated a C# implementation of the promises paradigm by Real Serious Games. This can enable highly readable code when writing sequenced logic. It does like to allocate though, so be careful when using it for anything performance critical. (Allocations in itself is a huge topic but there’s a good overview of by Unity)
This sequence checks where an object in the game is in screen space, then sends a faked finger through a wrapped API to our touch code.
Viewing this in-editor, the green dot is the debug visualisation of the touches that the game is being sent.
A note on the Unity Testing Framework — the generated Test DLL and your Game DLL are separate. As we are running the scenes in the Game DLL, we need to be able to pass results from the Game DLL to the Testing DLL, but the Testing DLL cannot depend on the Game DLL. We do this with a third “Bridge” DLL that both Game and Test have dependencies on. This DLL contains a single static class that the Test uses to pass data back and forth between the Game and the Test, such as which level to load, and fields for the Test to poll, determining when a level has reached completion.
In summary, this approach to creating an automated play-through verified the games critical path for every build we made. There is an upfront development cost to creating something like this, but it gives the entire team another level of confidence when pushing commits, catches problems after a commit in less than 30 minutes, allowing developers to react rapidly, and became invaluable for Assemble’s development.
A critical part of the build pipeline is communicating the status back to the team. We run our builds via a small bash script, where we check whether or not the return result from Unity indicates that the process error’d (i.e. non-zero success code)
We then check whether or not it is a compilation failure or a test failure, and if it is a test failure we grep the levels that have failed from the output and assign this to a env var SLACK_MESSAGE which is later consumed by the Jenkins slack plugin.
This then gets posted directly to our slack channel so that the entire team is instantly aware of a build failure, and usefully links to the unity output, and specifically which levels in the integration test failed.
When the issue is resolved, the team is notified again in Slack.
Slack is an inherently noisy place, our team size grew to around 12 at its limit during development and at that size there are a lot of people on a channel, meaning notifications (like a build failure) have the potential to distract a large number of people if not handled carefully.
We had a lot of problems initially due to false positives, either our iPhone SE was in a weird state (iOS update notifications are a bane of my life — pro tip, once your device is working, put it into flight mode, you won’t receive blocking notifications that stop apps from opening), or the tests themselves had some bugs which caused intermittent failures (maybe 1 in 10).
As a team we wanted to encourage the culture where the less-technical disciplines like design and art would read the build failures and feel empowered to fix it themselves.
Unfortunately, our early technical issues skewed the team’s perception of build failures, leading to the (false) assumption that most issues would need to be resolved by a developer; once we had solved the teething issues it took a fair while to try to readjust team culture to have shared ownership over build failures again. As a result of some of these mistakes I’d suggest the following:
- Try to make sure your integration tests (or any tests) are pretty rock solid from the word go. We threw together a quick solution, and then only took the time to refactor it properly a few months later, incorrectly prioritising it below some early game features.
- Adjust your message rate from your builds to be as limited as possible, our strategy is to notify for the first build failure that occurs, and when it goes back to normal, but no other builds — this means we pair a fail to a success and keep the noise ratio low.
Integrating Builds with Process : JIRA
The next part of the build pipeline to look at is how we integrate it within our workflow. Jira is a powerful tool we adopt for many aspects of production at ustwo games.
Our Jira boards are typically setup to look something like the following (this is a snapshot of our Assemble bug board):
When work is done on a ticket the developer moves the ticket to “In Develop”. This refers to a git branch called “develop”. This means that it is ready for a new build, as our build servers pull from this same develop branch. Once that build is created, we want to transition the tickets into “Ready for QA”, noting the build version number on a field in the ticket.
When this is a manual process it is prone to error; people forget to update their tickets, you have to go hunting for a build that you know your changes are in and copy the right build number, QA have to trust you’ve done this otherwise they end up testing the wrong version, and you quickly end up with miscommunication. To solve this we wanted to automate that process.
So we wrote a small bash script that calls through to the Jira Rest API (which is pretty fully featured, you can find this documented here)
- When we start a build, a script “get_jira_tickets.sh” caches the current ticket ids under the IN DEVELOP column.
- When that build completes, “udate_jira_tickets.sh” moves the cached tickets to the READY FOR QA column, and updates the fixed in version number..
Looking at each of these steps individually, the first uses curl to call through to the Jira API with a JQL query to find the tickets under the IN DEVELOP column. It then writes the ticket keys by parsing the curl response json (we used jq (https://stedolan.github.io/jq/) to do this) into a cache file.
The JQL query
The second step after a successful build is to read the results of this file back into bash, add the new build number to them and transition the tickets in jira.
Adding the build number is pretty simple (note VERSION_NUM we write into the environment variables as part of the build job)
Transitioning the tickets has a few more steps (both entire scripts can be found just below) but the Jira API call looks like:
So just a couple of simple scripts overall but we now have tickets tied to reliable build numbers and in the right column for QA to look at, giving the QA team the confidence that the build they are playing contains the right things.
The part to mention is being able to rapidly upload builds to a distribution platform that allows developers and testers to access the build as soon as possible. Depending on the platform you are developing for, the platform holder way to approach this can vary wildly. With Steam for example, you can push a build through the upload API and have it immediately accessible via Steam for anyone with access, which is fantastic. And you can separate builds into different branches to differentiate your development/release builds. For Apple it’s a bit of a different story and there are a number of non-trivial gotchas. This ranges from a new build being pushed to being available through TestFlight in a 1–3 hours time frame, to having a single app to publish builds to, making differentiating streams of work incredibly difficult.
App Center https://appcenter.ms/apps (formerly HockeyApp) is a Microsoft-published online platform that accepts uploads of iOS, Android, Mac and Windows builds, and lets you download and install directly to the device from a weblink. The only restriction for iOS is ensuring that your provisioning profile includes any phone that will install the build from App Center. We hook this up in Jenkins by using the open source App Center plugin (https://github.com/jenkinsci/appcenter-plugin).
All in all a lot of effort went into our build pipeline, but we think it was really worth the effort, enabling our team to ship on three platforms at release with confidence in each build. We will be building on this pipeline for all our new games in the future.
Finally wanted to acknowledge Van Le and Laure De Mey who took the initiative on developing and integrating many aspects of the pipeline and always on the look out for future improvements to keep the ustwo games build wheel spinning.
Hopefully this was useful for anyone interested in build pipelines, or just curious about what went into ours. If there’s anything you’re particularly curious about please reach and let us know in the comments, or via twitter.
Thanks for reading!