Beyond the routine in QA: how we automated regression testing

MY.GAMES
MY.GAMES
Published in
11 min readApr 16, 2024

--

QA testing work is not tedious and monotonous, it’s a creative endeavor with a rich range of opportunities. To be fair, there is a boring side — regression. So, we decided to automate regression testing — here’s how.

Within the IT industry, there is this prejudice that the work of tester is both tedious and monotonous. Allow us to disagree. In our opinion, QA is both a creative and technical endeavor that presents a full range of research opportunities. Further, to do this job well, you need to immerse yourself in the task, understand all its subtleties and complexities, and possess an awareness and the ability to manage any potential pitfalls.

Automating the mechanics of an auto-battler

First, let us provide a little bit of context about the work that our workflow fits into. We’re responsible for the development of a game called Hustle Castle, a mobile auto-battler with both economic-strategy and RPG elements. Unity is our engine of choice for this project, and the server is written in Java. The core mechanics of auto-battle are as follows: player and the enemy units clash, all characters have special equipment that confer abilities, and the battle continues automatically — the user can only cast spells and use their hero’s talents.

Hustle Castle also has a castle feature that involves a myriad of rooms where you can extract resources, craft items, and so on. Additionally, there are network mechanics featuring clans, territory development, arenas, and much more. All this coexists with each other and follows a common logic.

With the description of Hustle Castle out of the way, now we can move on to deeper things. The gameplay is tied to the abilities of the units. From a technical point of view, an ability is a kind of entity that has many settings: how and when it will be activated, who and how it will affect, what other abilities it can activate, and so on.

We describe the behavior of each ability in the form of JSON objects. To illustrate, abilities can be simple, and structural — they can also have sequential or parallel activation of other abilities. In addition, there are different types of abilities, such as counters, buffs, stuns — in general, there a lot of different settings to consider.

All this is taken into account by the auto-battle algorithm. To put it simply, it works like this: there is some data as the input — the state of our fighters, the stats and abilities they have. We transfer this data to the battle calculator, which does all the calculations. For each calculation step, it reports what event happened, whether some kind of ability was activated, and whether some damage was dealt. After a full cycle of calculations, we get the result of the battle.

It’s worth noting that the battle calculator code is available both on the client side and on the server side. This is necessary to validate the results of the battle — the client, the server, and the battle calculator need to look at the same data.

To automate the auto-battle check, we decided to make a custom client that can generate a request to the server to receive the initial battle state based on incoming data. Autotests use this client, receive a state, pass it to the calculator, subscribe to the event of a new calculation step, and then check the conditions at each step.

Average test of our combat system

To check the mechanics of the combat system, we use abilities collected specifically for testing. Checks are aimed at mechanics; we don’t check skills from the production.

We also use this approach in our in-house product called Battle Runner. Our game designers need it to balance the combat system. A game designer can take any player states from production, combine them into test groups, optionally replace one ability with another, and run the game on a large number of battles. As a result of the run, the game designer receives the desired battle statistics.

This was the first step towards automation. Our autotests help us ensure that the basic logic of the battle calculator is still working after the game designers and developers have made some changes. This is a complex process, and unfortunately, we don’t have unit tests for this. Therefore, these combat calculator autotests are our only way to make sure that most of the combat system is stable.

Tests on the server and on the client

To explain what will happen next, let’s consider some theory — let’s say you’re observing the well-known test pyramid.

The essence of the pyramid is simple. The closer we are to the bottom, the cheaper and faster the tests are; and the higher we are, the longer and more expensive the tests become.

The solution seems to be obvious — we need to use the fastest, cheapest unit tests. But things are a little more complicated than that. In order for developers to write unit tests, the application must have a specific design — that is, it must be testable. And, unfortunately, not all of them are testable.

There are almost no unit tests for server logic, and the ones we have are component tests that mainly cover matchmaking, and the basic logic of modes and features.

If we consider the server separately, then we have the following situation: there’re no integration tests on our server. We can theoretically write API tests, since our client and server communicate using the protobuf protocol. So, there is a description of the protocol, and we can take the client and send requests. But for now, we’re keeping this idea in reserve.

On the client side, things are a little more tragic. There are no unit tests, and no component tests either. So, we find ourselves at the top of the pyramid and we just have to test our application through the UI. Most of our game looks like this: lots of buttons, dialogs, pop-ups, more dialogs, more buttons. Almost all interface elements live on Canvas.

As a basic tool, we use the open-source solution AltUnityTester — this is a driver that provides:

  1. Search for objects using x-path
  2. Scene management
  3. Simulation of input methods (tap, scroll, drag-n-drop, and so on)
  4. Calling methods and getting properties of game-objects
  5. The protocol of interaction through a web socket, which allows you to add many other commands.

As a result, we used Java, Allure, TestNG, and we decided to apply the Page-Object pattern and started writing tests. At first, everything was pretty cool. We wrote about 10–15 basic tests that just checked that the interface did something.

However, it quickly became clear that our codebase contained a number of issues that would affect us more and more as the project continued to grow. The first issues were related to selectors. The screenshot below shows an example of how we used Page-Object. The fields of the class are selectors, and the methods contain calls to the driver and additional logic.

The problem was not only its massive appearance, but also that the AltUnity API ended up in all of our classes. And if the developers were to change something in a new version, it would be excruciatingly painful for us to update.

Another issue was the responsibility of Page objects. First, inside the Page object, we called the driver (hello, API!). Second, objects could have a twisted logic. Third, our Page objects knew about other Page objects — that is, they were engaged in navigating through objects.

Our Page objects looked something like this

Another problem was dependency injection. When there were few classes, everything was in order. But with the complication of tests, it was necessary to connect a bunch of dependencies, as well as keep in mind which ones we generally had.

This is how the dependencies of a typical test look like

A large number of dependencies causes unnecessary difficulties: for example, if a new person comes to the company and tries to write an autotest, they will need to study the entire variety of APIs, create a mental map of the classes we have and how they are related, and make a lot of effort to dive into the process.

The tests looked very confusing; it was difficult to understand them

And the last problem we encountered was code duplication. For example, the picture above shows the OpenShopAndBuyRoom method, which is private to this test class, so we can’t use it elsewhere. But since we want to write more tests, we want to somehow reuse this method and it must belong to some class.

Time to stop and think

The use of AltUnityTester and the Page object pattern is very similar to automation in web application development. There our colleagues use Selenium WebDriver. And if we take concepts from the web and use them in our subject area, then we get:

  1. UnityDriver — interaction with the game.
  2. Unity-Object — a structural pattern for describing dialogs, screens and scenes. We use them only to describe the structure, and STEPS deal with the logic.
  3. Unity-Element — buttons, pictures, dialogs, text and so on. In general, everything that we have on the stage in Unity is Unity-Element.

We looked into the sources of WebDriver and the HTML Elements framework and managed to adapt the code to our needs. We also used the STEPS pattern to separate the test logic from Unity-Objects. As a result, we got a framework with which allows us to:

  1. Group entities into separate classes (Button, Label, AbstractDialog, and so on).
  2. Set the x-path of UI elements using @FindBy annotations, as well as introduce new annotations and extensions.
  3. Create separate groups of elements and reuse them in different dialogs by searching for objects in the context of another object.
  4. Create representations of components in Unity on the test side (since there can be several components in an object).
  5. Using STEPS, write tests in terms of the business logic of the game (“Open a shop”, “Buy a product”, and so on).
  6. The AltUnity code is deep in the core, and the driver is hidden behind the interface.

A little about STEPS: they connect our tests with Unity-Objects. Unity-Objects make it possible to click on an element or transfer some data from the game, and all the logic is in STEPS. This gives us the ability to write tests in terms of a business process. For example, “On this location, open a barracks”, “In the barracks, upgrade the barracks”, “Take a unit and transfer it to the barracks”. And under the hood, is a drag-n-drop, clicks and everything else.

The second feature of STEPS is that they can be reused in the future; and not only within the framework of functional tests. For example, recently we needed to implement a trial run of one scenario on many different player states. We created a new project, activated the library with STEPS, used a few lines of code to run the script — and it was done.

Below is the Unity Object. Remember what our selectors looked like? They were terribly ugly. Now we just use annotations where we prescribe how to search for a desired element — and that’s it.

Example of a typed element

This is how the description of almost any dialogue in our project appears. At the same time, STEPS have access to clickable buttons, lists of repeating objects, and STEPS can also receive information from the entire dialogue (how much gold we have, which slots are open or closed, and so on).

The Unity Element Loader is responsible for initializing the class fields — it receives a specific class and driver. According to some logic, we create proxy elements for each field in the class. And thus, we can simply write “Self-press button”, although in fact the system will first find this button, information about this will return back and only after that the “Press” command will be fulfilled.

Below you can see that we have STEPS for the quest dialog. And these steps are already described in terms of the game itself.

Test example

All tests look like this: we use only one injection of STEPS. Based on this, we describe everything we want to do using the terms of business logic — in the end, everything looks pretty neat.

Future plans

In the future, our primary plans are to perform even more tests. All these efforts are aimed at a convenient, simple, and understandable extension of our code base; but the problem of multithreading looms on the horizon.

At the moment, the tests are running in one thread for one game instance. Everything works well, but this takes a long time.

To deal with this, we could create multiple instances on our remote server. Or we could assemble a farm of devices and connect to them. But some of our features are global and could interfere with tests: for example, if the Farming Portal is open — then it’s open to everyone. Notifications might appear when opening or closing a portal in the interface of a parallel running test, and instead of a desired element, this notification might be tapped.

The next thing we would like to implement is back-to-back testing. This is when you take two versions of an application, run the same script, take screenshots at some point, then compare. So, you can check if something has gone wrong, if a feature has appeared ahead of time, and so on.

At the moment, we’re expanding feature coverage, we have a smoke set with at least one test for any aspect of the game, and we have also begun training colleagues to write tests.

A test automation framework is a product. It should be simple, understandable and easily expandable. When designing it, you should remember the patterns and principles of software development, as well as those related to refactoring. Otherwise, getting rid of regression will become a maintenance headache.

--

--

MY.GAMES
MY.GAMES

MY.GAMES is a leading European publisher and developer with over one billion registered users worldwide, headquartered in Amsterdam.