Changing how we work with QA in Pet Rescue Saga
By Johan Hoberg — Senior Quality Engineer at King
We in the Pet Rescue Saga team here at King have changed how we work with QA to become more efficient and deliver a higher quality product over a period of three years.
I will single out a few factors that I believe hold more significance and describe how it was, what it changed into, and what impact these changes had for the game team:
At King we work with Quality Assistance instead of Quality Assurance, and in the Pet Rescue Saga project we continuously evaluate if we are staying true to this and analyze how we can improve in the right direction.
For more information read our article on quality assistance and what it stands for at King.
Three years ago the Pet Rescue Saga game team did not have any dedicated test professionals (QA Analysts), and now we have one embedded in the game teams, and one extra remote QA analyst, who can get involved when the workload is higher.
Having an embedded QA analyst has been critical for improving quality, as this person can be involved throughout the development life cycle, have close cooperation with the teams, and be part of the continuous development of the QA process.
Our remote QA analyst, who is located with the Central QA (offsite testers) regression test teams, has a broader view of different games, access to more, different mobile devices, and is located in a different country with different environmental factors. This has added much value to us. Having an expert in our game close by to help the Central QA regression test teams has facilitated communication immensely.
We also continuously improve how the QA analysts work, trying to focus on being proactive instead of reactive. This means getting QA analysts involved early in the development cycle, not just testing when a feature is done.
We had an idea that we had to run all our tests before every release ‘just to be sure.’ What we saw was that we were finding all critical problems, but we were running a lot of unnecessary tests in the process.
In hindsight this seems extremely inefficient, but it is an easy trap to fall into just to feel that you are in control of a complex situation.
The key to an efficient QA process is understanding and mitigating risk. You then spend your testing resources where they are most needed — where changes have been made, risks have been introduced, and where complexity is high.
These days we always look at all the changes that have been made for a release, analyze the risks, and run the appropriate tests to mitigate those risks.
Some features and areas are so important that a bug in those areas would be catastrophic, for example, purchases and Facebook Connect. Those areas we always test to some degree even though no change has been made. We don’t, however, run every possible test we could think of — we run enough tests so that we are confident that it works.
A key concept here is understanding test coverage. The number of possible tests in a complex feature is often close to infinity. Just because you have a hundred tests doesn’t mean that those tests give you full test coverage. You can basically never have full test coverage — but you can have enough to mitigate the risks. Never allow yourself to believe that the scripted tests you created cover every possible risk, and if you just run them you are safe. It is a trap, and it is very inefficient. Always aim for enough testing to mitigate risks.
Scripted and Exploratory Testing
When we started our journey towards better quality, all the tests we ran were scripted tests, created as an afterthought when the feature was done, and executed as internal regression tests before each release. We also had Central QA regressions test teams run scripted tests covering different game features.
We felt that this was not the best way to utilize the high level of test competence our embedded and remote QA analysts possessed, and no one in the game team enjoyed running the scripted regression test suites. We also had a lot of test cases that we performed without having to actually use the information stored in the test cases because we already knew how to do it.
Our first change was that the QA analyst stopped running scripted tests, and focused completely on exploratory testing throughout the release and development cycle. We would have simple checklists of important features and scenarios, and the QA analyst would quickly go through it before a release to make sure that everything on it was covered. This saved us a lot of time when we did not have to create test cases for new features — time that we could instead use doing actual tests.
A second change was that we no longer provided Central QA with scripted game specific tests, and instead let them focus on their core strengths, which I will go into more later.
The final change was that we stopped running scripted testing during our internal regression tests, which I will cover more extensively below.
In short, we no longer run any scripted tests in our game teams, and no one is running game specific scripted tests for Pet Rescue Saga. This has enabled us to make a better job of leveraging the high test competence of our QA analysts and stop creating waste in the form of scripted test cases that no one needs, as well as making testing a more enjoyable and interesting activity.
When you create an artifact such as a test case, always ask yourself what value that artifact will provide. In our case it was close to zero, and so we just stopped creating those artifacts.
Internal Regression Testing
We had created a set of scripted test cases that the game teams ran for each release. There was no enthusiasm or ownership from anyone, people just ran the tests as quick as possible to be done with it. Usually three or four developers or artists took care of all the tests, and everyone tried their best to avoid having to test.
We were not risk-based, and it was not fun.
What we did to address these problems was to switch to exploratory testing. Everyone in the game teams participate for one hour, receiving a test mission that they focussed on. We also provide everyone with a risk list at the beginning of the test so that extra focus can be given to risk areas. People sit together, testing to the best of their abilities, within the boundary of the test mission they have been given. More ownership, more risk focus, and more fun.
For a more in-depth analysis of the exploratory testing approach, see my previous article about this.
Central QA Regression Testing
Previously we used the Central QA regression tests as a safety net, trying to cover both game specific tests, and cross game (features that span over multiple games) tests for each release. What we saw was that on the rare occasions when they found major game specific bugs, it was too late to fix them anyway.
We needed to rethink how we used Central QA resources. What could they bring to the table, that we could not just as easily solve ourselves?
Devices — they had a larger devices lab than we did, so a basic smoke test on different devices added a lot of value to us. This was something we could not do ourselves.
Cross game features — they had better understanding of cross game features than us, and they also saw how these features worked and did not work on a broad spectrum of games. This was something they were better equipped to handle than we were.
So we refocused how we used Central QA resources, and removed all our own scripted tests from the test suites.
Our goal is to reduce our reliance on Central QA resources as much as possible, and only use them for activities that leverage their unique competence and environment.
We have tried to move to a more Agile way of setting up teams, by creating cross-functional teams that can own and solve problems, and adapting other practices described in the Scrum framework. It is important to understand that good quality cannot be added at the testing stage — it needs to be build in from the start. And a good way to enable this is to adapt Agile practices, such as Scrum.
As our Agile Coach always said: “Read the Scrum Guide!”
A key to good quality games is that everyone feels ownership of the game. Your responsibility does not stop because you committed something to the master branch. Everyone is responsible for the quality of the game together. Shared responsibility does not mean no responsibility.
Everyone needs to play the game, everyone needs to test the game, and everyone needs to highlight any flaws or bugs they find in the game. Everyone needs to care about the game.
We have worked extensively in Pet Rescue Saga with making the entire team feel ownership of the game. One of the first things we did in this regard was to outline the expectations we had that everyone should test whatever they have done before committing something to the main branch. Another important factor for us is, of course, adopting the Scrum framework, with all its benefits.
In the past developers and artists often made a change and committed it, and expected QA analysts or ‘someone’ to test it. Having a QA analyst doing basic testing for developers or artists is wrong in so many ways; not only does it erode ownership completely, it also diminishes the competence and value of QA professionals.
The key here is that everyone must test to the extent of his or her abilities. If you have developed a feature, you test as much as you can yourself. If you feel that there is still a risk, you call in a QA analyst to support you.
QA analysts can support developers with test ideas, and these can be developed early during feature development, not only helping the developers with what they need to test, but also help them design the code with these test ideas in mind.
But if you have developed a complex feature, which presents a complex testing problem, and you have tested to the extent of your abilities, you should always feel comfortable calling on a QA analyst for help.
It can also be good to provide developers with some general basic test ideas, such as described here.
I have gone into more detail on the subject of who tests what in an article I have written on Gamasutra.
Test automation has been completely driven by developers in Pet Rescue Saga, and this has been a key component in allowing us to transform how we work with QA.
Björn Gustavsson, who has driven this development, has already written an article about test automation in Pet Rescue Saga.
One small but important change was that we moved from only creating new builds on request, or for releases, to building every night, and running our automated tests on each build. This has had a number of positive side effects.
It facilities the QA analysts job — you don’t have to request a build, you can just start running exploratory tests on the latest build each morning, and thanks to the automated tests and a change log, you have a better understanding of remaining risks and can focus on those.
But it also makes it much easier to find when a bug has been introduced, and also which commit introduced it.
'Continuous’ Integration (CI)
While we are not doing CI in its true sense, we understand the importance of not having a feature on a separate branch for a long time, and have changed how we work in this direction. We want to avoid big-bang integrations. We still develop bug fixes and features on separate branches, but try to merge them to master as soon as possible (after an appropriate amount of testing). A feature doesn’t have to be ‘done’ before we merge it to master. Many features that are not done are merged to master, but are then turned off until they are ready for release.
King has a crash analytics tool which allows us to keep track of all crashes in the game, which not only lets us focus on fixing the most important crashes, but also helps us with risk information about unstable areas to aid in guiding our test effort to mitigate risk. We have always used it, but since 2015 we started to take a more structured approach to how we used it and what actions we took when a crash was discovered.
- Continuously monitor both live and QA versions of the game
- Analyze each crash which happens above a certain frequency threshold
- Create stories for all those crashes in the backlog and prioritize them
With this approach we reduced the number of crashes in our game, and this in turn makes it easier to discover, analyze, and have time to fix new crashes. A positive spiral.
A few years ago we had only sporadic cooperation with customer support, but we have actively worked with improving this relationship so that we can always stay on top of things that matter to our players. They are now an integral part in our development process, aiding us in focusing on the right things, that really bring value to our players.
Platform Integration Tests
When we receive new platform releases we previously often just integrated them directly to the master branch, with all the problems that entail. We now know better and create an integration branch which we test properly before merging the new platform release to the master branch, and we always do it early in the release cycle to have time to handle unexpected problems.
King Bug Policy
Previously when a bug was reported in JIRA we either fixed it or just left it there, and over time the ‘left it’ pile started to grow and grow. There was always a good intention behind it: “This is something we should fix at some point.”
But that point usually never came. So when we had several hundred bugs that had been left unattended for more than half a year we took a decision to start working differently. Our goal now is to always take a decision on each bug — is this something we are going to fix within a reasonable time frame, or not. And if it is not, then we close the bug. It is always easier to not make a decision, and just leave it there, but having a cluttered JIRA project with hundreds of bugs that will never be fixed brings with it a lot of problems, but most importantly it says something about your culture. That bugs are not worth your attention. And they should be if you care about customer value.
For a long time we relied on internal play testing of all new levels we released for our game, but we saw a lot of flaws in this way of working. The most glaring one was that we as developers of the game are not really good representatives of our typical players. Another problem was to find the time to do it.
Now we work with a crowdsourcing company and send all our levels to external playtest, which has not only freed up a lot of time, but also much better represent our players.
All of the above needs to be taken into consideration when designing your release process. We are continuously trying to improve and fine-tune our release process to leverage all the improvements we have done to our work process. But one of the keys is to always take complexity and risk into consideration when selecting a release candidate. Don’t pick an untested release candidate one or two days before a release and expect everything to go smoothly. The less testing and the more changes you have done, the more uncertainty. Account for that uncertainty.
These are the big changes we have made to improve the quality of our game, and we have had very good results, but you can never stop learning and trying out new ideas. We have to continuously revisit how we work and what we do to manage the ever-growing complexity that is mobile gaming.
Join the Kingdom — Jobs.King.com
Share this article!
Originally published at techblog.king.com on October 18, 2017.