Custom tooling for the test stack management

The evolution of apps QA at Azimo

Published in

AzimoLabs

4 min readJul 8, 2021

This is the third in a series of blog posts in which we outline our multiple years’ experience with our Android app testing at Azimo. Most of the principles, goals, and achievements also apply to our iOS app.

Table of content

The evolution of apps QA, first days, and unit testing
QA engineers, functional and UI testing
(This post) — Custom tooling for testing stack management
Removing and subtracting tests is part of development too
QA testing in the cloud

After around three years of development, we improved our testing stack and app’s quality significantly. From 0 tests to about 60% unit tests coverage, from manual QA testing to hundreds of functional and UI tests run automatically, from one release per two months to the new app version published in the store every two weeks. App’s stability went up from countless crashes to 99% of crash-free users.
We achieved all of that with a relatively small team of 3–4 software engineers and 1 QA engineer, separately on Android and iOS. Changes were introduced in small steps, quarter by quarter, thanks to conscious goals management.
We had never frozen product development for more than 1–2 weeks.

While the results of our work were quite impressive, we also started facing new challenges. With hundreds of functional and UI tests, the tools we used (Fastlane with our custom plugin, Spoon, and others) weren’t well suited for our needs anymore. Here are some of the problems we faced:

Our test suit was overgrown, to say the least. Because of that, it wasn’t possible to run all the tests at once. A single run would last for more than 5 hours. And due to big flakiness, it was nearly impossible to make all tests passing. Tests sharding solutions provided by Android didn’t work very well for our needs (e.g., by not balancing the number of tests properly).
Our testing stack was hard to debug — the test suite results, test runners logic, ADB communication, or AVDs management. We even defined some specific requirements for improving these. But due to lack of competencies (e.g., no Ruby language engineers), in many cases, we could rely on the community’s help and responses to the issues we reported.

Automation Tests Supervisor

With these in mind, we decided to build our internal tool for test cycle management. AutomationTestSupervisor, which we developed from scratch in Python, gave us full control over:

Logging on many stages of the test run — from building and deploying the app to running tests and analyzing results to ADB communication.
Splitting tests into packages, sharding, re-running failing tests,
AVD and devices management, including setting up emulators with a simple config file.

We wrote separate blog posts about AutomationTestSupervisor in which we cover our challenges and motivations in more detail. You can read it here: Story behind AutomationTestSupervisor — our custom made tool for Android automation tests.

Story behind AutomationTestSupervisor — our custom made tool for Android automation tests

AutomationTestSupervisor is a Python tool that is capable of creating, running Android Virtual Devices, building and…

medium.com

Here is what we achieved thanks to building our custom tool:

We could do parallel testing with many devices and emulators on a single computer. With 32GB/RAM Macbook Pro, we were able to run 5–6 emulators simultaneously.
We reduced testing time by ~50% (from >5hrs to 2–3 hours for full test suit).
It was much easier to debug our test runs thanks to logs and customized dashboards with information about flakiness, testing time, shards (see the screenshot below).
We got rid of Ruby language as a dependency for our tech stack. Python was part of our tech stack anyway as a scripting language other automation solutions.
Getting rid of Ruby and migrating to AutomationTestSupervisor was also a part of our goal of departing from Fastlane completely (which was replaced by Jenkins pipelines).
Due to testing time reduction, we could move to a 1-week release train. Fast testing, small chunks of changes deployed to production, speedier experimentation, and feedback loops — these were also game-changing improvements in product development.

Building a custom tool from scratch isn’t easy, especially if your engineering team is small. You take full responsibility for maintenance and development. And you get very little support from the outside world. It was one of the lessons we learned over time, but the benefits we got were much bigger back then.
The 1-week release train was a game-changer for us in terms of product development. In the next blog post, we will tell you more about that. Stay tuned!

Custom tooling for the test stack management

The evolution of apps QA at Azimo

Automation Tests Supervisor

Story behind AutomationTestSupervisor — our custom made tool for Android automation tests

AutomationTestSupervisor is a Python tool that is capable of creating, running Android Virtual Devices, building and…

Written by Mirek Stanek