Luko’s mobile E2E testing approach

Published in

Luko

10 min readMay 9, 2022

At Luko, in our early days, we shipped React-Native app features very fast without significant time allocated to writing automated tests. As the mobile guild grew, it became more and more important to adopt an automated testing approach while still being able to ship fast.

Good tests ensure that an application continues to work as expected now and in the future. They give us confidence during refactoring and new features implementation. They help us spot regression issues that can happen when ten developers work in different tribes and squads.

This article focuses on our End-to-End (E2E) test strategy. E2E testing is at the top of the testing pyramid, and certainly the most difficult to set up and maintain. The E2E tests main goal will be to catch interaction errors that may occur with a user and verify that all units work well together. To catch typo errors or verify isolated logic, other tests will cover it.

Testing pyramid from Static to End-to-End tests

How to chose what functionality to cover with E2E tests?

End-to-End tests are expensive to develop, maintain and run. Hence, covering everything with e2e automated tests is not very practical. A good approach is to focus on the critical flows.

What is a critical flow? A flow is a sequence of user interactions with the app. Those interactions are of critical importance for the user to be able to get the job done.

For example, in a Mail app, the ability to send a mail will be considered more critical than the ability to create email group tags. It does not mean creating group tags is unimportant, it means that one is a core to the purpose of the app and the other is not.

We consider the ratio between added-value and effort to be more positive on critical flows, and this is the way they are prioritized.

Determining whether a flow is critical is the result of a discussion between the engineers and product managers. All critical flows are recorded in a prioritized test plan.

Some of the critical flows for Luko’s mobile app include:

Sign in and signup feature.
Access to contract details.
Adding and removing a payment method.
Declaration of a claim and access to its details.

The choice of what flows to cover with E2E tests impacts all the subsequent phases of the testing approach, and the time to allocate to automate the test plan and monitor progress.

Why did we chose an E2E testing approach?

More than one approach can be adopted when it comes to E2E testing. The chosen approach determines the context in which a test is run and therefore how the implementation will unfold to meet our needs.

To make a choice, it was necessary to study and compare the options.

Testing with dependencies

Every part of the system is tested, including the services. It’s the closest approach to an actual user experience with a device in their hands. It has many possibly unstable variables that can affect the test outcome. For instance, if the network is down, the test will fail. If the service has issues, the test will fail. If the mobile application has active a/b tests, you can get a different result on each run and the test will fail, etc. We can’t rely on those tests to be rock-solid.

Why will anyone choose to implement E2E testing with dependencies?

It reduces the amount of time needed for manual testing. Consider running a test suite of 100 tests, where 10 fail. We can then investigate just these 10 scenarios (e.g., it can be an actual issue or a network one). Moreover, it is a powerful tool to discover Client — Server regressions that server tests were not able to prevent.

Testing with mocks

As opposed to testing with dependencies, in a mock E2E test we are setting controlled and consistent inputs and environment by replacing all endpoints with mock services and data. We are controlling the services, the requests, and thus all the responses the test will give us.

A mock E2E test is predictable and stable. It will keep working because the environment will stay the same on each run. On the other hand, it will not help us identify any integration issues that may occur when deploying with an actual service on production.

In our team, we prioritize having the highest confidence and fastest setup time with the Testing with dependencies approach for our E2E tests.

The testing with mocks approach is still kept into consideration for the other tests in the testing pyramid, especially unit tests.

Nonetheless, we still mock some libraries or API calls on specific occasions such as feature flags, SMS validation or hardware device setup. These mocks are necessary because our automated pipelines run on Android emulators and iOS simulators and not on actual devices.

Implementing E2E tests with React-Native

Detox

In early 2020, when we initiated the implementation of E2E tests, we benchmarked and compared two solutions: Appium and Detox.

We considered feedback and insights on both tools. Our main decisive factor ended up to be synchronization.

Traditionally, one of the most difficult aspects of E2E testing is synchronizing the test scenario with the mobile app.

What do we mean by “synchronization”? For example, when a user clicks on a button, there can be loading time, screen transition, animations… Before executing the rest of the test, a good synchronization makes sure all actions related to the button click are done.

Manual synchronization with sleep() command is a bad practice. It's flaky, complicates the tests, behaves differently on different machines and makes the tests slower.

One one hand, Detox offers automated management for synchronization between the test and the mobile app. Even though synchronization may fail, logs are easy to exploit to troubleshoot.

On the other hand, Appium management of synchronization was not praised, as this article highlighting it as a pain point.

Tests are flaky, we got different results on different machines, frequent failures in CI, which the only solution for was the addition of sleeps, which slowed the tests down.

In addition, a couple more points made us lean toward Detox definitively. First, it was built with React-Native architecture in mind from the start, and secondly, Detox is the E2E testing solution used to test the react-native library. (see external references)

We chose Detox other Appium.

The following operations are monitored by Detox

Network requests — in-flight requests over the network.
Main thread (native) — pending native operations on the main thread (main dispatch queue and main NSOperationQueue).
Layout of UI — UI layout operations. There’s also special support for React Native layout which includes the Shadow Queue where yoga runs.
Timers — timers (explicit asynchronous delays). There’s special support for JavaScript timers like setTimeout and setInterval.
Animations — active animations and transitions. There’s special support for React Native animations with the Animated library.
React Native JavaScript thread — pending operations on the JavaScript thread in React-Native apps.
React Native bridge — the React Native bridge and asynchronous messages sent on it.

Writing tests with Detox

Detox tests are pretty straightforward to write.

A. Test if a component is visible

// HomeScreen.tsx
<View testID="home-screen" />// e2e/home.e2e.ts
it('should display home screen', async () => {
    await expect(element(by.id('home-screen'))).toBeVisible();
});

B. Access an item with custom matchers

You can access with ease a specific element of the screen, either by filling in its displayed label, its testID or by adding hierarchy granularity with ancestor and descendant utils. All can be combined to narrow it down to the desired element.

Since our mobile app supports multiple languages, we created a custom matcher, byTextKey, to help us find an element with translations keys. This matcher also handles variables that can be transmitted into translations.

// English translation is "Take a photo of a damage on this wall"
byTextKey('claim.surface.surface', { surface: 'wall' }));

Why not use only testID matcher? For components displaying text managed by the app, we prefer to target element based on text that the user should actually see instead of a potentially nested testID. We find this behavior closer to the user experience.

C. Simulate user interaction

Once a component is correctly matched, we can simulate user interaction by performing actions.

By combining some of them, it is easy for instance to scroll inside a ScrollView until an element is visible.

await waitFor(element(by.text('Last child component')))
  .toBeVisible()
  .whileElement(by.id('scroll-view'))
  .scroll(200, 'down');

Integrating E2E tests with our CI

Running E2E tests locally is time-consuming. Furthermore, it’s not the most stable, scalable, and efficient way to proceed on a day-to-day basis. Hence to push automation further, we rely on the Continuous Integration.

Run E2E tests jobs

In the mobile team, we use Bitrise to run E2E tests on a virtual machine that runs both Android emulators and iOS simulators.

These tests are launched automatically every weekday morning. Manual launches are also possible, they happen mostly when a mobile engineer made changes on one of the critical flows covered by the E2E tests.

Analyze the result and troubleshoot issues

Once the tests have finished running, a result summary is sent to a dedicated channel on Slack.

Morning report with failures on Immo E2E tests

All mobile engineers can have a daily update and be aware of any issues.

To prevent temporary network unavailable errors causing a failure of a test, we retry the test once after a failure. If the test then succeeds, we consider it to pass, otherwise, it is flagged as failed. The report indicates if a successful test has been retried. This allows us to monitor possible instability on the mobile app side.

Below, we can find our CI workflow, for both iOS and Android platform, from the build of the app to the publishing of tests results, including the retry conditions.

Summary of the E2E CI workflow ran every weekday morning

When a test fails, in the artefacts of the job on Bitrise, we have access to screenshots containing each test step.

a screenshot of the app when the test started,
a screenshot when the test failed,
a screenshot when the test ended,
a screen recording from the start to the end.

With screenshots, screen recording and logs, it is then quick to troubleshoot the origin of a failure and work to fix it.

Start of a test screenshot / Screenshot when test failed where we easily guess a hard crash

The keyboard pain point

At first, our CI integration did not provide screenshots or screen recording and we only had access to application logs. Hence, when an E2E test failed on Bitrise while working locally, we had to rebuild the job with Remote access. It enabled us to have a live desktop rendering of the virtual machine and see the tests run on iOS Simulator or Android emulator and get an understanding of the differences with local tests. This way of troubleshooting required patience and vigilance. The build took approximately 20 minutes and if we missed the error event we had to run it again.

It’s back then that we were able to spot the main difference between local and virtual tests. Locally, when completing a form on an iOS simulator, the keyboard was not shown open, same as on a regular iPhone.

Keyboard appearance locally VS Keyboard appearance on Bitrise virtual machine

We changed the settings locally to display fully open keyboard on our iOS simulators. It allowed us to have a configuration closer to a real device and correctly handle the interaction of our screen with an open keyboard.

This experience demonstrated the importance of having E2E tests running on iOS simulators / Android emulators with the closest interaction possible to real devices. Locally, we were skipping the keyboard management on our E2E tests and Bitrise made us aware of this.

Since then, our E2E tests running on the CI became more stable and reliable.

Conclusion

Adopting the testing with dependencies approach, choosing Detox and configuring our CI allowed us to make significant progress in our E2E testing initiative in a timely manner. As of early 2022, we cover 73% of the critical flows.

With the improvements made, the increasing stability and the experience gained, the team is currently comfortable adding and maintaining E2E tests and we are satisfied with our approach.

Currently, we explore Behavior-driven development, which emerged from Test-driven development, and Gherkin, to write tests that match users’ use of the app as close as possible.