Testing a Flutter app: tools, pros, and cons

Published in

Surf

11 min readOct 10, 2022

Hi! I’m Mary, and I’m a QA expert at Surf. Our company has been making native apps since 2011, and ever since 2018 we’ve also been building Flutter apps.

In this article, we’ll be comparing testing options available for native and cross-platform apps. I’ll share my experience working with Flutter and tell you what tools we use to test apps at Surf, what makes Flutter practical, and what pitfalls we’ve run into.

Testing options in Flutter are on par with the native ones

When your business changes its approach to development or introduces a new technology, you should try to make it have only little effect on your testing options. Ultimately, QA experts should keep using the familiar stack of trustworthy tools and technologies when handling a new language or framework.

When we test native apps at Surf, we use automation tests and also intercept and mock packages. You can’t do without automation tests today, especially in regression tests, and without a proxy, app variation and case coverage are lower.

We needed the familiar options to stay when testing Flutter apps.

Automation tests

To autotest native apps, Surf uses Calabash and Ruby. When Flutter first appeared, right off the bat we wanted to know if we could avoid using Calabash while having a full-fledged autotesting experience that we were all used to — or even cooler.

Turns out we not only could do it but were able to do so even without third-party services: Flutter makes integration and widget tests in the console available out of the box.

Automation tests in Flutter are both cross-platform and native: you can write tests inside an app project, and they will work on both platforms. When there’s an entire project you’re looking at, you can add the missing ids or even catch bugs and fix them, — that’s another chance to improve your app.

Flutter also supports Behavior Driven Development — BDD. We adopt it in UI tests. Our language of choice is Gherkin: it supports feature files as well as scenarios written in English and Russian. It’s clear, it helps implement scenario steps without extra arguments inside or with them and customize autotest runs: e.g., running specific scenario by tags instead of running all the available tests in full.

To test Flutter apps with Gherkin, we enabled the open source framework flutter_gherkin.

Once we realized that automation tests are available in Flutter, we just needed to know the difference between Calabash and Dart+Gherkin and which approach is better. Let’s compare them together.

1. Feature files are absolutely identical in both cases.

For example, PIN code authorization scenario will be interpreted correctly in both Dart and Ruby with Calabash:

Scenario: Correct authorization in an app at the first try(short code)When I run the appAnd I see the short code input screenThen I enter the correct short codeAnd I see the home screen

Both technologies support English and other languages.

2. Steps differ in implementation.

Dart + Flutter_gherkin

class TapAnErrorButtonOnPinCodeScreen extends ThenWithWorld<FlutterWorld> {
  @override
  Future<void> executeStep() async {
  final elemFromReportAnErrorScreen = find.byValueKey(‘reportAnErrorButton’);
  await
FlutterDriverUtils.tap(world.driver, elemFromReportAnErrorScreen);
 }
  @override
  RegExp get pattern => RegExp(r”I press Report an error on the PIN code screen”);
}

Calabash

When(/^I press Report an error on the PIN code screen$/) do
wait_element(“* id:’reportAnErrorButton’”)
tap_on(“* id:’reportAnErrorButton’”)
end

It’s hard to pinpoint what is more convenient: the structure inside a specific technology doesn’t change and that’s great.

3. To configure tests and use them, Flutter has an extra .dart file. When you use Calabash, there’s no such single file.

We can’t actually say that it’s a drawback for either Flutter or Calabash — it’s just the nuance of using a particular tool or technology.

4. To make handling elements inside an app easier, each element needs to have its own id. When you work with automation tests in Calabash, you need to make sure in advance that the test apps have id’s on both platforms. In Dart, you can add an id as you write autotests, without uploading iOS and Android app files again — that’s handy and it saves time.

Our verdict: automation tests in Dart are just as good as those in Calabash.

Proxy

To expand the case coverage of an app, Surf turns to specific software to intercept and mock traffic — e.g., Charles. Analyzing client-server interactions helps:

Find out whether there’s an actual backend interaction.
Identify, whose side an issue is on: client or server.
Make tests on ready-to-use test data quicker and with less effort without involving server developers.
Analyze the behavior of a mobile app in various networking conditions: request failures, delays, big data. Charles helps identify client-server requests generated improperly and avoid loops when referring to a server and thus overheating your device or draining the battery too quickly.

Dart has its own networking client. Since all the requests pass through it, you need to enter all the necessary proxy settings in your app. To make life easier for QAs, all such settings are rounded up on a separate screen: at Surf, we use the Debug Screen we’ve developed ourselves.

Debug Screen is an extra screen of settings that aids testing and is only available in a debug build. In it, you can choose the server you need, enable intercepting and saving http requests in the app, view the fcm token of your device and much more — there’s plenty of testing options available.

Debug Screen is a custom one: developers supplement it with extra elements if testers request that — for example, the fields you can use to configure a proxy in the app. As a result, we have access to all the options in Charles: we can enable the proxy server on the Debug Screen without having to restart an app.

As you can see, testing options aren’t limited in a Flutter app. There’s everything we got used to having in a native app and it’s easy to work with. That’s good news.

Cons: framework bugs, gaps in third-party libraries, expected native behavior

The setbacks we run into when we test Flutter apps occur in native apps as well. I can’t say that they’re inherent to Flutter: addressing an issue in any technology isn’t always clear and easy.

Let’s see what to look out for when testing a Flutter app. Forewarned is forearmed.

Bugs in Flutter framework

While testing, we ran into an issue that concerns presentation and styling of fonts in iOS: spacing on iOS was significantly wider than on Android. That led to a large number of visual bugs.

Turns out the framework itself was to blame. When our mobile developers asked the Flutter community for help fixing this pesky bug, the framework was soon updated and text presentation on iOS was fixed.

Such cases will most likely come up again. But they can hardly be called more than a hiccup: guys in the Flutter community are quick to respond to issues and willing to support and improve the framework.

Gaps in third-party libraries

In iOS 10 and 11, we came across some faults in implementation of third-party libraries. For instance, we fixed the bug where permission to access notifications popped up as soon as the app was open and not after the user tapped a button in accordance with the technical specs and design.

Such things can come up in both cross-platform and native development. You can either fix them within your project or together with the authors of these libraries.

Handling expected native behavior

When you’ve used and tested native apps on iOS and Android for a long time, it’s easy to predict what users will expect from this or that app behavior. For example, going to the previous screen with a back swipe is a standard gesture in iOS. On Android, however, it’s unnecessary.

System dialogs differ between the two platforms: on iOS, you need to request permission to access notifications, while on Android you have this permission by default.

It’s these specific OS details that we often have to finish off manually. Or sometimes yeet them, if the behavior expected on iOS made its way to Android, just as it was with the back swipe.

In native apps, issues like screen refresh, an app minimized incorrectly, or app behavior not typical for a particular OS are pretty rare: tools and technologies designed to develop apps for a specific platform are supposed to automatically cover all the versions and capabilities of a specific system.

While testing a Flutter app, we came across an interesting case: screen refresh was unavailable for iOS devices with a fringe — starting with iPhoneX and younger. Meanwhile, iOS devices without the fringe and Android devices were functioning as expected.

Another bug we ran into on Android 6: once minimized, the app was kicked out of the memory.

Such bugs were fixed by our developers within the projects.

iOS are perfectly familiar with their devices and systems, what shticks they’re adding to the new OS and which of them won’t work in the previous versions anymore, what to highlight when they’re updating Swift. Android understand that they need to target a wide range of devices with absolutely different screen sizes and know their specifics as well.

We’d like the cross-platform to have all the implementation details of native development. Sure, you can see little gaps in Flutter, but that’s no big deal: you just need to find your way round the cross-platform.

Pros: single code base, one team of developers

Single code base saves time on testing

Bugs can result from unclear technical specs, specific states missing from the design, or breaking changes in the backend. Reporting such bugs is easier since you need to create twice as few tasks, which already saves time.

You can look for bugs and test features on one platform: they are extremely likely to occur on both platforms — there’s one implementation, after all.

The logic of new features on both platforms is also equivalent, since there’s a single code base: testing complex processes inside an app boils down to testing them on one platform and confirming them on the other. We run a full cycle of activities on one platform: we run exploratory testing, test features, do smoke/sanity/full tests, and analyze the feedback. Then, all we have to do is confirm the quality with exploratory tests on the other platform. Such an approach saves time testing the logic by about 1.3-fold.

Example
Say, the analysts ran into a situation while testing events: according to the technical specs the event must be sent to a corresponding analytical system, but it can’t be tracked. This gap in logic clearly means that the event won’t be sent on both platforms.
Having fixed the bug and made sure in various scenarios that the app gathers analytical data correctly on one of the platforms (e.g., iOS), we can pretty much suppose that we won’t be seeing this bug (tracking an extra event or missing a specific event in the system analytics) on the other platform (Android).

If you need to hand in the builds for both platforms on the same day,, you may not have enough time to test them in native development, provided that they come out simultaneously and there’s only one QA engineer on the project: the test cycle has to be run on both platforms separately. Testing cross-platform apps saves time: regression tests on both platforms run within a single cycle.

We’ve tried giving a rough estimate of the testing process on two similar projects — one on native Android and iOS, and the other one on Flutter — and compared them feature by feature.

The apps were analyzed and tested on one iOS and one Android device. As practice showed, Flutter actually gives you the headfirst, even though you’re not twice as fast. That’s logical: you can’t get rid of testing on the other platform completely. Whichever way you put it, they have different specifics and different ways to target their users.

Testing an out-of-the-box feature which doesn’t affect the specifics of the OS by a 100% and isn’t customized to each platform, saves time testing a Flutter app on two platforms about 1.3–1.5-fold. For example, authorization and resetting a password behave similarly on different platforms and require 1.3 times less time when tested in a Flutter app.

As for the features that require custom behavior on each platform, don’t expect tests to go any faster. They are expected to behave differently on iOS and Android, hence both platforms will have to be tested in full and separately from each other. For example, push notifications often have to be tested in a full cycle on both platforms due to the differences in permissions, enabling notifications, pusher settings on iOS and Android as well as other nuances in implementation — in order to have the app interact with notifications the way the user expects it to while complying with the technical specs and design.

Communications inside the team are easier to arrange

When there are a lot of people on a project, it’s hard to arrange the process in a way that stops even the smallest nuances from slipping through the cracks. Especially if you’re expecting a lot of things to be finished off later, new features to be implemented and some changes in general. Most of these things are easy to deal with when there’s only one team of developers.

First off, it’s easier to test only one app implemented by a single team than to deal with an app implemented in two different ways on two platforms. A QA expert would benefit from having all the information available on the app status, both on iOS and Android. Flutter makes it easier.

Secondly, there’s an important nuance to native development: changes recognized and accepted by one platform must be reported to the other platform, which is sometimes forgotten or lost in the course of vast or intensive change. There’s no such thing in Flutter development: there’s only one team — i.e. a single task of finishing something off covers both platforms.

We love testing Flutter apps

Being part of a cool community

A new framework is a positive thing for us: solving extraordinary cases makes us broaden our horizons. We find a lot of curious and unique bugs that boost our skills and expand our options in mobile testing.

That said, members of the Flutter community also provide swift feedback on the issues we find, improve their library, and let us contribute to the project: we’re moving forward together and that’s nice to know.

Being an expert

When you’re dealing with cross-platform apps, you need to keep in mind the differences between the operating systems and stay focused on specifics of each platform. Looking for and seeing the differences where they should normally be minimal, learning something you’ve never come across before — such things help you become a better expert.

When you develop or test native apps, you can’t build an iOS app in, say, Android Studio or Visual Studio Code. With Flutter, there’s one IDE for both Android and iOS. And that’s cool.

Taking on responsibility

Working with Flutter, Surf teams have their hands on all sorts of projects: from e-commerce to banking. As far as we’ve learned, a QA engineer can handle testing both platforms on their own. Adding another expert could only be needed as we get closer to release when work gets intense and the time left to do the tasks is running out.

Flutter — a step forward

Testing cross-platform apps is pretty easy. Sometimes it’s even faster and easier than native tests. All the hard parts you might see can’t outweigh its usability and its pros. The experience of developing and testing Flutter apps has shown Surf that this framework is a great step forward.