Native, simple and fast screenshot tests with Kaspresso

Published in

Kaspersky

12 min readJun 11, 2024

Hi everyone, my name is Maria and I’m an Android developer at Kaspersky. In this article we’re going to discuss screenshot tests and how to make them simple, fast and efficient using our open source framework for autotesting — Kaspresso.

What are screenshot tests and why use them?

Screenshot tests are mostly used for localization testing. Your app may have dozens of supported languages, and it can be laborious to manually ensure that all strings are properly translated and displayed in each of them. As the name suggests, screenshot tests involve taking screenshots of all the different screens and states of your app in all languages. Afterwards, these screenshots are saved to appropriate folders and can be reviewed by the localization team (Doc&Loc) to make sure every string is correct in each locale.

Screenshot tests can also be used for design review, allowing UX designers to check if the final look of the app matches the mock-ups they created.

Screenshot tests are different from other autotests you might write for your app. First, in each test we are only concerned with the strings on a particular screen — we don’t need to replicate the whole process of getting to that screen from the app startup. Instead, we want to directly open the activity or fragment we want to screenshot (although this is not always possible).

Second, we want to capture every possible state of the screen in each locale. However, we aren’t testing the app’s behavior, so checking UI elements and imitating user actions is unnecessary. Instead, all we need to do is:

1. Open the screen we’re testing

2. Set the state to test

3. Take a screenshot

4. If needed, set another state and repeat

5. Change locale and repeat steps 2–4

Let’s take a look at how you can implement this using Kaspresso library.

First, let’s create a simple screenshot test. Then we’ll go into detail about the potential pitfalls and propose a solution for an app that follows the MVVM (Model, View, ViewModel) architecture pattern.

We’re going to use an example app created to demonstrate the features of Kaspresso. In the Kaspresso project, in the ‘tutorial’ folder, there’s an example of the application code for which tests will be written. The first lesson of the tutorial will tell you how to download it. In the TECH-tutorial-results branch, you can see the final implementation of all the tutorial tests.

A simple screenshot test

Open our tutorial app and click the “Login activity” button. You will see a login screen that looks like this.

To write a screenshot test for it, we will first create a test class in the androidTest directory of the app. For convenience, all screenshot tests can be kept in a separate package — let’s name it screenshot_tests. Then, we create the class LoginActivityScreenshots in this package (“Screenshots” is a common name ending for screenshot test classes).

Note: in all the following code snippets, the imports section is omitted to keep the examples simple and concise. If you’re not sure which imports you should be using, refer to the full code in the repository linked above.

Note that we inherit the class from DocLocScreenshotTestCase. This is a Kaspresso class that has built-in support for running the tests in multiple locales and can automatically place the screenshots for each of them in a separate directory, making them easier to find and organize later.

The constructor of DocLocScreenshotTestCase takes a list of locales as a parameter. In this example, we’re going to test English and French locales, so we pass them to the constructor. The order of locales does not matter, the test will be executed using one at a time.

As we mentioned before, we’re not going to replicate the entire process of navigating to the desired screen. Instead, we’ll create a Rule specifying that LoginActivity should be opened when the test is started.

Now we can create a test method with one step (if you’re not familiar with steps, take a look at this article or our tutorial) which will test the initial state of the screen. Let’s call it takeScreenshots.

Next, we need to call the captureScreenshot method which will take the screenshots and place them in appropriate directories. This method takes a file name, which can be any string, as a parameter. Later you can locate the screenshot file by that name.

The base directory for the screenshots is sdcard/Documents/screenshots. Inside it, there will be subdirectories for each language we’re testing (in this case, “en” and “fr”). The next level directory is named using the package and class name, and finally, the last one has the method name. So the full path will look like this: sdcard/Documents/screenshots/en/com.kaspersky.kaspresso.tutorial.screenshot_tests.LoginActivityScreenshots/takeScreenshots/initial_state.png.

At this point, we’ve done everything necessary to take screenshots and view our app in different locales, but there’s one more modification we should make.

Currently we’re opening the screen and taking the screenshot immediately, meaning there’s a possibility that some data on the screen won’t be completely loaded, and the screenshot won’t capture all the elements we want to see.

To deal with this, we’ll create a LoginScreen page object and add a method that will wait for the UI to load (if you’re not familiar with Page Objects, you can read about them here). This method will simply call isVisible for all the elements on the screen, and thanks to this check using flakySafely, even if something doesn’t load right away, the test will wait for a few seconds until the condition is met. You can read more about flakySafely and its usages here.

Let’s call this method waitForScreen.

And then, we’ll call this method in the test class before taking a screenshot.

Now, if we run the test, it will pass successfully. The resulting files can be found using the Device File Explorer in Android Studio: in the sdcard/Documents/screenshots directory there will be subdirectories for each tested locale containing the screenshot files. We can review how our app looks in different languages.

These screenshots allow us to immediately notice that the French translation is missing, and the developer can fix the problem by adding the appropriate strings to the values-fr/strings.xml file.

Working with MVVM and setting states

In this example, we will be testing a screen that imitates data loading. To open it, click the “Load user activity” button in the tutorial app. When the user clicks the “Load user” button, the app starts the loading process and shows a progress bar. When the loading is complete, the screen displays either the successfully loaded data or an error message.

Let’s determine the states of this screen. First, when we open it, it displays a single button. Clicking this button will change the appearance and consequently, the state of the screen. We’ll call this first state Initial.

When we click the button, it becomes inactive, and the screen shows a progress bar indicating that the loading has started. This is already a different state from the Initial one. Let’s call it Progress.

After a few seconds, the loading completes, and the screen shows the name and surname of the user. The button becomes active again, and the progress bar disappears. We’ll call this next state Content.

While in this example we only imitate data loading, in real apps we could encounter network-related errors that should be handled in some way, for example, by displaying an error message. We have imitated this scenario in our test app as well. If you try to load a user several times, one of the attempts will result in an error.

This is the last possible screen state, and we’ll call it Error.

A naïve approach to testing this screen would be to imitate user actions by clicking the button and trying to capture all the states. There are, however, a number of issues with this method.

1) The Progress state depends on the network speed and server availability. If the data loads too fast, we’ll never capture the Progress state, and the test will fail.

2) We can turn off the internet to get the Error state, but in a real app there could be different error types leading to different states. For instance, if there’s no internet connection, the app will prompt the user to turn it on, but if the server returns an error, the app will just show a message describing it. In cases like these, emulating the error state becomes too complex.

3) The purpose of a screenshot test is to capture all possible screen states, not to test app behavior. However, if we encounter actual connection or server problems during the test, it will fail and we won’t get the screenshots.

4) Connecting to the internet or performing other complex tasks can cause our tests to run for too long, leading to potential failures due to timeouts and generally making the screenshot tests inconvenient to run.

View and ViewModel

For the approach we’re about to discuss, it’s important to understand the MVVM (Model, View, ViewModel) pattern. Briefly, it’s a technique that allows you to separate UI and business logic in your app.

The UI screens (Activities and Fragments) are responsible for displaying the UI elements (buttons, input fields, etc.) and reacting to user actions (clicks, swipes, etc.). This part of the app is called View.

ViewModel, meanwhile, is responsible for handling the logic behind the screens. It stores a state which determines what a screen should show to the user, and the View, upon receiving this state, modifies the UI elements accordingly.

Let’s consider an example from our app. When the “Load user” button is clicked, the View calls a ViewModel method that loads the data.

The LoadUserFragment class in the com.kaspersky.kaspresso.tutorial.user package is a View. In the following code fragment we set a button click listener that calls the loadUser method of the ViewModel.

The loading logic is implemented inside ViewModel, in the LoadUserViewModel class. The loadUser method changes the state of the screen, setting it to Progress before the loading starts and to Content or Error when it ends, depending on the result.

The View subscribes to the state of the ViewModel and changes the screen UI depending on its value.

It follows that if we want to change the screen state, we can modify it in the ViewModel, causing the View to react. This is the approach we’re going to use for our screenshot tests.

Mocking ViewModel

Let’s create a LoadUserViewModel inside the test class that we’re going to use for setting states.

Then, let’s try to set the state variable to a new value, for example, State.Initial. It will, however, cause an error because the state variable in the ViewModel has an immutable StateFlow type, which means we can’t set it to a new object. If we examine the code of the LoadUserViewModel, we’ll notice that all new states are set to a variable named _state of the MutableStateFlow type. The _state variable is a mutable object that can be set to a new value, but it has the private access modifier, preventing access from outside the class.

What can we do in this situation? We need the Fragment to react to new state values we’ll be setting inside the test, and the Fragment subscribes to viewModel.state (without the underscore).

We can do something a little different by creating our own mutable state object inside the test class, which will allow us to set it to any value.

Now we need to substitute the real viewModel.state with our own object when the Fragment subscribes to it. This can be achieved with mockk library (if you’ve never worked with it before, we recommend looking at the docs). To use it, we need to add the dependency to the build.gradle file.

Now we can stop caring about the internal implementation of the state in the ViewModel. All we need to do is return the state object we created when the Fragment subscribes to viewModel.state.

This technique is called mocking. We “mocked” the ViewModel so that if its state is referenced, it instead returns the _state object we created. The real implementation of the LoadUserViewModel will not be used in the tests.

And now, we don’t need to imitate the user’s action to change the screen state. Instead, we’ll do it directly by changing the _state variable, and take a screenshot after each change.

Modifying the Fragment code

Right now the viewModel object we created isn’t actually used anywhere. Let’s take a look at how the screen and the ViewModel interact and what changes we need to make so that the screen interacts with the mocked ViewModel instead of the real one.

The screen is launched using LoadUserActivity.

This activity barely has any code because most recent apps use the Single Activity approach. All screens are created using fragments while the activity simply serves as a container for them. So, the UI and interaction with the ViewModel are actually located in the LoadUserFragment class, and that’s where we’ll be making changes.

Let’s take a look at this class.

Note that this class has a private viewModel variable, and in the onViewCreated method a value is assigned to this variable by calling ViewModelProvider. We need to tweak the code so that when the fragment is used normally, the ViewModel is created through ViewModelProvider, but when it’s used in a screenshot test, the mocked ViewModel is passed as a parameter.

A fragment is created by calling the newInstance factory method.

This method simply creates a LoadUserFragment instance. Let’s add another method that will take a ViewModel as a parameter and set it to the variable in the Fragment. This method will only be used for tests, so we can call it newTestInstance.

Now, the activity calls the newInstance method like it did before, and our test code is going to call the newTestInstance method.

But currently we assign a value to viewModel in onViewCreated without differentiating between these cases. To fix that, we’ll add a boolean field isForScreenshots, set it to false by default, and change it to true in the newTestInstance method.

And in the onViewCreated method we’re going to create the ViewModel through the ViewModelProvider only if isForScreenshots is false.

However, the state variable is implemented in our mocked ViewModel, so when we reference viewModel.state in the observeViewModel method, it does not cause an error and just uses the value stored inside the _state object we created in the test class.

Testing the fragment

To finally write the screenshot test, we need to create a fragment instance and pass the mocked ViewModel as a parameter to it. But if you take a look at our current code, you’ll notice that we’re not creating any fragments at all.

Instead, we’re opening LoadUserActivity which creates a fragment inside itself, meaning we can’t pass any parameters to it.

If we’re testing fragments, we don’t actually need to launch the activity holding the fragment, and should test the fragment directly instead. To do that, we need to add the following dependencies to build.gradle.

Now we can delete the activityRule from the test class. To launch a fragment, we’ll call the launchFragmentContainer method and create the fragment we need.

Let’s go over what’s happening here. We launch LoadUserFragment inside the takeScreenshots method. We use the newTestInstance method to create the fragment and pass the ViewModel we created in the test.

Now the fragment interacts with the mocked ViewModel, not the real one. The fragment displays the state it gets from the ViewModel and, since we mocked the state object, it will display whatever state we set in the test.

From now on, we don’t need to imitate the user’s actions, as we can simply set the necessary state and take a screenshot when the fragment displays it.

If you run the test now, you’ll see that screenshots of all the states are saved successfully, and it happens really fast since we’re not limited by the internet connection speed or any other external factors.

In this article, we’ve covered screenshot tests and how to write them efficiently if your app uses the MVVM pattern. We encourage you to try it out in your own app, as well as explore other features Kaspresso has to offer. If you found this article useful, or if you’re planning to use Kaspresso in your projects, please consider supporting us by starring our Github project and joining our community on Telegram. We hope to see you among our contributors!

Furthermore, we welcome you to read our previous publications about autotests on Android:

UI tests on Android. How do we test an app that requires permissions?

Break free from the instrumented test code — use ADB within the tests