Performance testing on the web
This article was updated on Nov 25, 2020 to use the web_benchmarks
package.
Overview
During development, we often want to test an app’s performance in the browser. Performance testing is useful, as it reveals potential bugs that make an app slower.
This article describes a way to test an app’s performance in Chrome. This method is similar to how we test the new Flutter Gallery’s performance.
Example app
We use a simple app that contains an appbar, a floating action button, and an infinite list of items. The list also shows the number of times the button is pushed.
The app has a second page containing some information.
You can clone the app here:
What to test?
We want to test the app’s performance in Chrome under the following usage scenarios:
- User scrolls through the infinite list.
- User switches between the two pages.
- User taps the floating action button.
Setting up the framework
Add the following to pubspec.yaml
:
This dependency pulls in web_benchmarks
, a minimal package that implements performance testing in Chrome.
This package is adapted from macrobenchmarks
and devicelab
, two packages used by Flutter for web performance testing on the Flutter Gallery. At the moment, these two packages are specialized for web performance testing within flutter/flutter
, so it is easier to import the more general package, web_benchmarks
.
Run flutter pub get
to pull in this package.
Writing the first test
Add a benchmarks
directory under lib
, and add a new dart file to it called runner.dart
:
The contents of the file are as follows:
What is this test doing?
- When this app runs, a
ScrollRecorder
object is created, which drives the app by automatically making gestures. In this case, shortly after the app starts, it starts scrolling down the infinite list. - The
ScrollRecorder
class extends theAppRecorder
class, which extends theWidgetRecorder
class, which also records performance data as it drives the app. runBenchmarks
is a function defined inpackage:web_benchmarks/client.dart
, which allows the user to select which benchmark to run, and displays the results in the browser.- The method
automate
uses theflutter_test
package, which provides methods to make gestures or find certain widgets in an app.
Running the first test
In the root directory of the project, run flutter run -d chrome -t lib/benchmarks/runner.dart
. This tells Flutter to use runner.dart
as the entry point, instead of main.dart
.
We only have one benchmark so far, so click “scroll” to start it.
The test begins, and the list automatically scrolls down.
The test ends in a few seconds, showing the following screen:
This chart shows the time it took for the app to draw each (recorded) frame. The horizontal axis represents the flow of time; the vertical axis, the duration each frame took.
The first 2/3 of the chart has a gray background; these frames are considered “warm-up frames”, and are omitted from the statistics. Warm-up frames typically give the JIT compiler time to compile the code, and populate various caches, so that the measured frames produce numbers that reflect the “eventual” performance of the app, rather than the first few seconds of it. The warm-up phase should not be always ignored — it can provide valuable information about your app’s performance during the first few seconds, which can still influence the perception of the app’s quality.
Red frames are “outliers” — they are frames which take significantly longer than other frames to draw. Some outliers can be nearly unnoticeable. For example, jank at the beginning or the end of an animation up to a certain point will not be visible. However, a janky frame in the middle of an animation will be very noticeable.
Outliers provide a good indicator of the jankiness of the app. By improving your app, you can lower the values of outliers or reduce the number of outliers, which shows that your app has become smoother.
Collecting data from Chrome’s DevTools
This benchmark is entirely run from inside Chrome. Add the following file as test/run_benchmarks.dart
:
Then, run dart test/run_benchmarks.dart
.
After about one minute, you should see the following results:
The exact benchmark values may vary depending on the machine.
What is this test doing?
- Running
test/run_benchmarks.dart
builds the app for the web. Then, it starts a Chrome instance and runs the app in it. test/run_benchmarks.dart
connects to Chrome’s DevTools port, and listens and collects relevant performance data from it.
What do the results mean?
- When rendering a frame, the layer tree is walked twice.
- “Preroll” is the first walk. It does not render anything, but it computes values that are later used for rendering. Examples include: transform matrices, the inverse of transforms, and clips.
- “Apply frame” is the second walk where the UI is actually rendered.
- “Draw frame” is the total time that the framework takes to render a frame. It includes “Preroll” and “Apply frame”, but it also includes the time spent on building and laying out the widgets.
- “Total UI frame” includes everything in “Draw frame”, but it also includes some hidden work that the browser performs, such as layer tree updates, style recalculations, and browser-side layout (not to be confused with Flutter’s own layout).
- When a dataset (a list of durations) is collected, the algorithm removes outliers.
- First, the mean and standard deviation of the data are computed, and any data point that is higher than (mean + 1 standard deviation) is considered an outlier.
- The mean and standard deviation of non-outliers (clean data) are used to compute the average and noise of the data set, which are then reported.
- The mean of all outliers, as well as the ratio of the “outlier mean” and the “non-outlier mean” are also reported.
- For each dataset, “outlierRatio” and “noise” are both good indicators of how much noise there is in the performance of the app. If the results are too noisy, it might indicate inconsistencies in performance (such as janky frames as GC pauses). By aiming to lower the noise, you can make your app perform more smoothly.
Add more tests
Edit lib/benchmarks/runner.dart
to add two more tests.
First, modify the main
function:
Finally, add two more classes that extend AppRecorder
:
What are these tests doing?
- We have added the two remaining benchmark tests: one for switching between pages, and the other for tapping on the floating action button.
animationStops
repeatedly checks whether an animation is happening, and stops when all animation has stopped. This ensures, for example, a successful transition to the “about” page.- In the “page” and “tap” benchmarks, the
_completed
boolean tracks whether the automated gestures have finished. - In the “page” and “tap” benchmarks, overriding the
shouldContinue
method causes theAppRecorder
to stop recording frames after all gestures have finished.
How to run these tests?
To run these tests (and see the animations) in Chrome, run:
flutter run -d chrome -t lib/benchmarks/runner.dart --profile
To run these tests and collect DevTools data, run:
dart test/run_benchmarks.dart
What next?
Once you have a way to collect performance data, you can use it however you want:
- You can set up a job in CI that runs these benchmark tests whenever someone submits a PR, to avoid introducing performance-heavy changes.
- You can also set up a dashboard that keeps track of the trend of performance benchmarks. This is what we are doing for the Flutter Gallery (see Flutter Dashboard).