Flutter Performance Updates in 2019
By Yuqian Li and Shams Zakhour
Being fast is a key pillar to Flutter. This article highlights performance improvements from the second half of 2019 implemented by folks from the Flutter community. (Yes, this is late, but it’s better late than never!)
If you’ve helped out with performance in 2020, we will cover that in a future post. We hope that sharing this with you, the Flutter community, inspires you to help us with this effort!
Q4 (Oct-Dec) 2019
Quantified improvements
70% memory reduction in fast scrolling through large images
contributors: liyuqian, dnfield, chinmaygarde
- PR 14265: Cleanup the IO thread GrContext
- PR 46184: Memory test on scrolling large images quickly
- Dashboard: 70% reduction (400MB to 120MB) in fast_scroll_large_images__memory diff-median
- Issue 19558: IO thread GrContext memory needs to be cleaned up
40% reduction in CPU/GPU usages for simple iOS animations
contributors: flar, liyuqian, hixie, chinmaygarde
- PR 14104: Rework simpler conditional offscreen for screen readback support
- PR 13976: Dynamically determine whether to use offscreen surface based on need
- PR 31865: Simple animation costs high CPU/GPU/Power usages (~20%) on iPhone 6
- 40% (23%-CPU-14%-GPU to 13%-CPU-8.5%-GPU) reduction in simple_animation_perf_iphonexs cpu_percentage, gpu_percentage
41% speedup for caret performance
contributors: garyqian, liyuqian, justinmc
- PR 46720: Pass _caretPrototype to prevent cache miss
- PR 46720: 41% speedup (6.709ms to 4.756ms) for 90th percentile frame build time
- Fixed Issue 24522: Caret performance is poor, high GPU time per frame
10% speedup for list scrolling by fixing raster cache throttling
contributors: liyuqian, chinmaygarde, flar, cyanglaz, zsunkun
- PR 31865: Simple animation costs high CPU/GPU/Power usages (~20%) on iPhone 6
- PR 13710: Fix picture raster cache throttling
- PR 45050: Add a perf test for picture raster cache
- Fixed a blocker issue, 43083: List scrolling is not smooth with PR 13710: Fix picture raster cache throttling
37x speedup for cached benchmarks load time (Dashboard)
contributors: caseyhillers, tvolkert, digiter, jonahwilliams
- PR 494: Cache get-benchmarks
- PR 484: /api/public/get-status serves cached responses
- 37x speedup (37s to 1s) in benchmarks loading time
2.3x speedup for building APKs
contributors: jonahwilliams, blasten, zanderso, xster
- PR 44534: Improve performance of build APK (~50%) by running gen_snapshot concurrently
- 2.3x speedup (140s to 60s) in a release build of an APK for target platforms android-arm, android-arm64, and android-x64
+103 performance metrics recorded per Flutter engine commit
contributors: liyuqian, digiter, keyonghan, godofredoc, cbracken
- PR 14556: Run and collect benchmarks
- Related issue: PR 37434: Centralize performance metrics, provide universal alerts, and improve performance dashboard
20% app size reduction
contributors: mraleph, alexmarkov, rmacnak-google, mkustermann, sstrickl, aartbik
- 3% from Lift PC offsets out of StackMaps
- 2.58% from Further compress the information previously in StackMaps
- 1% from Canonicalize CompressedStackMaps payloads when possible
- 2% from Fully enable deduplication of instructions in bare mode
- 0.3% from Do not generate monomorphic prologue for functions which need args descriptor
- 1% from Drop redundant initializing stores of null
- 6% from Reduce alignment of Instructions and remove some debugging trap instructions
- 1.2% from Adjust CSP during the invocation stub instead of each function prologue
- 1% from ARM64: Block R22 to hold NullObject()
- 2.5% from Whole-program constant propagation
- 0.77% from Dead code elimination
108x speedup on Dart FFI performance
contributors: dcharkes, mkustermann, sjindel, alexmarkov
- Gerrit 120661: Optimize Pointer operations for statically known types
- Gerrit 119645: Pointer optimize indexed load and store
- Gerrit 121580: Allow inlining of force optimized functions in AoT
10–15% performance improvement in tight code
contributors: aartbik, mkustermann, mraleph
- Gerrit 117200: Loop analysis and BCE improvements
- 10–15% performance improvement in golem armv7, and TypedData Bench: Gerrit 117200: Loop analysis and BCE improvements
2.2x speedup in flutter test with new incremental serializer
contributors: jensjoha, alexmarkov
- Gerrit 121121: Enable incremental serializer by default
- 2.2x speedup (3:38 to 1:39) in `flutter test`
10% faster Kernel binary serialization by giving inlining hints to Dart VM JIT
contributors: jensjoha, johnniwinther
30% performance improvement on async heavy code
contributors: cskau-g, mkustermann, mraleph
Other improvements
Fixed a memory leak when using PlatformView
on iOS
Fixed a memory leak when animation is playing on iOS
- Gerrit 260538: Don’t allocate invalidation messages for generators that make uncacheable textures
- Memory Leak when animation is playing in iOS
Fixed more iOS memory leaks
- https://github.com/flutter/engine/pull/14275
- https://github.com/flutter/engine/pull/14326
- https://github.com/flutter/flutter/issues/35243
Started revamping Performance pages on flutter.dev and added instructions on measuring app size.
Corrected the first frame waiting logic and measurement
- PR 37192: Reland “Fix the first frame logic in tracing and driver (#35297)”
- Fixed Issue 47108: Memory Leak when animation is playing in iOS
DevTools added full timeline mode with support for async and recorded tracing.
IntelliJ plugin fixed 120FPS support
Many timeline tracing improvements thanks to ByteDance
- Gerrit 127920: [timeline] Add support for timeline asynchronous events in android platform trace
- Gerrit 128200: [timeline] support vm events available to systrace
- Gerrit 127921: support more sync event when use systrace to record timeline event
- PR 14323: Fix missing API stream when record event in systrace
- PR 14521: Support timeline can be enabled in release mode
- PR 14319: Fix missing timeline event of flutter engine’s startup time
- PR 47742: fix duration event of timeline summary
- Gerrit 131360: Support timeline conversion to iOS platform trace
- PR 16520: support endless trace buffer
- PR 47419: support endless recorder for timeline
Q3 (July-Sept) 2019
Quantified improvements
1.5–5x speedup for rect & point transformations
contributors: flar, yjbanov, dnfield
- PR 37275: Optimize the transformRect and transformPoint methods in matrix_utils
- 5.3x speedup (2300ms to 430ms) in
MatrixUtils_affine_transformRect_iteration
- 1.5x speedup (466ms to 320ms) in
MatrixUtils_affine_transformPoint_iteration
N/2–1 fewer missed frames on iPhone X/Xs scrolling
contributors: liyuqian, chinmaygarde, gaaclarke
- https://github.com/flutter/engine/pull/12385
- PR 12385: Reland “Smooth out iOS irregular input events delivery
- Reduced the worst-case number of missed frames from N/2 to 1 in a scroll of N frames. In practice, N/10 frames were missed on average before the fix.
- Fixed one of our top-voted issues:
Issue 31086: Scroll performance significantly degraded on iPhone X, Xs devices due to irregular input events delivery
15% faster engine start and shutdown with parallel initialization
contributors: gaaclarke, chinmaygarde, liyuqian
- PR 10182: Made flutter startup faster by allowing initialization to be parallelized
- 1.16x speedup (3829377 ns to 3286713 ns) in BM_ShellInitializationAndShutdown
14.57ms faster startup and 8MB smaller memory usage for shader warm-up
contributors: gaaclarke, liyuqian, dnfield
- PR 36482: Sped up shader warmup by only drawing on a 100x100 surface
- Saves 14.57ms (18.848ms to 4.279ms) in reading/converting pixels at startup
- Saves 8MB (39220KB to 31184KB) in start-median memory
- Saves 4MB (45034KB to 40980KB) in end-median memory
1.02%-8.04% reduction in code size
contributors: johnniwinther, aartbik, rmacnak-google, jensjoha, alexmarkov, mkustermann
- https://dart-review.googlesource.com/c/sdk/+/118280
- https://dart-review.googlesource.com/c/sdk/+/112758
- https://dart-review.googlesource.com/c/sdk/+/118181
- -8.04% (5.57MB to 5.13MB) in armv8 animation_bench_instructions_size
- -2.7% (2.10MB to 2.05MB) in armv7 flutter_gallery_readonlydata_size
- -1.22% (2.10MB to 2.05MB) in armv7 layout_bench_instructions_size
Up-to-2x increase for Flutter on Fuchsia FPS; improved frame scheduling
contributors: dreveman, amott, rosswang, mikejurka
- https://fuchsia-review.googlesource.com/c/topaz/+/280230
- https://fuchsia-review.googlesource.com/c/topaz/+/286735
- https://fuchsia-review.googlesource.com/c/topaz/+/300135
- https://fuchsia-review.googlesource.com/c/topaz/+/306773
- https://fuchsia-review.googlesource.com/c/topaz/+/306772
- https://fuchsia-review.googlesource.com/c/topaz/+/307953
Quantified regression fixes
3x speedup for BackdropFilter on iOS
contributors: lhkbob, liyuqian, flar
- https://skia-review.googlesource.com/c/skia/+/237904
- https://skia-review.googlesource.com/c/skia/+/234413
- https://github.com/flutter/flutter/pull/38814
- 3x speedup (110ms to 34ms) in GM_savelayer_with_backdrop
- Fixed regression https://github.com/flutter/flutter/issues/36064
To achieve some huge improvements (3x for example), the bad performance of the old state probably played an equally important role as the hard work in Q3 (July-Sept) 2019. We also marked some improvements as non-trivial fixes of equally big regressions. Nevertheless, we appreciate such work. Without it, we’d continue to have bad performance and regressions. We don’t want the big improvements to dwarf the smaller improvements. They just didn’t have a very bad old performance to start with, which in some sense is a good thing.
Other improvements
- DevTools support variable display refresh rates (e.g. 60 FPS, 120 FPS, etc.)
- VSCode plugin’s scanning for projects is now asynchronous which should improve extension activation speed and reduce the chances of triggering VS Code’s “extension causes high CPU” warning. (#1840/#2003/#1961)
- iPhone Xs is added to Flutter device lab for benchmarking
Conclusion
Thanks to these contributions from our community, the proportion of users positively satisfied with Flutter’s mobile performance increased from 85% in Q3 2019 to 92% in 2020. Despite our best effort, some performance contributions in Q3-Q4 2019 may be missed in this update. Please don’t hesitate to let us know of any missing contributions, and we’ll put them in the next update.