Performance in Flutter: What is it? How is it measured? How is it interpreted? How is it improved?
Once I was watching Star Wars, and Yoda said, “If you want to control your performance in Flutter, measure it, you must.”
Well and now… seriously, Flutter has an excellent performance, but sometimes, due to development errors, we can accidentally fall into practices that impact the performance of our app.
We can perceive this when we notice a latency when we are drawing elements on the screen, but in many other circumstances the error is not perceptible to the human eye. For example, if your app spends more battery than expected, occupies more space than expected in memory or has certain strange behaviors before some animations; it is an indication that you should measure the performance of the app.
So let’s navigate in this deep topic 🚤, I hope you like this article, do not forget at the end, leave us some +50 applause and comments if you have any concerns.
What is performance?
We can understand performance as a set of quantifiable properties of an executable. In our context it is not the execution of the action itself, it is how well it is performed.
I emphasize quantifiable, because it is fundamental to have a number that represents the ability to execute the task. For example, the final size of your app, is a measure assignable to a number and so it is clear that if a new feature of your app makes it go from 120 MB to 200 MB it is an indication that something may be wrong and maybe you have opportunities for improvement.
Therefore, all elements to measure performance must be quantifiable.
Why it is essential?
We must measure performance automatically to ensure that our applications are giving the best possible experience to our customers. In most cases an error in the performance of an app is associated to implementation problems, not to the SDK.
However, not everything is perceptible to the human eye. For example, FPS is the unit that measures the amount of frames that a device generates or processes during a period of time of 1 second. The goal of an app is to run at a minimum of 60 FPS. However, the human eye only distinguishes around 25 FPS; therefore, it is very difficult for us to measure a good performance, using only our senses. A bad rendering time implies an unnecessary expenditure of the GPU and this has an impact on the battery.
Therefore, if we do not measure our performance, we will dedicate ourselves to putting out fires instead of guaranteeing the quality of our developments before going to production.
Items to keep in mind
There are five items you should have on your radar when measuring performance:
- You must make metrics easy to understand 📈. You should always bring metrics to numbers and allow developers to understand when it is an improvement in performance or an affectation.
- Make metrics specific and unambiguous🧐. By this, I mean that a unit should be established and the standard for that measurement. For example, all our apps should run at a minimum of 60 FPS (there´s nothing ambiguous about that and it doesn’t give a chance for developer interpretation).
- Make the performance easy to compare. For example, keeping the old metrics against the new ones can help identify performance improvements or detriments.
- We must make possible that the performance metrics monitor can control te largest coverage as possible, to ensure that nothing is left behind. By this way we guarantee the good performance of the app.
- We should not perform performance optimizations until a test shows us that we really have a problem. It happens because sometimes we believe that certain developments can improve performance, but we shouldn´t rush to correct supposed failures until we certify with tests that in reality there is an error in a certain part of our implementations.
We can separate the performance in two categories: time and space. In the time category we can see measurements such as FPS. For example, time to execute until the first frame drawn, etc.
For the time category, we can use metrics such as app size or memory consumed by a process, among others.
How can we measure performance in our Flutter apps?
Well, we have reached the moment that many of us have been waiting for. It’s time to apply the theory and learn about performance measurement tools.
To begin with, I must mention an important issue; Flutter by default has good performance, so if we perceive any error in our app, it is very probable that the error is on our side.
To help with this, Flutter has given us the profile mode, a powerful tool to measure our performance and even diagnose the root of our problem. To use it you can execute the following command:
flutter run --profile
Note: This command must be executed on a physical device. Remember that, to execute this command, you must have your cell phone´s certificates and permissions properly configured.
In case you have several devices connected you must choose the one you want.
When the process finishes you will see something like:
These URLs will take you to two tools: One is an observatory setup on the configured phone:
In this observatory you can see the memory consumption and the highest peaks that have been reached while using the application. However, Google gives you a second tool with which you can evaluate several interesting points, so let’s take a look at the other generated link; which takes us to our well-known developer tools.
These two tools are the ones we will use to measure our performance.
How is it interpreted?
Let’s test these tools with demo application 🧐:
For this I designed them a small example, in which I drew on screen ten thousand widgets. In a first scenario I did it with a bad code (Low Perfo) and in the second scenario a code with best practices (High Perfo).
Also, I will use a screen to test our internet consumption. Finally, I will generate two compilations and compare their sizes.
Let’s start by looking at the two codes. First low_perfo.dart:
As we see in this code there are several bad practices. For example, we have a Column drawing 10000 widgets, this will cause that no element can be shown on screen until Flutter draws these elements. Also, we have a function that returns widgets and this is also a bad practice. All in all, this solution should perform poorly, I exaggerated the bad implementation a bit to make the bug more obvious.
On the other hand, we have the implementation high_perfo.dart:
In this implementation, we have a ListView.Builder that helps us to build everything lazily, that means that only those elements with which the user interacts are made. So, the difference with the previous code should be significantly evident, but will it be that way?
With these two codes ready, I started my performance tests:
Let’s first see what our first tool (the observatory) shows us:
We can see in the observatory that there is a high consumption of the device memory when running the experience in which we deduce that it would have bad performance. However, if we want to see more details, we must analyze the development through the developer tools.
First, let’s look at the performance tab:
However, as some graphs are shown on the screen, what do these bar graphs mean?
The UI thread executes the Dart code in the Dart VM. This thread generates the layer tree. Remember that Flutter runs widgets, elements and renders object trees at the same time, to draw the details on the screen. (I suggest you read the following article if you don’t have this link context). The layer tree has the elements to draw, but it doesn’t know how to do it, so it sends its information to the raster thread.
The raster thread takes the layer tree exposed in the UI thread and then sends the necessary commands to the GPU to draw the elements on the screen.
Jank (slow frame)
A frame that takes a long time to be drawn on the screen is considered a Jank. As a standard, any frame that takes more than 16 ms (for devices running at 60 FPS) to finish drawing is considered a Jank.
Frames that perform shader compilation are marked in dark red.
With these concepts in mind, we can understand the graphs displayed by the performance tab. As we can see when drawing on the screen, the high-performance development doesn’t present any anomaly to worry about, since no frame takes more than 17 ms to finish its process.
Now let’s see the low-performance experience:
As we see in this screen, we have several Jank or slow frames. At this moment, we can deduce that it was costly to create the layer tree. Therefore the thread involved has a long occupation time.
If we click on one of the bars that indicates that there is a Jank, the following information is shown:
We can see that we have a timeline of our events. Although, it is not very easy to understand these graphs. We can see that the screen’s construction took the longest in the timeline. Also, we can see that in summary (Summary tab), there are 513ms to process this frame 🤯.
I usually check the Bottom-up tab to understand what’s going on, so let’s take a look:
In the chart we can see, that it have a 345 ms delay in drawing the numbers (texts) on the screen without going into detail in the other processes. It is not normal behavior, as Flutter typically does this efficiently. So where could the problem be? We can apply a filter, hide all the elements related to the core libraries, and see if we find any issues in our developments.
As we can see on the screen we can identify that the heaviest processes are related to our own developments, in this case the building the screen and drawing the components of the list process. This indicates that we have an error in our development that is spending in total (219+219+ 73) ms. This means that the app is using 511ms of the 513ms to draw our developments 😂.
You can also enable the Performance Overlay from the developer tools, showing the consumption of the Raster and UI thread on the cell phone.
Let’s measure CPU consumption.
We can still use this example to analyze the following two tabs. Let’s start with the CPU profiler. Which will tell us how much computational wear is for our solution.
This functionality involves recording an execution period and will show us the results of the analysis:
We can see that no execution process of the high-performance tab is related to the implemented developments. Instead, the most comprehensive techniques in the low-performance screen are related to constructing the screen’s low performance.
Let’s measure memory consumption.
On the other hand, there is the memory tab (Memory). Let’s see what the analysis throws at us:
How we see, the memory consumption rises when building the screen. Still, once it has finished, the consumption decreases. It occurs in the low-performance experience. But as we see in the high-performance screen, there is no evidence of a memory spike when drawing the experience.
It makes it more evident that we have problems at the time of construction of the low-performance screen.
Let’s measure internet consumption.
Well, let’s go to the internet data trip tab. This screen allows us to measure response times, data type, and responses, among other exciting elements:
Perfo experience Network: How we can see, this shows us all our consumption on the Internet. In case of making unnecessary requests or presenting high response times from some of the services, it will be evidenced in this report.
Headers on the left and the response on the correct information that may be relevant to our developments can be validated.
Let’s measure storage space consumption
Before going to production, is good analyze the size occupied by our applications . For this, we can use the available tools. In this case, the App Size tab.
App Size tab To analyze the size of your application, we can execute the following commands:
flutter build apk --analyze-size
flutter build appbundle --analyze-size
flutter build ios --analyze-size
flutter build linux --analyze-size
flutter build macos --analyze-size
flutter build windows --analyze-size
It will generate a JSON file. I recommend you always keep the last JSON generated because, as you will see shortly, you can compare between two analyses.
In this scenario, I generated two reports, one for an app with a video within its assets and the other without it. My idea is that this tab tells us where the increase in the space of our app occurred.
The left image is the application without video report vs. right image that is the app with a video report. In the first analysis section, it allows us to see in detail what the storage space consumption of our app is. Between the two reports there is a big difference. One app weighs 54 MB and the other one 68.3 MB. This indicates that something happened in the last version, which caused the increase of our app´s weight. But, how do we validate what happened? For this we will use the Diff tab:
The left image and the two reports are uploaded. On right section, you can see the resulting increase in the app´s size is a video located inside the images folder called video.mov 🤯.
This tool also allows us to see if we decrease our app´s size. Therefore, performing these analyses in our process before deployment to production is very useful.
How is performance improved?
At this time, there were already many elements that we analyzed. The only thing left to solve is the following: How do we fix our performance failures?
We must think about what we are doing during our development process, and identify what are the best practices to perform this type of solutions. Also we should always look official documentation.
For example, let’s think about the Low Perfo screen; we obviously have a problem there. As shown in our analysis, the problem is in the drawing of our screen. But what are we drawing there? A long list of widgets.
The official documentation contains a section that tells us how we can work our lists in the best possible way (link)(link 2). If we follow the advice provided in these links, it will undoubtedly improve our performance.
Therefore, when we want to improve performance, we must consider what we are doing and check it against the official documentation.
I hope you enjoyed this article. Forgive how long it was, but, due to the detail of the subject, was necessary show several points. If you liked the article remember to leave +50 👏 as a sign of gratitude.
I am convinced that with this tool you will be able to identify the possible flaws of your app and attack the problems that you find. But … wait! I didn’t tell you that there is a way to measure some of these elements from automatic processes.
Follow us to see the next part of this article where I’ll tell you how we can do it 🧑🏻💻. I also leave you the link to the repository of the example (repo link).