Understanding Long Tasks data in the Real User Monitoring (RUM) world
Long Task is a new performance metric API that can be used for measuring the responsiveness of an application and helps developers to understand the bad user experience on the website. It enables detecting CPU intensive tasks that block the UI thread for extended periods (greater than 50 milliseconds) and block other critical tasks from being executed (eg: reacting to user input).
I work at Elastic and one of the core contributors of Elastic APM Real User Monitoring Agent which provides detailed performance metrics and error tracking for web applications. We have recently added support for capturing Long Tasks using the API from the end-users as it does not suffer from bad performance implications like other timer based approaches and would help users stay under the RAIL performance model budgets.
We can measure the Long Tasks in any application by subscribing to the PerformanceObserver interface.
By looking at the above
exampleEntry data, we can identify that the browser’s main thread gets blocked for
187ms performing the above task. From the attribution data, we can identify the type of work that contributed significantly to the long task, as well as identifying which culprit browsing container is responsible for that work. Here, the long task is originated from an iframe with src as
child.html and name as
The current attribution data does not expose any information on the snippet of code responsible for the long task. As a result, we thought of a solution that combines long task data with the marking of slow application code using User Timing measures, which helps to narrow down the analysis.
We can configure the programmable Sampling Profiler to run at specific intervals (samplingInterval) to collect the JS profile from the end users on the running application.
From the captured samples, we can generate the relevant call stack information and map it to the existing long task duration to find the true culprit code responsible for blocking the UI thread similar to how Chrome/Firefox dev-tools shows the trace information under the performance tab. Besides that, we can also calculate the self-time and total time of specific functions during the long task time frame.
Generated long task flame graph would look something like below.
How to run the profiler
-enable-blink-features=ExperimentalJProfiler. You can also head to chrome://flags/#enable-experimental-web-platform-features and enable it.
- Copy the below snippet of code and paste inside script tags in the head of any web page. The code should be placed in the head to start observing for long tasks as the buffered flag is not supported yet for long tasks.
2. Reload the web page and check your dev-tools console for the link to the generated trace report.
3. You can use Chrome local overrides feature to insert the script in any website.
Example traces for some of the websites
Profiler code is available here https://github.com/vigneshshanmugam/rum-profiler
In addition to capturing the traces for long tasks, we can correlate the call stack information for all User Timing data.
An example of trace from Netflix.com which includes react hydration time using performance measure API.
That’s it. If you have any feedback, reach out to me on Twitter (_vigneshh)