Understanding Long Tasks data in the Real User Monitoring (RUM) world
Long Task is a new performance metric API that can be used for measuring the responsiveness of an application and helps developers to understand the bad user experience on the website. It enables detecting CPU intensive tasks that block the UI thread for extended periods (greater than 50 milliseconds) and block other critical tasks from being executed (eg: reacting to user input).
Background
I work at Elastic and one of the core contributors of Elastic APM Real User Monitoring Agent which provides detailed performance metrics and error tracking for web applications. We have recently added support for capturing Long Tasks using the API from the end-users as it does not suffer from bad performance implications like other timer based approaches and would help users stay under the RAIL performance model budgets.
We can measure the Long Tasks in any application by subscribing to the PerformanceObserver interface.
By looking at the above exampleEntry
data, we can identify that the browser’s main thread gets blocked for 187ms
performing the above task. From the attribution data, we can identify the type of work that contributed significantly to the long task, as well as identifying which culprit browsing container is responsible for that work. Here, the long task is originated from an iframe with src as child.html
and name as child1
.
The current attribution data does not expose any information on the snippet of code responsible for the long task. As a result, we thought of a solution that combines long task data with the marking of slow application code using User Timing measures, which helps to narrow down the analysis.
A while ago, I was playing around with the JavaScript Self-Profiling API Proposal and used the same for printing the stack traces to the dev-tools console while the application was loading.
We can configure the programmable Sampling Profiler to run at specific intervals (samplingInterval) to collect the JS profile from the end users on the running application.
From the captured samples, we can generate the relevant call stack information and map it to the existing long task duration to find the true culprit code responsible for blocking the UI thread similar to how Chrome/Firefox dev-tools shows the trace information under the performance tab. Besides that, we can also calculate the self-time and total time of specific functions during the long task time frame.
Generated long task flame graph would look something like below.
Demo — https://rum-profiler.now.sh/demo
How to run the profiler
JavaScript Self-Profiler API is experimental, It’s available only from Chrome 78 behind a flag-enable-blink-features=ExperimentalJProfiler
. You can also head to chrome://flags/#enable-experimental-web-platform-features and enable it.
- Copy the below snippet of code and paste inside script tags in the head of any web page. The code should be placed in the head to start observing for long tasks as the buffered flag is not supported yet for long tasks.
2. Reload the web page and check your dev-tools console for the link to the generated trace report.
3. You can use Chrome local overrides feature to insert the script in any website.
Example traces for some of the websites
Profiler code is available here https://github.com/vigneshshanmugam/rum-profiler
More opportunities
In addition to capturing the traces for long tasks, we can correlate the call stack information for all User Timing data.
An example of trace from Netflix.com which includes react hydration time using performance measure API.
That’s it. If you have any feedback, reach out to me on Twitter (_vigneshh)
Some links…
- Buffered flag for long tasks will be available from chrome 81 onwards
- Elastic APM Real User Monitoring Agent
- Long Task API
- JavaScript Self-Profiling API
Finally, Thanks to Brandon Morelli for the proofread and Brian Vaughn for open sourcing react-flame-graph which helped immensely.