Enhancing Web App Performance Measurements

David Dvora
AppsFlyer Engineering
8 min readApr 17, 2023

When measuring the performance of your web applications, performance analyzers such as Lighthouse can be used to obtain industry-standard performance scores. But is this approach sufficient for accurately measuring performance?

In this article, I will show you how to measure your application’s true performance.

Performance is Relative

As developers, we are accustomed to working with smaller data sets, fast machines, and quick network connections. However, some of our customers may not have these types of connections or machines.

Furthermore, if you have an idea for how to improve your application, it’s essential to measure its performance before and after the change. Without doing so, you can’t be confident that you’ve made any improvement.

Testing on a few machines or a staging environment is likely insufficient to assess the impact of your changes. Asking your customers to run Lighthouse on their machines and send you the performance report is unlikely to happen as well.

Experiencing Your App as Your Customers Do

Knowing all of this, wouldn’t it be nice to measure the performance directly from your clients’ browsers? Below I’ll explain how you can help better analyze your app’s real performance.

Making Magic on Your Own Might Not Be the Best Idea

One option you can choose is to build an in-house “performance measurements collector” yourself, based on native performance APIs. These types of APIs provide you with insights such as detailed measurements of every resource you download (scripts, XHR requests, images, and more), and you can use them to compute web vitals as well.

These measurements can be collected from our clients, sent to a server, aggregated, and visualized with a user-friendly interface.

However, this might be a significant undertaking that could divert attention from the company’s primary focus. It involves addressing issues such as high traffic, computing resources, and ongoing maintenance efforts.

Are There Ways to Simplify Things?

Measuring frontend performance is a common challenge, and there are several solutions in the market that target this issue. Popular tools for this task include Sentry, Dynatrace, New Relic, and Datadog RUM.

At AppsFlyer, we chose Datadog.

Datadog RUM (Real User Monitoring) measures the user experience of your website from the perspective of your clients. The measured KPIs include web vitals, page loading times, resource performance (files and requests to server), and JavaScript errors.

To embed Datadog RUM, your app needs to execute a snippet, which will later effectively send measurements to Datadog. The collected data can be visualized and presented in neat dashboards, and can also be used for monitoring.

Default Measurements

After traffic reaches Datadog, you can examine it using the default dashboards. The default performance dashboard allows you to view core web vitals, resource loading times, and long task analysis. You can also analyze XHR request performance from actual customers.

Default Performance Dashboard

For the first time, we were able to see the TRUE state of our frontend.

We used the Default Sessions dashboard, which allows you to see how many visitors each page receives and how much time they spend on each page. Using this information, you can understand which pages receive the most traffic, and which gets the most unique visitors. This can help you prioritize which pages to improve next.

User Sessions Default Dashboard
Page Ranking Based on Traffic Volume, Taken from Session Data

User experience is not just a matter of speed. Errors thrown by your program can also negatively affect your users’ experience. Datadog RUM provides a default dashboard that displays these errors, with an option to view the stack traces.

A useful monitoring feature allows us to receive an alert whenever a new type of error — which has not been seen before — is detected.

Example: Time Series of JavaScript Error Count (with Drill-Down Option)
Example — Detailed Error with a Stack Trace Pointing to the Error’s Origin

After covering these basic measurements, I want to now discuss how AppsFlyer took this one step further.

Connecting the Dots

AppsFlyer’s Frontend Measurement Challenge

The AppsFlyer codebase contains hundreds of repositories, including dozens of frontend pages owned by multiple teams. These pages are deployed and operate independently of one another. Once loaded into the browser, the frontend code sends multiple requests to several microservices to gather the required data for the pages to operate.

AppsFlyer utilizes the ability to filter resource measurement by page name. This gives page owners a sense of ownership, allowing them to focus on improving their own micro-frontend performance. Reducing this noise is critical for developer engagement in performance improvement efforts. After all, would you rather spend your precious time improving other teams’ products instead of your own? I’m guessing not. :)

The next view shows the slowest and fastest pages, and is used for internal benchmarking and improved visibility. Needless to say, this view has made some developers more aware and engaged, and some began prioritizing performance tasks after realizing that their pages were at the bottom of the performant pages list.

Fastest Loading Pages on the Right; Slowest on the Left.

The following two widgets assess the performance of a single AppsFlyer frontend page from our customers’ perspective. Here, we measure the time it takes to fetch requests.

Performance Assessment of a Single AppsFlyer Server as the Frontend Page Experience

The top graph shows p75 of time of fetch requests to our servers, while the bottom one shows the same measurement for third-party servers.

Separating the traffic into two widgets helped us understand whether a performance issue originated from our servers or one of the third-party services we use. The top widget was the one we used to validate a performance improvement of one of our servers recently.

Validating Our Server’s Performance

In the chart above, you can see the beautiful drop around 17:50. That was the proof we needed to make sure that our clients are truly enjoying the change we implemented.

Upon examining these kinds of measurements, we can confidently verify whether we are progressing in the right direction or not. These crucial insights serve as the green light to propel us towards further improvements going forward.

Mix and Match

One of Datadog’s most useful features is its ability to create custom dashboards tailored to specific needs. One way you can take advantage of this is to place two widgets side by side to look for correlations.

Combining Git commit tagging and service (page) filtering has given us a powerful visualization tool. This technique allows us to quickly identify a commit that introduced a sudden change in page behavior, such as bundle size or page loading time. Once we’ve identified the problematic commit, we can make necessary code changes to resolve the issue.

A Sudden Increase in Bundle Size
Actual Git Commit Introdusing the Size Increase

One possible improvement for this example is to use code splitting. You can look for a decrease in page load times to see if this helps. But still — it’s possible to get the wrong impression about a change you made. For example, spending time to reduce the size of your bundle by 100K may not have the desired impact on most of your customers as you may have initially thought, assuming that the majority of them have fast networks and machines.

We recently made an improvement that I had been anticipating for a long time. We reduced the size of static assets downloaded to browsers by a few hundred kilobytes. We created a performance widget to understand the impact of this change on every page. The numbers we obtained proved that our effort to improve performance was not in vain.

Here you can see the effect of such a performance improvement on the time it takes for the DOM to finish downloading on actual customers’ browsers.

Before and After the Bundle Reduction Change: 2X Improvement in DOM Complete Time

Performance is an Ongoing Effort

I think it’s worth mentioning that not everything was easy peasy, and that the struggle was real. For example, we needed to write our custom Webpack wrapper plugin to upload and correctly tag the source maps being used for Datadog error monitoring, which was tricky to debug. And the Datadog source mapping is by no means perfect.

We also experienced issues with missing measurements for frontend pages that use Apollo Client (a known limitation that was eventually solved). In addition, we required an extra header manipulation on our CDN server to support asset size reporting.

So Should You Use Datadog in Your Company?

Overall, Datadog has achieved its main goals and is a valuable tool that helps us get the insights we need. Datadog RUM is a paid solution, which may not be suitable for every company’s budget. However, pricing is done per session, and you can decrease the sample rate to reduce costs. I recommend this tool to organizations that understand the value of performance visibility as both a decision-making tool and a means of gaining frontend insights. In other words — only buy it if you’re going to utilize it!

Next Steps

Once we established the baseline for frontend measurements, we started creating monitoring on top of it. For example, we set up an alert that is triggered when your page starts loading more slowly than usual, and another that is triggered when the error rate is higher than normal.

We also enriched our dashboard with these insightful widgets:

This widget provides insights into the errors that a page receives from servers, grouped by HTTP status codes.
This widget displays the heaviest resources on the page, providing immediate action items.
This widget displays the server error rate, as experienced by your frontend page.
This widget displays the percentile distribution for DOM completion.

We use our dashboard data for reporting to management. We provide actual numbers (hopefully good ones!), for performance features that we’ve worked on. This is something that can help justify future performance work.

Conclusion

In conclusion, tools like Datadog RUM provide us with this data and play an important role in shaping the way customers experience our products. Personally, understanding the true status of our frontend performance motivates me to take action, and I hope that I have motivated you to better understand your app’s performance as well.

--

--