Teads Engineering
Published in

Teads Engineering

The most accurate way to schedule a function in a web browser

TL;DR

Introduction

Hello and welcome!

I’m part of the team that develops the JavaScript SDK, a.k.a. the format, used to display curated ads on publishers pages (essentially news articles / editorial content). We are using the setTimeout function at various locations in our code base. Some of the setTimeout callbacks are pretty critical, in the sense that delaying them for a few milliseconds could significantly impact business KPIs.

Since our script is integrated in thousands of contexts, we are exposed to various behaviors when it comes to using the this function, such as:

As you may know, the timeout value provided to setTimeout is a theoretical one. The callback function is not guaranteed to be executed exactly X milliseconds later, as we’ll see in this article. Over the years, we discovered other ways to call a function after some timeout, involving the requestAnimationFrame function or a Web Worker.

This led us to do an experiment to answer the following question:

What is the most accurate way to schedule a function/callback in a web browser?

Hereafter are the results of the analysis we made. We chose to test various scenarios with an arbitrary 250ms timeout value. Bear in mind that this analysis focuses on a single setTimeout call. We are not taking into account recursive calls to achieve frequent operations, such as animations.

Article sections

1Methodology
2 Technical context
3Detailed analysis
4A few takeaways
5Conclusion

1 — Methodology

We thought about 3 strategies to set a timeout:

Each of these strategies was run twice: inside, and outside the viewport, giving us a total of 6 scenarios to test. For a period of approximately 3 days, we exposed a part of the users (visiting a page containing our script) to these scenarios, and logged messages containing timing information.

On the left, the ad element is outside the viewport / the device screen, referred to “out of viewport” in this article. On the right, a part of the ad element is inside the viewport, referred to “in viewport” in this article.

We collected hundreds of millions of logs on BigQuery, then we created a dataset that we analyzed using Data Studio. If you’re interested in knowing how our analytics stack works, feel free to read our Give meaning to 100 billion analytics events a day article.

Out of these logs, 1 million came from environments where IntersectionObserver, Web Worker or requestAnimationFrame were not available, or raised an error. The remaining logs were equally divided among the 6 scenarios.

Scenario 1 — setTimeout

These were the easiest scenarios to test. We used the following simplified code snippet:

For the “outside the viewport” scenario, we called this function as soon as possible. For the “inside the viewport” one, we used an IntersectionObserver to trigger the function when the DOM element containing the ad first entered the viewport. For the 4 other scenarios below, we used the same logic to trigger in/out viewport cases. The now function is a helper using performance.now when available, or Date.now otherwise.

Scenario 2 — requestAnimationFrame

Here’s a simplified version of the code used to perform a timeout using the requestAnimationFrame function:

The idea was to make multiple iterations until reaching — or going slightly over — the timeout value.

Scenario 3 — setTimeout in a Web Worker

This was the most complex scenario to put in place. Here’s a simplified version of the code we used:

This function was composed of 2 parts:

The createBlob function is a helper that uses Blob, or BlobBuilder when Blob is not available or its instantiation throws an exception.

2 — Technical context

Now let’s see in which environments we collected our logs.

OS and browser repartition

During this experiment, roughly 80% of the logs were collected on mobile devices.

Repartition of the logs we collected, by Operating Systems.

When we look at the browsers, the top 5 account for more than 84% of all the logs, and 4/5 of them are mobile browsers. We consider “Facebook App” — in other words, Facebook webview — as a browser in our system, since it’s not a regular Android/iOS webview. Also, the “Google App” browser refers to the Google application, on iOS only. This app is detected as “Android webview” on Android devices by our user agent parser library.

Repartition of the logs we collected, by browsers.

If you still have any doubts about the “mobile-first” design trends, then let me tell you something:

For the record, these were the main OS and browser versions used at the time of this experiment:

       Operating Systems      |          Browsers           
-----------------------------|-----------------------------
Android 10, 9, 8.1, 8 and 7 | Chrome mobile 85 and 86
Windows 10 | Chrome 85 and 86
iOS 13.6, 13.7 and 14 | Microsoft Edge 85 and 86
Mac OS 10.15 | Firefox 81
| Safari mobile 13.1 and 14
| Safari 13.1
| Mobile Samsung Browser 12.1

Frame type repartition

Our script is used to display ads in specific slots available on the publishers' pages. Some of the websites are created using AMP (more on that later). Some of the slots are isolated from the rest of the page in a SafeFrame (a cross-origin iframe following a protocol defined by the Interactive Advertising Bureau, or IAB).

Sometimes, e.g. when we are integrated through Prebid, the slot is set in a friendly iframe with a fixed size. We may also be integrated in friendly iframes with the possibility to “move out” from it, to find a slot in the main article in the top frame.

This enabled us to run the 6 scenarios in different frame types for free:

Repartition of the logs we collected, by frame type.

Although the repartition among the frame types is not balanced, given the total number of logs we collected, it’s safe to say the results we are sharing in this article are pretty accurate, even for the “cross-origin iframe” case.

A word on AMP

AMP (Accelerated Mobile Pages) is a web component framework aimed at creating user-friendly websites. Since the user experience is at the core of AMP, it changes the behavior of some standard JavaScript functions, which includes setTimeout since it can heavily impact the performance of the page.

Our script is used to display ads, so publishers use the amp-ad component to integrate Teads on their AMP pages. This means that when a setTimeout function is called outside the viewport, 1 second is de facto added to our timeout values.

setTimeout(f, 250) in an amp-ad component, outside the viewport. The AMP framework adds 1s to the timeout value set by the developer (cf. the source code for third-party environments on the ampproject/amphtml repository).

Why am I mentioning AMP? Because it’s part of the contexts we have to deal with on a daily basis, and it has an impact on the distribution of effective timeout values as shown in the next section.

3 — Detailed analysis

We chose to share all the relevant angles of analysis we explored during our experiment. It’s a bit dense so you can jump to a specific analysis below, or directly go to the Takeaways:
3.1 - Distribution of effective timeout values
3.2 - Timeout percentiles by scenario, frame type and viewport
3.3 - Median by main browsers, frame type and viewport
3.4 - Web Worker initialization
3.5 - [bonus] requestAnimationFrame average timer duration

3.1 — Distribution of effective timeout values

In order to display a chart with the distribution of the effective timeout values, we aggregated the values inside buckets of 20ms. For example, the 0151–0170 bucket contains all the logs where the effective timeout was between 151ms and 170ms.

Distribution of the effective timeout values, by scenario type.
        +--------------------------+------+--------+------+
| | Mode | Median | Mean |
+--------------------------+------+--------+------+
| setTimeout | 251 | 321 | 685 |
| requestAnimationFrame | 242 | 247 | 5241 |
| setTimeout in Web Worker | 251 | 287 | 510 |
+--------------------------+------+--------+------+

Overall, the chart looks like a right-skewed distribution.

The requestAnimationFrame distribution is wider than the other 2, with the mode at a lower bucket value: 231–250. We can see a significant number of logs below the 250 theoretical value as well. In addition, its tail expands the farthest to the right. At this point, it’s safe to assume this solution is the least accurate of all 3.

There is a bump at 1251–1270 for the setTimeout scenarios. This is due to AMP which adds 1s to the timeout value, as explained in a previous section.

3.2 — Timeout percentiles by scenario, frame type and viewport

Effective timeout percentiles 10 to 90 by scenario, inside/outside the viewport and frame type.

This is perhaps the most important chart of the analysis. Here, we can see the percentiles 10 to 90 for every combination of:

In case you didn’t know, a percentile is a measure used in statistics indicating the value below which a given percentage of values — in a set of values — falls. For example, the 30th percentile (or percentile 30) is the value below which 30% of the values may be found. We already saw a special percentile in the previous section: the 50th one, also known as the median.

Percentiles are great to answer the question “What is the highest value found in X% of this (sorted) set of values?”. They are also used to eliminate the extreme values in our right-skewed distribution: by choosing the 90th percentile as the “top value”, we decided to ignore the last 10% of the set that contains extremely high values, which we considered irrelevant.

As an example, let’s take a look at the “setTimeout (in, top)” combination.

Percentiles for the “setTimeout (in, top)” combination.

We can make the following statement: the highest timeout value logged for 30% of the visitors, while using a setTimeout in the top window inside the viewport, was 256ms. This means that at least 30% of the users could run the function after a timeout of 256ms (close to the theoretical value of 250ms). This also means that 70% of the users ran the function after 256+ms, which is less neat.

This chart also shows that requestAnimationFrame is the least accurate scenario type as we assumed in the previous section, given the implementation we shared in the Methodology.

Percentiles for the “requestAnimationFrame (in, top)” combination.

At least 60% of the visitors executed the function before reaching the theoretical timeout of 250ms. But wait, there is more!

Percentiles for the “requestAnimationFrame (out, x-origin)” combination.

This is by far the worst combination of scenario type, viewport and frame type. Only ~10% of the users could run the function after a timeout close to 250ms. For the rest, this timeout could reach several seconds or even minutes for higher percentiles. These high values have an explanation though, more on that in the next section.

Thankfully, setTimeout in a Web Worker was there to save the day. In particular, the use of a cross-origin iframe seemed to offer the best timeout values.

Percentiles for the “setTimeout WebWorker (in, x-origin)” combination.

So far, the setTimeout in a Web Worker scenarios look promising.

3.3 — Median by main browsers, frame type and viewport

Next in our analysis, we wanted to know the influence of web browsers for each of these scenarios. We know there are some optimization strategies that involve throttling the execution of JS functions when the code is run outside the viewport and/or inside an iframe, particularly on mobile devices. But how much can these mechanisms affect our timeout function? We chose to focus on the median value (50th percentile).

Median value of effective timeouts by main browsers, broken down by frame type and viewport.

All scenarios included, we can see on this chart that the median is higher on cross-origin iframes whose code is run outside the viewport (dark blue bar), and the best frame type and viewport combination seems to be the cross-origin iframe inside the viewport (green bar). But is this true when we look at each scenario individually?

setTimeout
Let’s take the same chart, but this time we’re only using the logs from the setTimeout scenarios.

Median value of effective timeouts by main browsers, broken down by frame type and viewport, for the setTimeout scenarios.

Here, we can split the list of browsers into 2 groups:

We can put Safari browsers and Google App in the first group (“Apple browsers”), whilst Chromium browsers (Chrome, Microsoft Edge, Android webviews) go in the second group. Facebook App is in between since it’s used both on iOS and Android devices. As for Firefox, it shows some impact on top and friendly frames outside the viewport, but none on cross-origin iframes.

requestAnimationFrame
Now let’s see what happens with the logs from the requestAnimationFrame scenarios.

Median value of effective timeouts by main browsers, broken down by frame type and viewport, for the requestAnimationFrame scenarios.

We are using a logarithmic scale since some values are really high compared to the rest. We can see several things:

setTimeout in a Web Worker
Last but not least, let’s see how the chart looks when we filter only the logs from the setTimeout in a Web Worker scenarios.

Median value of effective timeouts by main browsers, broken down by frame type and viewport, for the setTimeout in a Web Worker scenarios.

These results look pretty sexy! Overall, cross-origin iframes seem to be the best type of frames to use the setTimeout in a Web Worker solution, at least for 50% of the visitors (we used the median value in these charts).

3.4 — Web Worker initialization

At this point, I think you’re starting to realize that the setTimeout in a Web Worker scenarios are clearly the winners in our little experiment. However, using Web Workers comes at a price…

The first constraint is related to Content-Security-Policies (CSP). If the website uses the worker-src CSP directive, then the source of the script that instantiates a Web Worker must be authorized in this directive, otherwise it won’t work.

Another cost we haven’t talked about yet is the Web Worker initialization. Communication with a Worker is made possible by the postMessage API. This function is pretty fast, once the Worker is ready. This is where our next analysis comes into play: how much time does a Web Worker take to get ready?

Here’s how we computed the initialization time (cf. the code snippet from the Methodology):

As you can see, we wait for the first postMessage response to consider the Web Worker as ready.

Median by main browsers, OS, frame type and viewport
For this first chart, we are going to use the median value (50th percentile). Let’s see how many milliseconds it takes the Web Worker to respond to the first message.

Median value of Web Worker init duration by main browsers, broken down by frame type and viewport.

If we filter the logs on Apple browsers only (known to have the best JS performance), we can see that the initialization duration ranges from less than 10ms to 80ms.

Median value of Web Worker init duration by main browsers on Apple browsers, broken down by frame type and viewport.

Now, if we exclude both iOS and Mac OS, and check on the other operating systems, we get the following results:

Median value of Web Worker init duration by main browsers on non-Apple browsers, broken down by frame type and viewport.

This time, the initialization duration ranges from 25ms to 800ms in the worst case. Notice that desktop browsers have better performance than Android browsers. This is confirmed when we look at the main operating systems:

Median value of Web Worker init duration by main operating systems, broken down by frame type and viewport.

We can see that iOS offers the best timings while Android has the worst ones.

All browsers and operating systems included, we get the lowest initialization durations in cross-origin iframes. In some browsers, the duration is so low that we can consider the creation of a Web Worker as an operation with “no cost”. However, in the majority of cases, the duration is not negligible.

Until now we were focusing on the median value, let’s have a look at the other percentiles.

Percentiles by main browsers

Percentiles of Web Worker init duration by main browsers.

This chart shows that in some contexts (Android particularly), a Web Worker creation takes from ~30ms to more than 2s. This indicates that it’s better to create a single instance of Web Worker for the whole application lifespan rather than creating an instance for each setTimeout call.

Percentiles by frame type and viewport

Percentiles of Web Worker init duration by frame type and viewport.

Earlier in this article, we learned that the setTimeout in a Web Worker scenarios were giving the best performance in cross-origin iframes. Remember that the initialization duration was not included in that effective timeout value. On this chart, we can see that this type of frames is also the more appropriate to initialize the Web Worker instance, the duration ranging from 8ms to 670ms, versus the others where the initialization can last up to several seconds for the 90th percentile.

Using a Web Worker comes at the price of its initialization. This step is “negligible” on Apple devices, but it’s really significant on Android devices. As seen in the Technical context, Apple devices represented approximately 30% of all the contexts on which we delivered ads during this experiment. In other words, the best contexts are not the most used, so we can’t ignore this cost. Creating a Web Worker is an investment: the more you use this single instance over the course of the application lifespan, the more its cost shrinks.

3.5 — [bonus] requestAnimationFrame average timer duration

This is not really part of the experiment, but since we were logging messages for the requestAnimationFrame scenarios, we took the opportunity to log the average time frame between 2 iterations. In the best conditions, requestAnimationFrame can be used to create smooth animations (i.e. at 60 FPS, or 60 iterations per second). However, how many times are we in these “best conditions”?

Median by main browsers, frame type and viewport
For this first chart, we are going to use the median value of average duration between 2 iterations.

Median of average duration between 2 frames, by main browsers, broken down by frame type and viewport.

The first thing we notice is that we are below 30 FPS when using requestAnimationFrame outside the viewport, and it gets even worse in iframes. This is not really surprising since browsers strongly limit the frame rate of requestAnimationFrame when the frame is not visible in the viewport, to improve the performance of the page. When this function is used inside the viewport, we get better FPS, particularly on desktop and iOS browsers. Android is used on a wide range of devices, which means low-cost devices that offer limited performance are also taken into account in this analysis.

For 50% of the visitors, as long as requestAnimationFrame is used inside the viewport, we get between ~15 FPS and ~60 FPS, depending on the browser and the frame type. Let’s check out the other percentiles.

Percentiles by main browsers
The viewport is clearly important when it comes to the performance of requestAnimationFrame. This is why we decided to split this analysis into “inside the viewport” and “outside the viewport”.

We ran this function on a wide variety of websites. Some of them were probably “lite” while others could have been intense in terms of CPU / memory / network usage. Obviously, if your website is rather lite, it should offer better FPS than a website with a very busy main thread.

Inside the viewport

Percentiles of average duration between 2 frames, by main browsers, inside the viewport.

We can see several things on this chart:

Outside the viewport

Percentiles of average duration between 2 frames, by main browsers, outside the viewport.

When it comes to running requestAnimationFrame outside the viewport, we can see that until the ~60th percentile, Firefox has the best performance, but then Chromium desktop browsers show better results: Microsoft Edge and Chrome don’t go over 220ms (4.5 FPS) on the 90th percentile. For the rest, especially on mobile devices, we get closer to (or even below) 1–2 FPS the more we go up the percentiles.

The requestAnimationFrame function was introduced to create animations in the web browsers. Here, we used it to call a function after some timeout value. It’s not surprising that browsers chose to heavily limit the performance of this function outside the viewport: there’s no point in making smooth animations if the user cannot see them on their screens.

Nonetheless, requestAnimationFrame could be interesting to set a timeout if:

In this case, we recommend using a more robust implementation than ours since we got a significant number of logs before even reaching the timeout.

4 — A few takeaways

Safari suspends the execution of requestAnimationFrame for both friendly and cross-origin iframes outside the viewport, as long as there hasn’t been any user interaction with the frame. Chromium browsers only suspend the cross-origin iframes outside the viewport, while Firefox doesn’t suspend anything. This is why we got better results on the latest for the requestAnimationFrame scenarios.

Due to the wide variety of devices on Android, the instantiation of a Web Worker has a significant cost, and more generally, any JS execution is more costly on this OS than any other platform. We encourage you to keep this in mind while developing a JS product for users who use Android devices, don’t assume “it’s 2020, everyone’s got high-end devices”.

The reason why this article is relevant only for single timeouts and not for multiple timeouts called recursively (e.g. for animations) is because some browsers increase the timeout value after the first iterations, e.g. on Safari when the code is executed in an iframe outside the viewport.

Safari 14.0 on Mac OS 10.14.6, after 5 iterations, the 250ms timeout is increased until reaching the 1s value. This happens for timeouts greater than 1s as well, on both friendly and cross-origin iframes, outside the viewport only.

5 — Conclusion

We conducted an experiment at scale, in a very wide variety of contexts. Since our code base contains some critical use of timeouts, we wanted to know what would be the most accurate way to schedule these functions.

The setTimeout function is okay, with an effective timeout value ranging from 251ms to 1.66s at the 90th percentile, for a 250ms theoretical timeout. Using setTimeout in a cross-origin iframe seemed to be the worst case. On Safari browsers, the value ranged from 251ms to 664ms, making it the most effective browser to use setTimeout (and JS code execution in general).

The use of requestAnimationFrame was the least accurate one, given our implementation shared in the Methodology section. The callback function could be called before reaching the timeout value, or worse, several minutes later in some conditions (iframe integrations outside the viewport, essentially). However, it’s worth noting that when this function is called inside the viewport, it offers the lowest 90th percentile value. Nonetheless, it could be a good fallback option if a Web Worker cannot be used, and if a more accurate implementation than ours is used.

The last one, setTimeout in a Web Worker, seemed to offer the best results. More particularly, using this solution inside a cross-origin iframe seemed to be the most accurate way of calling a function after some timeout. There is a cost, but once the Web Worker is set up, the communication with postMessage is blazing fast.

This experiment was focused on a single timeout call. As stated in the takeaways, in some contexts, calling a timeout function recursively (e.g. for making animations) can increase the delay after the first iterations.

As of October 2020, the big winner is setTimeout in Web Worker, in a cross-origin iframe!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Benoit Ruiz

Benoit Ruiz

Software Engineer at Datadog. Twitter: @Haaress