How Airbnb migrated from Webpack to Metro and made the development feedback loop nearly instantaneous, the largest production build 50% faster, with marginal end-user runtime improvements.
By: Rae Liu
What is Metro?
- Resolution deals with how to resolve import/require statements.
These three concepts are the fundamental building blocks to understand how Metro works. In the following sections, we highlight the key architectural differences between Metro and Webpack to provide deeper context into Metro’s strengths.
Key architectural differences between Metro and Webpack
Process JS bundles on demand in development
Figure 1: Differences between the JS bundle development setups for Metro and Webpack
In this example, there is a web project with three entry points: entryPageA.js, entryPageB.js, and entryPageC.js. A developer makes changes to Page A, which includes only the entryPageA.js bundle. As you can see in Figure 1, in both scenarios, the browser loads Page A (1), then requests the entryPageA.js file from the bundler (2), and finally the bundler responds to the browser with the appropriate bundles (4). With the Webpack bundler (1a), even though the browser only requests entryPageA.js, Webpack compiles all entry points on start-up before it can respond to the entryPageA.js request from the browser. On the other hand, with the Metro bundler (1b), we see that the development server does not spend any time compiling entryPageB.js or entryPageC.js, instead only compiling entryPageA.js before responding to the browser request.
Another powerful Metro feature we leverage is its multi-layered caching feature, which makes setting up both persistent and non-persistent caches straightforward. While Webpack 5 also comes with a disk persistent cache, it isn’t as flexible as Metro’s multi-layered cache. Webpack offers two distinct cache types: “filesystem” or “memory”, which is limited to memory or disk cache, no remote cache capability is possible. In comparison, Metro provides more flexibility by allowing us to define the cache implementation, including mixing different types of cache layers. If a layer has a cache miss, Metro attempts to retrieve the cache from the next layer and so on.
Figure 2: How Airbnb configures the multi-cache layers with Metro
The ordering of the caches determines the cache priority. When retrieving a cache, the first cache layer with a result will be used. In the setup illustrated in Figure 2, the fastest in-memory cache layer is prioritized at the top, followed by the file/disk cache, and lastly the remote read-only cache. Compared with the default Metro implementation without a cache, hitting a remote read-only cache resulted in a 56% faster server build in a project compiling 22k files.
One contributing factor to Metro’s performance is its built-in worker support which amplifies the effect of the multi-layer cache. While Webpack requires careful configuration to leverage workers via a third-party plugin, Metro by default spins up workers to offload expensive transforms, allowing for increased parallelization without configuration.
But why use a remote read-only cache instead of a regular remote cache (read & write)? We discovered that not writing to the remote cache saved an additional 17% build time in development for the same project with 22k files. Writing to the remote cache incurs network calls that can be costly, especially on a slower network. To populate the cache, instead of remote cache writes, we introduced a CI job that runs periodically on the default branch commit.
In the bundler context, serialization means combining the transformed source files into one or multiple bundles. In Webpack, the concept of serialization is encapsulated in the compilation hooks (Webpack’s public APIs). In Metro, a serializer function is responsible for combining source files into bundles.
For one example of the importance of serialization, let’s take a look at Internationalization support. We currently support Airbnb websites in around 70 locales, and in 2020, our internationalization platform served more than 1 million pieces of content. To support internationalization with JS bundles, we need to implement specific logic in the serialization step. Although we had to implement similar internationalization logic when serializing bundles for both Metro and Webpack, Webpack required lots of source code reading to find the appropriate compilation hooks for us to implement the support. On top of all that, it also required understanding the intricacies of concepts like what dependency templates are and how to write our own. Comparatively, it is a breath of fresh air to implement the same internationalization support with Metro. We only have to focus on how to serialize JS bundles with translation content and all the tasks are done in the single serializer function. The simplicity of Metro’s bundling concepts makes implementing any bespoke feature straightforward.
Challenges of Adopting Metro at Airbnb
In development, we had to create a Metro server with custom endpoints to handle building dependency graphs, translation, bundling JS & CSS files, and building source maps. For production builds, we ran Metro as a Node API to handle resolution, transformation, and serialization.
The surface area of the full migration was substantial, so we broke it down into two phases. Because the slow iteration speed of our Webpack setup incurred significant costs around developer productivity, we addressed the slow Webpack development experience with the Metro development server as our first priority. In the second phase, we brought Metro to feature parity with Webpack and ran an A/B test between Metro and Webpack in production. The two biggest challenges we faced along the way are outlined below.
The out-of-the-box Metro setup for development produced giant ~5MiB bundles per entry point, since a single bundle is the intended use case for React Native. For the Web, this bundle size was taxing on browser resources and network latency. Every code change resulted in a 5MiB bundle being processed and downloaded, which was inefficient and could not be HTTP-cached. Even if the changed code recompiled instantly, we still needed to reduce the size and improve browser cacheability.
To improve the performance of Metro in the Web environment, we split the bundles by dynamic import boundaries, a technique also known as code splitting. The code splitting boundaries enabled us to leverage HTTP caching effectively.
In Figure 3, import(‘./file’) represents the dynamic import boundaries. The bundle on the left hand side (3a) is broken down to three smaller bundles on the right (3b). The additional bundles are requested when the import(‘./file’) statements are executed.
In Figure 3a, suppose fileA.js has changed, the entire bundle needs to be re-downloaded for the browser to pick up the change in fileA.js. With bundles split by dynamic import illustrated in Figure 3b, a change in fileA.js only results in re-downloading of the fileA.js bundle. The rest of the bundles can reuse browser cache.
Figure 3: Splitting bundles by dynamic import boundaries. A bundle is represented by the rectangular boxes with a pink background.
When we began to think about production bundles, we wanted to optimize a bit differently than in development. It takes time to run the bundle splitting algorithm, and we didn’t want to waste time on optimizing bundle sizes in development. Instead, we prioritized the page load performance over minimizing bundle sizes.
Bundle splitting alone decreased bundle sizes significantly, however we were able to make bundles even smaller by deleting dead code. However, it is not always obvious to identify what is considered dead code in a project, as some “dead code” in a project may be “used code” in the other projects. This is where tree-shaking came into play. It relied on the consistent usages of ECMAScript modules (ESM) import/export statements in the code base. Based on the import/export usages in a project, we analyzed what specific export statements were not imported by any file in the project. Finally, the bundler removes the unused export statements, making the overall bundle sizes smaller.
One challenge we faced while implementing the tree-shaking algorithm for Metro production builds was the risk of mistakenly removing code that is executed at runtime. For example, we ran into multiple bugs related to re-export statements. Since Webpack handles ESM import/export statements in a different way, there was no comparable prior art for reference. After multiple iterations of tree-shaking algorithm implementation, the following table captures how much dead code we were finally able to drop given the project size.
The Metro migration brought forth some very significant improvements. The biggest Airbnb frontend project compiling ~48k modules (including server and browser compilations) saw a drop in the average build time by ~55% from 30.5 minutes to 13.8 minutes. Additionally, we saw improvements on the Airbnb Page Performance Scores with the pages built by Metro, ranging around +1%. The end user performance improvement was a nice surprise, as we initially aimed for achieving neutral experiment results.
The simplicity of Metro’s architecture has benefited us in many ways. Engineers from other teams have ramped up quickly to contribute to Airbnb’s Metro implementation, which means there is a lower barrier to entry for contributing to the bundling system. The multi-layered cache system is straightforward to work with, making experimentation with caching possible. The bespoke bundler feature integrations are made obvious and easier to implement.
Thank you everyone who has contributed to this multi-year project. We couldn’t have done it without any of you! Special shoutout to my lovely team Michael James and Noah Sugarman for driving the Metro production migration to the finish line. Thank you Brie Bunge, Dan Beam, Ian Myers, Ian Remmel, Joe Lencioni, Madison Capps, Michael James, Noah Sugarman for reviewing and giving great feedback on this blog post.
All product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement.