Speed up source-map generation with WebAssembly: Google Summer of Code 2018

Try out the new WebAssembly powered dependency for webpack: 60% faster for source-map devtool and 20% for cheap-source-map devtool.

Introduction

WebAssembly, or WASM, is a new technology that enables code to be compiled and run faster in web and server applications. In this project, webpack-sources, one of the core packages of Webpack, and its dependencies source-list-map and source-map are re-written in Rust and compiled to WebAssembly binary to speed up the webpack bundling time. The new package is designed to be API compatible with the previous version in JavaScript, and you can substitute for the old package directly without any modifications in Webpack and plugins.

Although it has not yet been released as a default dependency in Webpack, you can simply try it out with webpack-cli. See the Github project for more information.

The project repository and npm package:

Performance Benchmark

A significant performance enhancement when using our package wasm-webpack-sources to build a large project with source-map and cheap-source-map devtool. For other test items, including normal build (without devtool) and build with cheap-source-map, the performance is a tie.

Benchmark project on Github

Project Details

With the help of wasm-bindgen and Rust, the WebAssembly development is much more comfortable than starting from the scratch. However, making a fully API compatible WebAssembly module to replace the existing JavaScript module and concerning the performance at the same time is still not a piece of cake. To achieve our goal, some architecture design has been proposed in this project.

wasm-bindgen

WebAssembly is powerful, yet it requires developers to program in low level and manage annoying issues like memory management by themselves. These issues also include how to pass data to functions WebAssembly instance and get the return values.

wasm-bindgen is a Rust crate to facilitate high-level interactions between WebAssembly modules and JavaScript. Briefly speaking, wasm-bindgen is a tool to create JavaScript wrappers for your WebAssembly binary so that developers can utilize the binary by simply loading the wrappers without worrying how to access to the binary and how to transmit the data between WebAssembly instance and JavaScript. It can also create high-level wrapper in JavaScript like class definitions from your function-composed WebAssembly binary, which is very helpful.

Module Dependencies

As aforementioned, our target webpack-sources has two major dependencies: source-list-map and source-map. These two are maintained by different team, webpack and mozilla, and published in npm as separate packages. In our project, these three packages are merged into one WebAssembly binary and released as one big npm package now for performance concerns.

When it comes to transmitting data between WebAssembly instance and JavaScript, developers can create a shared memory and share data by manipulating it. However, it is not that easy to communicate between different WebAssembly instances. If we want to pass the data from instance A to instance B, we need to “copy” the data from A’s shared memory to B’s shared memory, which will cause a performance loss. To avoid the copy, I pack everything into one binary so that all modules uses and shares the same shared memory. Thus, when passing the data, we need to pass the memory address and then functions in different modules can access the data at that address directly.

Additional JavaScript Wrapper

In this project, we write an additional JavaScript wrapper working with the wrapper created by wasm-bindgen to link JavaScript and WebAssembly. Even though wasm-bindgen can create useful wrappers for our binary, it is still too simple to achieve our final goal: API compatibility.

Some common tricks in JavaScript are not that easy to implement in WebAssembly. For example, we may have a function take one object as the argument, and we want to perform different tasks in this function based on the existing of a certain key in the object. It is straightforward to implement in JavaScript, since JavaScript doesn’t require developers to specify the fields of the object when defining the function parameters. However, things are different for Rust, the static type language. Although we can pass a JSON string and decode in WebAssembly, encoding and decoding will not be a cheap cost. To overcome this issue, we can split different tasks into different functions in WebAssembly, and perform the decision in JavaScript wrapper on the polymorphic argument to determine which function we should call.

Module Usability

Thanks to the API compatibility, the WebAssembly can directly replace the old module without any changes in Webpack and plugins. However, it is still a trouble to do the replacement. To alleviate this problem, a register() function is provided in our module, and it will be called automatically when the module is required. When this function is called, it will override the resolve of the old package webpack-sources with it, and webpack and plugins which load webpack-sources will all be "redirected" to load our module. Moreover, it is also possible to use with webpack-cli with nothing more than an additional command line argument.

Even though our package is still experimental, you can always give it a try with little effort and enjoy the performance enhancement from WebAssembly.

Performance Issues

Although WebAssembly has been shown to have the ability to bring considerable performance enhancement in various cases, naively rewriting everything into Rust and compiling to WebAssembly will not make the performance better but sometimes even worse. We always need to deal with performance sensitive part carefully.

Boundary Crossing

Boundary crossing means to transmit data between WebAssembly and JavaScript. As previously mentioned, we need to create a shared memory, and then move the data to the memory so that the data can be accessible to WebAssembly. These “copy” operations will be a performance bottleneck when the amount of data to be transmitted too large.

To minimize the performance loss from crossing boundary, we not only avoid passing unnecessary data to WebAssembly, but also implement JavaScript wrappers for lazy getter and caching.

Fast string split in Rust

When processing the codes in webpack-sources, there are tons of strings to be split and concatenated. In JavaScript, creating a substring is relatively efficient. In Rust, however, if you want the substring lives longer than the original string, you need to allocate a new memory space and copy the content of the substring into the new memory space. This is super slow in fact. Inspired by JavaScript engine V8, we create a module called StringSlice to mitigate this problem.

In our implementation, each StringSlice is stored with a reference counter. When creating a substring from a StringSlice, instead of allocating memory space for the new string, we create a new StringSlice instance with the pointer pointing to the original string's address and its length, then increment the reference counter. This method can avoid redundant memory allocation/copy as well as maintain the lifetime of used memory for Rust to perform the memory management at the right moment.

Future Work

Garbage Collection / Memory Deallocation

In old package webpack-sources, we rely on garbage collection in JavaScript to automatically deallocate those unused data. Unfortunately, there is no garbage collection in WebAssembly. To maintain the API compatibility and not modify the codes in webpack and plugins, we did not add functions to deallocate objects. As a result of that, unused objects will leave in the memory and may potentially cause memory leaks.

This issues can be solved by adding codes to deallocate objects in webpack and plugins, or wait until garbage collection is introduced to WebAssembly.

JavaScript Fallback

WebAssembly is only supported by Node.js version greater than 8. However, there are many people still using webpack with Node.js 6. Thus, a fallback is necessary for runtime not supporting WebAssembly. To make it possible, we can compile our Rust code into asm.js and switch between WebAssembly and JavaScript implementation based on the support of runtime. However, the current JavaScript wrapper is designed for WebAssembly. Some tricks are done to enhance the performance of crossing JavaScript/WebAssembly boundary and may lead to some performance defect to asm.js version. To overcome this issue, another wrapper needs to be done for asm.js. Another solution is simply fallback to the original JavaScript version.

Since this package is still under experiment, this issue is not urgent and can be completed before releasing as default dependency of webpack.

Conclusion

With our new package wasm-webpack-sources, source-map genration is 60% faster and cheap-source-map generation is 20% faster than before. The package is now released in npm and welcoming performance feedbacks and bug reports. Further improvement can be made to make it even better and ready to be the default dependency in webpack.