Generating Sudoku Boards pt. 3: Rust for WebAssembly

Solar Eclipse, through ISO 12312–2 glasses, long exposure, handheld

Part my Sudoku Board Generation Series:
Part 1: Structure & Algorithm
Part 2: Implementation Comparison
Part 3: Rust for WebAssembly
Part 4: C++ for WebAssembly

As mentioned in the first post of this series, the initial purpose of creating a sudoku board generator was to look at the performance characteristics Web Assembly (WASM). This post covers how the WASM build was created for the Rust version of the generator.

At the time of the initial post, I had a working WASM build that could be run in a browser. However, this build was a simple translation of the binary. There was no API made available in the JS layer. This isn’t particularly interesting. This time around I created a library that exposes an API that can be used by a JavaScript runtime.

Additionally, the build tools and runtime environments have changed a bit. Node now supports WASM with no additional configuration. And the Rust and WebAssembly project has blossomed. This project is dedicated to making Rust and WASM “the best choice for fast code on the web”. Part of that project is creating tools for making the process as easy as possible. I’ll be using their wasm-pack tool in order to build a module that can be used in node.

I had to work around a lot of issues and hit a few dead-ends due to my lack of knowledge from both Rust and WASM. What is presented below is a cleaned up version of the steps I went through.

You can see the source code associated with this post here.

Installing Tools

These instructions are modified from the “Rust Wasm book” and "wasm-bindgen basic usage”, in order to use the rustup toolchain manager and the node runtime.

Node

Installing node is pretty trivial. You can get the latest binaries on their downloads page, use your package manager, or use NVM. Just use the latest stable version and you should be fine.

Rust

The first item to install is rustup. This application installs and manages the tools needed to build rust for different targets. Below shows how to install rustup and the wasm buildchain to the nightly channel.

Note: The nightly channel was used because I wasn't able to build the WASM dependencies on stable.

# Install rustup and add binaries to current shell $PATH
curl https://sh.rustup.rs -sSf | sh
source $HOME/.cargo/env
# Update rustup and add wasm target to the nightly buildchain
rustup update --all
rustup target add wasm32-unknown-unknown --toolchain nightly

I then added the below snippet to my ~/.profile file so the tools will be configured when I restart my computer.

export PATH="$HOME/.cargo/bin:$PATH"

Finally, I used cargo to install wasm-pack. This added the wasm-pack executable to my PATH.

cargo install wasm-pack 

Note: I was unable to install wasm-pack on OSX. There was an issue including the headers for libz. I had the headers and binaries installed via brew, as well as through xcode, but cargo wasn't able to find them. Rather than debug the issue, I just switch to my Linux machine, the install worked immediately after installing the dependencies.

Update: I was able to install wasm-pack on OSX after removing the brew versions of llvm and clang . Now make clean run-js runs as expected.

Defining the Library Interface

As mentioned before I wanted to created a module that could be executed from JavaScript, rather than simply including and running. The interface I settled on is as follows.

generate_and_fill_board creates a new board, fills it with a random valid sudoku solution, and returns the struct. generate_and_fill_boards loops the previous functions, creating and filling board_count number of boards. This function was implemented in order to test the impact of communication between WASM and JavaScript multiple times vs. a single long-running job.

Note: Ideally I would compare returning single SudokuBoard structs with a Vec<SudokuBoard>. However, this is not currently supported by wasm_bindgen .

Finally, serializeBoard was included so I could actually seem some results.

Adding Multiple Cargo Builds

Cargo allows for multiple builds. This is done by adding separate entries that target different source files. I added the following entries to the bottom of my Cargo.toml file.

The lib entry defines the library configuration. This will be used by wasm-pack. The crate-type property defines how this module will be linked. The cdylib value means that a dynamically linked library will be produced. The rslib value means that it can be included statically by the bin build.

By comparison, bin is simple, only defining the name and path.

Exposing functions with wasm_bindgen

These instructions are based on wasm_bindgen basic usage instructions. I also looked at the project generated in the hello world example for reference

Initially, I added all of the new dependencies to both builds.

The cfg-if is used by the Hello World boilerplate to conditionally use a different allocator. The time crate was used before. I had to update the rand package as they recently added wasm-bindgen support. The attributes we'll be using later are included in the wasm-bindgen package.

In order to support error logging and a different allocator, the console_error_panic_hook and wee_alloc packages are included by the Hello World example linked above.

To expose the library API to JavaScript the #[wasm_bindgen] attribute was added to the following functions and structs.

The wasm_bindgen attribute generates the code necessary to expose those functions and structs to JavaScript. All values exposed to JavaScript must be annotated in this way. This includes the types returned by the exposed functions. That is why the SudokuBoard struct was annotated as well.

Although wasm_bindgen supports exporting instance methods, I found it easier to compile when the method SudokuBoard::serialize was refactored to the function serializeBoard. If I wanted to support an interactive game with this library, adjusting the compilation rules would be the least of my concerns.

Forced to use (Much Slower?) OsRng

To shuffle the valid options for each cell (and generate random boards). I was the rand::thread_rng. Unfortunately, this is not supported by wasm_bindgen. Only OsRng is available as of rand version 0.6. To see what performance impact this had. I added a feature to my library that will allow me to test this out. The feature was implemented as follows.

You can see on lines 10 through 16 the difference ranges being used. The cfg! macro is used in combination with feature detection, creating two compile time branches. A quick comparison of the two methods can be seen below.

# Os Rng
> make run-native
cargo +nightly build --release
Finished release [optimized] target(s) in 0.03s
./target/release/runner 100
time: 0.04272449 s
board count 100
boards per second 2340.5779682800194

# Thread Rng
make run-native-thread-rng
cargo +nightly build --features="thread_rng" --release
Finished release [optimized] target(s) in 0.03s
./target/release/runner 100
time: 0.024462508 s
board count 100
boards per second 4087.8882901131806

Quite the difference! Almost twice the running time for 100 boards. I initially thought this might be due to allocating a new OsRng for each cell. However, using a shared OsRng instance didn't seem to have any effect. I'll be comparing these more thoroughly in a later post. For now, it seems like a bummer to have to use a slower library.

Optimizing Build for size

A concern of WASM builds is the size of the generated binary. This is also true for embedded use-cases and is supported by Cargo. There are two options for optimizing for size opt-level= "s" and opt-level = "z". The z option is more aggressive but takes longer to run.

The options had some strange results with my project. I built the library with z, s, and with optimization for speed (the O3 option in gcc). And got the following results for the module size in bytes:

Oz - 1502285
Os - 1485024
O3 - 1468290

These versions are all very large for use on the web (~1.5 MB). But strangely, the size reduction was best when optimizing for speed, and worst when using aggressive size optimization. This is the opposite of what I expected. This bothers me, but an answer will have to wait for another day.

Actually Running the Code

So far it’s been a lot of work with no payoff. Let’s take a look at how I got the WASM module to actually build and run. I structured my builds around a small Makefile. The entry for the WASM module looks as follows.

Then I load the generated module with a node script in the root of the project.

You can see that the generate_and_fill_board function is called, and a serialized representation of the board is logged to the console. But does it work?

We can run the script and see the output as follows.

make pkg && node hello.js 5

And the result?

1|2|4|5|8|9|3|7|6|8|6|9|7|2|3|1|4|5|5|3|7|6|1|4|8|2|9|9|7|5|4|6|1|2|3|8|3|1|2|8|7|5|9|6|4|4|8|6|9|3|2|7|5|1|7|5|8|2|9|6|4|1|3|6|9|3|1|4|7|5|8|2|2|4|1|3|5|8|6|9|7
4|9|7|1|3|6|5|8|2|3|2|5|7|4|8|1|9|6|6|1|8|5|9|2|3|4|7|8|3|4|6|7|5|2|1|9|7|6|9|4|2|1|8|5|3|1|5|2|9|8|3|7|6|4|2|7|6|8|5|9|4|3|1|5|4|1|3|6|7|9|2|8|9|8|3|2|1|4|6|7|5
2|1|6|8|9|7|3|4|5|3|5|9|1|2|4|8|6|7|8|4|7|6|3|5|1|2|9|5|6|8|3|1|9|2|7|4|9|3|2|7|4|6|5|8|1|4|7|1|5|8|2|6|9|3|1|2|3|4|7|8|9|5|6|7|8|5|9|6|3|4|1|2|6|9|4|2|5|1|7|3|8
5|2|8|3|6|9|4|1|7|7|9|6|1|2|4|5|8|3|3|1|4|5|8|7|2|6|9|2|3|9|8|5|6|1|7|4|6|5|1|7|4|3|9|2|8|8|4|7|9|1|2|6|3|5|9|7|5|2|3|1|8|4|6|1|6|3|4|9|8|7|5|2|4|8|2|6|7|5|3|9|1
1|8|7|9|2|6|5|3|4|4|2|5|8|3|7|1|6|9|6|3|9|4|1|5|8|2|7|8|7|4|6|5|2|9|1|3|2|5|1|7|9|3|6|4|8|9|6|3|1|4|8|7|5|2|5|9|6|2|7|4|3|8|1|7|4|8|3|6|1|2|9|5|3|1|2|5|8|9|4|7|6
3|8|1|9|2|5|7|4|6|6|5|2|3|4|7|8|1|9|9|4|7|1|6|8|3|5|2|8|2|5|6|3|1|4|9|7|4|1|3|8|7|9|2|6|5|7|6|9|2|5|4|1|3|8|1|3|6|7|9|2|5|8|4|2|9|4|5|8|3|6|7|1|5|7|8|4|1|6|9|2|3
7|3|1|9|2|6|5|8|4|9|2|4|5|3|8|7|1|6|5|6|8|7|4|1|2|3|9|1|4|9|8|6|2|3|5|7|2|5|6|1|7|3|9|4|8|3|8|7|4|5|9|6|2|1|4|1|3|2|9|7|8|6|5|8|7|2|6|1|5|4|9|3|6|9|5|3|8|4|1|7|2
1|3|9|4|7|6|8|2|5|7|6|5|3|8|2|4|9|1|4|8|2|9|1|5|6|3|7|3|1|8|7|9|4|2|5|6|6|2|4|1|5|8|3|7|9|9|5|7|6|2|3|1|8|4|5|7|6|2|3|1|9|4|8|2|9|1|8|4|7|5|6|3|8|4|3|5|6|9|7|1|2
6|2|5|7|3|9|4|8|1|9|7|1|6|4|8|3|2|5|3|4|8|1|2|5|7|9|6|8|3|7|5|6|4|2|1|9|4|1|2|9|8|3|6|5|7|5|6|9|2|7|1|8|4|3|1|8|4|3|9|6|5|7|2|7|9|6|8|5|2|1|3|4|2|5|3|4|1|7|9|6|8
6|8|1|2|9|3|5|7|4|4|2|9|1|5|7|6|3|8|7|5|3|8|4|6|2|1|9|1|4|8|7|6|2|9|5|3|9|3|7|4|1|5|8|2|6|5|6|2|3|8|9|7|4|1|2|1|5|9|3|8|4|6|7|8|7|4|6|2|1|3|9|5|3|9|6|5|7|4|1|8|2

Great! We’ve got the serialized output of 10 probably unique boards.

Making it work natively … again

Getting the library to work with WASM was pretty cool, but I inadvertently broke my native (Linux) builds. I received build errors like the following.

error[E0432]: unresolved import `wasm_bindgen`
--> src/sudoku_generator.rs:151:1
|
151 | #[wasm_bindgen]
| ^^^^^^^^^^^^^^^ did you mean `self::wasm_bindgen`?

error: cannot determine resolution for the macro `__wbindgen_if_not_std`
--> src/sudoku_generator.rs:151:1
|
151 | #[wasm_bindgen]
| ^^^^^^^^^^^^^^^
|
= note: import resolution is stuck, try simplifying macro imports

error[E0433]: failed to resolve: use of undeclared type or module `WasmRefCell`
--> src/sudoku_generator.rs:151:1
|
151 | #[wasm_bindgen]
| ^^^^^^^^^^^^^^^ use of undeclared type or module `WasmRefCell`

After trying a few workarounds, I settled on excluding wasm-bindgen from the native build. Firstly I excluded the WASM specific packages from the Cargo.toml native builds.

The [target.'cfg(target_arch = "wasm32")'.dependencies] section excludes the below packages from all build targets except for wasm32. They won't be included in any native builds.

Next, I had to exclude wasm_bindgen from the library source code. This included both the imports and the attributes. The imports can be handled in the same way that the Hello World example handled the allocator. Rust provides a mechanism excluding the attributes via the cfg_attr attribute. This attribute allows for compile-time rules to adjust what attributes are added (or not). An outline of the changes can be seen below.

Essentially all #[wasm_bindgen] entries were replaced with the more verbose #[cfg_attr(target_arch = "wasm32", wasm_bindgen)]. A little ugly, but it gets the job done.

Now when I run the native binary I get the expected results.

> make run-native

# compilation output

./target/release/runner 100
time: 0.025662673 s
board count 100
boards per second 3896.7102140918837

Summary

In this post, I went over how I modified my sudoku board generator to build a library for use as a web assembly module, and within a native Linux build. I ran into and worked around some compatibility issues. And came across some strange optimization behavior.

Next, I’ll outline a C++ implementation of a WASM library. After that, it’s time to benchmark!