Unit testing Rust using Chrome

Stretch is a cross-platform Flexbox engine written in Rust. At Visly we are building a design tool for front-end engineers and we needed to ensure components looked the same across web, iOS, and Android without making use of WebViews. This meant replicating the web layout system on mobile.

In this post, I’ll cover the test setup we use in Stretch, how and why we need to generate unit dynamically. I’ll also cover an example of contributing another test to Stretch, and finally I’ll walk through how we also make use of this system for benchmarking.

Test setup

Flexbox is a complicated specification and it’s easy to get the implementation wrong. Therefore we knew from the beginning that we needed to take a test-driven approach to its development. We needed a novel way to ensure Stretch is both spec compatible as well as compatible with the idiosyncrasies of various browsers which weren’t encoded in the spec. We also knew we needed to keep up to date with browsers as they evolved, improved, and fixed bugs. Writing a test suite like this by hand and ensuring it was up to date was quickly dismissed as impossible. Crazy as it sounds we needed a way to automate writing automated tests.

We build a system we call gentest. With gentest all test cases are described using regular old html files which contain a small layout marked with id="test-root".

Gentest loads this file into a headless browser using WebDriver, an API to script browsers. Once loaded, gentest will ask the browser, via WebDriver, for the style, size, and position of every DOM node. Using this information gentest will write a Rust unit test which builds up the equivalent Flexbox tree. using the Stretch API and asserts that the output of the Stretch layout calculation matches the layout of the DOM nodes in the browser.

Gentest performs this process for hundreds of test cases, and because of the standardized WebDriver protocol can generate tests targeting any browser. For Stretch we have chosen to target Chrome but it would be trivial to also test against Firefox and Safari.

Once the tests on written to the filesystem we run them like any other Rust test suite using cargo. With this system, we generate hundreds of tests that are always up to date and ensure layout in Stretch results in the same output as the browser. We have also chosen to commit these generated tests to version control instead of re-generating them on every test run. The main reason for this is that we always want to be able to go back to a version which was known to be passing tests. If we did not commit the generated files then we could not make this promise as there is an implicit dependency on the browser version used to generate the test cases. Meaning if Chrome introduces or fixes a bug then a version of Stretch where all tests passed may now have a failing test.

Another benefit of gentest is that, because every test is just an html file, it can be visualized in the browser just by opening the file. This was especially useful as we were developing Stretch because when a test was failing we could just open the browser’s developer tools to inspect and test layout to understand how the browser behaved and then try to replicate that behavior in Stretch. We even added a small stylesheet to every test which color codes nodes by depth for visual inspection.

Contributing tests

Gentest makes it trivial to contribute test cases to Stretch as well as report bugs. You don’t even need to know Rust unless there are code changes needed to make the test pass. Start by cloning the Stretch repo.

Now create a new test fixture inside of the test_fixtures directory, use a long and descriptive name as this will be the unique name for the test. For this example we will call our test my_test.html but this is a terrible name for a real test. This file has to contain a valid html webpage, this means it should include the DOCTYPE tag as well as html and body tags. It must also include a reference to the test_helper script and test_base_style stylesheet. Finally you test layout should be a single DOM root inside of body with id="test-root".

Now we are ready to re-generate tests based on the new set of test fixtures. First though we need to install chromedriver so we can communicate with Chrome. We’ll use homebrew for macOS but chromedriver can also be manually installed for macOS and linux. We currently do not support generating tests on Windows but hope to in the future (Contributions welcome!)

Once chromedriver is installed and added to PATH we are ready to re-generate the tests.

We can verify that the test has indeed been generated using git status which should reveal a single test added tests/generated/my_test.rs. We finally run the test using cargo test. If the test doesn’t pass we need to fix Stretch and re-run the tests to verify our fix worked. We don’t need to re-generate the tests every time.

Benchmarking

We don’t only use this system for unit testing, we also make use of it to generate a suite of benchmarks which we can use to compare performance between commits to Stretch. Because the unit tests are there to test every feature of Flexbox they are also ideal candidates for benchmark tests as we make sure we don’t regress performance on some edge case we didn’t think to write a manual test case for. It’s important to not only rely on generated benchmarks through as they tend to represent fairly simple layouts so we mix in some manually written ones as well.

In conclusion

In summary, we run hundreds of Flexbox scenarios within Chrome and use WebDriver to query the results of those scenarios to generate unit tests which we can run in Cargo. This ensures Stretch matches browser behavior of Flexbox and on top of that we get benchmarks for free.

Make sure to sign up to the Visly early access list to receive more information as we get closer to launch. Also follow Stretch on github as we get closer to a first stable release. All contributions are welcome!