Curve Fitting in the Browser using Rust+wasm in Angular

Dominique Dresen
Aug 6 · 7 min read

TL;DR: Played around a bit with Rust+wasm and wrote a little crate to perform Levenberg-Marquardt fits. Then compared the binary execution, the wasm version, a JS implementation and a SciPy approach. To my delight found that the wasm version is equivalent to SciPy in results & execution speed.

Motivation

In experimental science, one of the basic recurring tasks is to compare your experimental data with a parametrized model — to either search for physics in the data beyond the considered model or to characterize the data by fitting said parameters to the data. In the latter case, the fitting process is often a recurring task that is repeated for each new sample batch, and therefore the clever students go ahead and write short evaluation and plotting scripts that can be quickly utilized each time. As I am looking into Rust & WebAssembly (wasm) lately, I was wondering how a curve fitting implementation in this manner compares to other approaches.

Now I know one could basically use Django to write a Web application that uses the NumPy/SciPy library in the browser to quickly achieve similar results… but basically I was searching for a reason and a project to play around with Rust+wasm. And so I went ahead, and thought I write about my experience to anybody that is interested.

Levenberg-Marquardt Algorithm

The Levenberg-Marquardt algorithm is effectively the standard when it comes to fitting parameters of a model to data. The basic idea is to use a combination of the Gradient-descent and the Gauss-Newton algorithm to effectively find a local minima close to some given starting values. The basic idea is to minimize the sum of square distance between data and model with respect to the measurement error (the bigger the error — the less important the distance is)

This is iteratively performed by updating pp+δ, where the change δ is calculated from solving the system of linear equations

Where J is the Jacobian of the model with respect to the parameters, W the weighting matrix that is effectively a diagonal matrix with σᵢ as elements and λ is a iteratively varying damping factor. A nice recent write-up I followed on implementing the algorithm can for example be found here (by H. P. Gavin, Duke University).

The default approach: SciPy

To compare different implementations, I start with the one I know as default approach: just throw Python on it. But first we’ll need some data that we can fit. Today simulated data has to be enough. So, I choose a Gaussian curve with a constant offset as model for the discussion

and quickly generated with NumPy some data that is randomized with a small error to model the measurement uncertainty using this script. The data is generated with the parameters A = 42, μ = 4.2, σ = 0.666, c = 10, and the random noise has a standard deviation of 3. The data points are rounded so we can imagine them representing the number of some event with respect to an external parameter. Even though the standard deviation in the random noise should then be the square root of the number of events, it is chosen as constant for each data point for later comparison, as I was only able to find a Javascript implementation, which did not implement a weighting matrix.

Fitting with SciPy is quite straight forward. One just needs to define the distance function that is squared and summed in the χ² function, and then throw this together with initial parameters into the scipy.optimize.leastsq function. This is a wrapper around functions from a battle tested Fortran package MINPACK and should be considered as a reliable source for a correct implementation of the LM algorithm. I uploaded a short script that calls this scipy.optimize.leastsq here. The start parameters are chosen as
A = 20, μ = 3, σ = 0.2, c = 0 in this and each subsequent fit.

SciPy Result
A = 41.3 +/- 1.6
μ = 4.200 +/- 0.028
σ = 0.630 +/- 0.030
c = 10.77 +/- 0.52

χ²/d.o.f = 0.922

Within the error bounds, this reproduces the generated model close to the real parameters that were used for data generation. The χ² value divided by the degrees of freedom (number of data points minus number of varied parameters) is close enough to one, where deviations come from the randomization and rounding of the data. On my machine ( a desktop with a AMD Ryzen 5 1600X ), the execution of scipy.optimize.leastsq on said dataset takes on average 1.2 ms.

Curve Fitting in Rust

Rust is a young programming language similar in it’s intention to C, but with a clever design that helps to avoid most of the pitfalls that can occur during runtime or when performing parallel code. Anybody not familiar with it is encouraged to take a look in The Rust Programming Language book, which is a great resource to learn about the why and how’s.

Using the ndarray crate as basis, I wrote a small crate that I named rusfun with the sole purpose to define one-dimensional functions, be able to run the Levenberg-Marquardt algorithm over it and to be able to pack the functions for wasm using wasm_bindgen. The crate is in an early stage, so don’t expect a lot of documentation & further functionality yet.

Given the data as a three column-file, as well as using the same starting parameters running, running the fit in Rust is also just a few lines, which I uploaded here. Timing the initialization & execution of the LM algorithm, a total of 0.4 ms execution time is measured on my machine. Without having really thought on how to optimize the code, the Rust implementation is already faster — which makes sense as all the overhead that comes from the Python↔Fortran interface in SciPy can be saved. In a later step it will be interesting to see how these compare for more computationally intensive models. But one step after the other. The fit results obtained from rusfun are analogue to the ones from SciPy with
A = 41.3 +/- 1.6
μ = 4.200 +/- 0.027
σ = 0.630 +/- 0.030
c = 10.77 +/- 0.51

χ²/d.o.f = 0.922

and only differ at values below the parameter uncertainty. The deviation could be from some minor implementation difference that the folks from MINPACK did in comparison to the resource I used.

Packing it to the Browser

Now that I had a simple curve fit function in pure Rust, a wrapper module in rusfun is quickly written and the functions are made accessible in NPM as rusfun package for Javascript using wasm-pack. Using the default build method, the npm package desires a Browser application to run it… and I know it’s overkill — but considering how far we already are, we can also now go the last mile and build a small Angular app to load data, make plots and run the fit routine.

I have prepared an app, I called FuncFit, and using netfliy made it available within a few minutes here: https://funcfit.netlify.com/ . To load data one basically needs a two or three column file, similar in format to the one I generated before with NumPy. Selects one of the pre-defined models (the list could easily be extended if that would be of desire), and press Fit.

The result should then look a bit like this. So we see, the same parameters are reproduced (yes, I didn’t bother rounding them sensibly… sorry), and the execution time is with 1.1 ms on a similar level as scipy.optimize.leastsq . So we lost our edge we had in the pure Rust version, but it seems as the Rust+wasm combination is on a same level in computational speed as the Python+Fortran combination.

This seems like a good observation to conclude for the day. I hope someone had fun reading this. Rust doesn’t seem to have a large user base in the scientific community yet. But maybe this changes some day. It’s definitely a nice feeling to have nearly no run-time errors in code execution once one is able to satisfy the compiler.

Appendix: LM algorithm in pure Javascript

As a last bonus section, I wanted to check how the Rust+wasm combination in Angular compares to a pure Javascript implementation. Searching a bit around I found an implementation in the machine learning package ml.js. It seems not to have implemented the weighting matrix, but as we chose constant measurement error this should have no effect in the result. A quick node implementation of said package with the default configuration and same start parameters takes however ~35 ms. So 30 times worse than both SciPy or Rust. Playing around a bit with with the options I notice that the implemented algorithm probably doesn’t have good checks for convergence implemented as it always goes to maxIterations. I could reduce the number of iterations from 100 to 20 without greatly changing the parameters, which left the execution time at 12 ms… still worse but at least not that much. A quick search for a more proper implementation of the LM algorithm in pure Javascript that also includes data weighting was not successful… maybe somebody knows of one?

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade