About Rust’s Machine Learning Community

The conversations on the introduction of the latest Rust Machine Learning crate, which was also the birthplace of the new rust-machine-learning IRC (thanks for the setup, @Argorak) lead to the essential question “What is this IRC all about?”. I thought it to be helpful to give a broad overview on the current affairs of Machine Learning in Rust, trying to fill the void that now exists and pushing Machine Learning in Rust forward.

To us, Rust seems to be a worthy alternative to the big players in the field of Machine Learning, namely C++, Python and Lua. Rust’s unequaled trade-off between performance, control and convenient high-levelness, could prove that Machine Learning in Rust is more performant than the current state-of-the-art tools in C++ and Lua. While Rust is highly performant, it is at the same time comfortably accessible. At a certain stage, this could prove important, for the adoption of Rust and the Rust ML tools from various scientific fields outside of Rust. In fact, we spoke with a noticeable amount of physicist and also Machine Learners, who would use Rust for their work, if there only was, at least, something like numpy.

But everyone interested in Rust and its scientific computation, Machine Learning and Artificial Intelligence soon realizes, that although the language, the community, and the concepts are all sound, the right tools just don’t exist, yet. The lack of mature crates for Machine Learning can demotivate even the longest long-term approaches, to consider Rust as the platform of their work.

And as a Machine Learning Framework is essentially nothing more than a layer on top of very performant data management, computation and mathematic libraries it is very unfortunate that there are no solid building blocks for those either, which actually leads Machine Learning, Artificial Intelligence and other fields of scientific computation to the same dead end.

I would hope, that the Rust Machine Learning community attracts people from different scientific fields to create the tools all parties of scientific computation will need.

Summing up the current Machine Learning field, there are around 15 Rust Machine Learning libraries on Github, with the lion’s share of those abandoned and none of them in a stage where it would be safe to consider them for serious applications. (including our own Machine Intelligence Framework, Leaf). Experimentation would be apt to summarize the current state of Rust’s ML community. Following, a quick overview of the status of the more active repositories, to provide some hard facts.

(170 commits, 1 Contributor, 22 Stars, 0.1.0, active)

(24 commits, 1 Contributor, 110 Stars, 0.2.0, active)

(68 commits, 5 Contributors, 835 Stars, 0.1.2, active)

(318 commits, 1 Contributor, 10 Stars, 0.0.5, active? (Nov.15))

(25 commits, 2 Contributors, 11 Stars, ?, active? (Oct.15))

(52 commits, 1 Contributor, 6 Stars, 0.1.0, active? (Sept.15))

The versions and amount of commits/contributors prove that none of the Frameworks are ready for show-time, yet. But the situation is even worse, as all frameworks implement their own data structures (Vector, Matrix, Tensor, ndArray) as something like numpy doesn’t exist in Rust, yet. Num seems to be the only crate the frameworks felt they could rely on.

Worth mentioning here is ndarray which looks promising and could resolve the issue of a missing data structure crate.

(534 commits, 2 Contributors, 41 Stars, 0.3.0-alpha2, active)

On the scientific computation side, it hardly looks any better. Many, many repositories that tried to kick something off but unfortunately, went nowhere. So following a quick overview of the more actively looking projects in the field of scientific computation in Rust.

(94 commits, 5 Contributors, 113 Stars, 0.0.7, active)

(236 commits, 6 Contributors, 24 Stars, 0.4.25, active? (Nov.15))

(309 commits, 5 Contributors, 48 Stars, 0.0.5, active? (Sept.15))

(40 commits, 1 Contributor, 4 Stars, 0.1.0, active? (Sept.15))

Like with the Machine Learning libraries, Num seems to be the only crate that seemed relevant enough to the scientific crates, that they would rely on it.

One crate we think worth mentioning here as an example for a successful and valuable (I think) scientific computation framework, is rust-bio.

(365 commits, 8 Contributors, 131 Stars, 0.3.20, active)

Concluding, for now, ndarray seems to move into the right direction and would be highly valuable for Rust’s ML community. We at Autumn are looking forward engaging with ndarray and hope that it turns, in fact, into Rust’s very own numpy.

Many important crates for computation and data management are still missing, though. But done right, we believe that those Rust crates could have significant improvements over similar projects in other languages.

But keeping more of that, for another article.

I hope, that the machine learning IRC helps us — the Rust ML community — to move from experimental to something we all feel confident of backing.

Also, thanks to the initiative of @idadesub, @Argorak and @ttaubert there will be a Rust talk about Machine Learning, Leaf and Collenchyma in Berlin on February 3rd. This might be Rust’s first ML community get-together.
MJ and hobofan