Why Rust fails hard at scientific computing

1.5 years ago I started a computer go bot in Rust based on Monte Carlo Tree Search (MCTS).

MCTS is at the heart of all strong go programs, and many AI for various games and real world competitions like RoboCup Soccer. Yes, even Google AlphaGo’s neural networks are just “suggesting” moves to the MCTS, it has the last words.

After weeks of fighting the borrow checker like many beginners I managed to program my way out, and produce this and brain dump material probably worth a PhD or two (check the README):

  • Compilation
  • Vectorization
  • Localization / Branch Prediction / Cache
  • Randomization / MCTS optimization
  • Data structure research
  • Machine Learning
  • Algorithms
  • Consumption (of trees and iterators/vectors)
  • Parallelism
  • Hashing
  • Heuristics
  • Memory
  • Unit tests

6 months ago, I found the time to dive into Data Science and Deep Learning, and 1 week ago I got the urge to write my own neural network library. Rust didn’t even enter my mind at the time, it had to be Nim.

4 Nim bugs later … After breaking a (Guiness ?) record of 5 bugs in 12 hours to a core language tracker.

… and a discussion with a fellow data scientist, I still think it’s the best language that fits my needs. Those bugs are only flesh wounds.

Let’s go back to Rust

Rust appealed to me due to speed, type safety and functional programming facilities. Why ? well, my first real programming language after bash, SQL and Excel VBA was Haskell, yep before even Javascript and Python.

So why did it fails for me, and why is it still failing for scientific computing:

1. Too much symbols & <> :: {} (Your mileage may vary, C++ programmers will feel right at home)

// Comparing black and white score, returning the winner. Komi is added to white score to compensate first move advantage
match PartialOrd::partial_cmp(&(black_score as f32),&(KOMI + (white_score as f32))) {
Some(Ordering::Less) => Intersection::White,
Some(Ordering::Greater) => Intersection::Black,
_ => unreachable!(),
}

I’m not even talking about Rc, RefCell and Box which seems like security through obscurity. (Though it can’t reach Haskell monadic level)

// from https://stackoverflow.com/questions/30861295/how-to-i-pass-rcrefcellboxmystruct-to-a-function-accepting-rcrefcellbox

trait MyTrait {
fn trait_func(&self);
}

#[derive(Clone)]
struct MyStruct1;
impl MyStruct1 {
fn my_fn(&self) {
// do something
}
}

impl MyTrait for MyStruct1 {
fn trait_func(&self) {
// do something
}
}

fn my_trait_fn(t: Rc<RefCell<Box<MyTrait>>>) {
t.borrow_mut().trait_func();
}

fn main() {
let my_str: Rc<RefCell<Box<MyStruct1>>> = Rc::new(RefCell::new(Box::new(MyStruct1)));
my_trait_fn(Rc::new(RefCell::new(Box::new((**my_str.borrow()).clone()))));
my_str.borrow().my_fn();
}

2. Arrays in Rust are a second-class citizens, actually I think they don’t even have their visas. I hear them laughing at me when I try to use them. You can’t even clone them:

Actually I misrepresented, you can, only if the array size is 32 or less.

array_impls! {
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32
}

// The Default impls cannot be generated using the array_impls! macro because
// they require array literals.

macro_rules! array_impl_default {
{$n:expr, $t:ident $($ts:ident)*} => {
#[stable(since = "1.4.0", feature = "array_default")]
impl<T> Default for [T; $n] where T: Default {
fn default() -> [T; $n] {
[$t::default(), $($ts::default()),*]
}
}
array_impl_default!{($n - 1), $($ts)*}
};
{$n:expr,} => {
#[stable(since = "1.4.0", feature = "array_default")]
impl<T> Default for [T; $n] {
fn default() -> [T; $n] { [] }
}
};
}

array_impl_default!{32, T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T}

Consequences ? You can’t use Rust arrays to represent a matrix bigger than 4x8, how useful is that?

Actually you can’t even represent a 8x8 chessboard without coding every properties from scratch (copy, clone, print, indexing with [] …). I’m in luck, go has 9x9, 13x13 and 19x19 board sizes …

You can work around it by using a Vec (arbitrary sized sequence/list) but then your matrix is allocated on the heap not the stack, meaning slower operations. Plus that means you cannot use Rust wonderful type system to check that you multiply matrices with compatible dimensions, say a 2x2 matrix with a 2x1 matrix, without jumping through hoops.

That brings me to the third point.

3. Rust is still “discussing” integer as generic type parameter (since 2015), meaning a matrix type Matrix[M, N, float] will not exist before a long long time. The following github discussions are quite the read:

That’s it folks, hope you enjoyed the read.

PS: Would “3 reasons why Rust fails hard at scientific computing” be too much baitclick ?


Originally published at Marie & Mamy’s Insights.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.