How competitive are C++ standard random number generators

Oscar David Arbeláez
4 min readJan 6, 2016

--

As a scientist that uses a lot of Monte Carlo simulation for his work I’m always on the look for better random number generators, that’s why the inclusion of a standard random library for C++ was great news for me. I immediately jumped on the Mersenne Twister random number generator from the standard, as that generator has a huge period and its proof of randomess is often good enough for Monte Carlo algorithms, way better than the linear congruential engine that you would see in standard random number generators in different languages.

However, in high performance computing, you often figure out that languages are tailored a bit more for convenience rather than performance. Even C++ is still a high level language, widely used for commercial applications, where you can waste a couple nanoseconds while waiting for user input. But in some cases in Monte Carlo simulations, when your program runs for weeks in your cluster, every single nanosecond matter. This is why I decided to write this little piece comparing the performance of the standard random number generators, I performed a little experiment inspired in the code of this stack overflow question:

The author of the question finds that the Mersenne Twister generator from the standard library is at least two times faster than two other linear congruential random number generators, namely std::rand, and the std::minstd_rand engine in the standard random library.

I proceeded to do the same test using the gcc-5.3.0 C++ compiler as well as the clang-3.7.0 one in my Arch Linux machine, and adding to the mixture GSL’s Mersenne Twister implementation, and the 64 bit implementation from the standard.

The results where pretty nice, as expected the specific implementation of the compiler greatly affect the performance of the standard generators, using the gcc compiler, the results were:

# oscar at XPS13 in ~ [11:11:46]
→ g++ -std=c++14 -O3 -o rand rand.cc -lgsl -lcblas
# oscar at XPS13 in ~ [11:11:51]
→ ./rand
rand 12.1279 ns per rand
minstd 5.8301 ns per rand
mersenne 5.00588 ns per rand
mersenne 64 6.51923 ns per rand
gsl mersenne 8.60412 ns per rand
rand 11.9981 ns per rand
minstd 5.82702 ns per rand
mersenne 4.95445 ns per rand
mersenne 64 6.54196 ns per rand
gsl mersenne 8.59234 ns per rand
rand 12.0031 ns per rand
minstd 5.82867 ns per rand
mersenne 4.94615 ns per rand
mersenne 64 6.52307 ns per rand
gsl mersenne 8.58726 ns per rand

As described in the stack overflow question, the std::rand random generator is very slow, even compared with the 64 bit implementation of the Mersenne Twister algorithm in the standard library, however, it seems that the implementation of the std::minstd_rand generator has been improved in this version of the standard library because I find it to be only slightly slower than the MT algorithm implementation. Furthermore, I find surprising that the, GSL implementation, at least with my version of BLAS is not faster than the standard implementation of the MT algorithm.

Using the clang compiler is an slightly different story, in this case, the std::rand and GSL implementations keep the same speed, the GSL only because it’s the same object code, but as you can see bellow the standard implementations are close to be 5 times slower, keeping almost the same ratios between them as in the gcc case.

# oscar at XPS13 in ~ [11:15:16]
→ clang++ -std=c++14 -O3 -o rand rand.cc -lcblas -lgsl
rand.cc:65:58: warning: expression result unused [-Wunused-value]
/* static_cast<volatile void> */(std::rand() % 10);
~~~~~~~~~~~ ^ ~~
1 warning generated.
# oscar at XPS13 in ~ [11:15:41]
→ ./rand
rand 12.0737 ns per rand
minstd 32.6652 ns per rand
mersenne 33.0896 ns per rand
mersenne 64 28.6194 ns per rand
gsl mersenne 8.70143 ns per rand
rand 12.2732 ns per rand
minstd 33.0323 ns per rand
mersenne 33.2481 ns per rand
mersenne 64 28.6497 ns per rand
gsl mersenne 8.92259 ns per rand
rand 12.488 ns per rand
minstd 33.0516 ns per rand
mersenne 33.6526 ns per rand
mersenne 64 28.6328 ns per rand
gsl mersenne 8.66299 ns per rand

As scientist we usually don’t go as far as implementing our own random number generators, and that is good, because a bug in such a crucial part of our simulation algorithms would kill the validity of our results right away.

I hope this little article helps some people to wee what’s out there, of course the implementations studied here are not the fastest but my guess is that they are among the most available. Faster implementation’s would need a lot of platform specific code such as this one built for the Intel® Pentium® 4 Processor, or a faster algorithm such as the xorshift random number generator.

--

--