## Random Number Generators: what is the best one?

Building Customisable High Performance C++ Applications

### Random Number Generators: what is the best one?

This thread is a discussion on random number generation. For example, when should we use Box Muller instead of Mersenne Twister, distributed RNG and improving performance and accuracy.
Last edited by Cuchulainn on Sat Oct 10, 2009 4:44 pm, edited 1 time in total.

Cuchulainn

Posts: 677
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

### Personal preference: distributed MT for U(0,1) then B-M

I am a big fan of using Mersenne Twister with its 2^19997-1 return period. It's also very very fast; as fast as Ran1 from Numerical Recipes. Which is a terrible RNG. I use the U(0,1) outputs to generate vectors as input to a templated BM function. The MT box-mueller implementation "randNorm" is naive, using two U(0,1)'s to generate a single N(0,1), and is not great for a performance critical section of code. Just implementing your own BM type code will speed things up by at least a factor of two (possibly more, if you consider function call overhead).

There are two variants on the Box-Mueller trick. One throws away bad variates (the so called polar rejection method). The other uses trigonometric functions. I have found that it is hard to guage which is faster a priori. If you can convince your compiler to vectorize a calculation on a big block of data, then the trig version is probably faster (vectorizers hate "if" statements). At least that's been my experience doing this on PowerPC and i686 cpus. If you don't operate on big blocks of data, I don't know if it makes a difference.

For distributed RNG the problem is always trying to make sure the different threads don't clash. You obviously have to use a vector of input seeds, one per thread. With a 2^19997-1 cycle period, Mersenne Twister offers acceptably close to zero chance of seed collision, which is longer than I need the program to run.
Guest

Here is a new version of Mersenne Twister.

http://www.math.sci.hiroshima-u.ac.jp/~ ... index.html

It is interesting to note that Box Muller is to be preferred to Polar Marsaglia on parallel machines. In the sequential case it is the other way aound.

Cuchulainn

Posts: 677
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

### Very interesting

That's a great development! I'm curious how hand-optimized SIMD MT compares with the previous MT as compiled by a SIMD aware compiler.
Guest

### Prelim results

There are a lot of variants in the package. It is heavily optimized, assembly-level code.

I compared the SSE2 variants using gcc3.4 (cygwin) and icl9.1 (win32).

The gcc code was marginally faster in block, 64 bit mode. I found no difference between 32 and 64 bit performance with the optimizing intel compiler. The Intel compiler was 20-25% faster than the gcc code for the block code. Counterintuitively, the intel compiler produced sequential code which was twice as slow as the gcc code.

USING GCC (standard compile options)

32 bit BLOCK:281ms for 100000000 randoms generation

32 bit SEQUE:437ms for 100000000 randoms generation

64 bit BLOCK:265ms for 50000000 randoms generation

64 bit SEQUE:344ms for 50000000 randoms generation

USING ICL /Wcheck /O3 /QaxPB /Qprefetch

32 bit BLOCK:234ms for 100000000 randoms generation

32 bit SEQUE:906ms for 100000000 randoms generation

64 bit BLOCK:235ms for 50000000 randoms generation

64 bit SEQUE:625ms for 50000000 randoms generation
Guest

Lagged Fibonacci RNGs appear to perform better than Mersennne Twister (e.g. in Boost) and are use for parallel processing.

See ->

http://etd.lib.fsu.edu/theses/available ... thesis.pdf

Has anyone used these generators?

Cuchulainn

Posts: 677
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

This is the random number generators in the Boost library. This is another option to use for Monte Carlo simulations.

http://www.boost.org/libs/random/index.html

Cuchulainn

Posts: 677
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

There are two variants on the Box-Mueller trick. One throws away bad variates (the so called polar rejection method). The other uses trigonometric functions. I have found that it is hard to guage which is faster a priori.
harrybro

Posts: 1
Joined: Thu Apr 26, 2012 1:44 pm

These traditional RNGs can be profiled by creating a loop and examining the execution time.

One thing is sure: sin, cos and log functions are very expensive.

Cuchulainn

Posts: 677
Joined: Mon Dec 18, 2006 2:48 pm
Location: Amsterdam, the Netherlands

### Who is online

Users browsing this forum: No registered users and 0 guests