View previous topic :: View next topic |
Author |
Random Number Generators: what is the best one? |
Cuchulainn
Joined: 18 Dec 2006
Posts: 593
Location: Amsterdam, the Netherlands
|
Posted: Sat Apr 28, 2012 8:29 am Post subject: |
|
|
These traditional RNGs can be profiled by creating a loop and examining the execution time.
One thing is sure: sin, cos and log functions are very expensive. |
|
Back to top |
|
|
harrybro
Joined: 26 Apr 2012
Posts: 1
|
Posted: Fri Apr 27, 2012 5:07 am Post subject: |
|
|
There are two variants on the Box-Mueller trick. One throws away bad variates (the so called polar rejection method). The other uses trigonometric functions. I have found that it is hard to guage which is faster a priori.
_________________
Alex... |
|
Back to top |
|
|
Cuchulainn
Joined: 18 Dec 2006
Posts: 593
Location: Amsterdam, the Netherlands
|
Posted: Thu Aug 30, 2007 11:36 am Post subject: |
|
|
This is the random number generators in the Boost library. This is another option to use for Monte Carlo simulations.
http://www.boost.org/libs/random/index.html |
|
Back to top |
|
|
Cuchulainn
Joined: 18 Dec 2006
Posts: 593
Location: Amsterdam, the Netherlands
|
Posted: Sun Aug 26, 2007 10:17 am Post subject: |
|
|
Lagged Fibonacci RNGs appear to perform better than Mersennne Twister (e.g. in Boost) and are use for parallel processing.
See ->
http://etd.lib.fsu.edu/theses/available/etd-11082006-195809/unrestricted/final_ms_thesis.pdf
Has anyone used these generators? |
|
Back to top |
|
|
Guest
|
Posted: Tue Mar 20, 2007 5:28 pm Post subject: Prelim results |
|
|
There are a lot of variants in the package. It is heavily optimized, assembly-level code.
I compared the SSE2 variants using gcc3.4 (cygwin) and icl9.1 (win32).
The gcc code was marginally faster in block, 64 bit mode. I found no difference between 32 and 64 bit performance with the optimizing intel compiler. The Intel compiler was 20-25% faster than the gcc code for the block code. Counterintuitively, the intel compiler produced sequential code which was twice as slow as the gcc code.
USING GCC (standard compile options)
32 bit BLOCK:281ms for 100000000 randoms generation
32 bit SEQUE:437ms for 100000000 randoms generation
64 bit BLOCK:265ms for 50000000 randoms generation
64 bit SEQUE:344ms for 50000000 randoms generation
USING ICL /Wcheck /O3 /QaxPB /Qprefetch
32 bit BLOCK:234ms for 100000000 randoms generation
32 bit SEQUE:906ms for 100000000 randoms generation
64 bit BLOCK:235ms for 50000000 randoms generation
64 bit SEQUE:625ms for 50000000 randoms generation |
|
Back to top |
|
|
Guest
|
Posted: Tue Mar 20, 2007 3:50 pm Post subject: Very interesting |
|
|
That's a great development! I'm curious how hand-optimized SIMD MT compares with the previous MT as compiled by a SIMD aware compiler. |
|
Back to top |
|
|
Cuchulainn
Joined: 18 Dec 2006
Posts: 593
Location: Amsterdam, the Netherlands
|
Posted: Tue Mar 20, 2007 10:55 am Post subject: |
|
|
Here is a new version of Mersenne Twister.
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
It is interesting to note that Box Muller is to be preferred to Polar Marsaglia on parallel machines. In the sequential case it is the other way aound. |
|
Back to top |
|
|
Guest
|
Posted: Tue Mar 20, 2007 1:36 am Post subject: Personal preference: distributed MT for U(0,1) then B-M |
|
|
I am a big fan of using Mersenne Twister with its 2^19997-1 return period. It's also very very fast; as fast as Ran1 from Numerical Recipes. Which is a terrible RNG. I use the U(0,1) outputs to generate vectors as input to a templated BM function. The MT box-mueller implementation "randNorm" is naive, using two U(0,1)'s to generate a single N(0,1), and is not great for a performance critical section of code. Just implementing your own BM type code will speed things up by at least a factor of two (possibly more, if you consider function call overhead).
There are two variants on the Box-Mueller trick. One throws away bad variates (the so called polar rejection method). The other uses trigonometric functions. I have found that it is hard to guage which is faster a priori. If you can convince your compiler to vectorize a calculation on a big block of data, then the trig version is probably faster (vectorizers hate "if" statements). At least that's been my experience doing this on PowerPC and i686 cpus. If you don't operate on big blocks of data, I don't know if it makes a difference.
For distributed RNG the problem is always trying to make sure the different threads don't clash. You obviously have to use a vector of input seeds, one per thread. With a 2^19997-1 cycle period, Mersenne Twister offers acceptably close to zero chance of seed collision, which is longer than I need the program to run. |
|
Back to top |
|
|
Cuchulainn
Joined: 18 Dec 2006
Posts: 593
Location: Amsterdam, the Netherlands
|
Posted: Mon Mar 12, 2007 9:57 am Post subject: Random Number Generators: what is the best one? |
|
|
This thread is a discussion on random number generation. For example, when should we use Box Muller instead of Mersenne Twister, distributed RNG and improving performance and accuracy.
Last edited by Cuchulainn on Sat Oct 10, 2009 3:44 pm; edited 1 time in total |
|
Back to top |
|
|
|