# Fast, High-Quality, Parallel Random Number Generators

### Some Recommended 32-bit Combination Generators

Before listing more combination generators, I'll first recommend that you use a 64-bit generator if you can, because tests show that they are generally more robust (having fewer borderline passes in tests and a higher overall success rate with my trial generators). Even if your application only needs 32-bit numbers, you can call a 64-bit generator and use the upper or lower 32 bits. If you're running a 64-bit operating system, you have 64-bit instructions available to you, which are probably just as fast as 32-bit instructions, so the 64-bit generators will be just as fast.

For 32-bit generators, the Rand2CmrRsr and RandRsResCers generators shown previously are fine, and RandRsResCers is the fastest of all these generators that contain three subcycle generators. The following 32-bit generators are also recommended:

```void SeedResrRersLesr (uint32_t seed) {
unsigned n;
x = 254; y = 774; z = 1;
for (n=((seed>>22)&0x3ff)+20; n>0; n--) { x = rotl(x,21) - x;  x = rotl(x,26); }
for (n=((seed>>11)&0x7ff)+20; n>0; n--) { y = rotl(y,20) - rotl(y,9); }
for (n=((seed    )&0x7ff)+20; n>0; n--) { z = (z<<7) - z;  z = rotl(z,23); }
}

uint32_t RandResrRersLesr() {  // Combined period = 2^74.73
x = rotl(x,21) - x;  x = rotl(x,26); // RESR, period = 3808884 = 2*2*3*17*18671
y = rotl(y,20) – rotl(y,9);          // RERS, period = 1973321 = 7*19*37*401
z = (z<<7) - z;  z = rotl(z,23);     // LESR, period = 4164739213 = 29*2207*65071
return x ^ y ^ z;
}
```

The RESR and RERS generators above use two rotations, which mix bits well. The LESR generator uses a left-shift and subtract. This is similar to a multiply and is reversible, and thus LESR has some of the traits of CMR. Its period is close to 232, and its randomness is good. Here's a second combination generator based on CMR:

```void SeedCmfrCmrCers (uint32_t seed) {
x = (seed      & 0x001fffffu) + 4027999010u; // 21 bits of seed
y = ((seed>>7) & 0x0007ffffu) + 3993266363u; // 19 bits of seed
z = (seed >> 13)              + 3605298456u; // 19 bits of seed
}

uint32_t RandCmfrCmrCers() {  // Combined period = 2^95.999951
x = ~(2911329625u*x); x = rotl(x,17); // CMFR, period=4294951751 (prime)
y = 4031235431u * y;  y = rotl(y,15); // CMR,  period=4294881427 (prime)
z = 3286325185u - rotl(z,19);         // CERS, period=4294921861=19*89*2539871
return (x + y) ^ z;
}
```

This is perhaps the best 32-bit combination generator presented in this article. It uses a CMR and the similar CMFR subcycle generators for excellent randomness, and it'll be fast due to pipelining of the multiplies. It is directly seeded from overlapping fields in the seed, and its period is very close to 296 because all three of its generators are reversible.

### Some Recommended 64-bit Combination Generators

As previously mentioned, you probably should use a 64-bit combination generator if you are using a 64-bit OS, even if your application is 32-bits. For clarity, I use xx, yy, zz for 64-bit variables, and x, y, z for 32-bit:

```typedef unsigned long long uint64_t;  // or use <stdint.h> if available
uint64_t xx, yy, zz;

void SeedRersResrResdra (uint32_t seed) {
unsigned n;
xx =    914489ULL;
yy =   8675416ULL;
zz = 439754684ULL;
for (n=((seed>>22)&0x3ff)+20; n>0; n--) { xx = rotl(xx,8) - rotl(xx,29); }
for (n=((seed>>11)&0x7ff)+20; n>0; n--) { yy = rotl(yy,21) - yy;  yy = rotl(yy,20); }
for (n=((seed    )&0x7ff)+20; n>0; n--) { zz = rotl(zz,42) - zz;  zz = rotl(zz,14) + zz; }
}

uint64_t RandRersResrResdra() {  // Combined period = 2^116.23
xx = rotl(xx,8) – rotl(xx,29);                 //RERS,   period = 4758085248529 (prime)
yy = rotl(yy,21) - yy;  yy = rotl(yy,20);      //RESR,   period = 3841428396121 (prime)
zz = rotl(zz,42) - zz;  zz = zz + rotl(zz,14); //RESDRA, period = 5345004409 (prime)
return xx ^ yy ^ zz;
}
```

What's unusual about this combination generator is that the periods of all its subcycle generators are prime numbers, which could improve its robustness. Also, RESDRA has an extra add to improve bit mixing.

```void Seed2RersRs (uint32_t seed) {
unsigned n;
xx = 2257535ULL;
yy = 821507ULL;
zz = 819103680ULL;
for (n=((seed>>22)&0x3ff)+20; n>0; n--) xx = rotl(xx,52) - rotl(xx, 9);
for (n=((seed>>11)&0x7ff)+20; n>0; n--) yy = rotl(yy,24) - rotl(yy,45);
for (n=((seed    )&0x7ff)+20; n>0; n--) zz -= rotl(zz,38);
}

uint64_t Rand2RersRs() {  // Combined period = 2^113.7
xx = rotl(xx,52) - rotl(xx, 9); // RERS, period = 1157113674487 = 71*10067*1618891
yy = rotl(yy,24) – rotl(yy,45); // RERS, period = 1405504503483 = 3*17*27558911833
zz -= rotl(zz,38);              // RS,   period = 10483687178 = 2*23*47*251*19319
return xx ^ yy ^ zz;
}
```

This generator executes only 10 arithmetic operations per number, and is a good compromise between speed and randomness.

And the final generator:

```void Seed3Resr (uint32_t seed) {
unsigned n;
xx = 590009;
yy = 8675416;
zz = 46017471;
for (n=((seed>>22)&0x3ff)+20; n>0; n--) { xx = rotl(xx,43) - xx; xx = rotl(xx,27); }
for (n=((seed>>11)&0x7ff)+20; n>0; n--) { yy = rotl(yy,21) - yy; yy = rotl(yy,20); }
for (n=((seed    )&0x7ff)+20; n>0; n--) { zz = rotl(zz,51) - zz; zz = rotl(zz,26); }
}

uint64_t Rand3Resr() {  // Combined period = 2^123.32
xx = rotl(xx,43) - xx; xx = rotl(xx,27); // RESR, period =  9925159703554 = 2*53*93633582109
yy = rotl(yy,21) - yy; yy = rotl(yy,20); // RESR, period =  3841428396121 (prime)
zz = rotl(zz,51) - zz; zz = rotl(zz,26); // RESR, period =  348142888313 = 11*11*2877213953
return xx ^ yy ^ zz;
}
```

All three of the subcycle generators in Rand3Resr use RESR, yet this lack of variety does not prevent this combination generator from performing well in BigCrush.

By now, you're probably wondering how to choose which of all these combination generators to use. As mentioned earlier, use 64-bit with a 64-bit OS. Among the 64-bit generators above, I prefer RandRersResrResdra because the prime periods of its subcycle generators could add some safety. But in a later article, I'll discuss the outstanding CMRES generator for 64 bits.

Among the 32-bit generators, I recommend Rand2CmrCers because its multiplies provide superb mixing of bits and it has the maximum possible period. There's little difference among the other choices, so choose based on personal preference. Better yet, run tests on them and select the one that works best for your application.

### Constructing and Evaluating Generators

I mentioned that the constants used in these generators came from brute-force search. For each type of subcycle generator, I tried all useful values of rotations, traversed the cycles that resulted from several seeds, and created a table of the longest cycles. This took an especially long time for 64-bit irreversible generators because their periods are often more than 240. For CMR, the multipliers were found by trying random multipliers and traversing their cycles, running in an infinite loop. Each multiplier was evaluated based mainly on the maximum size of its seed range. That's why the CMR generators in Rand2CmrRsr and other reversible generators can be seeded with 16-bit numbers directly, whereas irreversible generators need to be stepped in a loop.

Having produced a table of long cycles for a given type of subcycle generator, I then created an experimental combination generator consisting of three of those subcycle generators, carefully selected to maximize the period. I ran that through BigCrush two or three times, and then evaluated the results. It's worth looking at the percentage of tests whose p values are .01 or smaller, or .99 or larger. Too many such values indicate that a generator is struggling to pass. By using three of the same type of generator (instead of mixing types as recommended), defects in the generator were more likely to be exposed. If it passed without struggling, the subcycle generator was put into my repertoire of good generators. Finally, I selected subcycle generators from that repertoire to assemble production-quality combination generators, such as those shown in this article. An exception is the CERS generator, which is not random enough to pass on its own, but is still useful as a fast (two clock cycle) way to boost the period.

If you wish to create your own 32-bit combination generators, I encourage you to start with CMR or CMFR because they are more random, have near-maximum periods, and are almost as fast as the other subcycle generators on modern CPUs. A combination of two CMR generators will pass BigCrush (whereas three of the other generators would be required to likewise pass). They can be combined with adds or XORs (I haven't tested other operators). I suggest trying a mixture of both, such as (x+y)z, as well as solely adds and XORs. To create your own 64-bit combination generators, you'll need to use only irreversible generators if you want a known period length.

You can obtain my files containing tables of cycles for the subcycle generators (both 32-bit and 64-bit), additional combination generators, source code for my cycle-finder program, and source for my command-line interface to the TestU01 library here.

### What's Next?

Part two of this article will be a deep dive into the many subcycle generators I've explored, and will also provide additional combination generators.

### References

[1] Jenkins, Bob, The testing and design of small state noncryptographic pseudorandom number generators, circa 2006. http://is.gd/fuctkC.

[2] Knuth, Donald, Seminumerical Algorithms, The Art of Computer Programming Vol. 2, Second Edition, pg. 69, Addison-Wesley, 1981.

[3] Marsaglia, George, The Marsaglia Random Number CDROM, with "The Diehard Battery of Tests of Randomness," produced at Florida State University under a grant from The National Science Foundation, 1995. Access available at http://is.gd/MUh3qT. A revised version of the Diehard tests is available here.

[4] Pierre L'Ecuyer, Richard Simard, TestU01: A C library for empirical testing of random number generators, ACM Transactions on Mathematical Software (TOMS), Volume 33, Issue 4, August 2007. http://is.gd/0XtlyC.

[5] Knuth, Donald, Seminumerical Algorithms, The Art of Computer Programming Vol. 2, Second Edition, pg. 90, Addison-Wesley, 1981.

[6] Marsaglia, George, "Xorshift RNGs," Journal of Statistical Software Vol.8 (Issue 14), July 2003. http://is.gd/RB6Ewf.

— Mark Overton is a lead Software Engineer at Northrop Grumman and has multiple patents in image processing. He can be contacted at [email protected].

Fast, High-Quality, Parallel Random-Number Generators: Comparing Implementations

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.