A Fast Pseudo Random Number Generator

The r250 algorithm uses a 250-element array to generate pseudo random numbers faster than better-known techniques.

May 01, 1991
URL:http://www.drdobbs.com/parallel/a-fast-pseudo-random-number-generator/184408549

Example 2(a)

Example 2(a)

Example 2(b)

Figure 1

Figure 2

MAY91: A FAST PSEUDO RANDOM NUMBER GENERATOR

A FAST PSEUDO RANDOM NUMBER GENERATOR

r250 for "better" random numbers

W. L. Maier

Bill Maier is a software engineer whose main interests are simulation, mathematical software, and computer graphics. He can be reached at 3808 Seven Gables, Fort Worth, TX 76133.

Computers have been required to generate pseudo random numbers from the earliest days of computing. The usual method used to generate random numbers is the linear congruent algorithm, which is implemented by repeated use of the formula shown in Example 1 where the notation "p mod q" signifies the remainder left over after dividing p by q.

Example 1: The linear congruent algorithm

  x' = (ax + c) mod m,

Although the linear congruent technique is by far the most common algorithm in use for generating pseudo random numbers, there are other methods available. One such method is the r250 algorithm, described by the physicists E. Stoll and S. Kirkpatrick in a 1981 issue of the Journal of Computational Physics. This method, named "r250" because of the 250-element array used in the algorithm, is quite effective in general and is particularly well-suited for use on PCs. I first became aware of the r250 algorithm several years ago after reading an article in which the physicist Per Bak used this method to generate random numbers for a Monte Carlo simulation of the Ising model. This model is well-known in the physics community, and its simulation requires thousands of independent random numbers to be generated. Although most simulations of this type were (at the time the article was written) run on large computers, Bak demonstrated that he could perform the same simulations on a PC.

The basic theory behind the r250 algorithm is that, under appropriate conditions, a pseudo random sequence of bits can be generated by using the formula in Example 2(a). In this expression, the a_i and c_i are bit values, and are therefore equal to either 0 or 1. The formula simply says that if we have a set of bits that have been previously generated (that is, a_i-1 through a_i-p), we can multiply them by the coefficients c_i and add them together to create a new pseudo random bit, a_k. We add this bit to our sequence, and then repeat the formula until we have as many random bits as necessary. The maximum period of this sequence is 2^p-1, which is achieved by choosing the polynomial in Example 2(a) to be primitive. To simplify our calculations, we choose most of the c_i to be 0--in fact, for the r250 method all but two of them are set to 0, and we wind up with the formula in Example 2(b).

In the r250 generator, the primitive polynomial chosen is q=103 and p=250. Because we are generating only a single bit, we don't have to worry about the carry when we add. We can simply use an XOR (exclusive or) operation, which is the same as addition without carry. Thus, to create a random bit, we go back to the 103rd and 250th bits that we previously generated and XOR those values together. Of course, we are really not interested in just generating a single random bit; in a real application we might want 16-bit values, for example. To accomplish this, we treat each of the 16 bits in the word with the above formula. Instead of keeping a single sequence of bits, we keep 16 sequences of bits, which is just a sequence of 16-bit words, and use them in the formula in Example 2(b), performing an XOR between the 103rd and 250th previous words.

If we compare the r250 method with the linear congruent method, we see that r250 must perform one XOR and two index calculations to create a random number, while the linear congruent method requires a multiplication, an addition, and a division. Although the division is usually circumvented by using automatic truncation of integers to register length, the linear congruent method still needs to perform a multiplication each time a new random number is created. Because multiplication can be a time-consuming instruction compared to XOR, the r250 method is often a faster algorithm. For example, on an 8086 processor an XOR instruction requires four clock cycles, compared with about 115 clock cycles needed for an integer multiply. The MUL instruction was sped up on the later editions of the 80x86 family, but even on the 80386, MUL can take as many as 41 clock cycles compared with the two needed for XOR on the same processor.

As noted previously, the period of this method is given by the expression 2^p-1; in the case of r250, p = 250, so the period of this implementation is 2^250-1, which is approximately 1.8e75. In comparison, the 16-bit linear congruent method repeats (in the best case) after 65,536 iterations. For applications where thousands of random numbers are required, r250 is clearly the superior choice.

Implementing r250

The r250 algorithm is implemented with two functions, one which initializes the generator and one which returns a random integer each time it is called. The initialization function sets up a buffer of 250 random numbers, created using some other available random number generator such as one employing the linear congruent method. A pointer into the buffer is also set up. After the system has been initialized, the generator routine creates a new random number by performing an XOR between the numbers in the buffer at the current index and at the current index plus 103. If adding 103 to the index would put the pointer beyond the end of the buffer, it is wrapped around to keep it within bounds. The new number produced by the XOR is stored at the current index, and is also returned as the function value. The index is incremented before returning in preparation for the next time r250 is called.

There is one potential problem concerning r250's initialization that must be dealt with. Certain combinations of bits in the initial buffer can cause the r250 algorithm to produce numbers which are too regular to be considered pseudo random. One example of this would be the situation where a given bit was 0 in each of the buffer words. Say, for example, that the buffer was initialized with bit 12 equal to 0 in all 250 words in the buffer. Because new words are created by XORing two previous buffer words together, and 0 XOR 0 = 0, bit 12 would be 0 in all subsequent words. This situation and other similar problems can be avoided by ensuring that the buffer words are linearly independent. Although a full explanation of this requires knowledge of linear algebra and is beyond the scope of this article, linear independence can be achieved by application of the following algorithm. For N-bit words, choose any N words from the 250 word buffer, and think of the bits in these words as forming an N x N square matrix. Set all of the bits along the diagonal from the upper left corner to the lower right corner to 1, and set all bits to the left of these diagonal bits to 0, leaving bits to the right of the diagonal unchanged. This ensures linear independence, and guarantees that the random numbers produced by r250 are pseudo random.

The code in Listing One (page 157) is a 16-bit implementation of the r250 algorithm in C, organized as a separate module that can be linked to other programs requiring random numbers. The code shown was compiled with the Turbo C++ compiler, but it does not use any elements of C++ and should compile under almost any standard C compiler. Three routines are provided: r250_init for initializing the system, r250 for generating random unsigned integers, and dr250 for generating floating point random numbers in the interval 0 to 1. The static variables r250_buffer and r250_index hold the random number buffer and the index to the current location in the buffer, respectively. (C++ devotees will recognize that these functions and data can be encapsulated into a class for r250.) The standard C function rand( ) is used to initialize the r250 buffer. However, rand( ) returns integers in the range 0 to 0x7fff, so I make the numbers in the r250 buffer true 16-bit values by adding a loop to turn on bit 15 according to whether rand( ) returns a value in the upper or lower half of its range, delimited by the value 16384. The last loop ensures that the buffer is correctly initialized by applying the linear independence algorithm given above. I use the formula k = 11 * j + 3 to spread out the selected words over the buffer, although any 16 words would in fact work just as well.

Because rand( ) is being used to initialize the r250 buffer, there is a possibility that overlapping sequences between runs will be produced, causing repeating sequences in the output of r250. For example, suppose that the seed value is chosen to be 687, and that rand( ) fills the r250 buffer with the values 687, 16857, 23139, 2104, 16876, and so on. Now suppose that a second run is made and that the seed chosen is 16857, which fills the r250 buffer with the values 16857, 23139, 2104, 16876, and so on. These buffers are nearly the same, so some of the output of r250 will be identical between runs. Although such a situation is possible, its probability of occurrence is low because there are only 250 values in the buffer from a possible 32768 produced by rand( ). However, if a great many runs are to be made and overlapping sequences would be a problem, it would be better to use r250 with elements larger than 16 bits. It would not be difficult to extend the code given to use 32-bit values and to initialize the r250 buffer with a coded linear congruent equation rather than rand( ).

The routine r250 implements the basic algorithm. The index j is the value of the current index plus 103, unless that addition would put it outside the buffer. Subtracting 147 from the index is the same as adding 103 and wrapping it around, because 103 + 147 = 250, the size of the buffer. The rest of the routine is straightforward. The routine dr250 returns a double in the range 0.0 to 1.0 using the r250 algorithm. It is an exact duplicate of r250 except for the last line, which divides by 0xFFFF to produce the floating point number. You could, of course, implement dr250 by performing a function call to r250 and then doing the division, but this adds the overhead of the function call even though it does save some space. Speed demon that I am, I chose the implementation shown.

A number of sophisticated tests, both theoretical and experimental, have been applied to the r250 algorithm to ensure that the numbers it is returning are valid, and the interested reader can look them up in the references. However, it is still necessary to check our algorithm to ensure that we have coded and implemented it correctly. The test I chose for this is an intuitive one that lends itself easily to graphical analysis. The basic concept behind this test is that if we divide the interval 0 to 1 into N equal bins (that is, subintervals), then a random number in the range 0 to 1 will be equally likely to appear in any of the N bins. Thus, if a large amount of random numbers are generated, say M, the number which appear in any given bin should be approximately M/N. We then run this test for various values of M and N, with various seed values for r250, and look for bins which contain too few or too many random numbers. On a single run there will be some variation in the number of values in each bin, but if we vary the seed value for a given M and N, we should not find any bins that are regularly over- or under-populated. A program which implements this test is given in Listing Two (page 157). I ran the test with output redirected to a file, then used the file to generate a graph showing the bin populations (see Figure 1). I could not find any evidence of regularity in the random numbers.

Another intuitive test that is easily performed is to generate a large set of random numbers in the range 0 to 1 and match them up in pairs. These pairs are then used as the X, Y coordinates for points to be plotted. This procedure is repeated for many random number pairs, all plotted on the same graph. The points so plotted should be uniformly spread over the area from x = 0 to 1 and y = 0 to 1. In addition, there should not be any regular features in the plot, such as lines or spirals of points, which would indicate a departure from randomness. I ran this test as well on r250 (see Figure 2), and again could not find any problems.

As noted, the r250 method should, in theory, be quite fast. To test this hypothesis I wrote a program to generate a large number of random numbers using both the r250 method and the standard rand( ) function from the C library. I ran this test on both an AT&T 3B2 computer running Unix and my 80386 PC under DOS, timing the results so that a speed comparison could be done. On the 3B2 machine the r250 method was consistently about 15 percent faster than rand( ). A similar test on the 80386 machine showed that r250 was also about 15 percent faster than rand( ) on that platform. In both cases I used r250 implemented in C as shown here; even more speed could be gained by writing r250 in assembly language.

The r250 random number generator is an attractive alternative to the usual methods. I have used it in a wide variety of applications and have been very satisfied with the results. R250 is faster than the standard rand( ) function, and will produce far longer sequences of random numbers without repeating than the linear congruent method. Although I chose to implement r250 with 16-bit random numbers, the method can be extended in a straightforward manner to produce any size random numbers desired. In fact, the only disadvantage to the r250 method that I can find is that is takes up more space than the linear congruent method, which can be implemented with a single line of C code. For applications where space is at a premium, the linear congruent method may be preferable, but in most situations r250 can save valuable CPU time and provide a better spectrum of random numbers at only a small cost in program size.

References

Bak, P. "Doing Physics with Microcomputers." Physics Today (December, 1983).

Kirkpatrick, S. and E. Stoll. "A Very Fast Shift-Register Sequence Random Number Generator." Journal of Computational Physics (vol. 40, 1981).

Fruit, Robert "A Pseudo Random Number Generator." The C User's Journal (May, 1990).

Knuth, Donald. The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Reading, Mass.: Addison-Wesley, 1969.


_A FAST PSEUDO RANDOM NUMBER GENERATOR_
by W.L. Maier

[LISTING ONE]



/***************************************************************************
*  Module:  r250.cpp   Description: implements R250 random number generator,
*  from S. Kirkpatrick and E. Stoll, Journal of Computational Physics, 40,
*  p. 517 (1981). Written by:    W. L. Maier
***************************************************************************/

#include <stdlib.h>

/**** Static variables ****/
static unsigned int r250_buffer[250];
static int r250_index;

/**** Function prototypes  ****/
void r250_init(int seed);
unsigned int r250();
double dr250();

/**** Function: r250_init  Description: initializes r250 random number
generator. ****/
void r250_init(int seed)
{
/*------------------------------------------------------------------------*/
    int        j, k;
    unsigned int mask;
    unsigned int msb;
/*------------------------------------------------------------------------*/
    srand(seed);
    r250_index = 0;
    for (j = 0; j < 250; j++)     /* Fill the r250 buffer with 15-bit values */
        r250_buffer[j] = rand();
    for (j = 0; j < 250; j++)     /* Set some of the MS bits to 1 */
        if (rand() > 16384)
            r250_buffer[j] |= 0x8000;
    msb = 0x8000;       /* To turn on the diagonal bit   */
    mask = 0xffff;      /* To turn off the leftmost bits */
    for (j = 0; j < 16; j++)
        {
        k = 11 * j + 3;             /* Select a word to operate on        */
        r250_buffer[k] &= mask;     /* Turn off bits left of the diagonal */
        r250_buffer[k] |= msb;      /* Turn on the diagonal bit           */
        mask >>= 1;
        msb >>= 1;
        }
}
/**** Function:  r250  Description: returns a random unsigned integer. ****/
unsigned int r250()
{
/*------------------------------------------------------------------------*/
    register int    j;
    register unsigned int new_rand;
/*------------------------------------------------------------------------*/
    if (r250_index >= 147)
        j = r250_index - 147;      /* Wrap pointer around */
    else
        j = r250_index + 103;

    new_rand = r250_buffer[r250_index] ^ r250_buffer[j];
    r250_buffer[r250_index] = new_rand;
    if (r250_index >= 249)      /* Increment pointer for next time */
        r250_index = 0;
    else
        r250_index++;

    return new_rand;
}
/**** Function:  r250  Description: returns a random double in range 0-1. ****/
double dr250()
{
/*------------------------------------------------------------------------*/
    register int    j;
    register unsigned int new_rand;
/*------------------------------------------------------------------------*/
    if (r250_index >= 147)
        j = r250_index - 147;     /* Wrap pointer around */
    else
        j = r250_index + 103;

    new_rand = r250_buffer[r250_index] ^ r250_buffer[j];
    r250_buffer[r250_index] = new_rand;
    if (r250_index >= 249)      /* Increment pointer for next time */
        r250_index = 0;
    else
        r250_index++;
    return new_rand / (double)0xffff;   /* Return a number in 0.0 to 1.0 */
}

[LISTING TWO]



/***************************************************************************
*  Module: rtest.c   Description: tests R250 random number generator by
*  placing data in a set of bins.
***************************************************************************/

#include <stdio.h>
#include <stdlib.h>

/**** Constants ****/
#define NMR_RAND   5000
#define MAX_BINS   500

/**** Function prototypes *****/
unsigned int r250();
void   r250_init(int seed);

/**** Function:    main  ****/
void main(int argc, char *argv[])
{
/*------------------------------------------------------------------------*/
   int      j, k;
   int      nmr_bins;
   int      seed;
   int      bins[MAX_BINS];
   double   randm;
   double   bin_limit[MAX_BINS];
   double   bin_inc;
/*------------------------------------------------------------------------*/
   if (argc != 3)
      {
      printf("Usage -- rtest [nmr_bins] [seed]\n");
      exit(1);
      }
   nmr_bins = atoi(argv[1]);
   if (nmr_bins > MAX_BINS)
      {
      printf("Error -- maximum number of bins is %d\n", MAX_BINS);
      exit(1);
      }
   seed = atoi(argv[2]);
   r250_init(seed);
   bin_inc = 1.0 / nmr_bins;
   for (j = 0; j < nmr_bins; j++)
      {
      bins[j] = 0;      // Initialize bins to zero
      bin_limit[j] = (j + 1) * bin_inc;
      }
   bin_limit[nmr_bins-1] = 1.0e7;   // Make sure all others are in last bin
   for (j = 0; j < NMR_RAND; j++)
      {
      randm = r250() / (double)0xffff;
      for (k = 0; k < nmr_bins; k++)
         if (randm < bin_limit[k])
            {
            (bins[k])++;
            break;
            }
      }
   for (j = 0; j < nmr_bins; j++)
      printf("%d\n", bins[j]);
}