The supplied code (available online at www.ddj.com/code/) can be used to estimate the runtime optimization resulting from the use of SmartRange. The theoretical limit is 25 percent off the original runtime: Going down from the expected 2 calls to next() per each call to next(max) when the probability of rejection is 1/2, to 1.5 expected calls to next() when the probability of the first rejection is 1/2 , and after reduction of m the probability of another rejection is 0 (one call × probability of 1/2 , plus two calls × probability of 1/2).
This result is what happens with the chosen Mersenne Twister implementation, when defining, for instance, m=231+32 on 32-bit platforms. Actually, SmartRange favors even better than the aforementioned 25 percent, probably due to more information being available to the compiler in this case; see Table 1.
|SimpleRange||1.99976||1.30s (0.53s w/inlining)|
The computation of the remainder range size rem is carefully written to avoid overflows when value_t is an unsigned integer (as it is in the Random wrapper class). Mathematically, if you define m as the desired range exclusive maximum, M as the maximum number representable by value_t plus 1 (that is, making the previously used M greater by 1), and r as the remainder range size, then r=M mod m. Moreover, the GCD value that we compute in random_traits is actually:
g=gcd(m,r)=gcd(m,M mod m)=gcd(M,m)
Because on most architectures M is a power of 2 (do you know a platform on which it is not true?), g is actually 2 to the power of the number of trailing zeros in the binary representation of m. Thus, if we wish to forgo platform neutrality, the approach I present in this article might be efficient even when m is not known at compile time, especially if appropriate bit-twiddling hacks are employed (graphics.stanford.edu/~seander/bithacks.html).
Can REDUCE_MAX happen more than once for a single call to next(max)? The answer is no, but proving it is left as an exercise to interested readers.