Channels ▼

Mark Nelson

Dr. Dobb's Bloggers

A Big Problem That Doesn't Need a Bignum

February 15, 2011

When I finally wrapped up the mathematical modeling of the k-consecutive heads coin toss problem I had a concise formula:

To calculate the chances of k heads in a row appearing in n tosses, I just had to calculate a k-step Fibonacci number, and then divide it by an exponent of 2, and subtract the result from 1. Not complicated at all - but there was one catch. After a million coin tosses, the Fibonacci number and the exponent of 2 were approximately 300,000 digits long - way outside the scope of normal computations in languages like C++ and Java.

To get the exact result I needed, I used the BigInteger and BigDecimal Java classes to handle those big numbers, and had to wait about an hour to get the result. As it turns out, there is a much faster and easier way to do things

Simplifying the Computation

To solve this problem in a more reasonable matter, take a look at the part of the equation where all the real work is done:
In this equation, the p we are solving for is the probability that no runs of k consecutive heads will occur in n tosses of a fair coin. Although we might be able to find a closed-form equation that defines the k-step Fibonacci number, it is going to be a complicated polynomial, and the chances of us simplifying this equation are effectively nil.

So at first blush I seem to be stuck with calculating the top and bottom parts of the fraction independently. But a little inspection of the equation shows that I can actually calculate p in an iterative fashion, starting with n=1, and working my way up to the desired result.

Calculating the Fibonacci Number

The first step in the calculation is writing the code to calculate k-step Fibonacci numbers. Recall that the k-step Fibonacci number n is found by summing the values of the previous k numbers - a recursive definition. The base cases for all values of k are: Fibk(n) is 1 when n is 1, and 0 when n is less than 1.

To determine the value of Fibk in an iterative fashion, I make use of the the handy C++ container deque<T>. To calculate a k-step Fibonacci number, I make use of a deque<double> of size k that contains the previous k Fibonacci numbers. With the data structure set up like that, calculating the next Fibonacci number is simple:

  q.push_front( accumulate( q.begin(), q.end(), 0 ) );
  q.pop_back();
Note that after this calculation, q[ 0 ] contains the next new Fibonacci number, and the previous k-1 numbers are still stored in sequence in the deque. The call to pop_back() discards Fibk(n-k), which is not going to be needed for the next calculation.

Add the code to initialize the deque, and a loop, and we can print out Fibk(n) for values 1 through n:

	deque<double> q(k, 0);
	q[0] = 1.0;
	for ( int i = 1 ; i < n ; i++ ) {
            cout << q[ 0 ] << " ";
	    q.push_front( accumulate( q.begin(), q.end(), 0.0 ) );
            q.pop_back();
	}
	cout << q[ 0 ] << endl;
Using the familiar value of k=2, and n=20, I get the following output:
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765
The deque is perfect for this class - it lets us simulate a rolling window through the Fibonacci sequence by pushing new values on one end and popping aged ones off the other end. All this while providing constant time access to indexed elements. (Admittedly this code could be written just as efficiently with a list<double>.)

Divide By Two

This is a nice simple calculation, but it only gets me halfway to the goal. I'm calculating the top part of the fraction that computes p, but I need to divide by the bottom value as well. As it turns out, I can accomplish this in a way that allows me to iterate to the next value: before I add the values of the previous Fibonacci numbers to q[0], I just divide all of them by two. The resulting loop ends up looking like this:
	deque<double> q(k, 0);
	q[0] = 1.0;
	q[1] = 1.0;
	for ( int i = 1 ; i < n ; i++ ) {
            q.push_front(0.0);
            for ( int j = 1 ; j <= k ; j++ ) {
                q[ j ] /= 2.0;
                q[ 0 ] += q[ j ];
            }
            q.pop_back();
	}
	cout << q[ 0 ] << endl;
Note that I've slightly changed the initialization - the deque<double> is initialized with Fibk(2) - this is because the top half of the fraction uses Fibk(n+2), and the bottom uses 2n - the value of n for the Fibonacci calculation always has to be two ahead of the exponent of 2.

This calculation allows me to calculate the probabilities for large numbers of tosses in a matter of seconds, instead of the hour-plus it took using BigInteger. Using this code I was able to get the final word on just when we could be certain that a certain number of coin tosses would produce a run of 20 heads. If we want to be modestly certain, with 90% probability I can say that 4,828,842 coin tosses will produce a run of 20 heads.

If my life depended on it, and I wanted to get out to five nines, I can say that 24,144,137 coin tosses will produce a run of 20 heads with 99.999% probability. The output of kfib.cpp is shown here:

There is a 90% probability of seeing 20 consecutive heads in 4828842 tosses of a fair coin
There is a 99% probability of seeing 20 consecutive heads in 9657666 tosses of a fair coin
There is a 99.9% probability of seeing 20 consecutive heads in 14486490 tosses of a fair coin
There is a 99.99% probability of seeing 20 consecutive heads in 19315313 tosses of a fair coin
There is a 99.999% probability of seeing 20 consecutive heads in 24144137 tosses of a fair coin
And that's the last thing I am going to say about it.

Note for DDJ Readers

I read all the comments posted on my DDJ blog articles, but you should know that unfortunately, this is done without the support of a notification system. So if you don't see a response right away, be patient - I'll be making a pass through soon.

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video