Listing Two implements the recursive variation. You can again define a function `B(p) `

to represent the average number of fair bits you get per coin flip when the coin lands heads with probability `p`

. For every two flips, you again get a fair bit with probability 2`pq`

; this adds `pq`

fair bits per coin flip, on average. If you don't get a fair bit, you get a new flip. Recall the coin gave two heads with probability` p`

^{2 }and two tails with probability `q`

^{2}, so on average you get `(p`

^{2}+`q`

^{2})/2 additional recursive coin flips per original coin flip. Also, each coin flip at the new level is heads with probability `p`

^{2}/(`p`

^{2}+`q`

^{2}) and tails with probability `q`

^{2}/(`p`

^{2}+`q`

^{2}). (These values sum to 1, and preserve the proper ratio between two heads and two tails.)

###### Listing Two

Function ExtractBits ( Flips[0,NumFlips-1] ) NumNewFlips = 0; for (j = 0; j < (n-1)/2; j++) { if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Tails ) print 0; if ( Flips[2*j] == Tails ) and ( Flips[2*j+1] == Heads ) print 1; if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Heads) { NewFlips[NumNewFlips++] = Heads; } if ( Flips[2*j] == Tails ) and ( Flips[2*j+1] == Tails) { NewFlips[NumNewFlips++] =Tails; } } if (NumNewFlips >= 2) ExtractBits (NewFlips[0,NumNewFlips-1]); }

So now you can write the appropriate equation, as in >Figure 1(a). While this equation `B(p)`

does not appear to have a simple closed form, you can use it to calculate specific values of `B(p)`

recursively. Moreover, when `p=q=`

1/2, you obtain `B(p)`

=1/3, as shown in Figure 1(b). While this is an improvement for fair coins over the initial scheme, you are still far from the optimal of one fair bit per coin flip when the coin is fair. The recursive extraction removes some of the waste, but there is still more to be done.

###### Figure 1: Calculating B(p) for a simple recursive scheme.

### Making Even More Use of Symmetry

There is another symmetry you can take advantage of. Consider when the coin lands heads-heads-heads-tails and heads-tails-heads-heads. Both sequences produce one fair bit and one flip for the next sequence in the simple recursive scheme. But you have not taken advantage of the order in which these two events happened; in the first sequence, the fair bit was produced second, and in the second sequence, it was produced first. The symmetry between these two situations can yield another fair bit.

An easy way to extract this extra bit is to again use the idea of creating an additional sequence of coin flips. From the original sequence, you can derive a new sequence that gives you a biased coin flip for each consecutive pair of original flips. If the two flips are the same, you get heads; if the two flips are different, you get tails. You then apply von Neumann's scheme to the new sequence as well as the original sequence; see Table 3. You can again glue together the two outputs to obtain 01010010.

###### Table 3: Introducing additional sequences yields more bits.

Now you can also use the other new sequence described previously. So when you start with an original sequence, use it to derive two further sequences as in Table 4. It turns out that if you glue all the bits together, you get independent and fair bits.

###### Table 4: Using the two sequences together.

Of course, you can go further by recursively extracting more bits from each sequence. That is, each new sequence should generate two further new sequences, and so on. This recursive construction is easy to code; see Listing Three.

###### Listing Three

Function ExtractBits ( Flips[0,NumFlips-1] ) NumNewFlipsA = 0; NumNewFlipsB = 0; for (j = 0; j < (NumFlips-1)/2; j++) { if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Tails ) { print 0; NewFlipsA[NumNewFlipsA++] = Heads; } if ( Flips[2*j] == Tails) and ( Flips[2*j+1] == Heads ) { print 1; NewFlipsA[NumNewFlipsA++] = Heads; } if ( Flips[2*j] == Heads ) and ( Flips[2*j+1] == Heads) { NewFlipsB[NumNewFlipsB++] = Heads; NewFlipsA[NumNewFlipsA++] = Tails; } if ( Flips[2*j] == Tails ) and ( Flips[2*j+1] == Tails) { NewFlipsB[NumNewFlipsB++] = Tails; NewFlipsA[NumNewFlipsA++] = Tails; } } if (NumNewFlipsA >= 2) ExtractBits (NewFlipsA[0,NumNewFlipsA-1]); if (NumNewFlipsB >= 2) ExtractBits (NewFlipsB[0,NumNewFlipsB-1]); }

All of the procedures to this point have been designed recursively, so the bits from one sequence are output before the bits from the derived sequences are output. Is this important? In fact, you must be somewhat careful to make sure that you do not introduce correlations by ordering the bits in an unusual fashion. The recursive approach is known to be safe, so I recommend sticking with it.

You can again write an equation that determines the long-term average number of bits produced per flip. The equation is similar to the previous case, except now for every two flips, we also get an additional flip in one of the derived sequences. This flip is heads with probability `p`

^{2}+`q`

^{2}, since it comes up heads whenever the pair of flips are the same. This results in the equation in Figure 2(a). As shown in Figure 2(b), this equation yields that `B`

(1/2)=1. Now if you flip a fair coin, you expect in the long run to get out one fair bit per flip using this recursive procedure. You are finally doing essentially as good as you could hope for.

###### Figure 2: Calculating B(p) for the recursive scheme with two new sequences generated per sequence.

This process extracts the maximum number of fair bits possible for every value of `p`

. To prove this requires knowing some information theory. You need to know that the maximum rate at which fair bits can be produced from biased bits is given by the entropy function; see Figure 3(a). The complex looking equation has a straightforward solution; `B(p) `

equals the entropy` H(p). `

You can verify this by plugging `B(p)=H(p) `

into the recurrence. It is easiest to first calculate term by term, and to use the fact that when `p+q=`

1, you have 1-`p`

^{2}-`q`

^{2}=2`pq.`

Suppose `B(p)=H(p).`

Then you first calculate easier expressions for the terms on the right side; see Figures 3(b) and 3(c). Now check that the right side comes out to `H(p)=B(p)`

; see Figure 3(d).

###### Figure 3: Verifying that B(p)=H(p), the entropy function.

### Conclusion

Although I have presented some basic psuedocode, you might find it interesting to write more efficient versions on your own. There is a collection of implementations and related links at http://www.ciphergoth.org/software/unbiasing/, and Peres's work has been the subject of discussion on the sci.crypt newsgroup.

### Acknowledgment

Michael's work is supported by NSF grant numbers CCR-9983832, CCR-0118701, and CCR-0121154.

*
Michael is a professor of computer science at Harvard University and can be contacted at michaelm@eecs.harvard.edu or http://www.eecs.harvard.edu/~michaelm/.*