The paradox of probability 1
Sometimes people will describe an algorithm by saying that it works "with probability 1." It is almost true, but nevertheless naive to believe, that such a claim is the same as saying that the algorithm always works. The difference is subtle, in much the same way as the difference is subtle between the notions of "infinite" and "unbounded."
Here is a simple example. Suppose you want to cause something to happen 1/3 of the time, and the only way you can generate random data is by tossing a coin. Is there an algorithm for causing something to happen with probability 1/3?
On the surface, the answer would seem to be no. If you toss a coin n times, you get 2^n possible outcomes, and there is no integer k such that k/(2^n) is exactly 1/3. So if you look at the set of all of the possible outcomes of tossing a coin n times, there is no subset of that set that contains exactly 1/3 of its elements.
But what we have really proven is not that the problem has no solution, but rather that it has no solution if the number of coin tosses is bounded in advance. If we drop that restriction, the solution is easy.
What we do is toss the coin twice. Doing so gives us four possible outcomes, which we can call A, B, C, and D. If the outcome is A, we will do whatever it is we wanted to do 1/3 of the time. If the outcome is B or C, we won't do it. Finally, if the outcome is D, we will repeat the experiment, tossing the coin twice more. We will keep repeating it until the outcome is A, B, or C.
What if the outcome is D every time? Well, that can't happen—not really. For example, if we assign D to two tails in a row, we're saying that tossing a coin often enough is assuredly going to come up heads eventually. We can't say when it will happen, but we know that it must happen eventually.
This phenomenon is what we mean when we say that something will happen with probability 1. The probability of getting heads on the first toss is 1/2; for the first or second toss, it's 3/4; for the first, second, or third, it's 7/8, and so on. The limit of 1/2, 3/4, 7/8, 15/16, ... is 1; so we say that the probability is 1. It will eventually happen; we just don't know when.
There are a number of algorithms out there that work with probability 1. In general, there's little practical difference between such algorithms and algorithms that "always work" in the more traditional sense. The biggest difference is that it is usually not possible to put an upper bound on how long such an algorithm might take to run. We can usually talk about the average performance, but the absolute worst case may be more slippery.
But an algorithm doesn't have to have "probability 1" behavior for its absolute worst case to be intolerably bad: There are other, widely used, algorithms with this property. My next post will discuss a common example of such an algorithm.