# The Million Random Digit Challenge

December 28, 2009

I've mentioned my Million Random Digit Challenge here before. In a nutshell, I've posted a file of a million random decimal digits, packed into binary form, and challenged all comers to compress it. The proof is required to be a Kolmogorov-style work: a program that when run creates a perfect copy of the original file. The only requirement is that the program (plus any associated data file) be smaller than the target million digit file.

People have been attempting to meet this challenge since 2002 with no luck. The file was specifically designed by geniuses at RAND not to have any recognizable statistical patterns, and apparently this goal was accomplished quite well. And what do most compression programs do? Look for statistical patterns. Fail.

Abandon Patterns

Classic statistical techniques are just not going to do it for this problem. I think the only chance to win this prize is to use something I've often disparaged, which I call Magic Function Theory.

The idea behind Magic Function Theory is that we come up with some short but sweet generator function that can create a long sequence. Just as an example, I can create a magic function for any sequence imaginable using just three things:

• A program that generates the digits of pi. This program will be quite short.
• An offset into that string of digits.
• A length of the string starting at that offset.

I believe (IANAM) that this system will provably generate all sequences of digits. Of course, how long would we have to go to find the million random digits? Here's where it gets interesting. We might have to go quite a distance, but what if the offset to the million random digits turns out to be an easily compressible number? What if the million random digits appear at position 374,567,11114,127,269 - 623,557,570,925? If that were the case, we could represent the million random digits in a few hundred bytes - quite an accomplishment.

Another approach might be to look for polynomials that generate the million digit number. What if there was some short polynomial of the form kn + j that generated the number, where k, n, and j were representable using some nice compact format?

The final suggestion I will toss out for consideration is to use prime numbers. Find the nth prime number, p(n), that is closest to the million digit number, then add in an offset. The simple formula p(n) + k has a shot at generating our target. (Note the downside to this is that n will be heartbreakingly large. It won't contain a million digits, but it will only be a bit shorter. There are a lot of primes.)

If Only...

These are the kind of ideas that motivate a lot of dreamers out there, and it is at least a good intellectual exercise to think about how one would go about solving the problem this way. The prime number test, for example, appears to be unsolvable today - p(n) is only known for prime numbers up to something like 1018.

As the prime number analysis shows, it is just very difficult to deal with the hunt when the target is a million digits long. To win at this I suspect raw computer power would be less important than theoretical and algorithmic foundations.

A little analysis shows one thing: you wouldn't just have to be lucky to win with any of these approaches. You would have to be staggeringly, incredibly lucky - as if you were the one elementary particle in the entire universe selected for a lottery prize. If you doubt it, try using some of these tests for say, 10 digit numbers, and see what you would need to do to compress the key values to less than 10 digits.

But despite the odds, many will still continue the hunt. Perhaps this post will give them some ideas on new approaches.

### More Insights

 To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

# First C Compiler Now on Github

The earliest known C compiler by the legendary Dennis Ritchie has been published on the repository.

# HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

HTML5 Mobile Development: Seven Good Ideas (and Three Bad Ones)

# Building Bare Metal ARM Systems with GNU

All you need to know to get up and running... and programming on ARM

# Amazon's Vogels Challenges IT: Rethink App Dev

Amazon Web Services CTO says promised land of cloud computing requires a new generation of applications that follow different principles.

# How to Select a PaaS Partner

Eventually, the vast majority of Web applications will run on a platform-as-a-service, or PaaS, vendor's infrastructure. To help sort out the options, we sent out a matrix with more than 70 decision points to a variety of PaaS providers.

More "Best of the Web" >>