A Tidal Wave of Spam

By Lincoln D. Stein, July 01, 2002

Why let junk mail infuriate you when it can entertain you instead?

I opened my email inbox this morning and found 120 messages waiting. These included:

Twenty-seven messages from several mailing lists to which I subscribe
Twenty-seven messages sent by the Klez virus, including two masquerading under subject lines that purported to tell me how to remove the Klez virus
Twelve messages in various non-English languages, most of which I can't read because I don't have the proper character set
Twelve messages from merchants, including three trying to sell me prescription and non-prescription pharmaceuticals, two trying to sell me computer hardware, and two advertising Web and email hosting services
Six messages trying to entice me to pornography sites or interest me in sex services, including an advertisement for penis enlargement techniques
Five messages offering make-money-quick schemes, including two variants on the Nigerian scam
Four bounced emails, mostly from mailing lists
Three political messages
Two offers of software or services for issuing mass mailings

I also received 22 personal messages, about 18 percent of the total email in my inbox. If we pool mailing lists and personal mail, my inbox is still dominated by a majority (51 percent) of unsolicited spam.

Junk email used to make me furious. My first attempt to fight back was to junk the junk mail using the filtering software that was built into my mail reader. Each time I received a new piece of spam, I'd enter the sender's name and subject line into the filter so that I'd never receive that piece of spam again. This system never worked well because spammers vary their headers to avoid this type of filtering. For each piece of junk email the filter found, five slipped past.

I turned to more sophisticated filtering using third-party software running under the Unix procmail facility. The filter that I was originally most enthusiastic about used a form of fuzzy logic to count the occurrences of a long list of spam-related phrases, assigning each piece of incoming mail a spam likelihood index. The index could then be used to sort mail into various folders. For example, if a piece of incoming mail contained a high frequency of the phrases money, make money, and sure fire, it would be classified as a potential make-money-quick email and shunted to the junk mail pile.

I was pretty happy with this software until it misclassified and junked a legitimate email that was sent to notify me that I had been awarded a large grant for my research. I guess a letter that says "you have won a grant" sounds too much like one that says "you may already have won the sweepstakes."

I tinkered with the settings for a while, but could never achieve a satisfactory balance. If I set the software carefully enough that it would never misclassify a legitimate email, it let so much spam through that it wasn't worth the effort. Other filters that I experimented with had similar problems.

There are probably better filters out there, but filtering is a losing battle. I receive about 60 spam messages a day, so a filter would have to detect spam about 99 percent of the time to reduce the number of unsolicited messages to one a day. I get an equal number of legitimate messages daily, and to avoid missing more than one legitimate message per week, I need a filter that misclassifies less than 0.2 percent of mail. Maybe I can find a filter that has these characteristics, but consider what happens when the amount of junk mail I receive increases fivefold, which will likely happen sometime in 2004. To handle this tidal wave of spam, I'll need a filter that's more than 99.7 percent sensitive, but doesn't sacrifice specificity. These will be hard criterion to meet.

I could rail against spam, call on legislatures to criminalize it, encourage ISPs to block it, or propose radical strategies like imposing a charge for each Internet email sent. But I won't. Each of these proposals creates new problems, and many are worse than the one we're trying to solve. Instead, I've learned to stop worrying and to love the spam. A month ago I tossed out my mail filters. I like to think of my morning email sessions as "spam surfing." What new exciting opportunities are complete strangers offering me? What all-natural and completely safe herbal remedies will regrow my hair, boost my sexual stamina, and help me sleep better at night? What interesting attachments does Klez have for me today?

I'm happy and relaxed, and the only small cloud on the horizon is the thought of what I might be missing out on in all of those foreign-language messages. Maybe I should learn Korean.

Lincoln is an M.D. and Ph.D. who designs information systems for the human genome project at Cold Spring Harbor Laboratory in New York, NY. You can contact him at [email protected].

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

A Tidal Wave of Spam

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

A Tidal Wave of Spam

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content