overview

Advanced

Tangled Up in Spam

Posted by archive 
February 9, 2003
Tangled Up in Spam
[www.nytimes.com]

By JAMES GLEICK

I know what your in-box looks like, and it isn't pretty. It looks like mine: a babble of come-ons and lies from hucksters and con artists. To find your real e-mail, you must wade through the torrent of fraud and obscenity known politely as ''unsolicited bulk e-mail'' and colloquially as spam. In a perverse tribute to the power of the online revolution, we are all suddenly getting the same mail.

The spam epidemic has just a few themes and variations: phone cards, cable descramblers, vacation prizes. Easy credit, easy weight loss, free vacations, free Girlz. Inkjet cartridges and black-market Viagra, get-rich-quick schemes and every possible form of pornography. The crush of these messages on the world's networks is now numbered in billions per day. One anti-spam service measured more than five million unique spam attacks in December, almost three times as many as a year earlier. The well is poisoned.

Spam is not just a nuisance. It absorbs bandwidth and overwhelms Internet service providers. Corporate tech staffs labor to deploy filtering technology to protect their networks. The cost is now widely estimated (though all such estimates are largely guesswork) at billions of dollars a year. The social costs are immeasurable: people fear participating in the collective life of the Internet, they withdraw or they learn to conceal their e-mail addresses, identifying themselves as user@domain.invalid or someone@nospam.com. The signal-to-noise ratio nears zero, and trust is destroyed.

''Spam has become the organized crime of the Internet,'' said Barry Shein, president of the World, one of the original Internet service providers. ''Most people see it as a private mailbox problem. But more and more it's becoming a systems and engineering and networking problem.'' He told the 2003 Spam Conference in Cambridge, Mass., last month that his service is sometimes pounded by the same spam from 200 computer systems simultaneously. ''It's depressing. It's more depressing than you think. Spammers are gaining control of the Internet.''

If your own experience doesn't seem this bad, just wait. You may be a recent convert to e-mail; your address may not yet have percolated through the deep swamp of spammer databases (truly a land of no return). Quantity matters, psychologically. If five daily spams seem merely annoying, 20 or 30 will be maddening, creepy -- and chilling. Some avid Internet users report a hundred or more a day.

The harvesting of e-mail addresses by spammers is relentless and swift. Investigators for the Federal Trade Commission recently posted some freshly minted e-mail addresses in chat rooms and news groups to see what would happen; in one case, the first spam came in nine minutes. Addresses are sold and resold on CD-ROM's in batches of millions. If you have ever revealed your e-mail address in a public forum, or allowed it to appear on a Web page, or used it in buying merchandise online, your experience of the online world is sure to sound like this: ''Looking for love?'' ''$900 weekly at home.'' ''Fwd: Your winning lottery ticket.'' ''Biggee your penis 3 inches in 22 days.'' ''Have you received your cash?'' ''Hard core so intense it's sinful.'' ''Advanced degree = advanced career.'' ''Natural enlargement where you need it.'' ''Live chat room with real women!!''

Your correspondents claim to care about your health. Unfortunately, they care mainly about the length of your penis and the size of your breasts. They do not discriminate by sex; either way, you are assumed to feel inadequate. They offer implants and human growth hormone therapy. They are pharmaceutical enthusiasts, but we're not talking about penicillin; it's ''Viagra-Phentermine-Xenical-Propecia and MORE!'' They care about your finances. Here come the guaranteed paths to mortgage deals of a lifetime, cheap insurance, million-dollar prizes, hot stock tips and secrets of commodities trading.

Parents panic when they discover that their teenagers have mail from ''Mellisa'' at SexAffair.org: ''Hi there! I got your e-mail from Jennifer and I just wanted to tell you strait up, I really like -! She told me u're into -- too. Lets hookup for a juicy weekend.'' Does this mean the kids are checking out porn sites? (No, it does not.)

Another major spam category is ''fast cash -- work at home.'' More often than not this is a do-it-yourself kit -- you pay up front, and they send you everything you need to be a spammer from your own PC.

How this bane came to sully the greatest revolution in personal communication since the telephone makes for a complex and troubling story, with no promise of a happy ending. From the beginning, the Internet has tried to fight spam with grass-roots vigilantism. Software companies now routinely build spam-filtering technology into their e-mail programs, and independent programmers are struggling to devise more creative methods for separating wheat from chaff. Millions of individual e-mail users are trying to devise coping strategies of their own. Consumer advocates are working mostly in vain to persuade lawmakers to take action in what should, after all, be a popular cause.

Each in its own way, for different reasons, these efforts are failing.


Long, long ago, in a previous century, when the Internet was young, people discovered both the power and danger of mass e-mail. One online pioneer, Brad Templeton, says he believes he has pinpointed the first e-mail spam: in 1978, a Digital Equipment Corporation salesperson typed several hundred addresses by hand -- those of scientists and researchers on the Arpanet, the predecessor of the Internet -- and sent them an announcement of a product presentation. A small furor erupted. ''Where is the line to be drawn between this sort of thing (if it is to be allowed at all) and advertising?'' a recipient at Stanford University asked plaintively. The Net was a scientific and military enterprise -- most emphatically noncommercial.

The modern epidemic began 15 years later, coinciding with the explosive popularization of e-mail in 1993 and 1994. A chain letter began to spread, titled ''MAKE MONEY FAST.'' And a pair of Arizona immigration lawyers, Laurence Canter and Martha Siegel, bombarded the Internet with a notorious advertisement about the ''Green Card Lottery.'' Angry recipients counterattacked, overwhelming the lawyers' service provider with complaints. But these proto-spammers were unrepentant. Eventually they tried marketing a book, ''How to Make a Fortune on the Information Superhighway: Everyone's Guerrilla Guide to Marketing on the Internet and Other On-Line Services.''

Frauds and cons from the horse-and-buggy world quickly adapted to the new technologies. One of the most shameless is the so-called Nigerian spam. The subject is ''Urgent/Confidential'' or ''Assistance Required.'' The sender confides that he is a bank manager or political exile or son of the late commander in chief of the armed forces in Lagos. He explains that he needs your help in transferring $21.5 million in cash out of the country. ''I will like you as an foreigner to stand in as the next of kin. If only you will send your bank-account information, you can have a 40 percent cut.'' It's always the same letter, more or less. ''Should you not be in a position to assist, this deal has to remain a secret till the end of time.'' Sure. Some secret.

Few Internet users realize that this particular con began with handwritten letters and only recently grew into a mass-market business, in the category of World Gone Mad. Perhaps there's a reason for one con artist to ''solicit for your assistance in this mutually benefiting business transaction''; surely it defeats the purpose when they all send this same e-mail several times a day, week in and week out. Today it's from Dominic Tutu and Dominic Egbu, separately. Before that it was Dr. Lawrence Adu and Dr. Francis Oputa and Dr. Williams Ossai and Dr. Shuaibu Hamza. There can hardly be any unsuspecting victims left, so you have to wonder -- what's the point?

Early Internet users reacted so angrily to commercial mass mailings that fake return addresses became a necessity. America Online and other large service providers began closing accounts used for spam. The next big step -- indispensable to the spam epidemic -- was the rise of free mail services: Hotmail, now owned by Microsoft, and Yahoo. Two features of the modern Internet (both more or less accidental) make spamming easy: service providers desperate for market share at all costs; and an architecture of relatively open and insecure mail gateways. Together these enable hit-and-run e-mailers to create quick, disposable, false identities. It's why so many of your correspondents have addresses like ''buffy0412xxxmeb13mxy@hotmail.com'' -- though that one, offering me ''eBay insider secrets,'' day after day, turns out to be not just a pseudonym but also a forgery, not a real Hotmail account at all.

For that matter, anyone named Buffy who sends me e-mail is a spammer, judging from my experience. A suspicious number of my correspondents now seem to be called James. It seems safe to assume that a sender named NoMoreConstipation68487 is a spammer. Likewise for Persondude1 -- but no, that turns out to be my 11-year-old nephew. Sorting the good from the bad looks easy, but it's a real problem, both for humans trying to manage their in-boxes and for artificial intelligence.

A human being, looking at a sender called ''Ug56miZ5w@msn.com,'' can guess it's not a real person. Looking at subjects like ''Do your butt, hips and thighs embarrass you'' and ''My husband's not home, come and have me!'' and ''Make money fast'' (yes, this ancient artifact still makes the rounds), a human being knows enough to press the delete key. Why can't a computer be that smart?


Many programmers are working to automate the filtering of spam. The spammers, of course, are working to stay ahead. They forge identifying headers; some of your spam now appears to have been sent to you by you. They continually change the content, even adding random text. They make their subject lines sound real, or at least plausible:

• How are you?
• Thanks for requesting more information.
• Error in your favor.
• It is critical that your Internet connection is fixed.
• Someone is waiting for you.
• Listen to what people are saying about you.
• Pentagon readies war plans. (This one was an attempt to sell heating-oil options: ''$5,000.00 minimum investment.'')

Another gambit is to have their computers insert your name:

• James, is this still your e-mail?
• James, your paycheck has just increased!
• Hello, Gleick, darling.

All this in the name of tricking you into opening the message. ''Baby you're so strange'' -- can you resist? ''Do you remember me?''

Every major e-mail program now comes with filters meant to spare you the trouble. The idea is for the computer to detect spam and delete it, or at least move it into a separate folder where you can while away your time examining it later. Microsoft's e-mail clients use simple rules. If the first eight characters of the sender's name are digits, they guess the mail is spam. Likewise if the subject contains dollar signs and exclamation points, or if the message contains ''money back'' or ''check or money order'' or ''over 21'' or ''Dear friend.''

This approach catches less and less spam as time goes on, because the spammers buy the same software and test their mail in advance. If you're an enthusiastic computer user, you can add filters of your own. You can even download interesting filters from sites organized by other spam victims. You can't expect perfection. There will always be false negatives -- junk mail that manages to sneak through.

More troublesome, however, is the problem of false positives -- legitimate mail that is blocked. If your filter deletes what it thinks is spam, you may never see the message from your long-lost high-school sweetheart, who finally wants to make contact but uses too many exclamation points, or calls you a dear friend, or mentions sex.

Most corporate networks use spam filters, and many Internet service providers are also installing them, in response to pressure from their customers. The effectiveness varies widely. People use whitelists (friends) and blacklists (known spammers). People get frustrated and overcompensate, putting all of hotmail.com and yahoo.com and aol.com on their blacklists. ''Am I likely to miss important e-mail?'' writes Michael Fraase, a Minnesota Web consultant who goes to these extremes. ''Probably, but I have no way of knowing. Unfortunately the spam problem has become so bad that it's on the verge of rendering e-mail useless.''

One of the best tools for network administrators is an ever-evolving program called SpamAssassin, which uses a range of tests and a point system to identify spam. This is subtler than simple yes-no filtering. Messages get points for capitalized words like AMAZING and GUARANTEE and PROFITS. They get points for mentioning Viagra -- especially ''natural'' or ''herbal'' -- or penises or breasts. They get points for requesting a credit-card number, for including a toll-free number and for offering a full refund. They get points for odd-looking dates: much spam appears to have been sent in 1941; much appears to have been sent from the future. They get points for lively font colors and embedded scripts or links.

In a delightful SpamAssassin irony, a message gets extra points for declaring that it is not spam. After all, such statements are invariably lies. So are the following:

• This is a one-time mailing.
• There is no catch.
• This message is sent in compliance with spam regulations.
• You're getting this message because you registered with one of our marketing partners.

Like so much of the online world, spam becomes circular and self-referential and tail-chasing. Mail titled ''Clear your in-box of spam'' and ''Say goodbye to junk e-mail'' has to be spam.

SpamAssassin deploys hundreds of rules, adding more all the time and readjusting the weighting scheme. If a message has enough points, it's probably spam. But false negatives and false positives cannot be ruled out.

The newest approach to a technological solution is a different kind of filtering, based on statistics. A computer scientist, Paul Graham, caused a stir in the online world last summer with a proposal for adaptive, probabilistic filtering. The idea is to give up trying to list specific features of spam, to quit trying to get inside the mind of the spammers. ''Over the past six months, I've read literally thousands of spams, and it is really kind of demoralizing,'' Graham declared. ''Norbert Wiener said if you compete with slaves you become a slave, and there is something similarly degrading about competing with spammers.''

Instead, the new software keeps track of all the words in every e-mail, calculates their statistical probabilities as spam-indicators and adjusts these with experience. With his new method, he claims to be catching more than 99 percent of spams, with no false positives.

Graham's screed inspired many independent programmers to work on statistical filters. You train the software with a few hundred examples of good and bad mail, and the software starts to get smart. If you're curious, you can study the details. In my case, for example, the word ''Viagra'' appears almost exclusively in spam. The word ''vanilla'' appears almost exclusively in real mail. ''Awesome'' is bad; ''awful'' is good. ''Fraction'' is bad; ''fragment'' is good. I don't know why, and it doesn't matter.

''I was skeptical of filters, too, but in my case there wasn't much of a choice,'' says Michael Tsai, the author of SpamSieve, one of the more effective statistical filters. ''I don't have time to categorize my mail by hand, and if I did I'd probably get trigger-happy with the delete key.''

If you wonder why words like ''Viag.a'' and ''pen-s'' have started appearing in your spam, this is why. It's the spammers trying to outwit the filters. They have also begun appending random chunks of innocent text.

After weeks of training, my spam filter didn't see a problem with a letter titled ''See what hot girlz do behind closed doors!'' To a wary human, just the word ''girlz'' would be a giveaway. The filter didn't suspect anything unwelcome about a letter from Ivanna Come titled Obscene Facial Pictures. You can't say these spammers are brilliantly concealing their intentions. A human being gets the point. Yet none of these words, singly or in combination, triggered the alarm in my filtering software. The filter still thought this might be legitimate e-mail:

Looking for a good time?

Want some FREE ladies?

Then click here!


Deep inside the majestic Pennsylvania Avenue headquarters of the Federal Trade Commission lies the control center for the government's battle against spam, such as it is. This is the grandly named Internet Lab. It turns out to be eight PC's on desks.

''It's a great resource,'' says Brian Huseman, the F.T.C. lawyer assigned to tell me about the commission's spam-fighting. ''I don't know how we would do our spam investigations without it.'' While I watch, three young staff members surf the Web.

During the Clinton administration, the commission set up an e-mail address, uce@ftc.gov, for consumers to forward samples of their spam. The database now contains 27.5 million of these, and 85,000 more arrive daily. Every month or so, the commission files an enforcement action against someone, leading to a warning letter, or a promise by the spammer to cease and desist, or even, occasionally, a ''disgorgement of ill-gotten gains.'' No one really imagines any of this makes a dent.

The agency can't help noticing that, by and large, spam is not illegal. Its enforcers can go after only the most obvious forms of fraud, of which there is, after all, no shortage. The fact that you may not want to get this stuff is not their problem. ''From the F.T.C.'s point of view, whether it's wanted or unwanted, what we're concerned about is whether it's deceptive,'' Huseman says.

In reality almost all spam qualifies as deceptive. Junk messages typically come with false return addresses. The Internet headers are typically forged to falsify the mail servers, relays and other data that could help trace the source. Instructions for removing your name from their lists are typically false; your name won't be removed, and the spammer will know you're alive and reading the mail. In theory, any of these lies could justify action by the commission. ''Maybe there's a deceptive statement about how your name was acquired,'' Huseman says. Why, yes!

Here's a typical disclaimer: ''Best Net Offers never sends unsolicited e-mail. Best Net Offers has been given the right to market to you through our Web site partners and their privacy policies.'' That's a lie. Here's another: ''P.S. We never ever send spam we only send to people we think would enjoy this invitation! Psychic powers don't fail us now!''

In practice, it's hopeless. You can't prove that you never clicked a button marked ''O.K.'' after failing to read the fine print of a privacy agreement, thus unwittingly giving someone a sort of authorization to sell your e-mail address. The F.T.C. is a strong believer in these lawyerly privacy statements, but how many people read them?

You don't want to test the Click Here to Be Removed link, because you suspect it will just lead to more spam. This is just another in the series of spam Catch-22's.

The spammers are elusive and hidden -- one reason Barry Shein compares their operations to organized crime. ''These are not legitimate businesspeople,'' he said. ''These are people who are acting in a way that society does not accept.'' Yet it is often possible for a determined victim to track them down. Although headers will be forged, inspecting the HTML source of junk mail usually reveals a genuine domain name, because, after all, you are meant to visit some Web site. Domain names are required to list publicly a technical contact and an administrative contact. For example, SexAffair.org, which sends a torrent of spam from supposedly willing maidens called ''Erica'' and ''Jen'' and ''Katie,'' turns out to be Sobonito Investments Ltd., at 10800 Biscayne Boulevard in Miami. They handle Sex2go.com, too. I'd like to talk to them about this, but they don't return my calls. Anyway, until Congress sees fit to make their activities illegal, tracking down spammers provides little satisfaction.

Yet the perpetrators are not just misfits and pornographers. Real companies send spam, too -- because they don't know any better, or because they don't care, or just in rote obeisance to the gods of marketing.

E-mail marketers, from the sleazy to the near-legitimate, defend their behavior by citing postal junk mail and unsolicited telemarketing. These irritate consumers but are tolerated, up to a point. Spam is different. It is intrusive because, in the nature of e-mail, it arrives round the clock, demanding attention. It lacks even the modest checks and balances of traditional marketing: to print letters and send them through the post costs money; likewise to make telephone calls. A direct mailer can't afford a pitch so shabby and fruitless that it will produce a one-in-a-thousand rate of return. A spammer can, because sending a million more copies is practically free.


We citizens and consumers have more points of contact with the world than ever before; more points of exposure. Our front doors and mailboxes are one kind of interface; our telephones and fax machines another; our televisions and radios still another. Because networked computers open a pathway wider and faster and more fluid than all these combined, the spam epidemic will prove a need for new kinds of locks and new kinds of rules.

From an economic perspective, if the purpose of advertising is to get directly in my face, for the least cost, then spam is almost perfectly effective. It's in my face, all right. As a magazine reader, you might feel that advertisements are an intrusion, but you also know that the advertisers had to pay thousands of dollars and that this money supports whatever it is you like about the magazine. You've made a voluntary economic bargain -- as you do when you watch free broadcast television.

By contrast, advertising by e-mail is the ultimate free ride. The cost is borne by the recipients. This is a historical accident; no planners designed the economics of cyberspace to work this way. But the capitalists who laid the world's fiber-optic cable across continents and oceans get no return on their investment from the spammers; the Internet service providers whose computers send and receive these billions of messages get no compensation. Nor, of course, do we, the targets of spam -- we don't get the television program or magazine article or football game or service that other advertising dollars support.

Of course, if the purpose of advertising is to get me to buy the product, as opposed to turning me into a livid bundle of hatred and resentment, then spam is not perfectly effective. Yet the tide keeps rising.

Many people who hate spam believe, honorably enough, that it's protected as free speech. It is not. The Supreme Court has made clear that individuals may preserve a threshold of privacy. ''Nothing in the Constitution compels us to listen to or view any unwanted communication, whatever its merit,'' wrote Chief Justice Warren Burger in a 1970 decision. ''We therefore categorically reject the argument that a vendor has a right under the Constitution or otherwise to send unwanted material into the home of another.''

Odd forces have conspired to create paralysis in the government on the matter of spam. Corporate marketers and Internet traditionalists have found themselves in an accidental alliance. The marketers, particularly the powerful Direct Marketing Association, have lobbied hard to preserve their ability to send cheap messages to potential customers; ''self-regulation'' has been the industry's watchword. And the Internet's culture, too, lines up against government intervention. The word ''clueless'' might have been invented to describe legislators with an urge to meddle in the high-tech arena. In cyberspace, policy and rulemaking have come from the bottom up, and the benefits have been spectacular. Time and again, the online world has behaved like a self-healing organism, outwitting authorities who tried to impose structure from above.

But cyberspace belongs to the world of human beings, who rely on laws to discourage the worst behavior and to protect the powerless. The grass roots have not solved the problem -- indeed anti-spam posses have often caused trouble of their own, with blacklists that block mail from legitimate sources. And even the Direct Marketing Association has come to believe that spam hurts its members. ''I've got caught in the vicious cycle,'' says Regina Brady, an e-mail marketing consultant who works with the association. ''All of a sudden your e-mail address is spun out into cyberspace. This is a real problem for legitimate marketers trying to do the right thing.'' This fall, after fighting it for years, the association came out in favor of federal anti-spam legislation in some form.

Arguments about such legislation tend to focus on the issue of ''opt in'' versus ''opt out.'' In opt-out schemes, which the marketers favor, consumers have to take action to declare their unwillingness to receive unsolicited bulk mail. If the system is opt-in, then marketers have to be able to show that consumers have given their consent to receive solicitations. The European Parliament recently voted to adopt opt-in requirements, putting Europe far ahead of the United States in acting against spam.

''The reason it happens with impunity right now is that in fact it's not against the law,'' says James Love, director of the Consumer Project on Technology, a Washington-based advocacy group. He says he believes that making it possible to collect fines from spammers would change the balance of power significantly. He also says that the international nature of the problem need not be a barrier to effective action. ''Spam should be a good test case for cross-border consumer protection, because everybody thinks a spammer is lower than a sniper. Go into a treaty-type mode, where you try to get everybody to agree, from Korea to China to Turkey to Russia. You could build enormous pressure to comply with this.''

As remote as an effective solution seems, the spam problem might not be so intractable after all. The Telephone Consumer Protection Act of 1991 made it illegal to send unsolicited faxes; that law passed with strong backing from manufacturers of fax machines. It should be extended to include unsolicited bulk e-mail.

For free-speech reasons, any legislation should avoid considering e-mail's content; trying to define key words like ''commercial'' and ''pornographic'' only leads to trouble. And it isn't necessary. For that matter, even short of outlawing spam, two simple measures might be enough to stem the tide:

1) Forging Internet headers should be made illegal. The system depends on accurate information about senders and servers and relays; no one needs a right to falsify this information.

2) Unsolicited bulk mail should carry a mandatory tag. That alone would put consumers back in control; all the complex technological challenge of identifying the spam would vanish.

We need to be able to say no. No, I'm not looking for a good time. No, I don't want to ''e-mail millions of PayPal members.'' No, I don't want an anatomy-enlargement kit. No, I don't want my share of the Nigerian $25 million. I just want my in-box. It belongs to me, and I want it back.

James Gleick often writes for the magazine about technology. His next book, ''Isaac Newton,'' will be published in May.