Index Spammers and Google bombing

SEARCH AND DESTROY
by James Surowiecki

From the New Yorker, Talk of the Town
Issue of 2004-05-31
Posted 2004-05-24

If you go to the Internet search engine Google, type in “miserable failure,” and click on the “I’m feeling lucky” icon, you will be directed not to an article about “Ishtar” or the 1962 Mets but, rather, to the White House Web site and the official biography of President George W. Bush. Congratulations. You’ve been Google-bombed.

A Google bomb goes off when people conspire to have a particular phrase (in this case, “miserable failure”) link to a given Web page, effectively tying the phrase to the page. Other famous Google bombs include one linking “more evil than Satan himself” to Microsoft’s home page and, currently, one that links “weapons of mass destruction” to a page that reads, “The weapons you are looking for are currently unavailable. . . . Click the Regime Change button, or try again later.”

Google bombing may be a party trick, something to amuse office workers as they trudge through the day, but it exemplifies one of the biggest challenges that Google faces as it heads toward its multibillion-dollar I.P.O. Google is as much a ranking system as a search engine. It is more efficient than any other site at analyzing information and making decisions about its importance. Google is successful not because if you search for “Enron” it will return 1.75 million pages that contain the word but because, of those 1.75 million, the most relevant are right at the top. In large part, Google does this by relying on the collective intelligence of the Web itself. At the core of Google’s technology is a voting system. Every link from one Web site to another is treated as a vote; sites that get more votes are considered more valuable and, in Google’s system, are weighted to have more influence. Google also takes hundreds of other factors into consideration, such as font size and the location of words on the page. But, fundamentally, the Web pages that Google says are best are the pages that the Web as a whole thinks are best.

Google’s success has created a problem, though: if you have a voting system, people are going to try to manipulate it. Google bombing is the innocent face of this. Less innocent is the industry dedicated to helping Web sites maximize their Google rankings—the racket known as “search engine optimization.” Some American companies have armies of programmers toiling away in Bangalore solely to boost their Google rankings. Much of what the “optimizers” do is reasonable, helping companies do a better job of presenting content, using keywords, and building pages to which others will want to link. (These are termed “white hat” tactics.) But there are also plenty of black hats—known as “index spammers”—who have simply adapted the methods and tricks of the old political machines.

In the days of Boss Tweed, people were encouraged to vote early and often, dead men were placed on the voting rolls, and citizens were paid for their votes. On the Web, companies “cloak,” which means, among other things, that they disguise the real content of their sites, in an attempt to fool Google into thinking that a page is relevant to a search. Deep-pocketed players pay other sites to link to their sites, to foster an illusion of popularity. Some companies set up “link farms”—a host of interconnected Web sites that exist primarily to link to each other. A big company with a major Internet presence, for instance, can buy thousands of domain names, set up Web sites, and effectively create thousands of links out of nothing.

Google, of course, knows about all this. In its recent I.P.O. filing, it said that the threat from index spammers was “ongoing and increasing,” and so it has embarked on a campaign to outsmart them. A couple of weeks ago, for instance, it essentially banned a company called WhenU because of its cloaking tactics. (WhenU’s web site will no longer appear if you search for the company on Google.) To stymie the cheaters, Google issues periodic revisions to its algorithm, and companies breathlessly await the subsequent changes in their rankings. (They call this “the Google dance.”) These revisions are so important to Web sites that, like hurricanes, they are given names. Web masters still marvel at the havoc wreaked by the Florida revision, last November: “Denial, then anger, gradually changing to acceptance, and, finally, healing,” one wrote.

Google’s efforts to keep its rankings honest have not always been popular. Some people who run Web sites that depend on the traffic that Google sends their way have accused the company of being capricious and unjust. There have even been calls from critics for it to be regulated as a public utility. But attacks on Google are shortsighted. Google is treating index spammers the way Olympic officials treat athletes who use steroids. Think of the Web as a track meet. When the other runners are juiced, it’s hard to keep up with them unless you are, too. Likewise, when people start cloaking or link-farming, those who wish to remain competitive have to consider doing so themselves. This winds up hurting everyone; if Web users think Google isn’t a clean game, eventually they’ll stop playing.

Google works best when no one knows it’s there—when people are making their own decisions about which sites are useful or good. The more important Google becomes, the harder its job gets, because more and more people find themselves trying to game the system, and wind up undermining it instead. When Google purges dubious Web sites and rejects links from link farms, it is, in a sense, counteracting the consequences of its own success. Collective intelligence relies on a certain degree of innocence. Google is using guile to re-create a guileless world, under the assumption that what we don’t know should help us.