Anyone with a blog or Web site can attest to the fact that spam has increasingly become the bane of the blogosphere and search engines. As a result researchers at Microsoft Labs have developed a tool to seek out and stymie the efforts of the spammers.
The tool dubbed Strider Search Defender designed by Microsoft's Cybersecurity and Systems Mangement Research Group is an amalgamation of two previous Research Lab projects Strider Honey Monkey and Strider URL Tracer. The combined technology is designed to locate and identify links to spammer's Web pages that tricksters post as comments on scores of blogs from links to legitimate Web sites.
Known as "comment spam" or "splog" the fake comments and phony blogs are used by spammers to boost a site's search engine rankings by making the linked site appear to be popularly linked from many other sites. Link popularity is one criteria used by search engines like Google to determine a site's importance or ranking. Microsoft lab researchers found that the socalled search spammers create "doorway pages" on legitimate domains and use those URLs to pepper blogs with comment spam.
"By identifying those domains that serve as target pages for a large number of doorway pages we can catch major spammers' domains together with all their doorway pages and doorway domains" the Microsoft researchers said.
Click With Caution
Blog Web sites such as blogspot.com blogstudio.com blogdrive.com etc. are rife with doorway pages setup up by spammers. These pages are a form of spam blogs or "splogs." The researchers noted that spam blogs hosted by Google's blogspot "appear to be particularly widely spammed and effective against search engines."
Many of the spamfilled blogs use cloaking and redirection technology to fool searchengine crawlers by presenting a different page than that which an unsuspecting Web surfer will see.
The Strider Search Defender tool uses a twopronged approach to combat spammers. Starting with a list of confirmed spam URLs the "Spam Hunter" uses them as search terms to comb search engines and locate blogs forums and guest books that have been spammed.
The Strider URL Tracer component released earlier this year as a tool to help trademark owners identify typosquatting domains of their Web sites. It helps researchers to locate a spammer's domain and all its spamlinked pages on other sites.
"Our approach is to treat each spam page as a dynamic program rather than a static page and utilize a 'monkey program' to analyze the traffic resulting from visiting each page with an actual browser so that the program can be executed in full fidelity" the researchers said.
While testing the Strider Search Defender tool researchers discovered a redirection network that used roughly Web pages hosted on Google's Blogger.com of which percent directed unsuspecting Web surfers to just six spamming sites. Another test found more than spamrelated sites on BlogEver. The majority of these sites used the same Google AdSense client ID which partly explains what is known as "click fraud."
Advertisers place ads with Google whcih then allows those ads to be displayed on "partner" site sites. When a potential consumer clicks on the ads the advertiser pays Google and its partner sites a perclick fee. Click fraud is the way in which spammers produce fake clicks on those ads to collect the fee.
Click fraud is a major problem for portal sites such as Google MSN and AOL whose business models depend on advertising revenue explained Avivah Litan an analyst at Gartner research. The portals are spending significant amounts of resources to address this problemm because it threatens to undermine their ability to sell to advertisers.
"We know have a method to identify spammers so that before they get indexed into search results we can block them" YiMin Wang manager of Microsoft's Cybersecurity and Systems Management Research Group said in a Washington Post interview.