spam

Link spammers try the subtle approach

I've left unpublished a bunch of comments lately that seem to reflect a new kind of spam.

Since Google's algorithm is known to value inbound links, many of the spam comments that get posted on the Web actually aren't intended for users of the site where they're posted. They're Google spider bait. And when they're posted on a website that has good Google karma -- like mine -- they can help elevate the target site in Google search returns.

Until recently, the comment spam that I routinely delete without publication has been heavy-handed, obvious, and probably automated.

But now I'm seeing spam that includes a quote from the blog item and some innocuous question like "Is this practical?" or a meaningless comment like "All of the above? Could be." And the link to the spammer's website is subtly tied to the username.

I may respond to this by simply disallowing links of any kind, but that defeats the purpose of linking to legitimate blogs from legitimate comments. For the time being I'm going to simply be very suspicious of any comments that don't pass the "sniff test."

Drug spammers exploit newspaper site search

As newspapers work to improve their search experience and embrace Web search as well as on-site search, they're being exploited by a new round of automated blog spam that displays Internet drug listings right on the newspapers' websites.

This allows unscrupulous scammers to present their pitch under the "trusted information provider" brand of the newspaper. And it undoubtedly undermines the newspaper's brand.

Tribune Company and McClatchy sites in particular are being targeted. [Update: nytimes.com also is being exploited.]

Various "Canadian drugstore" sites are being promoted, but a minor bit of domain detective work traces much of this back to Israel, where several "businesses" registered to people with Russian surnames have registered a number of prescription-drug domains.

On the McClatchy sites, it's an Overture clickthrough tag that's being exploited. Here's an example, with the domain adjusted to my site in order not to promote the drug spammer:
http://www.miamiherald.com/cgi-bin/mi/overture/overture.pl?Keywords=site...

On the Tribune sites, the same trick looks like this:
http://www.orlandosentinel.com/search/dispatcher.front?Query=site:yelvin...

(Go ahead and click on those links; they're safe, and they will show you how the result set is presented.)

In both cases, a bit of checking of the HTTP request headers would probably allow the newspaper's search script to foil the spammer with minimal side effects.

These blog spammers attack websites with automated scripts that attempt to post comments on blog entries. Typically several dozen comments containing little more than the Web links are posted at once.

There are several techniques blog sites can use to foil these attacks.

Registration-only commenting stops most of it, although a few blog spammers do register usernames, then return weeks or months later with scripts programmed to log in and post spam.

Requiring approval of the posting (as I do) prevents the spam from being made public, at the cost of some administrative overhead to delete the evil and promote the good.

Captcha, a technique that requires users to answer a question in order to post, is the most effective technique. There are several variations. One uses a warped graphic image of a random password that the user has to type. Another asks the user to type the Nth word of a random sentence. And yet another asks the user to perform a calculation, or answer a trivia question. They're all remarkably annoying to the innocent.

Attack of the zombie bots

There hasn't been much press about it, but many websites (including this one) increasingly are under attack from zombie armies, clusters of Windows PCs that have been infected by viruses that allow them to be commanded and controlled remotely by spammers.

Typically a virus installs a "back door" on an infected PC that allows it to respond to remote commands. These commands are relayed through Internet chat systems in a chain designed to disguise the identity of the spammer.

Sometimes they're used to send email, but that's becoming increasingly difficult as Internet providers block direct outgoing email from their networks.

So the current hot ticket is to post blog spam. The trick involves gaming Google's search algorithm, which raises the "value" of any Web page depending on the number of links to that page. Links from other "valuable" pages are especially powerful, so spammers try to target well-regarded websites.

Earlier tonight I killed five or six hundred spam postings that were full of links to porn and scam sites. If you wonder why I don't allow immediate direct postings of comments, that's the reason.

These postings come in bursts, and the bursts can force a website to its knees, particularly if they are malformed requests. Some of our servers at work have seen thousands of automated requests simultaneously.

The latest twist is harvester bots that grab text from blog postings, insert links to spam sites, and post the result to fake blogs at Blogger.com. I have an RSS feed from Icerocket that helps me monitor blog postings that refer to me. Lately I've been finding my own words picked up and pasted together with blog spam links on fake blogs.

The irony is that none of this really helps the spammers. Google does a pretty good job of ignoring those spam blogs, and most legitimate blog software now uses a "rel=nofollow" tag on comment links that instructs Google to ignore the links.

Blog spammers: Strangle them

I'm on vacation and really don't have time to clean up after the spammers who assault this site every night, so I'm disallowing anonymous comments for the time being.

Syndicate content