Darrell Norton's Blog [MVP]

Sponsors

The Lounge

News

  • Darrell Norton pic

    MVP logo

    View Darrell Norton's profile on LinkedIn

    Currently Reading:

    weewar.com

Advertisement

Images in this post missing? We recently lost them in a site migration. We're working to restore these as you read this. Should you need an image in an emergency, please contact us at imagehelp@codebetter.com
Comment Spam - a solution?

As anyone can see from the DNJ weblogs home page or the ASP.NET weblogs home page, comment spam is a problem. Jay we understand how you feel! Of course, feeling that I am obviously smarter (ha ha) than the hundreds of super-intelligent people working on this problem, I’d like to offer my two suggestions.

For a short-term fix, disable any comment with more than 1 link. These spammers are using 1 comment to boost some 20 or 30 of their web sites. If we make that only 1 for 1, it will massively cut their productivity for a while. This would give us enough time to implement the long-term fix.

Long-term, why not use Bayesian filtering, similar to SpamBayes? It’s pretty much the same thing, comment spam or email spam, it’s all SPAM! And the benefit with Bayesian filtering would be that you could scale the effort in training the filter across all of the bloggers on a site! I have a feeling that the type of comments left in technical weblogs is vastly different than links to pharmacy and gambling sites.


Posted 11-14-2004 8:49 AM by Darrell Norton

[Advertisement]

Comments

Scott Galloway wrote re: Comment Spam - a solution?
on 11-14-2004 4:39 AM
I've noticed a recent trend of comment spam using some sort of summarised form of the original post (in some cases, for short posts, the whole original). The problem with that is that Bayesian filtering wouldn't really work - since there's very little to learn. I know Robert McLaws has attempted some sort of automated filtering system (http://weblogs.asp.net/rmclaws/archive/2004/11/05/252817.aspx)
and the use of Captcha verification would seem to help.
Darrell wrote re: Comment Spam - a solution?
on 11-14-2004 4:52 AM
Captcha stops automated comment spam. The ones I've been plagued with lately seem to be copy-paste deals with a bunch of low-wage workers from Asia (based on IP address). These are easy in that they link to 20 or so web sites, and have nothing to do with the post, so Bayesian filtering should work.

If there is spam that is a summary of the post, and then the only link back to the spammers site is the URL field, then the only cost effective way for the spammers would be automated spamming, which captcha would presumably stop.

Blacklists like Robert is setting up have never worked with email spam, so I don't think that they will work for comment spam either. It's just too easy for spammers to change their IP address, and then if you blacklist legitimate IP addresses, people get pissed off. :)

Whatever the end result is, it will have to be a combination of things. Maybe captcha, bayesian filtering, limiting the number of links in a comment, and human moderation of comments could work. I don't know, I was just being funny saying that I was smarter than all those other people. :)
Scott Galloway wrote re: Comment Spam - a solution?
on 11-14-2004 5:00 AM
I don't know - what I've been doing is simply blocking the IP of any comment spam as quickly as possible - this seems to last about 2 weeks until a new bunch show up...The cool thing about comment spam is that for the most part it's traceable (very few seem to fake IP addresses - they're not that sophisticated - yet). One thing I've also noticed recently though is that many comment spams (to my blog at least) use the comment RSS system - which would lead me to believe it's an automated agent doing this.
Darrell wrote re: Comment Spam - a solution?
on 11-14-2004 5:31 AM
More interesting facts! Thanks Scott.
Underway Again! wrote SPAM SPAM SPAM SPAM!
on 11-14-2004 8:37 AM
SPAM SPAM SPAM SPAM!
Johannes Brodwall wrote re: Comment Spam - a solution?
on 11-14-2004 10:08 AM
I have been having a bit of problems with comment spam as well. The solution I am currently attempting is to have a number as an image that the user has to type in to verify that they are not a robot. So far, I have gone down from roughly two to three spams per week to none. There seems to be plugins for blog software that does this.

I don't know if it affects the ability of people to comment effectively, as I don't get any real comments anyway :-)
Darrell wrote re: Comment Spam - a solution?
on 11-14-2004 10:35 AM
Johannes - I tried to comment on your blog using both Firefox and IE and I couldn't get the form to show up.
jkimble at gmail.com (Jay Kimble) wrote RE: Comment Spam - a solution?
on 11-14-2004 10:58 PM
I've seen a lot of suggestions, but nothing I can use right now. I looked around in the .Text admin and can't block IPs. Beyond Darrell's original idea.

Actually what would work for me right now is a new setting for comments... something like (comments must be approved).

Another thing that might work is turning off HTML in the comments (meaning no rich comments... only text. Spam me all you wat... if you can't introduce more tha
say 1 link... I dbt it's worth it.

I've also considered turning off web service access to mysite, but I'm not sure what kind of problems that will cause.
miguel jimenez's coding blog wrote RE: Comment Spam - a solution?
on 11-14-2004 11:16 PM
I developed a captcha control to install in .Text a weeks ago that i'm using in my blog and lot of people already installed. It stopped all my comment spam. You can check it at http://blogs.clearscreen.com/migs/archive/2004/11/10/575.aspx
Darrell wrote re: Comment Spam - a solution?
on 11-15-2004 12:33 AM
Jay - the 1 link per comment is exactly my point! Then the spammers would have to use automated methods to make it even worth it, and a captcha control is good enough, for now, to prevent the automated spam.
Darrell wrote re: Comment Spam - a solution?
on 11-15-2004 12:34 AM
Miguel - I'd love to use it, but we don't have control over things like that here at DNJ blogs.
Mark Levison wrote re: Comment Spam - a solution?
on 11-15-2004 4:30 AM
I've also had a real comment spam problem in the past few days. I get about 2-3 as many spam comments as I get real comments.

Limiting comments to one or two links (I legitimately wrote one recently that used 4 links) - sometimes makes it difficult for people providing feedback.

Instead why not give us an option to approve comments - via a link that was part of the original email. We just ignore the spam comment emails and have only one click to approve valid comments vs the three clicks to delete a comment.

Added bonus, with a bit of luck SpamBayes - will begin to recognized that I'm not interested in Poker - no texas hold em or any other kind.
Darrell wrote re: Comment Spam - a solution?
on 11-15-2004 4:40 AM
Mark - or "discount pharmaceuticals". :)
Johannes Brodwall wrote re: Comment Spam - a solution?
on 11-17-2004 3:56 AM
Darrell - Do you have any more information on what the problem with the comments on my blog might be? (I was able to post all right). Feel free to send me an email directly.
Greg Postlewait wrote re: Comment Spam - a solution?
on 11-19-2004 6:39 AM
I worked on a commercial anti-spam project, then stopped it once Outlook 2003 came out. It does the best job I know of to quickly sort out the good from the bad. Its issues are very minor (you do need to scan your spam folder occasionally), and keep up with updates, but its worth it.

Darrell wrote re: Comment Spam - a solution?
on 11-19-2004 7:41 AM
Greg - unfortunately I haven't been lucky enough to get a hold of any Office 2003!!! Arrgh! :)
Jim wrote re: Comment Spam - a solution?
on 11-21-2004 10:27 AM
I won it at a user group meeting. ;-)
Darrell wrote re: Comment Spam - a solution?
on 11-21-2004 12:00 PM
Oh yeah, that's right! I forgot we gave that away. :)
Elliott Back wrote re: Comment Spam - a solution?
on 11-30-2004 10:53 AM
I actually had some spam trouble that none of the WP plugins would catch, so I wrote one of my own. It's an automatic capcha that makes the client (in javascript) compute the md5 of another md5 the server gives it--that way you know with high probability that a person leaving comments was visiting your site...
Darrell wrote re: Comment Spam - a solution?
on 11-30-2004 2:59 PM
That's pretty damn good, Elliot!
Ian Davis wrote re: Comment Spam - a solution?
on 12-01-2004 11:03 PM
I implemented a solution for my wordpress weblog. I moderate all comments containing links and I require the commenter to type in a specific word of a passphrase. The word is chosen by a simple function of the post ID. The comment form has some text like "enter the 14th word of this phrase" which requires a human to read and perform the action. The word chosen by the system is always the same for each post so commenting multiple times is less onerous. Full writeup here: http://internetalchemy.org/2004/09/zero-comment-spam
Darrell wrote re: Comment Spam - a solution?
on 12-02-2004 3:00 AM
Ian - yet another good solution. These great ideas should be incorporated into the blogging engines.