Comment Spam and other Blog Spam Issues

Comment Spam and other Blog Spam Issues

Comment spam is a very common form of blog spam these days. Here are some ways to avoid spam comments and other forms of form spam that many people suffer from today. Comments are just one form of this pernicious activity that can be so upsetting and time-consuming to deal with. We shall also, therefore, briefly discuss other forms of spamming that can affect your blog.

Comment Spam

In case there is somebody who is unaware of exactly what comment spam is, we shall provide a description of it here prior to explaining how it is generated and why. Only by understanding the how and why of an issue can we properly deal with it and ensure that there is no recurrence. The problem is that such remedies tend to be inconvenient, and we get frustrated at having to put ourselves to such trouble to tackle an issue that search engines should be able to prevent … same old story!

What is Comment Spam?

This is the most common form of blog spam. Your comments box is manually auto-filled with software expressing a generic sentiment such as ‘Nice Site,’ ‘Thanks for the useful information,’ or ‘Drop by and see me sometime.’ Whatever – it doesn’t even have to make sense. It is generally accompanied by a website or blog URL that is present for one of three major reasons.

1: PageRank and Web SEO

The spammer is hoping that you do nothing with the comment and just let it be and ignore it. In fact, if you do ignore it the URL is still readable and Google may provide the spammer’s page with PageRank through your link back to it. Yours is just one of potentially many millions of blog comments that have been software-generated, so the potential PageRank to the spammer is considerable.

2. Links to Sales Sites

Many such links lead to an offer for a popular blue pharmaceutical pill or some form of adult site – you know the kind of thing. The idea is that if they send out a million spam posts, then even a 0.1% CTR will result in 1000 potential customers visiting the URL. A million is low – think 100 million, and even 0.01% CTR provides 10,000 views of the spammer’s website. Spam works because it is offering in-demand products although to non-targeted readers.

What you must keep an eye open for are apparently innocuous links that are not what they seem to be. A link using the text ‘SEOMasters.com’ could actually be HTML coded to direct robots (and readers) to a more unsavory commercial web page.

3. Googlebombing

A Googlebomb is where large numbers of links are generated from large numbers of sites to a single web page using the same anchor text, such as ‘Buy Blue Pills’. The target page then tends to improve in ranking for that keyword. Spamming is the most common means of achieving the links needed for success.

In fact, if they could, they would hide the anchor text by making it the same color as the background. They don’t want you to take any notice of it; they only want you not to delete it and for it to remain visible to search engine robots.

Here is an example of comment spam. The link led to a pharmaceutical site – it was not approved!

Comment Spam Example

Some are more obvious, such as this:

Comment Spam

If you see any comments like these that appear to be very general in nature or irrelevant to your niche, then it is highly likely that it is spam. Check the link.

Why Blog Spam is Damaging

You might think a comment such as that above (“Great site: I learned a lot from it”) is doing no harm. It seems like a fairly innocuous comment, but what if the link leads to an adult or pharmaceutical site? In fact, such comments can do you a lot of harm and here’s why:

Comment spam indicates to visitors that your blog is poorly moderated. If they see a number of meaningless comments that offer no value to the reader, then visitors are likely to regard your site as ‘just another spammy blog’ and move on.

Your readers might click on one of these links and find themselves on a web page they would rather not be on. They trusted you to offer only safe links, and what if they are kids! Spammers and comment spamming software do not distinguish between blogs intended for youngsters and those intended for adults.

The Google Panda and Penguin algorithm updates are targeting links that point to such unsavory websites. If you permit your blog to contain such links, then you might not only find it dropping down the rankings, but also deindexed altogether.

You are within your rights to take any reasonable steps to prevent it. So how do you prevent it?

Detection and Moderation is no Cure

There is a big difference between prevention and detection. The first advice is to follow each link provided in your Comments to make sure it is genuine. It is often said that the best defense against blog comment spam is to moderate frequently.

It certainly pays to keep a close eye on your blog comments for spam and to moderate them regularly. However, that doesn’t stop it all appearing, and you can’t do that all day, every day! Moderation is no cure – it merely enables you to delete all the spam comments. If your spammers are human then perhaps they might eventually give up if they see all their comments are being deleted. But will they notice? Will they care? Absolutely not!

As we have already mentioned, the major problem is that the majority of blog spam today comes from software and auto-blogging systems. Spambots look for forms, including comments forms, and fill them in. Here are a few methods you can use to prevent spam reaching your comments box, not all free.

Comment Spam Prevention Techniques

A. WordPress Settings

WordPress includes some options in its ‘Settings’ area that will help you combat spam comments. Go to Dashboard -> Settings -> Discussion and you will see a number of checkboxes allowing you to control the way comments are handled. These are shown below:

Discussion Settings WordPress

Specifically note the following sections (ignore the existing settings displayed):

a) Before a comment appears: You have two options in the section, the first requiring an administrator to approve the comment before it is published. This won’t stop spam, but it will prevent it being seen until you have approved or deleted it.

The second option enables you to require the author of a comment to have had a previously approved comment before their comments can be automatically published. Clicking this option will reduce the amount of approval work you have to do. It also ensures that the author will need re-approval if they use different email.

b) Other comment settings: The first two options are the important ones. You should insist on the comment authors providing a name and email address. Otherwise you have no means of communication, and cannot tell whether they are using a generic name or a real one. You would not be as confident with “gary1278@mail” as you would with “rajesh@namase.com” for example.

The other important option requires users to be registered with your blog before they can comment. You will find many blogs with this requirement and also many without it. The benefit is that you know that those providing comments have already given you their email and name in the registration form, and are unlikely to be spammers.

This is assuming that you used a double opt-in system, where they had to confirm their registration via an email link. Spambots and autoblogging software cannot complete confirmation emails, and this is one way to prevent spammers accessing your site.

c) Time Limit on Blog Comments: Under ‘Other comment settings’ you will also see this section:

Close Comments after X Days

If you continue to permit comments on your old posts, it is more likely for spamming software to post comments to these than humans. You can prevent this by clicking the box shown above. You can set the period to anything you want. Your readers can still access the posts, but spambots as well readers cannot place comments.

d) Default article settings: While not relevant to blog comment spam, trackbacks can be an even more severe problem than comment spam. It is only really of use to find out who is linking to you, and you can get this information from the Incoming Links area on your WordPress Dashboard. Uncheck this box to prevent trackbacks and pingbacks if you are getting excessive numbers of these.

e) Comment Moderation and Comment Blacklists: These are not shown in the screenshot, but appear on the ‘Discussion’ page beneath ‘Before a comment appears.’ Under ‘Comment Moderation’ you can set the maximum number of links included in a comment above which it will be sent for moderation:

Comment Moderation

Just beneath the above section you will see a reference to ‘words’ – this refers to words you can blacklist if they appear in the comment content, URL, email address or IP. You can choose either to send such content to the moderation queue, or mark it with spam and reject the comment.

One potential problem with using this is that you must use it intelligently to avoid ‘The Scunthorpe Problem.’ This will be discussed later, but meantime examine the name of this English town and see why it has been excluded from so many web pages, blog comments and other online publications using anti-spam systems.

B. Anti-Spam Plugins

If you use WordPress then you will have a large number of plugins to use, many of them relating to blog spam. The best known is Askimet, but there are others. This post is not a review of antispam plugins, but we shall mention some of those available.

Askimet Plugin

This is the most popular WordPress plugin and needs to be activated to begin working for you. It is free if your blog is personal and non-commercial, otherwise you must pay according to the level of service and size of your blog: generally $5 to $50/month. To activate Askimet, click on the link to the website:

Akismet

Then follow the instructions to get an API key. Once you have it, enter it into the box provided and Askimet will be activated.

Askimet will filter out what it believes to be spam and send it to a separate folder. You can then moderate them and either accept or reject.

WP-Hashcash Plugin

WP-Hashcash helps to eradicate comments spam on WordPress, and you can also use it to prevent trackback and pingback spam. It operates by forcing your visitors to pass through a JavaScript that detects whether they reached your website naturally using a web browser or by means of a robot.

If the JavaScript test fails, you get three options: a) send the comment for moderation, b) add it to the Askimet queue for investigation or b) simply delete it.

AVH First Defense Against Spam Plugin

This can block content spammers before they even comment on your site. AVH First Defense uses Project Honeypot to identify spammers. Project Honeypot is a free service that creates a database of spammers that have been reported and detected and a directory of malicious IP addresses. If visitors try to post a comment they are first checked at stopforumspam.com that holds databases of reported spammers, such as:

AVH First Defense Against Spam Plugin

The date is the date the spam was reported. This service then works in conjunction with Askimet to keep your comments free of spam.

Cookies for Comments Plugin

Many people find Cookies for Comments to be a very effective anti-spam plugin that is specifically designed to tackle blog comment spam. The plugin does this through your HTML: it adds a Stylesheet or an image to the HTML coding of your blog. This does not affect the functioning or the formatting of your blog in any way.

When somebody (or some thing) visits your blog, their browser naturally loads the blog coding, including the Stylesheet or image. When that happens the software drops a cookie that should be picked up by your browser. If the visitor leaves a comment on your blog, Cookies for Comments checks for the cookie. If it’s not there, then the visit is marked as spam.

Also, if the comment, including email and URL, is added much faster than a human can write, as happens with a machine-written comment, then it is also marked as spam. This has been reviewed as a very effective anti-spam plugin.

C. Third-Party Anti-Spam Applications

You can use a third-party blog spam application to tackle comments spam. Livefyre, Disqus, etc are the options that offer features such as visitors requiring to register with the service before commenting, the potential to use multiple moderators and being able to moderate all of an individual’s comments on your blog in one go.

However, these services neither prevent spammers from making comments, nor are they necessarily user friendly. Your blog readers might object to having to register with a third-party site before they can comment on your blog. The idea is a good one, but it can antagonize some visitors.

Are they better than the above WP plugins? Perhaps, if comment spam is a very severe problem to you – maybe you are getting hundreds every day, or at least too many spammy comments for you to handle, even after Askimet has done its work. They might reduce your loading speed, but again you have to offset that against their benefits to you. In fact, many spambots seem not to be able use some of these systems.

D. Using CAPTCHA

A very effective way of preventing spamming by software and robots is CAPTCHA. This is becoming so common these days that people are becoming used to it, and therefore less annoyed when they come across it. The term is an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart” and is designed to ensure that a human is involved and not a machine.

There are two modern aspects of CAPTCHA that can overcome OCR (optical character recognition) software. The first is:

reCAPTCHA: This is a free service offered by Google. It uses Scanned images of old books and documents to provide a CAPTCHA image that OCR software cannot read accurately enough:

CAPTCHA Example

This will prevent robots filling in your comments form – and any other form such as your newsletter form – thus making sure that any comment spam has been manually inserted. This does not happen as frequently as spambot-generated comments.

E. Captcha Alternative: The GASP Plugin

The Growmap Anti-Spambot Plugin (GASP) offers an alternative to CAPTCHA that is friendly to those that have difficulty deciphering the code. It is a one-click indication that you are not a spambot, and GASP even defeats those ‘learning bots’ that can work out what boxes to check. The reason for that is that it is based on JavaScript that bots cannot read (as yet!)

While commenting on your blog, your visitor will see this checkbox:

GASP Plugin

The comment will not be accepted until the box it checked. It’s as simple as that. To ensure that learning bots are defeated, dynamically named fields are added to the comment form so that the fields on each post will be differently named.

Check this system out at Growmap. It has been having some great reviews. It is much easier and also quicker than using CAPTCHA – and while it might seem to be too simple, it works!

Spam Blacklists: The Scunthorpe Problem

This is a genuine name given to what has been, and still is, a genuine problem. When selecting words to be blacklisted from a search, or even for entire websites and domains to be excluded from listings because they contain such terms, the Scunthorpe problem comes into play.

When you ban a word, you are banning that character string wherever it appears. This English town (and you can no doubt think of some in your own area) suffered because of letters 2-5 in is name.

Just to lighten our mood up a little, here are a few more such issues:

  • Shitakemushrooms.com – could not be registered as a domain due to rules in place at the time (1998.)
  • Scot Craig Cockburn was unable to get a Hotmail email address.
  • We won’t even begin to discuss Mr. & Mrs. Libshitz and their problems with Verizon, or why sprinter Tyson Gay had his name automatically switched to Tyson Homosexual by Associated Press – or even the problems that the pantomime Dick Whittington has had!

So be careful of the words that you ban from your blog comments. You might also be banning some of your regular readers!

Conclusions

Blog spam, particularly in the form of comment spam, can be a very serious problem. No blogger, whether a hobby or a professional blogger, wants their visitors to get sight of masses of spammy comments. The above advice offers four ways to reduce this type of spam:

  • A. WordPress settings on the Discussion link
  • B. WordPress anti-spam plugins
  • C. Third party anti-spam solutions
  • D. Using CAPTCHA with your comments form

Each of these has its own benefits and deficiencies. However, if you use a combination of A, B and D, you should feel much more secure from blog spam in general, and comment spam in particular.

Comments

  1. says

    Wow great. If your blogging is having a good pr and if its a do follow one then brace yourself your blogging is going to get hell lot of comments. Nowadays bloggers are paying for commenting services. Some of them are using scraping tools for commenting but its not that effective like paying for commenting services. Using captcha is one of the best ways to avoid scrapping of comments. You had pointed out some awesome tips to stay against spamming and spam comments. Thanks a lot for sharing this with us.
    Nithin Upendran recently posted – 5 Best Social Bookmarking Sites For Building BacklinksMy Profile

  2. says

    Very nice article
    I think we have to kind of spam. One is Auto spam and we have some solutions to pretend like you mentioned above, the second one is people just don’t care about the article’s information and they just want to do the Backlink, and the comment’s really flat and not contribute anything to your issue.
    thanks for your sharing and looking forward to your next articles.

    stephan
    Stephan recently posted – Why you should start your business with Opencart?My Profile

  3. says

    I don’t really mind spam comment but if their comment is not related to your post
    or something that is beyond your imagination, then its a big NO. I dont like it specially
    if it was posting many comment with the same content of words. its a big no no!
    that is why i am thankful for this advice. Thank you this is very useful!
    Dangem Piodena recently posted – PAUTAKAN GAME 7 – The Warriors WayMy Profile

  4. says

    Hi Rajesh,

    I have to admit that this is one of the most thorough posts on spam I’ve read to date. Not only did you let people know what it is and how harmful it can be but gave us suggestions for alternative measures to stop more.

    I use CommentLuv Premium and I love this plugin. Like everyone else though, I still get spam. If I install the AVH First Defense Against Spam plugin do you think that would help as well? I know they’re human but they’ve got to be on that spam list for sure because of the crap they leave.

    I so appreciate this post so thank you for taking the time to share this with us and all the research you did.

    Have a great week.

    ~Adrienne
    Adrienne recently posted – How Bloggers Help Bloggers Increase Their IncomesMy Profile

  5. says

    Hi Rajesh,

    Awesome coverage on almost every aspect of blog comment spam and yes, some of the plugins you have listed do a fantastic job of identifying spam and blocking them. I use commentluv and that helps doing the job it is supposed to do with GASP in addition to Akismet to cover my bases.

    However, one thin I noticed is (and probably you would write a post about it if you find it interesting), comment spambots do not even visit your website and therefore, even if their comments go in SPAM queue, they still consume your resources by inserting that junk in your MYSQL. They bypass CAPTCHA or GASP check easily and only Akismet is able to work on the comments to put them in SPAM. But that doesn’t help saving our server resources.

    Is there anyway to block spambots from inserting comments? Yes, there is a .htaccess rules method but somehow that doesn’t seem to be working. If you have some insight in that area, it will really be very helpful.

    Thanks.
    Kumar
    Kumar Gauraw recently posted – 9 Characteristics Of Highly Successful EntrepreneursMy Profile

    • says

      In Akismet Settings there is an option – ” Auto-delete spam submitted on posts more than a month old.” You can use this option so that there will not be any junk in your site database. Hope this helps.

      • says

        Rajesh,

        Well, it helps for sure but it’s not exactly what I am looking for.

        What I am trying to find is, a way to block spambots from being able to insert junk comments into my database and thus save the CPU resources my site uses to insert that junk.

        I am not sure if that makes sense but my goal is to reduce operational load on my database by blocking spambots. Any ideas?

        Thanks,
        Kumar
        Kumar Gauraw recently posted – 9 Characteristics Of Highly Successful EntrepreneursMy Profile

 Comment Policy

Your words are your own, so be nice and helpful if you can. Please, only use your REAL NAME, not your business name or keywords. Using business name or keywords instead of your real name will lead to the comment being deleted. Anonymous commenting is not allowed either. Limit the amount of links submitted in your comment. We accept clean XHTML in comments, but don't overdo it please. You can wrap code in [lang-name][/lang-name] tags.


Tell us what you're thinking...

If you want a picture to show with your comment, then get Gravatar!

CommentLuv badge