Using .htaccess to Block Spam Bots

Following on from my attempts to block trackback spam, here’s another way to block those pesky spam bots. This only works if your server uses Apache.

Because this only works on Apache, you may have already guessed that it uses the .htaccess file to prevent the spam bot from reaching your page. In order to block a spam bot, you’ll need to know the User Agent for the bot. The User Agent can be found from your web stats package. If you’re using WordPress, then you can use any of the available stats packages to find the user agents hitting your site. (I use FireStats and StatPress.)

Before you edit your .htaccess a word of warning:

 

Be very sure of what you’re doing.

 

Because .htaccess controls access to your website, you could find yourself locked out of your own website. If that happens, you’ll need to contact your hosting provider and ask them to make the changes for you.

Once you’re ready to edit your .htaccess file, open your preferred text editor and save a blank file as htaccess. Notice that there is no dot in front of the name. The reason for this is that most modern OSes will hide any file with a filename starting with a dot, and you’ll need to be able to find the file later on.

At the top of your new htaccess file type the following:

SetEnvIfNoCase User-Agent "^User Agent To Be Blocked" bad_bot

If you want to block more than one User Agent, then add the above line for each spam bot to be blocked.

Under this, you’ll need to add the following lines:

<Limit GET POST>Order Allow,DenyAllow from allDeny from env=bad_bot</Limit>

Save the changes you’ve just made, and fire up your preferred FTP program. Upload the file to the root directory of your website or blog using the ASCII file type. Once the file has been upload, use your FTP program to rename the file to .htaccess.

Check your website to ensure that you can access it, and you’re good to go!