Apache and PHP vs. the Spambots Continued
So now you know how to stop a spambot. You can either force it to download the same useless page each time it tries to get something from your site, or have it run through several hundred pages of links to nowhere.
If you're intent on feeding fake email addresses to the spammers, you can get Wpoison, a Perl program, and either port it to PHP or run it as-is.
You should also keep in mind that using PHP and Apache in the way I describe here isn't foolproof. If a spambot obeys the robots exclusion protocol - that is, they read your robots.txt file, and stay out of certain directories, as you direct - they won't be caught by this method; they'll still look for email addresses in allowed directories. What you should do, then, is keep robots out of sections of your site which have lots of email addresses (bulletin boards, guest books, etc.). Using the robots exclusion protocol, you can actually let specific good robots (from legitimate search engines) into those areas, and keep the rest out. You can also use the Robo-Cop program I talked about earlier as a back-up to detect robots that obey your robots.txt file, but which you'd still like to keep off your site.
For more spam-related resources, see the Spam section of our Web Index. To hear about new content on ahref.com, sign up for our newsletter. And don't forget about our survey on dealing with spam... Edward Piou is an ahref.com producer and runs ep Productions, Inc., a development company based in the Washington, D.C. area. |