ROBOTS & SPIDERS
SITES
BotSpot
A metasite for everything to do with bots (aka robots), spiders, intelligent agents. Includes links to various bot resources and technical papers, and profiles working bots from around the Net.
Protect Your Webserver From Spam Harvesters
A list of known spam harvesting programs, and instructions on using Apache's mod_rewrite module to divert spambots.
Spambot Beware!
Information about spambots, detecting them, and avoiding them. Talks about ways to encode email addresses so they don't look like email addresses.
The Web Robots Pages
Information on robots (aka spiders), programs that traverse the Web automatically. Includes an FAQ and information on the Robots Exclusion protocol, a way to keep robots off your site.
ARTICLES
A Clever Crawler: The Human Brain, Metadata And Smarter Search Engines
The future of information retrieval (i.e. searching) may lie in clever use of metadata. (1/18/2001 at SF Gate)
Save Your Site From Spambots
Using Apache's mod_rewrite module to stop spambots from scraping email addresses from your website. (7/14/2001 at Web Techniques)
Search Engine Friendly PHP Pages
Tricks to write dynamically-generated PHP pages that look static, and so will more likely be indexed by search engines. (4/24/2001 at Zend)
To Cloak or Not to Cloak?
A look at the issue of cloaking - altering the content your website sends back to search engines based on the engines' IP addresses, to manipulate search engine ranking. (9/13/2001 at SF Gate)
Tough Times for Data Robots
A court has ordred Verio inc. to stop using robots to extract data from Register.com's Whois database, under the theory that the robots use Register.com's computer resources, thus causing the company harm. (1/12/2001 at The New York Times)
Where Did All the Bots Go?
Web robots - that would help you search for information, or search for low prices - have not taken off like they were supposed to. (1/17/2001 at ClickZ Network)
|



Click on an icon to show only those resources.
World Internet Alliance Home Page
Home page for a coalition that provides information about issues of Internet governance, and hopes to bring the various Internet "stakeholders" into an environment where they can make decisions together.
Submit a URL to the index. You can also submit by email.
|