The Web Robots Pages

The Web Robots Database

The List of Active Robots has been changed to a new format, called The Web Robots Database. This format will allow more information to be stored, updates to happen faster, and the information to be more clearly presented.

Note that now robot technology is being used in increasing numbers of end-user products, this list is becoming less useful and complete.

For general information on robots see Web Robots Pages.

The robot information is now stored into individual files, with several HTML tables providing different views of the data:

Browsers without support for tables can consult the overview of text files.
The combined raw data in machine readable format is available in a text file.

Feel free to email any feature requests (such as better sorting, and a plain HTML View) -- the more I get, the more likely I will implement them.

To add a new robot, fill in this empty template, using this schema description, and email it to m.koster@webcrawler.com


Others

There are robots out there that the database contains no details on. If/when I get those details they will be added, otherwise they'll remain on the list below, as unresponsive or unknown sites.

Services with no information

These services must use robots, but haven't replied to requests for an entry...
Magellan
User-agent field: Wobot/1.00
From: mckinley.mckinley.com (206.214.202.2) and galileo.mckinley.com.
(206.214.202.45)
Honors "robots.txt": yes
Contact: cedeno@mckinley.mckinley.com (or possibly:
spider@mckinley.mckinley.com)
Purpose: Resource discovery for Magellan (http://www.mckinley.com/)

User Agents

These look like new robots, but have no contact info...
BizBot04 kirk.overleaf.com
HappyBot (gserver.kw.net)
CaliforniaBrownSpider
EI*Net/0.1  libwww/0.1
Ibot/1.0 libwww-perl/0.40    
Merritt/1.0
StatFetcher/1.0
TeacherSoft/1.0  libwww/2.17
WWW Collector
processor/0.0ALPHA libwww-perl/0.20
wobot/1.0 from 206.214.202.45
Libertech-Rover         www.libertech.com?
WhoWhere Robot
ITI Spider
w3index
MyCNNSpider
SummyCrawler
OGspider
linklooker
CyberSpyder (amant@www.cyberspyder.com)
SlowBot
heraSpider
Surfbot
Bizbot003
WebWalker
SandBot
EnigmaBot
spyder3.microsys.com
www.freeloader.com.

Hosts

These have no known user-agent, but have requested /robots.txt repeatedly or exhibited crawling patterns.
205.252.60.71
194.20.32.131
198.5.209.201
acke.dc.luth.se
dallas.mt.cs.cmu.edu
darkwing.cadvision.com
waldec.com
www2000.ogsm.vanderbilt.edu
unet.ca
murph.cais.net (rapid fire... sigh)
spyder3.microsys.com
www.freeloader.com.
Some other robots are mentioned in a list of Japanese Search Engines.
The Web Robots Pages