Heritrix
| ||||||||||||||
Patterns
archive-it
archive.org_bot
- archive.org_bot
- Mozilla/5.0 (compatible; archive.org_bot/1.5.0-200506132127 +http://crawler.archive.org) Hurricane Katrina
- Mozilla/5.0 (compatible;archive.org_bot/1.7.1; collectionId=316; Archive-It; +http://www.archive-it.org)
- Mozilla/5.0 (compatible;archive.org_bot/heritrix-1.9.0-200608171144 +http://pandora.nla.gov.au/crawl.html)
heritrix
- Mozilla/5.0 (compatible; heritrix/1.8.0 http://wiki.office.aol.com/wiki/SEO)
- mozilla/5.0 (compatible; heritrix/1.0.4 +http://accelovation.com)
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; heritrix/1.3.0 http://www.cs.washington.edu/research/networking/websys/)
- mozilla/5.0 (compatible; heritrix/1.0.4 http://innovationblog.com)
- Mozilla/5.0 (compatible; heritrix/1.10.2 +http://i.stanford.edu/)
- mozilla/5.0 (compatible; heritrix/1.3.0 http://archive.crawler.org)
- Mozilla/5.0 (compatible; heritrix/1.4.0 +http://www.chepi.net)
- Mozilla/5.0 (compatible; heritrix/1.4t http://www.truveo.com/)
- Mozilla/5.0 (compatible; heritrix/1.5.0 http://www.l3s.de/~kohlschuetter/projects/crawling/)
- Mozilla/5.0 (compatible; heritrix/1.5.0-200506231921 http://pandora.nla.gov.au/crawl.html)
- Mozilla/5.0 (compatible; heritrix/1.6.0 http://www.worio.com/)
- Mozilla/5.0 (compatible; Heritrix/1.8.0 http://www.hanzoarchives.com)
- Mozilla/5.0 (compatible; heritrix/1.x.x +http://www.accelobot.com)
http://archive.crawler.org
http://crawler.archive.org
http://i.stanford.edu
http://innovationblog.com
http://pandora.nla.gov.au/crawl.html
http://wiki.office.aol.com/wiki/seo
http://worio.com
http://www.accelobot.com
http://www.archive-it.org
http://www.chepi.net
http://www.cs.washington.edu/research/networking/websys
http://www.hanzoarchives.com
http://www.l3s.de/~kohlschuetter/projects/crawling
http://www.truveo.com
http://www.worio.com
hurricane katrina
internetarchive
worio bot heritrix
worio heritrix bot