Turn on suggestions
Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.
Showing results for
How do I find a list of bad spiders and crawlers?
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Help with my Plusnet services
- :
- Everything else
- :
- How do I find a list of bad spiders and crawlers?
How do I find a list of bad spiders and crawlers?
01-01-2008 10:43 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
The tutorials on plusnet usertools are good value.
http://usertools.plus.net/tutorials/id/5 ; and /id/48
make good sense for keeping unwanted site traffic down. Fine, I now know how to keep GoogleImage's paws off my pics, how to stop other sites using my bandwidth to serve my binaries and quite a bit more. But can anyone tell me please where I can find a list of ill-behaved spiders so that I can put them into my robots.txt and.htaccess files to keep them out?
Wikipedia has a list in its robots.txt file, but I imagine that someone somewhere keeps an up-to-date and reasonably reliable list.
http://usertools.plus.net/tutorials/id/5 ; and /id/48
make good sense for keeping unwanted site traffic down. Fine, I now know how to keep GoogleImage's paws off my pics, how to stop other sites using my bandwidth to serve my binaries and quite a bit more. But can anyone tell me please where I can find a list of ill-behaved spiders so that I can put them into my robots.txt and.htaccess files to keep them out?
Wikipedia has a list in its robots.txt file, but I imagine that someone somewhere keeps an up-to-date and reasonably reliable list.
Message 1 of 5
(2,019 Views)
4 REPLIES 4
Re: How do I find a list of bad spiders and crawlers?
05-01-2008 10:06 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
There are some useful links at the end of http://en.wikipedia.org/wiki/Spambot.
Message 2 of 5
(583 Views)
Re: How do I find a list of bad spiders and crawlers?
09-01-2008 10:40 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
why would you want to keep google away? one of the top search engines which can bring you traffic and you want to close the doors to them 
I personally use a sitemap, upload this to google webmaster tools which indicates to them when ive made changes.

I personally use a sitemap, upload this to google webmaster tools which indicates to them when ive made changes.
Message 3 of 5
(583 Views)
Re: How do I find a list of bad spiders and crawlers?
09-01-2008 6:57 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I haven't read the articles that Deepee referred to, but I assume that he wants to use .htaccess to stop Google (and anybody else) using up his bandwidth. I assume he's not using robots.txt to keep Google away completely.
Google seems to have made some improvements over Christmas - I've found that Google indexes my site within a minute of me making any changes.
Google seems to have made some improvements over Christmas - I've found that Google indexes my site within a minute of me making any changes.
Message 4 of 5
(583 Views)
Re: How do I find a list of bad spiders and crawlers?
17-01-2008 7:18 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
The idea was indeed to let Google and others in to the html files but to keep the GoogleImage bot, and others that would trough the images, out. Robots.txt will work with well-behaved bots, such as Google, but apparently not the badly behaved ones.
While keeping the bad bots out with htaccess it's also possible to prevent 'hotlinking'. http://usertools.plus.net/tutorials/id/48 gives a good introduction.
While keeping the bad bots out with htaccess it's also possible to prevent 'hotlinking'. http://usertools.plus.net/tutorials/id/48 gives a good introduction.
Message 5 of 5
(583 Views)
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Help with my Plusnet services
- :
- Everything else
- :
- How do I find a list of bad spiders and crawlers?