search engine crawlers are very required to reach audience to your website, but if they hit continuous it will harm your website performance, If you want to prevent search engine crawler to crawl here is first and basic way to do that i.e. robot.txt
robot.txt is file which tells crawlers to what part of your website need to crawl and what part or directory not, but in written format, robot.txt follows a format to let the crawls bots to performs on your website, it also block and allow different bots by there names
below steps let you how robot.txt will works.
1. If your website name is yourdomain.com then robot.txt should placed here
http://yourdomain.com/robots.txt
2. They are some top Bots :
Googlebot Yahoo bingbot AhrefsBot Baiduspider Ezooms MJ12bot YandexBot
3. Robot.txt have two main variables
- User-agent - define name of bot
- Disallow - define which directory is allow to access
example:
1) If you to allow only Googlebot
User-agent : Googlebot
Disallow: *
2) If you want allow all bots
User-agent : *
Disallow: /
3) If you disallow particular bot
User-agent : *
Disallow: *
User-agent : AhrefsBot
Disallow: /
4) If you want to prevent some directory to crawl
User-agent : *
Disallow: /scripts/
Disallow: /themes/
Please Comment your thoughts and feedback below and add something if you found good in anywhere to help others
Hit a like Button If you like the Post.
Many Thanks
- Log in to post comments