What is a robots.txt file?


A "robot", "bot", or "spider" are computer programs that crawl the internet for a variety of purposes.  These bots can be useful (Googlebot crawling your website to add to the Google search engine), or they can be undesirable (bots crawling your website looking for forms and email addresses to send SPAM to for example).


A robots.txt file is a small text file which contains information about which areas of a website the website's owner would like the bots would like to crawl and which areas should be off limits. Please note that the areas you add as off-limits in the robots.txt file are only optional for bots to follow - nefarious bots may ignore the file completely and continue to crawl pages of your website you disallowed in the robots.txt file.


For more detailed information about bots, what a robots.txt file is, and the information to add to it, please see this website: http://www.robotstxt.org



How to upload a custom robots.txt file


Uploading a robots.tx file works exactly the same as uploading any file to the root of your website.  Here are the steps again for convenience:

  1. Login to the Admin area of your site
  2. Navigate to Developers-> (FTP) File Manager
  3. Click on “Upload”
  4. Click on “Choose File” or “Browse” (could be different depending on what browser you are using)
  5. Choose the robots.txt file on your hard drive
  6. Click “Open” and the file will now be uploaded automatically
  7. Once the upload is complete, click on the "Back to..." link on the bottom of the screen, or simply close the browser tab you were using to upload the file in order to return to the File Manager
  8. The file will now be accessible at http://www.YourDomain.com/robots.txt


Save bandwidth with a robots.txt file


Every time a bot crawls a page on your website, it must load the the page, which counts towards you site's monthly bandwidth usage.  While the only way to reliably stop a bot from crawling your website completely is to block its IP address, blocking all bots except for the ones you want to crawl your website can prevent some bots from crawling your website.

At the bottom of this article there is an example robots.txt file attached which allows the bots from the popular search engines Google, Yahoo, and Bing to crawl your site, but attempts to block ALL others (including other search engines and other potentially beneficial bots).

Please note that this robots.txt file is an example only, and it is highly recommended that you only add this robots.txt file to your website if you have a clear understanding of the potential impact it can have on your website.