View Full Version : What Is Robot.txt?

27th April 2010, 09:45 AM
Hello Friends,

Can any one help that what is robot.txt and how can it help in SEO procedures


28th April 2010, 11:22 PM
Hi Jen,

robots.txt is used to block spider bots from search engines or "bad bots" (yes we can create our own bots) from crawling and indexing certain files and directories in our sites

However bots can disregard robots.txt and still crawl on your sites
and robots.txt is a double edged sword - they're viewable publicly

It's like saying, these are my protected files and folders, please don't crawl on them

You're better of using .htaccess if you're hosting on UNIX based servers, it has much more features than robots.txt

In term of SEO you can use .htaccess for 301 redirects, rewrite urls for SEO friendly urls, 404 redirects, hide files and folders, prevent hotlinking of your files, images, etc

Use google to find .htaccess SEO,

1st May 2010, 08:07 AM

1st May 2010, 08:07 AM
You can use robots.txt to tell search engines not to index certain files. This can help you control the parts of your website that get considered by search engines for showing in search results.

Shopping cart pages or members pages have little benefit in being indexed so you may want to exclude them. These can also be done using the meta robots tag.

Or you may want to stop Google indexing your images for their image results. Images that make it into Google images tend to get stolen more than private ones.

Another SEO tip is that you can define where your xml sitemap is. Some search engines pick up on this and may index you faster.

You can also attempt to block or slow down some spiders. Quite hit and miss though. I've found the ones that cause trouble need to be blocked on the page. e.g htaccess that blocks certain agents from accessing any page on the website.

10th May 2010, 03:27 AM
"Robots.txt" is a regular text file
that through its name, has special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth. "Robots.txt" lets you tell Google just that.

19th August 2014, 08:07 PM
A robots.txt file is a text file that stops web crawler software, such as Googlebot, from crawling certain pages of your site. The file is essentially a list of commands, such Allow and Disallow, that tell web crawlers which URLs they can or cannot retrieve.

20th August 2014, 01:43 AM
Robots.txt is text file where you can define rules for search engine bots. You can define which url you have to follow and which ones are not.

20th August 2014, 05:26 AM
its kind of an exclusion protocol that can be used to tell any search engine, which pages in your site can be visited and which cannot be.

23rd August 2014, 01:13 AM
Robots.txt file is useful in the following manner

1. If your website has identical pages and you want search engines to overlook these pages.
2. In order to locate your sitemap, you need to inform search engine.
3. For search engine indexing of various files in your website such as PDF, images etc.

23rd August 2014, 02:06 AM
robots.txt is a small txt file that contains information that directs a search bot that whether they index your website or not.

23rd August 2014, 07:08 AM
In simple Words Robot txt is text file which regulates the movement of search engines spiders in terms of access to the content of your site and index them and give anther orders to not to index other parts.
25th August 2014, 12:29 AM
when google want to index your website through article must robots.txt file, a robots.txt file so important in the management of the google index.