THE ROBOTS.TXT FILE
You know that search engines have been created to help people find information quickly on the Internet, and the search engines acquire much of their information through robots (also known as spiders or crawlers), that look for web pages for them.
The spiders or crawlers robots explore the web looking for and recording all kinds of information. They usually start with URL submitted by users, or from links they find on the web sites, the sitemap files or the top level of a site.
Once the robot accesses the home page then recursively accesses all pages linked from that page. But the robot can also check out all the pages that can find on a particular server.
After the robot finds a web page it works indexing the title, the keywords, the text, etc. But sometimes you might want to prevent search engines from indexing some of your web pages like news postings, and specially marked web pages (in example: affiliate´s pages), but whether individual robots comply to these conventions is pure voluntary.
ROBOTS EXCLUSION PROTOCOL
So if you want robots to keep out from some of your web pages, you can ask robots to ignore the web pages that you don´t want indexed, and to do that you can place a robots.txt file on the local root server of your web site.
In example if you have a directory called e-books and you want to ask robots to keep out of it, your robots.txt file should read:
User-agent: * Disallow: e-books/
When you don´t have enough control over your server to set up a robots.txt file, you can try adding a META tag to the head section of any HTML document.
In example, a tag like the following tells robots not to index and not to follow links on a particular page:
meta name=”ROBOTS” content=”NOINDEX, NOFOLLOW”
Support for the META tag among robots is not so frequent as the Robots Exclusion Protocol, but most of major web indexes currently support it.
NEWS POSTINGS
If you want to keep the search engines out of your news postings, you can create an an “X-no-archive” line in of your postings’ headers:
X-no-archive: yes
But although common news clients, allow you to add an X-no-archive line to the headers of your news postings, some of them don´t permit you to do so.
The problem is that most search engines assume that all information they find is public unless marked otherwise.
So be careful because though the robot and archive exclusion standards may help keep your material out of major search engines there are some others that respect no such rules.
If you’re highly concerned about the privacy of your e-mail and Usenet postings, you must use some anonymous remailers and PGP. You can read about it here:
http://www.well.com/user/abacard/remail.html http://www.io.com/~combs/htmls/crypto.html
http://world.std.com/~franl/pgp/
Even if you are not particularly concerned about privacy, remember that anything you write will be indexed and archived somewhere for eternity, so use the robots.txt file as much as you need it.
Written by Dr. Roberto A. Bonomi
[Sponsored] Bryan Seawell is the proud owner of this article and he owns a site called: “maverick money makers review“. See how he can help you with his site: “maverick money makers review” and allow him to share with you his best known secrets here at his exclusive site, “maverick money makers review“. Thank you for your trust and belief in Bryan. Hope it will benefit you and others. Have a wonderful day ahead. [Sponsored]
Originally posted 2010-07-13 02:26:42. Republished by Blog Post Promoter
Related Posts -
Search Engine Optimization Glossary Algorithm. A set of rules that a search engine uses to rank the pages contained within its index in response... -
TheOnlineGuru.com REVIEW - 5 Top Tips For Getting Free Hits And Visitors To Your Website They say the best things in life have the freedom. When it comes to getting free hits and visitors to... -
Manual Directory Submissions Wins An Upper Hand For Seo Listing submissions are some of the fashionable and extensively trusted search engine optimisation methods that decide for the search engine...
Related Websites - Elements Articles Must Have To Drive Visitors To Your Website! The significance of articles in the success of present day web based companies is immeasurable. They're the life blood of...
- The Vital Part A Google Places Expert Can Play in The Successfulness Of Your Business When it comes to selling in this digital age, businesses must stay competitive. With the advent of online buyer research...
- Say Goodbye to "Article" Marketing, Slide Sharing Can Get You 100x More Traffic Anyday - Internet Marketing Strategies Slide sharing websites (just Google "slide sharing sites") post your written content in PDF, PowerPoint or some other visually appealing...
Link to this page






{ 4 comments… read them below or add one }
I couldn’t have really asked for a more rewarding blog. You are available to provide excellent assistance, going on to the point for quick understanding of your target audience. You’re surely a terrific professional in this matter. Thanks a ton for always being there visitors like me.
Almost all I can express is, I’m not sure what to really say! Except needless to say, for the fantastic tips which have been shared within this blog. I will think of a thousand fun strategies to read the content on this site. I do believe I will ultimately take action making use of your tips on areas I could not have been able to handle alone. You were so clever to let me be one of those to profit from your handy information. Please realize how great I enjoy the whole thing.
Many hours I’ve been looking for the notes about it. Finally, I found the point – thank to your post. Thanks a lot and good luck!
I am really inspired together with your writing abilities well with the layout for your blog. Is this a paid subject matter or did you modify it your self? Either way stay up the excellent high quality writing, it’s rare to peer a great weblog like this one today.