<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Getting to Know Your Bots: Robots.txt 101</title>
	<atom:link href="http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/feed/" rel="self" type="application/rss+xml" />
	<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/</link>
	<description></description>
	<lastBuildDate>Fri, 08 Mar 2013 16:32:43 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: ashok</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-96152</link>
		<dc:creator>ashok</dc:creator>
		<pubDate>Tue, 17 Jul 2012 12:55:34 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-96152</guid>
		<description><![CDATA[Hello Rhea, 

Thanks for great post, Can you please help me regarding the following i have some dynamic pages  how should i make my robots.txt file so that search engine does not read these pages:

http://www.abc.com/index.php?option=com_virtuemart&amp;page=shop.pdf_output&amp;showpage=shop.product_details&amp;pop=1&amp;output=pdf&amp;product_id=34&amp;category_id=1&amp;pop=1&amp;vmcchk=1&amp;Itemid=2

Is this fine: 
User-agent: *
Noindex:/index.php/*? option=*

Thanks for your help in advance...

Regards
Ashok]]></description>
		<content:encoded><![CDATA[<p>Hello Rhea, </p>
<p>Thanks for great post, Can you please help me regarding the following i have some dynamic pages  how should i make my robots.txt file so that search engine does not read these pages:</p>
<p><a href="http://www.abc.com/index.php?option=com_virtuemart&#038;page=shop.pdf_output&#038;showpage=shop.product_details&#038;pop=1&#038;output=pdf&#038;product_id=34&#038;category_id=1&#038;pop=1&#038;vmcchk=1&#038;Itemid=2" rel="nofollow">http://www.abc.com/index.php?option=com_virtuemart&#038;page=shop.pdf_output&#038;showpage=shop.product_details&#038;pop=1&#038;output=pdf&#038;product_id=34&#038;category_id=1&#038;pop=1&#038;vmcchk=1&#038;Itemid=2</a></p>
<p>Is this fine:<br />
User-agent: *<br />
Noindex:/index.php/*? option=*</p>
<p>Thanks for your help in advance&#8230;</p>
<p>Regards<br />
Ashok</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Lawrence</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-81118</link>
		<dc:creator>Lawrence</dc:creator>
		<pubDate>Fri, 18 May 2012 14:21:06 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-81118</guid>
		<description><![CDATA[While some search providers actually read and respect the robots.txt file, _not all do_.

A recent example I found was Bing.
We found it was explicitly indexing items that were excluded.  Not only that, but it was taking substantial bandwidth (over a period of time) sucking up disallowed content using multiple simultaneous ip connections.

A check of the IP ranges used verified that they were indeed owned by Microsoft.
We eventually blocked the ranges in use by Bing for abuse.

More here - 
http://www.computersolutions.cn/blog/2012/05/msn-bing-crawler-spider-madness/]]></description>
		<content:encoded><![CDATA[<p>While some search providers actually read and respect the robots.txt file, _not all do_.</p>
<p>A recent example I found was Bing.<br />
We found it was explicitly indexing items that were excluded.  Not only that, but it was taking substantial bandwidth (over a period of time) sucking up disallowed content using multiple simultaneous ip connections.</p>
<p>A check of the IP ranges used verified that they were indeed owned by Microsoft.<br />
We eventually blocked the ranges in use by Bing for abuse.</p>
<p>More here &#8211;<br />
<a href="http://www.computersolutions.cn/blog/2012/05/msn-bing-crawler-spider-madness/" rel="nofollow">http://www.computersolutions.cn/blog/2012/05/msn-bing-crawler-spider-madness/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jason</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-76706</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Fri, 04 May 2012 10:25:17 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-76706</guid>
		<description><![CDATA[I was unaware of robots.txt until around 2 weeks ago when I found Google was indexing my test site! Thankfully I&#039;m now a robots due to various research ad this post I&#039;m now a robots guru!]]></description>
		<content:encoded><![CDATA[<p>I was unaware of robots.txt until around 2 weeks ago when I found Google was indexing my test site! Thankfully I&#8217;m now a robots due to various research ad this post I&#8217;m now a robots guru!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Halvorsen</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73135</link>
		<dc:creator>Halvorsen</dc:creator>
		<pubDate>Mon, 23 Apr 2012 20:19:36 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73135</guid>
		<description><![CDATA[My favorite :

User-agent: *
Crawl-delay: 10]]></description>
		<content:encoded><![CDATA[<p>My favorite :</p>
<p>User-agent: *<br />
Crawl-delay: 10</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Wissam Dandan</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73114</link>
		<dc:creator>Wissam Dandan</dc:creator>
		<pubDate>Mon, 23 Apr 2012 18:21:35 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73114</guid>
		<description><![CDATA[its been there for a while and kinda funny ( last three disallows)

http://www.last.fm/robots.txt]]></description>
		<content:encoded><![CDATA[<p>its been there for a while and kinda funny ( last three disallows)</p>
<p><a href="http://www.last.fm/robots.txt" rel="nofollow">http://www.last.fm/robots.txt</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rhea Drysdale</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73111</link>
		<dc:creator>Rhea Drysdale</dc:creator>
		<pubDate>Mon, 23 Apr 2012 18:18:08 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73111</guid>
		<description><![CDATA[Amen. (not to go remove my credit card numbers from my robots.txt...)]]></description>
		<content:encoded><![CDATA[<p>Amen. (not to go remove my credit card numbers from my robots.txt&#8230;)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Czarto</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73108</link>
		<dc:creator>Alex Czarto</dc:creator>
		<pubDate>Mon, 23 Apr 2012 18:12:52 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73108</guid>
		<description><![CDATA[Important note to add: Be aware that your robot.txt file is public for anyone to view.  So don&#039;t rely on robots.txt to hide sensitive information from view, because if you have a malicious user visiting your site one of the first places  he will look is your robot.txt file to see if there are any sensitive or vulnerable files listed there.

ie: If he sees this

&lt;blockquote&gt;
Disallow: /secret/important-hidden-secret-stuff.html
Disallow: /secret/credit-card-numbers.txt
&lt;/blockquote&gt;

...and you haven&#039;t taken precautions in securing those files, then you may be in trouble.]]></description>
		<content:encoded><![CDATA[<p>Important note to add: Be aware that your robot.txt file is public for anyone to view.  So don&#8217;t rely on robots.txt to hide sensitive information from view, because if you have a malicious user visiting your site one of the first places  he will look is your robot.txt file to see if there are any sensitive or vulnerable files listed there.</p>
<p>ie: If he sees this</p>
<blockquote><p>
Disallow: /secret/important-hidden-secret-stuff.html<br />
Disallow: /secret/credit-card-numbers.txt
</p></blockquote>
<p>&#8230;and you haven&#8217;t taken precautions in securing those files, then you may be in trouble.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rhea Drysdale</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73103</link>
		<dc:creator>Rhea Drysdale</dc:creator>
		<pubDate>Mon, 23 Apr 2012 18:08:40 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73103</guid>
		<description><![CDATA[Mark, great point about snoopers. Most 101 types wouldn&#039;t be too worried about that, but absolutely when it comes to competitive fields and sites. LOVE SEER&#039;s and Rishi&#039;s robots, thanks for including them. And, it&#039;s always hilarious to see a robots.txt file indexed. Like ours! Let&#039;s fix that. :D]]></description>
		<content:encoded><![CDATA[<p>Mark, great point about snoopers. Most 101 types wouldn&#8217;t be too worried about that, but absolutely when it comes to competitive fields and sites. LOVE SEER&#8217;s and Rishi&#8217;s robots, thanks for including them. And, it&#8217;s always hilarious to see a robots.txt file indexed. Like ours! Let&#8217;s fix that. :D</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Ginsberg</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73098</link>
		<dc:creator>Mark Ginsberg</dc:creator>
		<pubDate>Mon, 23 Apr 2012 17:48:20 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73098</guid>
		<description><![CDATA[Thanks for the post. As netmeg pointed out, to prevent pages and folders being indexed, I&#039;m much more a fan of the noindex meta tag. It&#039;s also less of a tipoff to some pesky SEO looking for secret internal info. I love visiting a robots.txt file and seeing stuff the site wanted to hide - they&#039;re pointing you right to it!

I was actually doing some research and putting together some fun robots.txt files with cute comments in them, an idea I had after the whole SEER rankings fiasco. After checking their robots file, they&#039;ve got a finally line in there - it&#039;s worth checking out - seerinteractive.com/robots.txt. Probably the best one I&#039;ve seen is Rishi&#039;s at explicitly.me/robots.txt. I love how these files themselves are indexed, and you need to find a workaround to prevent them from being indexed themselves.

Anyway, thanks for the post - will definitely reference it when I write about fun robots.txt files.

UPDATE FROM RHEA: Mark also shared this on Twitter: http://www.seomoz.org/robots.txt]]></description>
		<content:encoded><![CDATA[<p>Thanks for the post. As netmeg pointed out, to prevent pages and folders being indexed, I&#8217;m much more a fan of the noindex meta tag. It&#8217;s also less of a tipoff to some pesky SEO looking for secret internal info. I love visiting a robots.txt file and seeing stuff the site wanted to hide &#8211; they&#8217;re pointing you right to it!</p>
<p>I was actually doing some research and putting together some fun robots.txt files with cute comments in them, an idea I had after the whole SEER rankings fiasco. After checking their robots file, they&#8217;ve got a finally line in there &#8211; it&#8217;s worth checking out &#8211; seerinteractive.com/robots.txt. Probably the best one I&#8217;ve seen is Rishi&#8217;s at explicitly.me/robots.txt. I love how these files themselves are indexed, and you need to find a workaround to prevent them from being indexed themselves.</p>
<p>Anyway, thanks for the post &#8211; will definitely reference it when I write about fun robots.txt files.</p>
<p>UPDATE FROM RHEA: Mark also shared this on Twitter: <a href="http://www.seomoz.org/robots.txt" rel="nofollow">http://www.seomoz.org/robots.txt</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rhea Drysdale</title>
		<link>http://outspokenmedia.com/seo/getting-to-know-your-bots-robots-txt-101/#comment-73084</link>
		<dc:creator>Rhea Drysdale</dc:creator>
		<pubDate>Mon, 23 Apr 2012 16:35:43 +0000</pubDate>
		<guid isPermaLink="false">http://outspokenmedia.com/?p=14352#comment-73084</guid>
		<description><![CDATA[YES! Robots.txt has so many caveats. There should be a giant checklist of &quot;warning!&quot;]]></description>
		<content:encoded><![CDATA[<p>YES! Robots.txt has so many caveats. There should be a giant checklist of &#8220;warning!&#8221;</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic page generated in 0.400 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2013-05-05 13:26:02 -->
