Robots.txt and Sitemap Error

dummy imageby @akhtar (155), 6 months ago Search engine optimization Search console

Google Search Console is indicating that my has been blocked by robots.txt file. While testing it, through the robots.text testing tool it says ALLOWED.

Wondering why Google mentioned the sitemap contains URLs which are blocked by robots.txt. How to resolve this issue?


Best Answer

dummy image@ms (237), 6 months ago

Your robots.txt file looks good to me, even though Allow isn't defined in the standard. However, major crawlers should know this directive.

See (from http://tools.seochat.com/tools/robots-txt-validator):

The official standard does not include Allow directive even though major crawlers (Google and Bing) support it. If both Disallow and Allow clauses apply to a URL, the most specific rule - the longest rule - applies. To be on the safe side, in order to be compatible to all robots, if one wants to allow single files inside an otherwise disallowed directory, it is necessary to place the Allow directive(s) first, followed by the Disallow. It is still nonstandard.

I guess empty robots.txt would work just fine for, just like having no robots.txt at all.

5 replies
dummy image@ms (237), 6 months ago

Post your robots.txt and exact URLs you're getting errors for in your Search Console. Can't really tell without knowing your robots file.

dummy image@ms (237), 6 months ago

Your robots.txt file looks good to me, even though Allow isn't defined in the standard. However, major crawlers should know this directive.

See (from http://tools.seochat.com/tools/robots-txt-validator):

The official standard does not include Allow directive even though major crawlers (Google and Bing) support it. If both Disallow and Allow clauses apply to a URL, the most specific rule - the longest rule - applies. To be on the safe side, in order to be compatible to all robots, if one wants to allow single files inside an otherwise disallowed directory, it is necessary to place the Allow directive(s) first, followed by the Disallow. It is still nonstandard.

I guess empty robots.txt would work just fine for, just like having no robots.txt at all.

dummy image@akhtar (155), 6 months ago

Thanks Martin for this help and information.

dummy image@ms (237), 6 months ago

No problem, glad to help @akhtar. You can accept answer if it answered your question.