robots.txt Validator
Lint a robots.txt for syntax and directive order issues.
Overview
The robots.txt Validator lints a pasted robots.txt for syntax and directive-order issues — unknown directives, mistyped user-agents, wildcards in unsupported positions, Allow/Disallow rules outside a group, missing colons, and incorrect line endings. It applies the RFC 9309 grammar and Google's well-documented extensions.
Useful for SEO practitioners and webmasters learning how to validate robots.txt syntax or how to debug robots.txt rules. Reach for it after editing the file, especially when a regex-heavy path pattern unexpectedly blocks or fails to block the wrong URLs.
How it works
robots.txt parses line-by-line into field: value pairs. A directive outside a group (no preceding User-agent:) is ignored by Google and is a common bug source. The validator groups directives by user-agent and reports each group's effective rule set, plus warnings for typos, dangerous patterns (Disallow: / followed by attempted Allow: overrides), and missing essentials (no Sitemap, no production user-agent rule).
For each tested path, the validator can show which rule matches and which user-agent group applies, mirroring the precedence Google's documentation describes: specific user-agent over *, longest match wins among Allow/Disallow.
Examples
- A typo
Disalow: /admin/(missing 'l') is flagged as an unknown directive. - A
Disallow: /privatefollowed byAllow: /private/publicis reported as a precedence note — Allow wins for longer specific paths. - Tab indentation is reported as suspect because some parsers misread it.
- A file with only
User-agent: Googlebotbut noUser-agent: *is warned: other crawlers fall back to permissive behaviour.
FAQ
Are wildcards supported in robots.txt?
Officially no, but Google and Bing support * and $. The validator marks these as Google-extension rules — they may be ignored by smaller crawlers.
Does Allow override Disallow?
When both match the same URL, the longer (more specific) wins per Google's documented behaviour. Equal-length matches favour Allow.
Why does Google ignore my rule?
Most often because the rule is not inside a user-agent group, or the user-agent name is mistyped (case-sensitive in some implementations). Validate to find the missing group anchor.
Should I include a BOM?
No — UTF-8 with a byte-order mark causes some parsers to fail on the first line. Save as plain UTF-8 without BOM.