robots.txt Validator

Lint a robots.txt for syntax and directive order issues.

Open tool

Overview

The robots.txt Validator lints a pasted robots.txt for syntax and directive-order issues — unknown directives, mistyped user-agents, wildcards in unsupported positions, Allow/Disallow rules outside a group, missing colons, and incorrect line endings. It applies the RFC 9309 grammar and Google's well-documented extensions.

Useful for SEO practitioners and webmasters learning how to validate robots.txt syntax or how to debug robots.txt rules. Reach for it after editing the file, especially when a regex-heavy path pattern unexpectedly blocks or fails to block the wrong URLs.

How it works

robots.txt parses line-by-line into field: value pairs. A directive outside a group (no preceding User-agent:) is ignored by Google and is a common bug source. The validator groups directives by user-agent and reports each group's effective rule set, plus warnings for typos, dangerous patterns (Disallow: / followed by attempted Allow: overrides), and missing essentials (no Sitemap, no production user-agent rule).

For each tested path, the validator can show which rule matches and which user-agent group applies, mirroring the precedence Google's documentation describes: specific user-agent over *, longest match wins among Allow/Disallow.

Examples

  • A typo Disalow: /admin/ (missing 'l') is flagged as an unknown directive.
  • A Disallow: /private followed by Allow: /private/public is reported as a precedence note — Allow wins for longer specific paths.
  • Tab indentation is reported as suspect because some parsers misread it.
  • A file with only User-agent: Googlebot but no User-agent: * is warned: other crawlers fall back to permissive behaviour.

FAQ

Are wildcards supported in robots.txt?

Officially no, but Google and Bing support * and $. The validator marks these as Google-extension rules — they may be ignored by smaller crawlers.

Does Allow override Disallow?

When both match the same URL, the longer (more specific) wins per Google's documented behaviour. Equal-length matches favour Allow.

Why does Google ignore my rule?

Most often because the rule is not inside a user-agent group, or the user-agent name is mistyped (case-sensitive in some implementations). Validate to find the missing group anchor.

Should I include a BOM?

No — UTF-8 with a byte-order mark causes some parsers to fail on the first line. Save as plain UTF-8 without BOM.

Try robots.txt Validator

An unhandled error has occurred. Reload ×