Question 1

What is a robots.txt checker?

Accepted Answer

A robots.txt checker fetches the robots.txt file at the root of a domain, parses the directives, and tells you whether a specific path is allowed or disallowed for a specific crawler. It is the first thing to run when a page that should rank suddenly disappears from Google — a wrong Disallow rule is one of the most common causes.

Question 2

How do I check if Googlebot is blocked from my site?

Accepted Answer

Enter your domain and the path you want to test (e.g. /), set User-agent to Googlebot, and click Test. The tool fetches the live robots.txt, finds the rule group that applies to Googlebot, and reports whether the path is Allow or Disallow — including which specific rule won the precedence match.

Question 3

Does robots.txt block AI crawlers like GPTBot, ClaudeBot, and PerplexityBot?

Accepted Answer

It depends on your robots.txt. The checker has GPTBot and other AI bots in the User-agent dropdown — pick one and run the test against / to see whether the entire site is disallowed. Many sites accidentally block AI crawlers because they copy-pasted a "no AI training" rule that also blocks the runtime citation crawlers (ChatGPT-User, OAI-SearchBot, PerplexityBot). Result: the site cannot be cited by ChatGPT search, Perplexity, Claude, or Google AI Overviews.

Question 4

How does robots.txt rule precedence work?

Accepted Answer

Google picks the most specific (longest-matching) rule for the given path. If an Allow and a Disallow are equally specific, Allow wins. The checker applies the same rule and shows you which directive actually decides whether the URL is crawlable.

Question 5

Does each subdomain need its own robots.txt?

Accepted Answer

Yes. robots.txt applies only to the exact host from which it is served. www.example.com/robots.txt does not govern shop.example.com — each subdomain needs its own file at /robots.txt.

Question 6

Does robots.txt prevent indexing?

Accepted Answer

No. robots.txt only blocks crawling. A URL blocked there can still appear in Google's search results if other sites link to it — just without a description. To prevent indexing, use a noindex meta tag on the page itself. Do not block a noindex page in robots.txt or Googlebot will never see the noindex directive.

robots.txt Checker

Three scenarios where robots.txt is the silent killer

Common questions about robots.txt