Question 1

What is llms.txt?

Accepted Answer

llms.txt is an emerging convention (proposed by Jeremy Howard at llmstxt.org) for telling AI search engines how to navigate your site. Think robots.txt + sitemap.xml + an executive summary, all in one Markdown file at the root of your domain. The format: an H1 with the site name, a > blockquote summary, optional paragraphs, then ## Section headings each containing a curated bullet list of [Page title](URL): description.

Question 2

Which AI engines actually read llms.txt?

Accepted Answer

As of 2026 the file is being adopted by ChatGPT search, Perplexity, Claude, Anthropic's WebSearch, and various AI-coding tools (Cursor, Continue.dev, etc.). Google AI Overviews don't yet read it but the signal is moving that direction. Even when an engine doesn't read llms.txt directly, having one gives you a single Markdown file you can paste into any LLM-based research workflow.

Question 3

Why check robots.txt at the same time?

Accepted Answer

The most common failure mode: a site has a great llms.txt but their robots.txt blocks GPTBot, ClaudeBot, or PerplexityBot — making the llms.txt invisible to the engines that would read it. Even worse, sites often add a generic "no AI training" robots.txt rule that ALSO blocks the citation crawlers (ChatGPT-User, OAI-SearchBot), opting themselves out of being CITED in AI search results. This tool flags that inconsistency.

Question 4

What's the minimum valid llms.txt?

Accepted Answer

Per the spec: just an H1 line with the site/project name. Everything else is optional. But a useful llms.txt has the H1 + a > blockquote one-line summary + at least one ## section with 3-10 curated links. Don't list every URL — that's what sitemap.xml is for. llms.txt is for the canonical landing pages, docs, and most important blog posts.

Question 5

Does llms.txt replace sitemap.xml or robots.txt?

Accepted Answer

No — they coexist. sitemap.xml is the complete URL list for search-engine crawlers. robots.txt is access rules. llms.txt is a curated, human-readable summary for AI engines. Keep all three.

Question 6

Where should I host llms.txt?

Accepted Answer

At the root of your origin, served as text/markdown or text/plain. Example: https://example.com/llms.txt. Each subdomain needs its own llms.txt; the root file doesn't cover subdomains. Per the spec, also offer an optional /llms-full.txt with the FULL content of each linked page concatenated, for engines that want everything in one fetch.

llms.txt validator & AI search readiness checker

AI search is real traffic now

Common llms.txt questions