Do your robots meta tag and X-Robots-Tag header agree with each other?
Contradictory robots directives force AI crawlers to apply the strictest one and silently drop your pages.
What this signal tests
We check the two places your site can give per-page instructions to crawlers - the robots meta tag in the HTML head and the X-Robots-Tag in the HTTP response header - and confirm they agree with each other. We also check for typos and unknown directive names that crawlers will refuse to parse, and we confirm that non-HTML files (like PDFs) use the HTTP header form because they cannot include a meta tag.
Why it matters for your visibility in AI
When the meta tag says one thing and the HTTP header says another - for example, the HTML allows indexing but the header sets noindex - AI crawlers do not try to pick a winner. They apply the most restrictive directive and silently drop the page from their index. The page disappears from AI answers, but neither tag looks wrong in isolation, so the cause can take weeks to find. Typos in directive names cause a similar silent failure. A directive of nofllow (with one extra l) is treated as unknown and ignored on some crawlers, applied conservatively on others. Inconsistent crawler behaviour is exactly what you do not want when trying to debug missing pages. Keeping one source of truth and using only the well-known directive names eliminates an entire category of mysterious AI invisibility.
Pass criteria at a glance
| Criterion | Passes when |
|---|---|
| Zero conflicts and unknown tokens across sample. |
How we test it
For each sampled page we read the robots meta tag from the HTML head and the X-Robots-Tag from the HTTP response header. We normalise the directives - index, noindex, follow, nofollow, noarchive, nosnippet, and so on - and compare the two sources. We flag any direct contradiction and any unknown or misspelled directive token that compliant crawlers would silently ignore.
Show technical detection method
Parse both; normalize directives; flag conflicts and unknown directive tokens.
If your site fails: how to fix it
- Decide where your robots directives will live and stick to it. For HTML pages, the meta tag is the conventional location and the easiest to audit visually. For files like PDFs, you must use the X-Robots-Tag header because there is no HTML to contain a meta tag.
- If your CMS or CDN currently adds an X-Robots-Tag header that nobody asked for, find the setting and disable it - many security plugins ship with this on by default and accidentally noindex your whole site.
- Audit your templates for stale or copy-pasted meta robots tags. A noindex left over from a staging environment is one of the most common ways production sites accidentally hide themselves.
- Confirm every directive name is from the standard list: index, noindex, follow, nofollow, noarchive, nosnippet, noimageindex, max-snippet, max-image-preview, max-video-preview. Anything else is a typo or a vendor-specific directive that may be ignored.
- Re-run the AI Ready Test to confirm zero contradictions and zero unknown tokens across your sample.
Quick facts
| Maturity | ESTABLISHED |
|---|---|
| Weight | medium |
| Category | Crawlability |
Primary sources
Related signals
Frequently asked questions
What is the X-Robots-Tag header?
It is a response header that lets you apply the same noindex/follow/etc. directives to a URL as the robots meta tag, but at the HTTP layer rather than inside the HTML. It is essential for non-HTML files like PDFs and images, where you cannot embed a meta tag in the document.
How does a conflict actually behave?
Most major crawlers, including AI ones, apply the union of all noindex-style directives - that is, the most restrictive interpretation. If the meta tag allows indexing but the header says noindex, the page is treated as noindex. The reverse is also true. Conflicting signals always lean restrictive.
Can I noindex specific file types with the X-Robots-Tag?
Yes - that is one of the main use cases. For example, you can add an X-Robots-Tag header globally for all PDFs to prevent them being indexed without modifying each file. Web server config (nginx, Apache) and CDNs all support content-type-based header rules.
What are common typos I should watch for?
The most common: 'no-index' or 'no index' instead of noindex (the hyphen and space break parsing), 'nofollow' written as 'no follow', and 'noarchive' misspelled as 'noarcive'. These are all silently ignored by some crawlers and may be conservatively applied by others, causing unpredictable behaviour.
Run your own scan
Run a free scan and see how your site grades across all 155 AI-readiness signals.