Question 1

Which AI bots should I allow in robots.txt?

Accepted Answer

For maximum AI discoverability, allow GPTBot (ChatGPT), Google-Extended (Gemini), ClaudeBot (Anthropic Claude), PerplexityBot (Perplexity), and Applebot-Extended (Apple Intelligence). You can selectively block specific bots if needed while allowing others.

Question 2

What does the noai meta tag do?

Accepted Answer

The noai meta tag (<meta name='robots' content='noai'>) tells AI systems not to use your content for AI training or generation. It's important to know this exists and to only use it intentionally — many sites have it without realizing it blocks AI discoverability.

Question 3

Is a sitemap really necessary for AI crawlers?

Accepted Answer

Yes — a sitemap tells AI crawlers exactly which pages to index and when they were last updated. Without a sitemap, AI bots must discover pages through links alone, potentially missing important content. Always reference your sitemap in robots.txt.

Question 4

What is TDM Protocol?

Accepted Answer

The TDM (Text and Data Mining) Protocol is an emerging standard that lets website owners specify permissions for AI text and data mining. It provides more granular control than robots.txt, allowing you to permit certain AI uses while restricting others.

AI Crawler Access

What We Check

How We Score

Why It Matters

How to Improve

Frequently Asked Questions

Ready to optimize for AI?