robots_txt.presentA reachable robots.txt exists at the site root.1
robots_txt.allows_known_agentsrobots.txt does not disallow well-known agent user-agents (GPTBot, ChatGPT-User, PerplexityBot, ClaudeBot, Google-Extended, Applebot-Extended).2
robots_txt.content_signalsrobots.txt advertises Content-Signal directives (e.g. `Content-Signal: ai-train=no, ai-summarize=yes`) that let agents respect granular usage preferences.2
llms_txt.presentAn llms.txt file at the site root helps agents locate your most-important content.1
sitemap.presentA sitemap.xml is reachable and referenced from robots.txt.1
discoverability.link_headersHTTP `Link:` response headers carry useful relations (canonical, alternate, describedby) that agents consume without parsing HTML.1
markup.canonicalA canonical URL is declared and resolves to the same page.1
discoverability.mcp_server_cardAn MCP Server Card is reachable at `/.well-known/mcp/server-card.json` (or referenced from a Link header) and validates against the Model Context Protocol spec.2
discoverability.oauth_authorization_server`/.well-known/oauth-authorization-server` is reachable and conforms to RFC 8414, letting agents discover authorization endpoints without out-of-band setup.1
discoverability.oauth_protected_resource`/.well-known/oauth-protected-resource` is published per RFC 9728 so agents can discover required scopes and authorization servers programmatically.1
discoverability.api_catalogAn API catalog is published at `/.well-known/api-catalog` (RFC 9727) or an OpenAPI document is reachable at `/openapi.json` or `/openapi.yaml`.1
discoverability.web_bot_authThe site signals support for the Web Bot Auth IETF draft (HTTP Message Signatures over a known JWKS) so well-behaved agents can prove identity.1