# Sensia / Mneme-AI — robots.txt # Public marketing surface is crawlable. The authenticated app (workspace, # admin, portal, spaces) is intentionally walled off via the Disallow rules # below + client-side auth redirects. NOTE: there is NO edge `X-Robots-Tag: # noindex` header today — robots.txt + auth are the only boundary. If a # noindex header is added later (Caddy, app subdomains only), document it here. # # RFC 9309 reminder: a named User-agent group FULLY OVERRIDES the `*` group # for that bot — it does NOT inherit `*`'s rules. So every named group below # carries the SAME Disallow list as `*`. Keep them in sync. # Last updated: 2026-06-08. User-agent: * Allow: / # Authenticated app — no SEO value, sensitive paths Disallow: /workspace/ Disallow: /admin/ Disallow: /superadmin/ Disallow: /portal/ Disallow: /api/ Disallow: /partners/ Disallow: /profile Disallow: /settings Disallow: /spaces/ Disallow: /checkout-redirect Disallow: /checkout-success Disallow: /invite/ Disallow: /devices-profiles Disallow: /signup/finalize Disallow: /signup/cancelled Disallow: /signup/check-email Disallow: /verify-email # ───────────────────────────────────────────────────────────────────────────── # LLM / AI crawlers — explicitly welcomed on the marketing surface so Sensia # is discoverable via ChatGPT, Claude, Perplexity, Google AI Overviews / Gemini, # Bing Copilot, Meta AI, Apple Intelligence, DuckAssist, Mistral, Cohere, etc. # One shared rule block = same Disallow list as `*` (named groups don't inherit # it — see RFC note above). Vendors covered: # OpenAI ........ GPTBot (training), OAI-SearchBot (ChatGPT Search), ChatGPT-User (live fetch) # Anthropic ..... ClaudeBot, Claude-SearchBot, Claude-User, anthropic-ai # Perplexity .... PerplexityBot, Perplexity-User # Google ........ Google-Extended (Gemini training/grounding — opt-IN) # Meta .......... Meta-ExternalAgent, FacebookBot # Apple ......... Applebot-Extended (Apple Intelligence training — opt-IN) # Microsoft ..... bingbot (Bing / Copilot) # Common Crawl .. CCBot (feeds many open LLM datasets) # DuckDuckGo .... DuckAssistBot # Cohere ........ cohere-ai # Mistral ....... MistralAI-User # Diffbot ....... Diffbot (Knowledge-Graph extraction) # You.com ....... YouBot # To opt OUT of model training while keeping AI search, move Google-Extended # and Applebot-Extended into their own group with `Disallow: /`. # ───────────────────────────────────────────────────────────────────────────── User-agent: GPTBot User-agent: OAI-SearchBot User-agent: ChatGPT-User User-agent: ClaudeBot User-agent: Claude-SearchBot User-agent: Claude-User User-agent: anthropic-ai User-agent: PerplexityBot User-agent: Perplexity-User User-agent: Google-Extended User-agent: Meta-ExternalAgent User-agent: FacebookBot User-agent: Applebot-Extended User-agent: bingbot User-agent: CCBot User-agent: DuckAssistBot User-agent: cohere-ai User-agent: MistralAI-User User-agent: Diffbot User-agent: YouBot Allow: / Disallow: /workspace/ Disallow: /admin/ Disallow: /superadmin/ Disallow: /portal/ Disallow: /api/ Disallow: /partners/ Disallow: /profile Disallow: /settings Disallow: /spaces/ Disallow: /checkout-redirect Disallow: /checkout-success Disallow: /invite/ Disallow: /devices-profiles Disallow: /signup/finalize Disallow: /signup/cancelled Disallow: /signup/check-email Disallow: /verify-email # Bytedance / TikTok / Doubao — kept in its own group (low ICP relevance) so it # can be blocked outright in one line: replace `Allow: /` with `Disallow: /`. User-agent: Bytespider Allow: / Disallow: /workspace/ Disallow: /admin/ Disallow: /superadmin/ Disallow: /portal/ Disallow: /api/ Disallow: /partners/ Disallow: /profile Disallow: /settings Disallow: /spaces/ Disallow: /checkout-redirect Disallow: /checkout-success Disallow: /invite/ Disallow: /devices-profiles Disallow: /signup/finalize Disallow: /signup/cancelled Disallow: /signup/check-email Disallow: /verify-email # ───────────────────────────────────────────────────────────────────────────── # Sitemap — apex host (matches the canonical strategy: www/sensia have no 301, # the self-canonical collapses them onto the apex). The sitemap's entries # are all apex too, so keep this on the apex for a consistent signal. # ───────────────────────────────────────────────────────────────────────────── Sitemap: https://mneme-ai.com/sitemap.xml