Hugo Agent Readiness Playbook: From Score 8 to 83

⬅️ Back to Tutorials

⚡ Quick Start

Six files to create, two to update. About 30 minutes of work.

StepFileTime
1Update static/robots.txt3 min
2Create or rewrite static/llms.txt5 min
3Create static/_headers2 min
4Create static/.well-known/api-catalog2 min
5Create static/.well-known/mcp/server-card.json2 min
6Create static/.well-known/agent-skills/index.json3 min
7Update layouts/.../head-end.html5 min
8Create functions/_middleware.js5 min

Scoring tool: https://isitagentready.com/YOUR-DOMAIN

Starting score on this site before any of this: 8. After: 83.


🤖 What Is Agent Readiness?

The web was built for humans. Search engines added a crawling layer on top. Now there’s a third layer: AI agents (Claude, GPT, Perplexity, autonomous research pipelines) that browse, read, and act on websites programmatically.

These agents have different needs from both humans and search crawlers. They want:

  • Machine-readable structure, not a wall of styled HTML
  • Explicit permission signals — can I train on this? can I use it as context?
  • Discoverable capabilities — what can I do with this site? is there a search API? an RSS feed?
  • Markdown instead of HTML — 31% fewer tokens consumed, 66% faster answers (Cloudflare’s own benchmark)

In April 2026, Cloudflare published a study of 200,000 top domains and found almost no one is ready:

  • Only 4% of sites declare AI usage preferences
  • Only 3.9% support markdown content negotiation
  • Fewer than 15 sites in the entire dataset had MCP Server Cards or API Catalogs

The opportunity is obvious. Being agent-ready today is the equivalent of being search-engine-optimised in 2003.

The Four Dimensions

Agent readiness is scored across four areas:

Discoverability — Can agents find your content? (robots.txt, sitemap.xml, HTTP Link headers)

Content Accessibility — Can agents read your content efficiently? (llms.txt, markdown negotiation)

Bot Access Control — Have you declared your AI access policy? (Content Signals, explicit bot rules)

Capabilities — Can agents interact with your site? (API catalog, MCP server card, agent skills index)


🔍 The Tool: isitagentready.com

isitagentready.com scans any public domain and returns a scored report across all four dimensions, with specific pass/fail checks and links to the relevant skill docs for each failing item.

Run it before you start: https://isitagentready.com/YOUR-DOMAIN

Run it again after each batch of changes to see what moved. The checks are independent, which means that you can tackle them in any order.


🛠️ Implementation

1. static/robots.txt — Bot Access Control + Content Signals

Most sites have a robots.txt. Almost none of them have explicit AI bot rules or Content Signals.

Two things matter for the scanner:

  • Explicit User-agent blocks for AI crawlers — a wildcard User-agent: * alone does not count
  • Content-Signal directive inside the User-agent: * block — not floating at the bottom of the file

The required bots are: GPTBot, OAI-SearchBot, Claude-Web, Google-Extended, Amazonbot, Bytespider, CCBot, Applebot-Extended, and anthropic-ai.

# robots.txt
# Content is free for AI agents to read and reference with attribution.

# ---- Default: all bots ----
User-agent: *
Allow: /
Sitemap: https://YOUR-DOMAIN/sitemap.xml
# Content Signals (https://contentsignals.org/)
# ai-train=no  — do not use content for training datasets
# search=yes   — indexing for search is allowed
# ai-input=yes — using content as LLM context / inference is allowed
Content-Signal: ai-train=no, search=yes, ai-input=yes

# ---- AI training & inference bots ----
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-Web
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: anthropic-ai
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Applebot-Extended
Allow: /

User-agent: Applebot
Allow: /

User-agent: Amazonbot
Allow: /

User-agent: Bytespider
Allow: /

User-agent: CCBot
Allow: /

User-agent: cohere-ai
Allow: /

User-agent: YouBot
Allow: /

User-agent: Diffbot
Allow: /

User-agent: FacebookBot
Allow: /

# ---- Search engines ----
User-agent: Googlebot
Allow: /

User-agent: bingbot
Allow: /

Adjust ai-train, search, and ai-input to match your actual policy. The values above are a reasonable default for a personal site or blog.


2. static/llms.txt — LLM Content Index

llms.txt is a Markdown file that describes your site to language models. Think of it as a sitemap, but written for AI comprehension rather than crawlers. Standard defined at llmstxt.org.

# YOUR SITE NAME

> One sentence: what the site is and who it's for.

## About

- Key belief or value
- Another belief or value

## Site structure

### Core pages

- [Home](https://YOUR-DOMAIN/) — description
- [About](https://YOUR-DOMAIN/about/) — description
- [Section](https://YOUR-DOMAIN/section/) — description

### Subsections

- [Section › Sub](https://YOUR-DOMAIN/section/sub/) — description

## Machine-readable resources

- [Sitemap](https://YOUR-DOMAIN/sitemap.xml) — Full XML sitemap
- [RSS Feed](https://YOUR-DOMAIN/feed.xml) — Recent updates
- [Search Index](https://YOUR-DOMAIN/search.json) — Full-text search index (JSON)
- [API Catalog](https://YOUR-DOMAIN/.well-known/api-catalog) — RFC 9727 API catalog
- [Agent Skills](https://YOUR-DOMAIN/.well-known/agent-skills/index.json) — Agent capabilities

## Usage guidelines for LLMs

- Content is free to read, reference, and cite with attribution.
- Training on content requires attribution; commercial training requires explicit permission.
- Skip navigation, sidebars, and footer elements.

## Contact

- Email: [you@example.com](mailto:you@example.com)

The ## Machine-readable resources section references files you’ll create in steps 3–6. Add them now and fill them in as you go.


3. static/_headers — HTTP Link Headers

Cloudflare Pages and Netlify read _headers and inject HTTP response headers. This is how you add Link: headers on a static site, which is critical for the scanner’s discoverability checks.

/*
  Link: </.well-known/api-catalog>; rel="api-catalog"
  Link: </llms.txt>; rel="alternate"; type="text/markdown"
  Link: </sitemap.xml>; rel="sitemap"

/.well-known/api-catalog
  Content-Type: application/linkset+json

/.well-known/mcp/server-card.json
  Content-Type: application/json

/.well-known/agent-skills/index.json
  Content-Type: application/json

Without this, .well-known files with no extension (like api-catalog) get served as application/octet-stream. The scanner expects application/linkset+json. The _headers file fixes the content-type and adds the HTTP Link headers in one shot.

GitHub Pages users: _headers files are not supported. Use a Cloudflare Transform Rule to inject Link: headers instead.


4. static/.well-known/api-catalog — RFC 9727 API Catalog

Create the directory first: mkdir -p static/.well-known

The API catalog must use the RFC 9264 linkset format, which is a JSON object with a "linkset" array. Custom formats like { "apis": [...] } fail the check.

{
  "linkset": [
    {
      "anchor": "https://YOUR-DOMAIN",
      "service-desc": [
        {
          "href": "https://YOUR-DOMAIN/search.json",
          "type": "application/json",
          "title": "Full-text search index"
        }
      ],
      "service-doc": [
        {
          "href": "https://YOUR-DOMAIN/llms.txt",
          "type": "text/markdown",
          "title": "LLM content index"
        }
      ],
      "alternate": [
        {
          "href": "https://YOUR-DOMAIN/feed.xml",
          "type": "application/rss+xml",
          "title": "RSS feed"
        },
        {
          "href": "https://YOUR-DOMAIN/sitemap.xml",
          "type": "application/xml",
          "title": "Sitemap"
        }
      ]
    }
  ]
}

5. static/.well-known/mcp/server-card.json — MCP Server Card

The MCP Server Card (SEP-1649) describes your site to agents that understand the Model Context Protocol. The critical required field is serverInfo.name, because omitting it or putting name at the top level (outside serverInfo) fails the check.

{
  "serverInfo": {
    "name": "YOUR SITE NAME",
    "version": "1.0.0"
  },
  "description": "One paragraph describing what the site is.",
  "url": "https://YOUR-DOMAIN",
  "transport": {
    "type": "http",
    "endpoint": "https://YOUR-DOMAIN/.well-known/agent-skills/index.json"
  },
  "capabilities": {
    "resources": true,
    "tools": false,
    "prompts": false
  },
  "resources": [
    {
      "name": "Search Index",
      "uri": "https://YOUR-DOMAIN/search.json",
      "description": "Full-text search index of all site content (JSON)",
      "mimeType": "application/json"
    },
    {
      "name": "LLM Index",
      "uri": "https://YOUR-DOMAIN/llms.txt",
      "description": "Structured Markdown overview for LLM ingestion",
      "mimeType": "text/markdown"
    }
  ],
  "content_policy": {
    "ai_inference": "allow",
    "ai_search": "allow",
    "ai_training": "disallow",
    "attribution_required": true
  }
}

6. static/.well-known/agent-skills/index.json — Agent Skills Index

This file tells agents what they can actually do with your site. Each skill needs name, description, endpoint, method, response_type, and usage.

{
  "schema_version": "1.0",
  "site": "https://YOUR-DOMAIN",
  "skills": [
    {
      "id": "search-content",
      "name": "Search Site Content",
      "description": "Search all content using a full-text index.",
      "endpoint": "https://YOUR-DOMAIN/search.json",
      "method": "GET",
      "response_type": "application/json",
      "usage": "Fetch the JSON array. Each entry has: uri, title, content (up to 1000 chars), description, tags. Filter client-side."
    },
    {
      "id": "browse-feed",
      "name": "Browse Recent Updates",
      "description": "Get the latest content updates in RSS format.",
      "endpoint": "https://YOUR-DOMAIN/feed.xml",
      "method": "GET",
      "response_type": "application/rss+xml",
      "usage": "Standard RSS 2.0. Parse <item> elements for title, link, description, pubDate."
    },
    {
      "id": "list-pages",
      "name": "List All Pages",
      "description": "Full list of all pages with last-modified dates.",
      "endpoint": "https://YOUR-DOMAIN/sitemap.xml",
      "method": "GET",
      "response_type": "application/xml",
      "usage": "XML sitemap. Each <url> has <loc> and <lastmod>."
    },
    {
      "id": "read-llms-index",
      "name": "Read LLM Content Index",
      "description": "Structured Markdown overview of the site for LLM comprehension.",
      "endpoint": "https://YOUR-DOMAIN/llms.txt",
      "method": "GET",
      "response_type": "text/markdown",
      "usage": "Fetch and parse as Markdown. Contains site description, section links, and usage guidelines."
    }
  ]
}

7. Head Meta Tags + WebMCP — layouts/.../head-end.html

Hugo’s theme determines the exact path. Look for whatever partial handles the <head> end — often layouts/partials/custom/head-end.html or layouts/_partials/head/extra.html. Add the following at the top.

Agent discovery <link> and <meta> tags:

<!-- Agent Discoverability -->
<link rel="alternate" type="text/markdown" title="LLM-friendly content index" href="{{ "llms.txt" | absURL }}" />
<link rel="sitemap" type="application/xml" title="Sitemap" href="{{ "sitemap.xml" | absURL }}" />
<link rel="api-catalog" href="{{ "/.well-known/api-catalog" | absURL }}" />
<meta name="robots" content="index, follow" />
<meta name="ai-inference" content="allow" />
<meta name="ai-search" content="allow" />
<meta name="ai-training" content="disallow" />

WebMCP script — exposes site tools to agents via the browser’s navigator.modelContext API:

<!-- WebMCP: expose site tools to AI agents -->
<script>
  (function () {
    if (!("modelContext" in navigator)) return;

    navigator.modelContext.provideContext({
      tools: [
        {
          name: "search_SITENAME",
          description:
            "Search site content. Returns matching pages with title, URL, and snippet.",
          inputSchema: {
            type: "object",
            properties: {
              query: { type: "string", description: "Search term" },
            },
            required: ["query"],
          },
          execute: async function ({ query }) {
            const res = await fetch("/search.json");
            const data = await res.json();
            const q = query.toLowerCase();
            return data
              .filter(function (item) {
                return (
                  (item.title && item.title.toLowerCase().includes(q)) ||
                  (item.content && item.content.toLowerCase().includes(q)) ||
                  (item.description &&
                    item.description.toLowerCase().includes(q)) ||
                  (item.tags &&
                    item.tags.some(function (t) {
                      return t.toLowerCase().includes(q);
                    }))
                );
              })
              .slice(0, 10)
              .map(function (item) {
                return {
                  title: item.title,
                  url: "https://YOUR-DOMAIN" + item.uri,
                  description:
                    item.description ||
                    (item.content ? item.content.slice(0, 200) + "…" : ""),
                  tags: item.tags || [],
                };
              });
          },
        },
        {
          name: "get_recent_updates",
          description: "Get recently updated pages, sorted by date.",
          inputSchema: {
            type: "object",
            properties: {
              limit: {
                type: "number",
                description: "Max results (default 10)",
              },
            },
          },
          execute: async function ({ limit }) {
            const res = await fetch("/feed.xml");
            const text = await res.text();
            const parser = new DOMParser();
            const doc = parser.parseFromString(text, "application/xml");
            return Array.from(doc.querySelectorAll("item"))
              .slice(0, limit || 10)
              .map(function (item) {
                return {
                  title: item.querySelector("title")?.textContent || "",
                  url: item.querySelector("link")?.textContent || "",
                  date: item.querySelector("pubDate")?.textContent || "",
                };
              });
          },
        },
      ],
    });
  })();
</script>

The if (!("modelContext" in navigator)) return; guard means this is a complete no-op in browsers that don’t support WebMCP. No performance impact.


8. functions/_middleware.js — Markdown for Agents

This is the most impactful single change. When any request arrives with an Accept: text/markdown header, the middleware intercepts it, strips the chrome (nav, header, footer, scripts, cookie banners), converts the main content to clean Markdown, and returns Content-Type: text/markdown with an x-markdown-tokens token-count estimate.

Browsers never send Accept: text/markdown, so they receive normal HTML responses. This is completely transparent to human visitors.

Create the file at functions/_middleware.js in the root of your Hugo project (alongside content/, static/, etc., not inside public/). Cloudflare Pages picks it up automatically on the next deploy.

/**
 * Cloudflare Pages Middleware — Markdown Content Negotiation
 *
 * When a request includes `Accept: text/markdown`, fetches the HTML page,
 * strips chrome (nav, header, footer, scripts, styles, cookie banners),
 * converts the main content to clean Markdown, and returns it with
 * Content-Type: text/markdown and x-markdown-tokens headers.
 *
 * Browsers receive normal HTML (no Accept: text/markdown header sent).
 *
 * Spec: https://llmstxt.org/
 * Docs: https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/
 */

export async function onRequest(context) {
  const { request, next } = context;

  const accept = request.headers.get("Accept") || "";
  if (!accept.includes("text/markdown")) {
    return next();
  }

  const method = request.method;
  if (method !== "GET" && method !== "HEAD") {
    return next();
  }

  const url = new URL(request.url);
  const path = url.pathname;

  // Skip assets
  if (
    /\.(css|js|png|jpg|jpeg|gif|svg|webp|ico|woff2?|ttf|eot|json|xml|txt|pdf)$/i.test(
      path,
    )
  ) {
    return next();
  }

  const response = await next();

  const contentType = response.headers.get("Content-Type") || "";
  if (!contentType.includes("text/html")) {
    return response;
  }

  const html = await response.text();
  const markdown = htmlToMarkdown(html, url.href);
  const tokenEstimate = Math.ceil(markdown.length / 4);

  return new Response(markdown, {
    status: response.status,
    headers: {
      "Content-Type": "text/markdown; charset=utf-8",
      "x-markdown-tokens": String(tokenEstimate),
      "Cache-Control": "no-store",
      Vary: "Accept",
    },
  });
}

function htmlToMarkdown(html, pageUrl) {
  html = stripElements(html, [
    "script",
    "style",
    "noscript",
    "nav",
    "header",
    "footer",
    "aside",
    /<div[^>]*(?:cookie|consent|gdpr|banner|notice|overlay|modal)[^>]*>[\s\S]*?<\/div>/gi,
  ]);

  const main =
    extractTag(html, "main") ||
    extractTag(html, "article") ||
    extractTag(html, "body") ||
    html;

  const titleMatch = html.match(/<title[^>]*>([\s\S]*?)<\/title>/i);
  const pageTitle = titleMatch ? decodeEntities(titleMatch[1].trim()) : "";

  let md = convertToMarkdown(main);

  const header = [
    `<!-- Source: ${pageUrl} -->`,
    pageTitle ? `# ${pageTitle}\n` : "",
  ]
    .filter(Boolean)
    .join("\n");

  md = header + "\n\n" + md;
  md = md.replace(/\n{3,}/g, "\n\n").trim();

  return md;
}

function stripElements(html, targets) {
  for (const target of targets) {
    if (typeof target === "string") {
      const re = new RegExp(
        `<${target}(\\s[^>]*)?>([\\s\\S]*?)<\\/${target}>`,
        "gi",
      );
      html = html.replace(re, "");
    } else if (target instanceof RegExp) {
      html = html.replace(target, "");
    }
  }
  return html;
}

function extractTag(html, tag) {
  const re = new RegExp(`<${tag}(?:\\s[^>]*)?>([\\s\\S]*?)<\\/${tag}>`, "i");
  const m = html.match(re);
  return m ? m[1] : null;
}

function convertToMarkdown(html) {
  let md = html;

  md = md.replace(/<h1[^>]*>([\s\S]*?)<\/h1>/gi, (_, t) => `\n# ${clean(t)}\n`);
  md = md.replace(
    /<h2[^>]*>([\s\S]*?)<\/h2>/gi,
    (_, t) => `\n## ${clean(t)}\n`,
  );
  md = md.replace(
    /<h3[^>]*>([\s\S]*?)<\/h3>/gi,
    (_, t) => `\n### ${clean(t)}\n`,
  );
  md = md.replace(
    /<h4[^>]*>([\s\S]*?)<\/h4>/gi,
    (_, t) => `\n#### ${clean(t)}\n`,
  );
  md = md.replace(
    /<h5[^>]*>([\s\S]*?)<\/h5>/gi,
    (_, t) => `\n##### ${clean(t)}\n`,
  );
  md = md.replace(
    /<h6[^>]*>([\s\S]*?)<\/h6>/gi,
    (_, t) => `\n###### ${clean(t)}\n`,
  );

  md = md.replace(
    /<blockquote[^>]*>([\s\S]*?)<\/blockquote>/gi,
    (_, t) =>
      clean(t)
        .split("\n")
        .map((l) => `> ${l}`)
        .join("\n") + "\n",
  );

  md = md.replace(
    /<pre[^>]*><code(?:\s[^>]*)?>([\s\S]*?)<\/code><\/pre>/gi,
    (_, t) => `\n\`\`\`\n${decodeEntities(t)}\n\`\`\`\n`,
  );
  md = md.replace(
    /<code[^>]*>([\s\S]*?)<\/code>/gi,
    (_, t) => `\`${decodeEntities(t)}\``,
  );

  md = md.replace(
    /<(strong|b)[^>]*>([\s\S]*?)<\/(strong|b)>/gi,
    (_, _t, t) => `**${clean(t)}**`,
  );
  md = md.replace(
    /<(em|i)[^>]*>([\s\S]*?)<\/(em|i)>/gi,
    (_, _t, t) => `_${clean(t)}_`,
  );

  md = md.replace(
    /<a\s[^>]*href="([^"]*)"[^>]*>([\s\S]*?)<\/a>/gi,
    (_, href, text) => {
      const t = clean(text);
      return t ? `[${t}](${href})` : "";
    },
  );

  md = md.replace(/<img\s[^>]*alt="([^"]*)"[^>]*\/?>/gi, (_, alt) =>
    alt ? `_[Image: ${alt}]_` : "",
  );

  md = md.replace(/<hr[^>]*\/?>/gi, "\n---\n");
  md = md.replace(/<table[^>]*>([\s\S]*?)<\/table>/gi, convertTable);
  md = md.replace(
    /<ul[^>]*>([\s\S]*?)<\/ul>/gi,
    (_, inner) => convertList(inner, false) + "\n",
  );
  md = md.replace(
    /<ol[^>]*>([\s\S]*?)<\/ol>/gi,
    (_, inner) => convertList(inner, true) + "\n",
  );
  md = md.replace(/<li[^>]*>([\s\S]*?)<\/li>/gi, (_, t) => `- ${clean(t)}\n`);

  md = md.replace(/<p[^>]*>([\s\S]*?)<\/p>/gi, (_, t) => {
    const text = clean(t);
    return text ? `\n${text}\n` : "";
  });
  md = md.replace(/<div[^>]*>([\s\S]*?)<\/div>/gi, (_, t) => {
    const text = clean(t);
    return text ? `\n${text}\n` : "";
  });

  md = md.replace(/<br\s*\/?>/gi, "\n");
  md = md.replace(/<[^>]+>/g, "");
  md = decodeEntities(md);

  return md;
}

function convertList(inner, ordered) {
  const items = [];
  const re = /<li[^>]*>([\s\S]*?)<\/li>/gi;
  let i = 1,
    m;
  while ((m = re.exec(inner)) !== null) {
    const text = clean(m[1]);
    if (text) {
      items.push(ordered ? `${i}. ${text}` : `- ${text}`);
      i++;
    }
  }
  return items.join("\n");
}

function convertTable(_, inner) {
  const rows = [];
  const rowRe = /<tr[^>]*>([\s\S]*?)<\/tr>/gi;
  let rm;
  while ((rm = rowRe.exec(inner)) !== null) {
    const cells = [];
    const cellRe = /<t[hd][^>]*>([\s\S]*?)<\/t[hd]>/gi;
    let cm;
    while ((cm = cellRe.exec(rm[1])) !== null) {
      cells.push(clean(cm[1]).replace(/\|/g, "\\|"));
    }
    rows.push(`| ${cells.join(" | ")} |`);
  }
  if (rows.length === 0) return "";
  const cols = (rows[0].match(/\|/g) || []).length - 1;
  rows.splice(1, 0, `| ${Array(cols).fill("---").join(" | ")} |`);
  return "\n" + rows.join("\n") + "\n";
}

function clean(html) {
  return html
    .replace(/<[^>]+>/g, " ")
    .replace(/\s+/g, " ")
    .trim();
}

function decodeEntities(str) {
  return str
    .replace(/&amp;/g, "&")
    .replace(/&lt;/g, "<")
    .replace(/&gt;/g, ">")
    .replace(/&quot;/g, '"')
    .replace(/&#39;/g, "'")
    .replace(/&apos;/g, "'")
    .replace(/&nbsp;/g, " ")
    .replace(/&#(\d+);/g, (_, n) => String.fromCharCode(parseInt(n, 10)))
    .replace(/&#x([0-9a-f]+);/gi, (_, h) =>
      String.fromCharCode(parseInt(h, 16)),
    );
}

How to test it locally:

curl -H "Accept: text/markdown" https://YOUR-DOMAIN/

A correct response returns Content-Type: text/markdown with clean Markdown; no HTML tags, no <div>, no nav, no scripts.

Alternative: On Cloudflare Pro/Business plans, you can skip the Worker entirely. Go to Cloudflare Dashboard → AI Crawl Control → enable “Markdown for Agents”. Cloudflare handles the conversion automatically.


📂 Folder Structure After All Changes

static/
  _headers                           ← HTTP Link headers + content-types
  robots.txt                         ← AI bot rules + Content Signals
  llms.txt                           ← LLM content index
  .well-known/
    api-catalog                      ← RFC 9727 linkset catalog
    agent-skills/
      index.json                     ← Agent skills index
    mcp/
      server-card.json               ← MCP Server Card (SEP-1649)

layouts/
  ...head-end.html                   ← Agent meta tags + WebMCP JS

functions/
  _middleware.js                     ← Markdown content negotiation

📊 Results: 8 → 83

Before: a score of 8. The site had a robots.txt and a basic llms.txt, both of which the scanner found too thin to pass.

After applying all eight steps: 83.

The remaining gap to 100 comes from checks that are genuinely not applicable to a read-only static site:

CheckWhy it’s N/A
OAuth / OIDC discoveryNo protected APIs
OAuth Protected ResourceNo authenticated endpoints
Web Bot AuthRequires key-signing infrastructure
Commerce protocols (x402, UCP, ACP)E-commerce only

These aren’t failures — they’re correct absences. A static blog has no business publishing /.well-known/openid-configuration. The scanner knows this; those checks contribute zero to the score penalty.


✅ Quick Checklist

  • robots.txt has explicit AI bot entries (GPTBot, Claude-Web, Google-Extended, Amazonbot, Bytespider, CCBot, Applebot-Extended, anthropic-ai)
  • Content-Signal directive is inside the User-agent: * block, not floating at the end of the file
  • llms.txt has site description, section structure, machine-readable resource links, and usage guidelines
  • _headers injects Link: headers and application/linkset+json content-type for .well-known/api-catalog
  • .well-known/api-catalog uses { "linkset": [...] } RFC 9264 format
  • .well-known/mcp/server-card.json has serverInfo.name (not just name at top level)
  • .well-known/agent-skills/index.json lists all available skills
  • Head partial has agent discovery <link> and <meta> tags + WebMCP script
  • functions/_middleware.js exists and handles Accept: text/markdown

🔗 Resources


Crepi il lupo! 🐺