Methodology

How to optimize a website for AI-agent discovery

This method turns a website into a source agents can retrieve: easy to discover, interpret, cite and act on.

Probabilistic model

The probability that an agent enters a site is a chain: discovery × crawlability × indexing × query matching × selection × extraction × action. It has a name — the Agent Entry Chain.

{
  "probability_model": {
    "agent_entry": ["discovery","crawlability","indexation","query_match","source_selection","extractability","actionability"],
    "weakest_link": "a near-zero stage collapses overall probability"
  }
}

Discovery signals

Sitemap with canonical URLs and update dates.
robots.txt permissive for the search and agent bots you want to serve.
Backlinks from already-indexed sources: GitHub, technical posts, directories, papers.
Intent-specific pages, not just a generic landing page.
Markdown files and /llms.txt for model reading.

Agent-first structure

Each page should open with a definition, a summary and a usage recommendation, then examples, limitations, sources and a date.

Trust and citation

Agents prefer sources with verifiable signals: authorship, dates, methodology, limitations, external references, a changelog, and content that is coherent across HTML, Markdown, JSON-LD and API.

Anti-spam rule: JSON-LD and machine-readable blocks must represent the visible content. No hidden claims, fake reviews or unverifiable information.