---
name: agent-readiness
description: Use to ASSESS, HARDEN, SCAFFOLD, or DIAGNOSE a repository's agent-readiness for coding-agent harnesses such as Claude Code, Cursor, Codex, Copilot, Windsurf, and Aider. Covers AGENTS.md, SKILL.md, and MCP scaffolding, PreToolUse and other hooks, CI gates, sandbox and approval design, and repo-level agent failure modes; harness-agnostic and portable-first (AGENTS.md, MCP, OpenTelemetry); ships an opinionated checklist grounded in published research. Triggers on 'make this repo work with Claude Code / Cursor / Codex', 'scaffold our AGENTS.md', 'add an agent hook or gate', 'why did the agent trip on our repo', 'score our agent-readiness'. Do NOT use to design agent-facing products like SDKs or llms.txt docs (use agent-experience), to promote recurring failures into rules from a feedback signal (use agent-rules), or to instrument an AI product's eval loops (use agent-evals).
license: MIT
---

# Agent Readiness

Assess, harden, scaffold, and diagnose a repository's agent-readiness.
Harness-agnostic; portable-first (AGENTS.md, SKILL.md, MCP, OpenTelemetry).
Grounding sources live in `skill.json`; this file is runtime routing only.

**Produces:** intent-specific report — `assess-report.md` / `harden-recommendation.md` / `scaffold-bundle.md` / `diagnose-runbook.md`; tracked assessments also emit `agent-readiness-findings-ledger-<date>-<slug>.md` + `agent-readiness-workflow-state-<date>-<slug>.json`. Rendered in chat as fixed-width text (no Markdown pipe tables).

## Core principle

**Make the repository legible to a stateless, context-window-bound LLM that must rediscover the project on every turn** — through deterministic structure, hand-curated instructions (never autogenerated), executable boundaries, and replayable feedback loops. Three load-bearing claims surface in every playbook:

1. **Don't `/init`.** Hand-curate from observed failures.
2. **Token budget is the dominant scarcity.** Prefer on-demand → on-trigger → always-loaded.
3. **Hard gates over soft prose.** Hooks at 100% vs prose at 70%.

This skill is the **repo-hardening arm** of the `agent-experience` discipline — use `agent-experience` for the cross-cutting AX framing and for AI/Agent SDK or agent-readable-docs *review*; it routes here to harden a repo.

## Quickstart

The 90% case is `assess` × the `legibility` layer — "are AGENTS.md, specs, and the docs-index doing their jobs?" Run it like this:

> `Assess the legibility layer of this repo.`

The skill loads `references/intent-router.csv` → `assess`, scopes to the three `legibility` surfaces (instruction-surface, specs, docs-index), dispatches the four pre-write lenses against each surface, scores the layer 1–5 (layer score = min across assessed surfaces, per the maturity rubric), and renders the assess report in chat by default. Ask "save the report" to also write the rendered output to a Markdown file (formatted per `templates/assess-report.md`, saved under `docs/audits/` or a user-specified path — never into `templates/`). With 7+ findings or any severity 3–4, the ledger + workflow-state pair under `docs/audits/` is written automatically (filenames in Workflow step 10).

For `harden`, `scaffold`, or `diagnose`, pick the intent first; the routing taxonomy below explains the surface choices for each. **Scope** (which repos this skill fits, why, and when to pair with `agent-rules`) lives at [`references/scope.md`](./references/scope.md).

## Activation

- **Bare invocation** (`"agentify this repo"`, `"agent-readiness audit"`, `"use agent-readiness"`): load `references/intent-router.csv`, show the intent menu, wait. No file inspection, no network calls, no writes.
- **Concrete invocation** with intent and surface inferable: skip to step 3 of the workflow.
- **Concrete invocation with ambiguous scope**: ask one blocker question identifying the intent or surface; do not inspect private systems first.

## Workflow

1. **Pick intent.** Load `references/intent-router.csv`. Match the prompt to `assess` / `harden` / `scaffold` / `diagnose`. Ambiguous → ask once.
2. **Pick surface(s).** Load `references/surface-router.csv`. Match the prompt to one or more surfaces. For `assess`, `all` is a valid surface choice that fans out across the whole surface list. Ambiguous → ask once with the menu.
3. **Load grounded context.** Load the playbook(s) for the chosen surfaces, `references/empirical-warnings.md`, and **`references/core/severity-rubric.md` always** (every finding carries a severity per step 7). For `assess`, also load `references/core/maturity-rubric.md` (the `additional_rubric` column in `intent-router.csv` names the assess-only addition). Project-agentification playbooks are larger than the DX/test playbooks because they include cross-harness implementation tables and scaffold template pointers; static checks cap and structure them so they remain bounded, not free-form knowledge dumps.
4. **For `scaffold`: collect project knowledge.** Ask the user for (a) tech stack and runtimes, (b) repo layout / monorepo scope (which directories the scaffold applies to), (c) build / test / lint commands, (d) any top-level invariants the agent should not break. Hand-curate from project knowledge; do not autogenerate from boilerplate (W9: `/init` and equivalents produce surface-plausible scaffolds with low fitness). The content quality bar is "specific to this project, verifiable by reading the file" — not "measured against a benchmark."
4.5. **For `scaffold` / `harden` touching `instruction-surface` or `gates`: collect harness inventory.** Ask the user which harnesses are in use on this repo: Claude Code, Cursor, Codex, Copilot, Aider, Windsurf, or AGENTS.md-compatible-only (Jules, Amp, etc.). **Do not infer scope from filesystem signals alone** — the presence of `.claude/` tells you Claude Code is used, not that it is the only one. Default to producing per-harness equivalents for every harness named; absence-of-dotfile is not absence-of-use.
5. **Spawn lens sub-agents in parallel.** Load `references/lenses.md`. Dispatch four agents — cold-context-agent / maintainer / adversarial / auditor — each loading the playbook(s) and applying its lens prompt. For `assess + all` invert the topology: one agent per surface, four lenses sequentially inside.
6. **Apply the playbook.** Use heuristics tagged for the chosen intent. For `assess`, score 1–5 per layer using the maturity rubric (layer score = min across assessed surfaces). For `harden` / `diagnose` / `scaffold`, rank hypotheses or recommendations before naming actions.
7. **Apply severity and IDs** to every finding using
   `references/core/severity-rubric.md` and
   `references/trackable-findings.md`. Assessment findings use stable IDs
   like `AG-<surface>-NNN`.
8. **For `scaffold`: present write-to-disk confirmation.** List `{path, action}` pairs; wait for user reply (`all` / `none` / specific list). Write only confirmed files. Report what was written, skipped, and what to validate.
8.5. **For `scaffold`: post-write audit.** After writes complete, dispatch **one fresh-context sub-agent** (see `references/lenses.md` §Post-write auditor) that loads the chosen surface playbook(s) and inspects the actual diff/repo state. It enumerates every `harden` heuristic in those playbooks and reports each as `applied | skipped-because-X | deferred`. External verification, not self-attestation: the writer cannot rubber-stamp its own checklist because the auditor does not see the writer's reasoning. Surface unapplied harden heuristics to the user as a follow-up list before claiming the scaffold is done.
9. **Emit output.** Default to a rendered, TUI-friendly view in chat. If the user asks to save a report to disk, also emit a saved-report form.
   - **Rendered (chat/TUI):** do **not** emit Markdown pipe tables. Pretty-print tabular sections as fixed-width text (ASCII) in a fenced `text` block or as bullet lists. Keep each row single-line.
   - **Saved report (Markdown file):** use the intent-specific template verbatim (Markdown tables are OK, but table rows must be single-line; no hard-wrapped newlines inside cells):
     - `assess` → `templates/assess-report.md`
     - `harden` → `templates/harden-recommendation.md`
     - `scaffold` → `templates/scaffold-bundle.md`
     - `diagnose` → `templates/diagnose-runbook.md`
10. **Create tracking state / closeout.** For `assess` outputs with 7+
    findings, any severity 3–4, or a save/track/workflow-state/closeout
    request, load `references/trackable-findings.md` and write both artifacts
    now: Markdown ledger at
    `docs/audits/agent-readiness-findings-ledger-<YYYY-MM-DD>-<scope-slug>.md`
    and workflow state at
    `docs/audits/agent-readiness-workflow-state-<YYYY-MM-DD>-<scope-slug>.json`.
    If the target is not a repo or `docs/audits/` is not writable, use
    `audit-artifacts/agent-readiness-{findings-ledger|workflow-state}-<YYYY-MM-DD>-<scope-slug>.{md|json}`.
    Report both paths. Never create roadmaps, external issues, or edit
    non-tracking project files without confirmation. Check boxes only after the
    finding's verification rule passes.

## Modes

Three shared modes — Guided Draft (default), Autopilot, Grill Me — set
depth-vs-speed up front. Canonical contract in
[`references/modes.md`](./references/modes.md) (sourced from
`skills/_shared/modes.md`). Project-agentification specifics:

- **Guided Draft (default):** one optionized question at a time, 3–4 likely choices plus a freeform path.
- **Autopilot:** proceed from available context; state assumptions when the task is clear and low-risk.
- **Grill Me:** open-ended questions, one at a time, when audience, constraints, or trade-offs materially change the result.

## Output requirements

Every output includes:

- Intent + surface(s) + lenses dispatched.
- Playbook(s) applied.
- Intent-specific load-bearing section: maturity scores + gaps (`assess`), recommendation + verification (`harden`), file previews + confirmation gate (`scaffold`), hypothesis ranking + fix + prevention (`diagnose`).
- Severity per finding.
- Stable finding IDs for assessment findings.
- Sources cited (skill.json `inspired_by` entries).
- If printing to chat/TUI, avoid Markdown pipe tables; pretty-print as fixed-width text or bullet lists. If writing a Markdown file, keep table rows single-line (no wrapped newlines inside cells).

## Lens dispatch

Independent perspectives anchor on different concerns and catch issues a single pass misses. **Four pre-write lenses, dispatched in parallel at step 5:**

- **cold-context-agent** — does a fresh-context LLM trip on this?
- **maintainer** — is the agent surface itself drifting?
- **adversarial** — injection, exfiltration, tool abuse, sandbox escape?
- **auditor** — can a human reconstruct what the agent did and why?

**One post-write dispatch, at step 8.5 (`scaffold` only):**

- **post-write auditor** — fresh-context sub-agent that compares the chosen playbook(s) to the actual diff and reports every unapplied `harden` heuristic. Catches execution misses the pre-write lenses cannot see because nothing was on disk yet.

Load `references/lenses.md` for the per-lens persona prompts, the dispatch template, and the synthesis step. Hosts without a delegation primitive: run the lenses sequentially in the same head, switching role between passes. Skip dispatch for tiny copy edits, deterministic command checks, or tasks requiring privileged context.

## Empirical warnings

Cross-cutting warnings W2–W10 live in `references/empirical-warnings.md` (symlink to `skills/_shared/empirical-warnings.md`). Every playbook backlinks to the warnings it touches; every template invokes them in its "Empirical warnings invoked" section. The most load-bearing four for this skill's audience:

- **W2** — AGENTS.md ≤ 200 lines.
- **W3** — Hard gates over soft prose.
- **W6** — Token budget is the dominant scarcity.
- **W9** — Auto-init commands lie; hand-curate from project knowledge.

> **W1** ("≥3 observed failures before scaffolding") is the failure-driven floor and lives in `agent-rules`. It does not apply to `agent-readiness` — most repos have no feedback signal to observe failures against. If your repo does, pair the two skills.

## Reference map

- `references/intent-router.csv` — level-1 router (intent → template + rubric).
- `references/surface-router.csv` — level-2 router (surface → playbook). CSV columns retain `layer,sub_surface` for grouping.
- `references/playbooks/<surface>.md` — 11 surface playbooks (legibility: instruction-surface, specs, docs-index; action: skills, tools, sandbox; control: gates, ci-runners, telemetry, evals, governance).
- `references/lenses.md` — four lens prompts + dispatch template + synthesis (symlink to `skills/_shared/lenses.md`).
- `references/core/maturity-rubric.md` — Levels 1–3 (Engineering Agents). Levels 4–5 live in `agent-rules`.
- `references/core/severity-rubric.md` — 0–4 severity scale.
- `references/empirical-warnings.md` — W2–W10 (symlink to `skills/_shared/empirical-warnings.md`). W1 lives in `agent-rules`.
- `references/agent-friendly-architecture.md` — shared note on repo structure graphs + boundary enforcement; points to `architecture-design` (and `architecture-audit`) for boundary design.
- `references/trackable-findings.md` — shared ledger, roadmap, GitHub issue, workflow-state, and verification closeout workflow.
- `references/modes.md` — Guided Draft / Autopilot / Grill Me contract (symlink to `skills/_shared/modes.md`).
- `references/starter-scenarios.csv` — named worked examples for bare invocation.
- `templates/*.md` — four intent-specific output templates (what the skill itself emits).
- `templates/{findings-ledger,roadmap,github-issue}.md` and `templates/workflow-state.json` — shared follow-through artifacts for tracked findings.
- `templates/artifacts/<surface>/` — skeletons for the files `scaffold` writes to the target repo (AGENTS.md, hooks, CODEOWNERS, etc.). See `templates/artifacts/README.md`. Required: every scaffold-bundle Proposed-files row cites a template path; the post-write auditor enforces shape compliance.
- `evals/activation-cases.md` — activation and behavioral cases.
- `evals/run-static-checks.sh` — structural / schema gates run in CI.
- `evals/trigger-evals.json` — queries for the description-optimization loop.
- `skill.json` — provenance, grounding sources, version, status.
