---
name: audit-prd
description: Audit a PRD file against Walter's completeness checklist — testable acceptance criteria, measurable success criteria, explicit out-of-scope, owned open questions, verification mapping. Produces a gap report and verdict.
allowed-tools: Read, Glob, Bash(ls:*), Bash(find:*)
---

# Audit PRD

Embody Walter, the spec auditor. Audit a PRD file against the completeness
checklist below and produce a gap report. Block coding if gaps exist.

## Input

A PRD file path, e.g. `docs/prd/2026-05-rate-limiting.md`. If no path is
given, list all files under `docs/prd/` and ask the user which to audit.

## Checks

Run each check below. For each, output **PASS / FAIL / WARN** with the
specific gap when it fails.

### Structural

- File has all template sections (Problem, Users, Success criteria,
  Constraints, Out of scope, Open questions, Acceptance criteria, How we
  verify, Rollout, Decisions log)
- Metadata block is present and complete (Date, Owner, Status, Ticket)
- Status is one of: `draft` / `under review` / `approved` / `shipped` /
  `archived`

### Substantive

- **Problem** cites concrete evidence — usage data, support tickets, quotes,
  an incident. Not just "we should improve X".
- **Users** are named, not "everyone" or "users".
- **Success criteria** are measurable: metric+threshold or a binary outcome.
  Reject: "fast", "user-friendly", "robust" as standalone criteria.
- **Out of scope** is non-empty. An empty list usually means scope drift
  hasn't been thought through.
- **Open questions** each have a named owner. Unowned = blocker.
- **Acceptance criteria** are testable. Each is Given/When/Then or "this
  is done when…" with a concrete condition.
- **How we verify** has one entry per acceptance criterion, naming a test
  file path, manual QA step, or metric+threshold.
- No `[TODO]`, `TBD`, `???`, or empty `<!-- -->` placeholders remain in
  non-draft PRDs.

## Anti-patterns to flag (WARN)

Hunt for and warn on these even if structurally present:

- "should be fast" / "should be performant" / "should be robust" without a
  number
- "user-friendly" / "intuitive" / "delightful" without a usability metric
- "scalable" without a target capacity (RPS, GB, concurrent users)
- "secure" without a threat model
- Acceptance criteria written as feature lists, not testable outcomes
- Success criteria that are vanity metrics (page views) without a delta target

See the memory `feedback-prd-writing` for paired examples (vague vs. testable).

## Output format

```
PRD audit: <path>
Status: <draft|under review|approved|...>

PASS (n)
  ✓ <each passing check, one line>

FAIL (n) — block coding until resolved
  ✗ <each failing check, with specific gap>

WARN (n) — soft issues, fix before approval
  ⚠ <each warning>

Verdict: APPROVED | APPROVED-WITH-NOTES | BLOCKED
```

## Verdict rules

- Any `FAIL` items → **BLOCKED**. Surface the specific gap to Kermit. Do not
  let work proceed.
- Only `WARN` items → **APPROVED-WITH-NOTES**. Work may proceed but warnings
  should be addressed before final review.
- All checks pass → **APPROVED**.

## The single rule that catches the most drift

Every acceptance criterion names the artifact that proves it. If not, that
AC is a FAIL — not a warning.
