---
name: tidy-code
description: >-
  Invoked by /tidy-code. Reviews source code for structural quality violations
  (hidden dependencies, god functions, silent failures, deep nesting) across
  any language. Produces a read-only findings report with concrete refactoring
  suggestions.
allowed-tools: Read Write Grep Bash(scripts/scan-source-files.sh:*) TaskCreate TaskGet TaskUpdate TaskOutput TaskList
---

# tidy-code Review

Review source code against 10 language-agnostic structural quality principles and produce a findings report with concrete refactoring suggestions.

**Important: This skill produces a report. Do not modify any reviewed files.**

---

## Activation

This skill activates **only** when the user explicitly invokes it via the `/tidy-code` slash command. Do NOT auto-activate on natural-language requests such as "review my code," "audit the code," "clean this up," "find code smells," "make this more maintainable," or "reduce complexity" — those phrasings must not trigger this skill.

---

## Review Workflow

1. **Select files** — Use the user's specified files. If none specified, run `scripts/scan-source-files.sh <project-directory>` to discover source files. The `<project-directory>` argument is **required** — it must be the root of the user's project, NOT the skill's own directory.
2. **Load rules** — Read `references/principles-quick-ref.md` for the full checklist with detection signals and thresholds.
3. **Review files in parallel** — Spawn parallel sub-agents (via the Task tool) using a fast cheap model (e.g., Claude Haiku 4.5, Gemini Flash 2.5) at medium effort for file-review sub-agents. Batch files into groups of 3–5 per sub-agent, grouping related files (same module or directory) together when possible so sub-agents can detect cross-file violations within their batch. Each sub-agent receives: its file list, the principles from `references/principles-quick-ref.md`, and instructions to produce findings in the Output Format below, loading detailed reference files on demand as violations are detected. Run up to 5 sub-agents concurrently. Once all complete, collect their findings. If a sub-agent fails, log the error and continue — do not block the rest of the review.
4. **Collect and deduplicate findings** — Gather findings from all sub-agents. Remove exact duplicates if file batches shared related files. Check for cross-file violations that individual sub-agents may have missed (e.g., a dependency injected in one file but hardcoded in another within a different batch). For large repos, increase batch size rather than exceeding 5 concurrent sub-agents — use batches of up to 10 files if more than 50 app files are found. Run in waves of 5 until all batches are dispatched.
5. **Classify severity** — Use `references/severity-rubric.md` to assign high/medium/low.
6. **Verify suggestions** — For each suggested rewrite, confirm it resolves the flagged violation, does not introduce a new violation of any other principle, and preserves the original behavior. If a suggestion introduces a new violation, revise it before including it.
7. **Assemble report** — Write findings to `.agents/tidy/code/tidy-code-findings-YYYYMMDD.md` (create the directory if it doesn't exist; use today's date). Group findings by file, then by severity (high first). End with the summary block.

---

## Gotchas

- The script outputs `--- test files ---` as a literal line in stdout — strip this separator before passing the file list to sub-agents and note which files are test files for the lighter-touch rules.
- If `scan-source-files.sh` returns an empty app-files section, abort with a user-facing message rather than spawning sub-agents with empty batches.
- Files in `/tests/` that are not test files themselves (factories, fixtures, helpers) should be reviewed as application code, not under the test light-touch rules.

---

## Model & Effort Guidance

This skill does not require frontier-class reasoning for typical codebases. The 10 principles have concrete detection signals and named refactorings that reduce the task to structured pattern matching.

- **Orchestration / deduplication:** use a mid-tier model (e.g., Claude Sonnet 4.5, Gemini Pro 2.5) at high effort.
- **File-review sub-agents** (structured pattern matching against 10 named signals): use a fast cheap model (e.g., Claude Haiku 4.5, Gemini Flash 2.5) at medium effort.
- **Optional escalation** for very large or architecturally complex codebases: upgrade the orchestrator to a frontier reasoning model (e.g., Claude Opus 4, Gemini 2.5 Pro).

**Recommended optimization — two-pass sub-agent architecture:** For large codebases or when token efficiency matters, consider splitting file review into two cheap-model passes: (1) a detection pass where sub-agents identify candidate violations by matching the 10 detection signals and output a structured list of suspects, then (2) a refactor-suggestion pass where a mid-tier model generates concrete rewrites only for confirmed violations. This reduces expensive generation to a smaller set of confirmed findings. This is a recommended optimization, not a required change to the workflow above.

---

## Output Format

Use this exact structure for each finding:

    ## [file path]

    ### Finding [N] — [Smell name] [ID] (severity: [high|medium|low])
    - **Line [N]:** `[original code snippet]`
    - **Principle:** [One-sentence explanation of the violated principle]
    - **Refactoring:** [Named refactoring technique]
    - **Suggested:**
      [concrete rewrite as a fenced code block]

**Example:**

    ## src/services/order_service.py

    ### Finding 1 — Hidden Dependency TC-02 (severity: high)
    - **Line 8:** `self.db = PostgresConnection("prod:5432")`
    - **Principle:** Dependencies created internally are invisible, untestable, and tightly coupled to a specific implementation.
    - **Refactoring:** Inject via constructor parameter
    - **Suggested:**
      ```python
      class OrderService:
          def __init__(self, db, mailer):
              self.db = db
              self.mailer = mailer
      ```

    ### Finding 2 — Nested Pyramid TC-03 (severity: medium)
    - **Line 34:** 3 levels of nesting in `process_order()`
    - **Principle:** Each nesting level forces the reader to maintain a mental stack. Guard clauses flatten the logic.
    - **Refactoring:** Replace Nested Conditional with Guard Clauses
    - **Suggested:**
      ```python
      def process_order(order):
          if not order:
              return None
          if not order.items:
              return None
          if not order.payment:
              raise ValueError("Missing payment")
          # happy path — no nesting
      ```

If a file has no findings, omit it from the report entirely.

End the report with:

    ## Summary
    - **Files reviewed:** [N]
    - **Total findings:** [N] ([N] high, [N] medium, [N] low)
    - **Top issues:** [List the 2-3 most frequent violations]
    - **Highest-leverage fix:** [The single change that would most improve the codebase]

---

## When to Load Reference Files

Load references on demand to conserve context:

| File | When to load |
|------|-------------|
| `references/principles-quick-ref.md` | Always — load at start of every review |
| `references/severity-rubric.md` | When classifying findings |
| `references/composition-over-inheritance.md` | TC-01 candidate detected |
| `references/dependency-injection.md` | TC-02 candidate detected |
| `references/guard-clauses.md` | TC-03 candidate detected |
| `references/single-responsibility.md` | TC-04 candidate detected |
| `references/fail-fast.md` | TC-05 candidate detected |
| `references/least-surprise.md` | TC-06 candidate detected |
| `references/tell-dont-ask.md` | TC-07 candidate detected |
| `references/immutability.md` | TC-08 candidate detected |
| `references/naming.md` | TC-09 candidate detected |
| `references/functional-core-imperative-shell.md` | TC-10 candidate detected |

---

## Scope Rules

- **Review:** application source code — functions, classes, modules, components
- **Skip:** test fixtures/factories, generated code, migration files, configuration files (JSON/YAML/TOML), vendor/third-party code, single-use scripts under 20 lines, type declaration files (.d.ts)
- **Light touch:** test files — apply naming (TC-09) and guard clauses (TC-03) but do not enforce DI (TC-02) or functional core (TC-10), since test setup is inherently side-effectful
- **Do not modify reviewed files** — produce recommendations only

Comment-prose quality is out of scope. If a user wants prose review of source comments, run `plain-language` on the file directly.

Stale TODO/FIXME/HACK markers older than 12 months are out of scope here — `tidy-project` (TP-10 STALE MARKER) owns them because the age signal needs git history.