---
name: loom-test-driven-development
description: "Use when implementation should be driven by test-first verification: a focused executable check can fail before the change, pass after it, and provide evidence for a behavior, bug, logic, edge case, or acceptance claim."
---

# loom-test-driven-development

Test-driven development is a Loom execution playbook for behavioral claims.

It turns the intended behavior into a failing check, drives the smallest change to
green, and preserves the red/green story as evidence when the ticket will rely on
it.

## Core Dependency

Use `loom-core` first. This playbook composes `loom-specs`, `loom-tickets`,
`loom-ralph`, `loom-evidence`, and `loom-audit`.

TDD is usually expressed as `Verification Posture: test-first` in a Ralph packet.

## Use This Playbook When

Use this playbook when:

- implementing new behavior or changing existing behavior
- fixing a bug that can be reproduced with a check
- adding edge-case handling
- changing logic that could regress
- a ticket acceptance criterion can be proved with an executable check

Skip it for prose-only edits, static content, or configuration changes with no
behavioral claim.

## Route

Use this route:

```text
contract -> red -> green -> refactor -> evidence -> audit-ready
```

## Contract

Start from the behavior contract:

- spec `REQ-*` and `SCN-*` when the behavior is durable
- ticket `ACC-*` when the claim is scoped to the work unit
- bug report, reproduction, or evidence record when fixing a defect

If expected behavior is unclear, route to `loom-specs` before writing tests.

## Red

Write or identify the smallest check that fails for the expected reason.

Good red checks:

- exercise public behavior, not incidental implementation details
- are scoped to one concept
- include meaningful edge or failure state when that is the bug
- are named so the claim is visible
- fail before the implementation change

For bug fixes, the red check should reproduce the bug. If the check passes before
the fix, tighten the reproduction or reclassify the issue.

## Green

Make the smallest change that passes the check.

Keep the change inside the ticket or packet write scope. If passing the check
requires broader behavior, update the ticket or route back to specs/plans before
continuing.

## Refactor

Refactor only with tests green.

Useful cleanup:

- clearer names
- simpler control flow
- removed duplication
- better boundary placement
- smaller helpers where they clarify the concept

Run the relevant checks after refactor steps that can affect behavior.

## Test Shape

Prefer checks that match the claim:

- unit tests for pure logic and small data transforms
- integration tests for APIs, persistence, queues, filesystem, or service boundaries
- browser or end-to-end checks for critical user flows and runtime UI behavior
- snapshot or visual evidence only when it supports the exact UI claim

Use fakes or real implementations before interaction-heavy mocks when practical.
Mocks are useful for slow, nondeterministic, or side-effecting dependencies.

Good interfaces make tests natural:

- accept dependencies instead of constructing hard-coded external services inside
  the behavior under test
- return observable results when possible instead of hiding all behavior in side
  effects
- keep the interface surface small enough that acceptance can be tested without
  reaching into implementation details
- test through the same seam callers use

If the only available test seam is too shallow to reproduce the behavior honestly,
record that limitation and route to `loom-api-and-interface-design` or
`loom-architecture-deepening` instead of writing a confidence theater test.

## Evidence

Preserve evidence when ticket closure, audit, or future recovery depends on it.

Useful evidence includes:

- failing command output before the fix
- passing command output after the fix
- test file and test name
- source state or branch/worktree when relevant
- what the test does not cover

Use `loom-evidence` for durable red/green records, especially for bugs, acceptance
claims, and high-risk behavior.

## Done Means

The TDD pass is done when:

- expected behavior is grounded in a spec, ticket, or bug reproduction
- a check failed before the change for the expected reason
- the check passes after the scoped change
- relevant broader checks were run or explicitly skipped with limits
- evidence is recorded when downstream review will rely on it
- the ticket can cite the check without overstating coverage
