---
name: verify-worker-output
description: "Use when a worker has written its report and exit-file. This is the Critic step — re-run tests yourself, trust nothing. Triggers whenever reports/<task-id>.md and reports/<task-id>.exit both exist."
---

# Verify Worker Output (Independent Critic)

Core rule: **do not trust the worker's PASS claim.** Re-run `test_commands` yourself against the worktree's actual diff.

## Precondition

Both files exist:
- `$BRAID_ROOT/reports/<task-id>.md`
- `$BRAID_ROOT/reports/<task-id>.exit`

If only `.md` exists: the worker is mid-write. Wait.

## Process

### 1. Load documents

- Contract: `$BRAID_ROOT/queue/<task-id>.yaml`
- Report: `$BRAID_ROOT/reports/<task-id>.md`
- Exit code: `cat $BRAID_ROOT/reports/<task-id>.exit`

If `EXIT != 0`: worker-crash or manual kill. Automatic FAIL, skip further tests.

### 2. Scope compliance (cheap, do first)

```bash
cd $BRAID_WORKERS_DIR/<task-id>
git diff <target-branch> --stat
git diff <target-branch> --name-only
```

For every file touched:
- Matches a glob in `scope.allowed_paths`?
- Does NOT match any `scope.forbidden_paths`?

Any violation → FAIL, rework contract naming the violating path.

### 3. Independent test execution

In the worktree (not the target repo):

```bash
cd $BRAID_WORKERS_DIR/<task-id>
# Run EACH test_command from the contract yourself
```

Do NOT copy-paste the worker's test output — run fresh. Workers have been observed to:

- Paste stale output from earlier runs
- Ignore failures and report COMPLETED
- Hide stderr

### 4. Acceptance-criteria evidence

For each criterion, find concrete evidence:
- A specific line in the diff
- A specific assertion in a new/modified test
- A specific field in actual test stdout

Vague claims in the report ("function now works correctly") are NOT evidence. If you can't point to a line, the criterion is not met.

### 5. Decision

**PASS** — all of these:
- Exit code 0
- Scope compliance clean
- Every `test_command` green on independent re-run
- Every `acceptance_criterion` has concrete evidence

→ `braid merge <task-id>` (which runs its own post-merge verify)

**FAIL** — any failure
→ Write `$BRAID_ROOT/queue/<task-id>.rework.yaml`:

```yaml
rework_for: "<task-id>"
rework_iteration: <n>
failures:
  - criterion: "<which check failed>"
    evidence: "<what you observed, exact quote or line>"
    fix_hint: "<specific next action>"
```

## Escalation

If `rework_iteration >= 2` already and the next cycle would be #3: **STOP.**
Write `$BRAID_ROOT/reports/<task-id>.decomposed.md` explaining why the task is too large and propose 2-3 smaller replacement contracts. Do NOT dispatch a third attempt.

## Worker failure modes to watch for

- COMPLETED status with failing test output included — worker self-deceived
- All criteria marked ✅ but diff doesn't match any of them
- Forbidden path touched with handwave ("had to fix a small thing")
- Test commands altered in the contract copy — check original `queue/<task-id>.yaml`, not the worktree copy

## Rework-Critic: iter0-preserve diff-check (mandatory)

Triggered when the rework contract's `non_goals` contain phrases like:
- "Do NOT modify the iteration-0 files: X, Y, Z"
- "preserve from iteration 0"
- "leave as iteration 0 left it"

Then the critic MUST run a targeted diff-check BEFORE approving the merge. The static-read-critic is NOT sufficient alone — silent refactors of iter0 helpers inside preserve-files are invisible without a diff against the prior state.

### Check 1: commit-count per preserve-file

For each file explicitly named in the non_goals as preserve-target:

```bash
cd "$BRAID_WORKERS_DIR/<task-id>"
for f in <preserve-file-1> <preserve-file-2>; do
    count=$(git log --oneline "$(git merge-base HEAD main)..HEAD" -- "$f" | wc -l)
    if [[ "$count" -gt 1 ]]; then
        echo "VIOLATION: $f has $count commits since base (expected 0 or 1 auto-commit)"
    fi
done
```

### Check 2: helper-function survival

iter1 must NOT silently drop helpers that existed in iter0. The signature-set of private methods and top-level functions in each preserve-file is compared base-vs-rework:

```bash
cd "$BRAID_WORKERS_DIR/<task-id>"
BASE_COMMIT=$(git merge-base HEAD main)
for f in <preserve-file-1> <preserve-file-2>; do
    iter0_helpers=$(git show "$BASE_COMMIT:$f" 2>/dev/null \
        | grep -E '^\s*(private|public|function)\s+\w+\s*\(' \
        | sed -E 's/^\s*(private|public|function)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(.*/\2/')
    rework_helpers=$(git show "HEAD:$f" 2>/dev/null \
        | grep -E '^\s*(private|public|function)\s+\w+\s*\(' \
        | sed -E 's/^\s*(private|public|function)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(.*/\2/')
    missing=$(comm -23 <(echo "$iter0_helpers" | sort -u) <(echo "$rework_helpers" | sort -u))
    if [[ -n "$missing" ]]; then
        echo "HIGH_PRIORITY_VIOLATION: $f — helpers removed during rework: $missing"
    fi
done
```

Tune the regex per language if the target is not TypeScript/JavaScript (e.g. `^def ` for Python, `^func ` for Go).

### Verdict semantics

- **Any VIOLATION from Check 1 or HIGH_PRIORITY_VIOLATION from Check 2 → REWORK_REQUESTED.** Write to `reports/<task-id>.md` with the offending file + missing helpers + quote of the non_goal that was violated. Do not merge.
- **No violations → proceed** with the rest of the critic pass (test_commands, scope-compliance, acceptance-criteria).

### Why this exists

A real target-project run merged with a silent violation: a rework iteration dropped a helper from the initial attempt despite the non_goal explicitly naming that file as preserve-target. The test suite flagged it post-merge. Static-read critic had no signal because it didn't diff the rework against the initial attempt — the worker's diff-against-main showed the new code but hid the removed helper behind the broader refactor.

The fix was to make rework-vs-initial diffing an explicit critic habit instead of relying on memory. Without this check, the pattern repeats.
