--- name: verify-worker-output description: "Use when a worker has written its report and exit-file. This is the Critic step — re-run tests yourself, trust nothing. Triggers whenever reports/.md and reports/.exit both exist." --- # Verify Worker Output (Independent Critic) Core rule: **do not trust the worker's PASS claim.** Re-run `test_commands` yourself against the worktree's actual diff. ## Precondition Both files exist: - `$BRAID_ROOT/reports/.md` - `$BRAID_ROOT/reports/.exit` If only `.md` exists: the worker is mid-write. Wait. ## Process ### 1. Load documents - Contract: `$BRAID_ROOT/queue/.yaml` - Report: `$BRAID_ROOT/reports/.md` - Exit code: `cat $BRAID_ROOT/reports/.exit` If `EXIT != 0`: worker-crash or manual kill. Automatic FAIL, skip further tests. ### 2. Scope compliance (cheap, do first) ```bash cd $BRAID_WORKERS_DIR/ git diff --stat git diff --name-only ``` For every file touched: - Matches a glob in `scope.allowed_paths`? - Does NOT match any `scope.forbidden_paths`? Any violation → FAIL, rework contract naming the violating path. ### 3. Independent test execution In the worktree (not the target repo): ```bash cd $BRAID_WORKERS_DIR/ # Run EACH test_command from the contract yourself ``` Do NOT copy-paste the worker's test output — run fresh. Workers have been observed to: - Paste stale output from earlier runs - Ignore failures and report COMPLETED - Hide stderr ### 4. Acceptance-criteria evidence For each criterion, find concrete evidence: - A specific line in the diff - A specific assertion in a new/modified test - A specific field in actual test stdout Vague claims in the report ("function now works correctly") are NOT evidence. If you can't point to a line, the criterion is not met. ### 5. Decision **PASS** — all of these: - Exit code 0 - Scope compliance clean - Every `test_command` green on independent re-run - Every `acceptance_criterion` has concrete evidence → `braid merge ` (which runs its own post-merge verify) **FAIL** — any failure → Write `$BRAID_ROOT/queue/.rework.yaml`: ```yaml rework_for: "" rework_iteration: failures: - criterion: "" evidence: "" fix_hint: "" ``` ## Escalation If `rework_iteration >= 2` already and the next cycle would be #3: **STOP.** Write `$BRAID_ROOT/reports/.decomposed.md` explaining why the task is too large and propose 2-3 smaller replacement contracts. Do NOT dispatch a third attempt. ## Worker failure modes to watch for - COMPLETED status with failing test output included — worker self-deceived - All criteria marked ✅ but diff doesn't match any of them - Forbidden path touched with handwave ("had to fix a small thing") - Test commands altered in the contract copy — check original `queue/.yaml`, not the worktree copy ## Rework-Critic: iter0-preserve diff-check (mandatory) Triggered when the rework contract's `non_goals` contain phrases like: - "Do NOT modify the iteration-0 files: X, Y, Z" - "preserve from iteration 0" - "leave as iteration 0 left it" Then the critic MUST run a targeted diff-check BEFORE approving the merge. The static-read-critic is NOT sufficient alone — silent refactors of iter0 helpers inside preserve-files are invisible without a diff against the prior state. ### Check 1: commit-count per preserve-file For each file explicitly named in the non_goals as preserve-target: ```bash cd "$BRAID_WORKERS_DIR/" for f in

; do count=$(git log --oneline "$(git merge-base HEAD main)..HEAD" -- "$f" | wc -l) if [[ "$count" -gt 1 ]]; then echo "VIOLATION: $f has $count commits since base (expected 0 or 1 auto-commit)" fi done ``` ### Check 2: helper-function survival iter1 must NOT silently drop helpers that existed in iter0. The signature-set of private methods and top-level functions in each preserve-file is compared base-vs-rework: ```bash cd "$BRAID_WORKERS_DIR/" BASE_COMMIT=$(git merge-base HEAD main) for f in

; do iter0_helpers=$(git show "$BASE_COMMIT:$f" 2>/dev/null \ | grep -E '^\s*(private|public|function)\s+\w+\s*\(' \ | sed -E 's/^\s*(private|public|function)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(.*/\2/') rework_helpers=$(git show "HEAD:$f" 2>/dev/null \ | grep -E '^\s*(private|public|function)\s+\w+\s*\(' \ | sed -E 's/^\s*(private|public|function)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(.*/\2/') missing=$(comm -23 <(echo "$iter0_helpers" | sort -u) <(echo "$rework_helpers" | sort -u)) if [[ -n "$missing" ]]; then echo "HIGH_PRIORITY_VIOLATION: $f — helpers removed during rework: $missing" fi done ``` Tune the regex per language if the target is not TypeScript/JavaScript (e.g. `^def ` for Python, `^func ` for Go). ### Verdict semantics - **Any VIOLATION from Check 1 or HIGH_PRIORITY_VIOLATION from Check 2 → REWORK_REQUESTED.** Write to `reports/.md` with the offending file + missing helpers + quote of the non_goal that was violated. Do not merge. - **No violations → proceed** with the rest of the critic pass (test_commands, scope-compliance, acceptance-criteria). ### Why this exists A real target-project run merged with a silent violation: a rework iteration dropped a helper from the initial attempt despite the non_goal explicitly naming that file as preserve-target. The test suite flagged it post-merge. Static-read critic had no signal because it didn't diff the rework against the initial attempt — the worker's diff-against-main showed the new code but hid the removed helper behind the broader refactor. The fix was to make rework-vs-initial diffing an explicit critic habit instead of relying on memory. Without this check, the pattern repeats.