---
name: verify-work
description: >
  (What) Verify results against CONTRACTs — determine STATE (SUCCESS|FAILURE) per unit.
  (When) After each unit completes (inline from /proceed-work) or after all units complete (batch).
  (Why) Structural drift detection between CONTRACT.B and Result. (How) Result vs CONTRACT.B → State + log.
argument-hint: "[WP path or work title] [unit_index]"
metadata:
  author: David Seo of GEM².AI
  version: 14.0.0
allowed-tools:
  - Read
  - Write
  - Edit
  - Bash(date *)
  - Bash(mkdir *)
---

(* TPMN SKILL — verify-work *)

(* === Modes === *)
(* verify-work operates in two modes, determined by unit_index input:        *)
(*   SINGLE: verify one unit (called inline by /proceed-work after each UC)  *)
(*   BATCH:  verify all units (called at end-of-WP or manually by human)     *)
MODE ≜ unit_index ≠ ⊥ ? SINGLE : BATCH

(* === Input === *)
A ≜ [
  project_slug: 𝕊,
  wp_path: Path?,                      (* direct path to WP file, or ⊥ *)
  work_title: 𝕊?,                     (* search term if wp_path not given *)
  unit_index: ℕ?                      (* ⊥ = BATCH (all units). ≠ ⊥ = SINGLE (one unit) *)
]

(* === Output === *)
B ≜ [
  project_slug: 𝕊,
  wp_path: Path,
  wp_id: 𝕊,
  mode: {SINGLE, BATCH},
  overall_state: {SUCCESS, FAILURE},   (* SINGLE: state of that unit. BATCH: aggregate *)
  unit_results: Seq([
    unit_index: ℕ,
    unit_title: 𝕊,
    state: {SUCCESS, FAILURE},
    detail: 𝕊                         (* what passed or what specifically failed *)
  ]),
  failures: Seq(𝕊)?,                  (* failed unit titles + reasons, ⊥ if all SUCCESS *)
  log_path: Path,                      (* .gem-squared/verify-work-logs/{wp_id}.md *)
  verified_at: 𝕊                      (* ISO8601 *)
]

(* === Precondition === *)
P ≜ project_slug ≠ ⊥
    ∧ (wp_path ≠ ⊥ ∨ work_title ≠ ⊥)
    ∧ (MODE = SINGLE
       ⟹ unit[unit_index].STATUS = COMPLETED)          (* only that unit must be done *)
    ∧ (MODE = BATCH
       ⟹ all unit STATUS ∈ {COMPLETED, ABORTED})       (* all units must be terminal *)

(* === Transform === *)
F ≜ <<
  1. Record verified_at timestamp. Determine mode.
     Identify WP:
       IF wp_path provided → read that WP.
       IF only work_title → search work-plan/ for matching WP.
     Verify precondition:
       SINGLE: unit[unit_index].STATUS = COMPLETED (that unit has a Result to verify).
         WP may be IN_PROGRESS — this is expected during inline verification.
       BATCH: all unit STATUS ∈ {COMPLETED, ABORTED} (all units terminal).
         IF not met → STOP, report: "WP not ready for batch verification."
  2. Select units to verify:
       SINGLE: verify only unit[unit_index].
       BATCH: verify all COMPLETED units. Skip ABORTED units (no result to verify).
     Extract each target unit's CONTRACT (A, B, P) and Result.
  3. FOR each target unit, evaluate STATE by formal verification:
       Let b = actual Result (recorded by /proceed-work).
       Let B = CONTRACT.B (expected output specification).
       Let P = CONTRACT.P (preconditions and constraints).

       (* Verification predicate *)
       unit.state = SUCCESS ⟺
         (∀ field ∈ B: b[field] ≠ ⊥ ∧ type(b[field]) = B[field].type)
         ∧ P(a, b) holds

       In practice:
         - Field coverage: ∀ field ∈ B → b[field] ≠ ⊥ (every contracted output field exists in result)
         - Type conformance: type(b[field]) = B[field].type (result types match contract types)
         - Constraint satisfaction: P(a, b) (preconditions and invariants hold against actual input/output)
       IF any predicate fails → unit.state = FAILURE. Record which predicate failed and why.
  4. Determine overall STATE:
       SINGLE: overall_state = unit[unit_index].state.
       BATCH: overall_state = SUCCESS iff all verified units SUCCESS, else FAILURE.
     Edit WP file:
       Write `- State: SUCCESS` or `- State: FAILURE` per verified unit-work.
       Do NOT write WP-level STATE in header — that is /archive-work mandate.
  5. Write verification log:
       mkdir -p .gem-squared/verify-work-logs/
       SINGLE: append section to .gem-squared/verify-work-logs/{wp_id}.md:
         IF file does not exist → create with WP header, then append unit section.
         IF file exists → append unit section at end (before Summary if present).

         ## Unit {N}: {title} — {STATE} (verified {verified_at})
         ### CONTRACT.B (expected)
         {full CONTRACT.B text}
         ### Result (actual)
         {full Result text}
         ### Judgment
         {detailed comparison — what matched, what didn't, why STATE was determined}

       BATCH: overwrite .gem-squared/verify-work-logs/{wp_id}.md with full log:
         # Verification Log: {wp_id}
         **WP:** {wp_title} | **Verified:** {verified_at}
         **Overall:** {overall_state} | **Units verified:** {count} | **Skipped (ABORTED):** {count}

         ## Unit 1: {title} — {STATE}
         ### CONTRACT.B (expected)
         {full CONTRACT.B text}
         ### Result (actual)
         {full Result text}
         ### Judgment
         {detailed comparison}

         (repeat for each verified unit)

         ## Summary
         {overall assessment}

     Output B.
>>

(* === Constraint === *)
CONSTRAINT ≜ [
  ⊢ NEVER fix failures — report only,
  ⊢ NEVER re-execute work — that is /proceed-work mandate,
  ⊢ NEVER modify work output or Result fields — verify against what exists,
  ⊢ NEVER archive — that is /archive-work mandate,
  ⊢ NEVER apply subjective judgment — verify against CONTRACTs only,
  ⊢ Verification is binary per unit: SUCCESS or FAILURE — no partial credit,
  ⊢ SINGLE mode verifies ONE unit — does not touch other units' State fields
]

(* === Invariant === *)
INV ≜ [
  ⊢ STATE is written to WP file alongside STATUS — separate concerns (STATUS = did it run, STATE = did it pass),
  ⊢ BATCH: overall_state = FAILURE if ANY unit fails — no partial success,
  ⊢ SINGLE: overall_state reflects only the target unit,
  ⊢ Failures include specific reasons — not just pass/fail,
  ⊢ ABORTED units are skipped (no result to verify) — they do not affect overall STATE,
  ⊢ WP file stores the verdict (State field) — routing signal,
  ⊢ Log file stores the full reasoning (CONTRACT.B vs Result comparison) — decision support,
  ⊢ One log file per WP — SINGLE appends, BATCH overwrites,
  ⊢ WP file is the source of truth for STATE
]

(* === Post-Execution Routing === *)
Routing ≜ [
  MODE = SINGLE → return to caller (/proceed-work handles next step based on state),
  MODE = BATCH ∧ overall_state = SUCCESS  → /archive-work (persist the win),
  MODE = BATCH ∧ overall_state = FAILURE  → present failures to human:
    → human may /proceed-work to retry specific unit,
    → human may /archive-work with FAILURE state accepted
]
