---
name: text-orchestrator
description: Automated text pipeline orchestrator — run through all stages sequentially, launching each stage skill automatically until completion or a blocker is hit.
allowed-tools: Bash(go:*), Bash(jq:*), Bash(find:*), Bash(ls:*), Bash(git:*), Read, Write, Edit, Grep, Glob
---

# Text Orchestrator (Automated Pipeline Runner)

Automatically run through text pipeline stages by invoking subskills in sequence.

This skill is the **automated orchestrator** for the text pipeline.
It coordinates:
- workspace initialization via `text-intake`
- sequential stage execution via stage-specific skills
- state tracking and advancement
- blocker detection and graceful stopping

It does **not** own:
- any stage's artifact generation directly
- semantic evaluation decisions
- human approval (Stage 10 always requires explicit approval)

## Environment

Text data lives in the `texts` repository. Set `LYCEUM_TEXTS_DIR` to point to this directory (defaults to `output/texts` for local development).

## Quick Status

Pipeline workspaces: !`find ${LYCEUM_TEXTS_DIR:-output/texts} -mindepth 1 -maxdepth 1 -type d 2>/dev/null | wc -l`
Workspaces in progress: !`find ${LYCEUM_TEXTS_DIR:-output/texts} -path '*/state.json' -exec grep -l '"status": "pending"\|"status": "in_progress"' {} \; 2>/dev/null | wc -l`

## Commands

- `/text-orchestrator run [work] [--mode add|apply] [--profile <profile>]` — Run the full pipeline for a text
- `/text-orchestrator replay [work] --profile <profile>` — Replay specific stages on an existing workspace
- `/text-orchestrator resume [work]` — Resume an in-progress pipeline from where it stopped
- `/text-orchestrator step [work]` — Execute only the next pending stage, then stop
- `/text-orchestrator status [work]` — Show pipeline progress and next stage
- `/text-orchestrator abort [work]` — Mark the pipeline as blocked and stop

Target: $ARGUMENTS

---

## Stage-to-Skill Mapping

| Stage ID | Stage Name | Skill to Invoke |
|---|---|---|
| `0-intake` | Intake | `text-intake` |
| `1-source-discovery` | Source Discovery | `source-hunt` |
| `2-acquisition-extraction` | Acquisition | `source-extract` |
| `3-cleaning-normalization` | Cleaning | `text-cleaning` |
| `4-structural-segmentation` | Segmentation | `segmentation` |
| `5-witness-normalization-ranking` | Witness Ranking | `translation-witness` |
| `6-versification-alignment` | Versification | `versify try-witness` → (if fail) `translation-synthesis` → `versify` |
| `7-transliteration` | Transliteration | `transliteration` |
| `8-interlinear-morphology` | Interlinear | `treebank` → `alignment` → `gloss-review` |
| `9-reader-reliability` | Reader QA | `reader-reliability` |
| `10-human-review` | Ship Gate | `new-text-ship` (requires human approval) |

---

## Orchestration Protocol

### Before Running Any Stage

**Step 0. Create workspace backup** (MANDATORY):
```bash
SLUG="your-text-slug"
STAGE="N-stage-name"
cp -r "${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}" "${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}.backup-${STAGE}-$(date +%Y%m%d-%H%M%S)"
```

1. Load workspace state:
   ```bash
   nix-shell -p jq --run "jq '.' ${LYCEUM_TEXTS_DIR:-output/texts}/SLUG/state.json"
   ```

2. Check current stage status and blockers:
   ```bash
   nix-shell -p jq --run "jq '{current: .current_stage, invalidated: .invalidated_stages, stages: [.stages[] | select(.requested) | {stage: .stage, status: .status}]}' ${LYCEUM_TEXTS_DIR:-output/texts}/SLUG/state.json"
   ```

3. Check manifest for blockers:
   ```bash
   nix-shell -p jq --run "jq '.blockers' ${LYCEUM_TEXTS_DIR:-output/texts}/SLUG/manifest.json"
   ```

### Stage Execution Loop

For each stage, the orchestrator must:

1. **Backup** the workspace (see "Defense Mechanisms" section)
2. **Announce** the stage being executed
3. **Read** the stage skill to understand its requirements
4. **Invoke** the skill's `run` command with safety constraints and timeout
5. **Verify workspace integrity** (CRITICAL — see "Defense Mechanisms"):
   - Workspace directory still exists
   - manifest.json exists and is valid JSON
   - state.json exists and is valid JSON
6. **If integrity check fails**: restore from backup, mark failed, STOP
7. **Verify** the stage completed successfully by checking:
   - Stage status updated to `done` or `blocked`
   - Expected outputs exist
   - No new blockers were added
8. **Update** workspace state if the skill didn't do so
9. **Advance** to the next stage or stop if blocked
10. **Cleanup** old backups (keep last 3)

### Stopping Conditions

The orchestrator MUST stop when:
- A stage sets status to `blocked`
- A stage fails verification
- Manifest blockers are non-empty
- Stage 10 is reached (requires human approval)
- An unrecoverable error occurs

---

## Workflows

## `/text-orchestrator run`

Full automated pipeline execution from intake through ship gate.

### Steps

1. **Initialize workspace** (if not exists):
   ```
   Load skill: text-intake
   Execute: /text-intake init [work] --mode [mode] --profile [profile]
   Verify: manifest.json and state.json exist
   ```

2. **Enter stage loop**:
   ```
   WHILE next_stage exists AND next_stage != "10-human-review":
     stage = determine_next_stage()
     skill = map_stage_to_skill(stage)
     
     ANNOUNCE: "=== Executing Stage: {stage} via {skill} ==="
     
     Load skill file for {skill}
     Execute: /{skill} run [work]
     
     IF stage has sub-skills (e.g., Stage 6, Stage 8):
       Execute each sub-skill in order
     
     Verify stage completion
     
     IF blocked OR failed:
       STOP with report
   ```

3. **Reach ship gate**:
   ```
   ANNOUNCE: "Pipeline reached Stage 10 (Human Review)"
   ANNOUNCE: "Automatic execution complete. Human approval required."
   Execute: /new-text-ship preview [work]
   STOP (do not auto-promote)
   ```

### Multi-Skill Stages

**Stage 6 (Versification)** — conditional fast path:
1. `/versify try-witness [work]` — Attempt to versify a `versification-candidate` witness directly
2. **If try-witness succeeds**: Stage 6 is done. Skip Stage 6a generation entirely. Run `verify_stage_6b.sh` (the versified witness edition satisfies the same structural contract as a generated translation).
3. **If try-witness fails** (no candidate, or all candidates fail versification):
   a. `/translation-synthesis run [work]` — Generate translation (Stage 6a)
   b. Run `verify_stage_6a.sh`
   c. `/versify run [work]` — Validate and import (Stage 6b)
   d. Run `verify_stage_6b.sh`

**Stage 8 (Interlinear)** — with sub-stage tracking:

Stage 8 consists of 4 canonical sub-stages:
- `treebank` — Generate/import treebank (optional)
- `llm-interlinear` — Generate contextual glosses via LLM
- `ground-truth-benchmark` — Benchmark against external corpora
- `gloss-review` — Audit and promote

**Progress announcement**:
```
=== Stage 8: Interlinear & Morphology ===
  → Sub-stage 1/4: Treebank import (optional)
  → Sub-stage 2/4: LLM interlinear generation (est. ~5min/100 words in agent mode)
  → Sub-stage 3/4: Ground-truth benchmark
  → Sub-stage 4/4: Gloss review & promotion
```

**Sub-stage 1: Treebank** (optional)
```bash
SLUG="your-text-slug"
WORKSPACE="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"

# Check if already done
STATUS=$(nix-shell -p jq --run "jq -r '.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.treebank.status // \"not_started\"' ${WORKSPACE}/state.json")

if [[ "$STATUS" == "done" ]]; then
  echo "Sub-stage treebank already done, skipping"
else
  echo "→ Sub-stage 1/4: Treebank import"
  START_TIME=$(date +%s)
  
  # Mark in progress
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.treebank) = {status: \"in_progress\", started_at: \"$(date -Iseconds)\"}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
  
  # Run treebank skill (may skip if no treebank available)
  /treebank run "${SLUG}"
  
  END_TIME=$(date +%s)
  ELAPSED=$((END_TIME - START_TIME))
  
  # Mark done
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.treebank) = {status: \"done\", completed_at: \"$(date -Iseconds)\", elapsed_seconds: ${ELAPSED}, notes: [\"Treebank import completed\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
fi
```

**Sub-stage 2: LLM Interlinear** (DO NOT use text_pipeline_alignment.go)
```bash
# Check if already done
STATUS=$(nix-shell -p jq --run "jq -r '.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"llm-interlinear\".status // \"not_started\"' ${WORKSPACE}/state.json")

if [[ "$STATUS" == "done" ]]; then
  echo "Sub-stage llm-interlinear already done, skipping"
else
  echo "→ Sub-stage 2/4: LLM interlinear generation"
  START_TIME=$(date +%s)
  
  # Mark in progress
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"llm-interlinear\") = {status: \"in_progress\", started_at: \"$(date -Iseconds)\"}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
  
  # Use API mode if key available (5-10x faster), fall back to agent mode
  if [[ -n "${ANTHROPIC_API_KEY:-}" ]]; then
    INTERLINEAR_MODE="api"
  else
    INTERLINEAR_MODE="agent"
  fi
  python3 scripts/generate_workspace_interlinear.py --workspace ${WORKSPACE} --mode ${INTERLINEAR_MODE} -v
  
  END_TIME=$(date +%s)
  ELAPSED=$((END_TIME - START_TIME))
  
  # Mark done
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"llm-interlinear\") = {status: \"done\", completed_at: \"$(date -Iseconds)\", elapsed_seconds: ${ELAPSED}, notes: [\"LLM interlinear generation completed\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
fi
```

**Sub-stage 3: Ground-truth Benchmark**
```bash
# Check if already done
STATUS=$(nix-shell -p jq --run "jq -r '.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"ground-truth-benchmark\".status // \"not_started\"' ${WORKSPACE}/state.json")

if [[ "$STATUS" == "done" ]]; then
  echo "Sub-stage ground-truth-benchmark already done, skipping"
else
  echo "→ Sub-stage 3/4: Ground-truth benchmark"
  START_TIME=$(date +%s)
  
  # Mark in progress
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"ground-truth-benchmark\") = {status: \"in_progress\", started_at: \"$(date -Iseconds)\"}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
  
  # Benchmark against external corpora
  go run ./scripts/text_pipeline_ground_truth.go benchmark -root ${LYCEUM_TEXTS_DIR:-output/texts} -work ${SLUG}
  
  END_TIME=$(date +%s)
  ELAPSED=$((END_TIME - START_TIME))
  
  # Mark done
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"ground-truth-benchmark\") = {status: \"done\", completed_at: \"$(date -Iseconds)\", elapsed_seconds: ${ELAPSED}, notes: [\"Ground-truth benchmark completed\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
fi
```

**Sub-stage 4: Gloss Review**
```bash
# Check if already done
STATUS=$(nix-shell -p jq --run "jq -r '.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"gloss-review\".status // \"not_started\"' ${WORKSPACE}/state.json")

if [[ "$STATUS" == "done" ]]; then
  echo "Sub-stage gloss-review already done, skipping"
else
  echo "→ Sub-stage 4/4: Gloss review & promotion"
  START_TIME=$(date +%s)
  
  # Mark in progress
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"gloss-review\") = {status: \"in_progress\", started_at: \"$(date -Iseconds)\"}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
  
  # Run benchmark and promote
  go run ./scripts/text_pipeline_gloss_review.go benchmark -root ${LYCEUM_TEXTS_DIR:-output/texts} -work ${SLUG}
  go run ./scripts/text_pipeline_gloss_review.go promote -root ${LYCEUM_TEXTS_DIR:-output/texts} -work ${SLUG}
  
  END_TIME=$(date +%s)
  ELAPSED=$((END_TIME - START_TIME))
  
  # Mark done
  nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"gloss-review\") = {status: \"done\", completed_at: \"$(date -Iseconds)\", elapsed_seconds: ${ELAPSED}, notes: [\"Gloss review and promotion completed\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
fi
```

**On resume**: The orchestrator checks each sub-stage's status before executing. If a sub-stage is marked `done`, it skips it entirely.

**On failure**: Mark the failed sub-stage as `blocked` or `failed` with notes:
```bash
nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages.\"llm-interlinear\") = {status: \"blocked\", completed_at: \"$(date -Iseconds)\", notes: [\"LLM API rate limit exceeded\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
```

**Clearing sub-stages** (e.g., when replaying Stage 8):
```bash
nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"8-interlinear-morphology\") | .sub_stages) = {}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
```

**CRITICAL for Stage 8**: The LLM pipeline (`generate_workspace_interlinear.py`) produces
contextual glosses like "wrath", "him", "walking". The Go-based pipeline produces dictionary-style
glosses like "to be", "he, she, it; self". ALWAYS use the Python LLM pipeline. Also, `gloss-review`
expects the workspace ground-truth benchmark artifact to already exist, so run
`text_pipeline_ground_truth.go benchmark` before `text_pipeline_gloss_review.go benchmark`/`promote`.

**Performance expectations**:
- **Agent mode** (`--mode agent`): ~5 minutes per 100 words (subprocess cold-start dominates)
- **API mode** (`--mode api`): ~30 seconds per 100 words (5-10x faster, requires `ANTHROPIC_API_KEY`)
- **Section splitting**: Sections >200 words are automatically split by `llm_interlinear.py` to maintain reasonable batch sizes
- **Recommendation**: For texts with >500 total words, set `ANTHROPIC_API_KEY` and use `--mode api` to avoid excessive pipeline overhead
- **Typical wall time** (agent mode): A 1600-word text (e.g., Meditations Book 1) takes ~90 minutes
- **Sub-stage timing**: Track elapsed time for each sub-stage to identify bottlenecks

---

## `/text-orchestrator replay`

Replay specific stages on an existing workspace by invalidating and re-executing them according to a replay profile.

### Purpose

Use when you need to:
- Upgrade source files for a shipped text
- Redo cleaning with improved rules
- Regenerate segmentation after fixing bugs
- Refresh witnesses or versification
- Rebuild transliteration with new standards
- Regenerate interlinear with improved LLM prompts
- Re-run reader QA after fixes
- Completely rehab a text

**Key difference from `run`**: Replay expects the workspace to already exist and uses replay profiles to determine which stages to invalidate and re-execute.

**Key difference from `resume`**: Resume picks up where a pipeline stopped. Replay explicitly invalidates stages via a profile before executing, even if those stages were previously marked `done`.

### Available Replay Profiles

| Profile | Stages Replayed | Downstream Invalidated | Use When |
|---------|----------------|----------------------|----------|
| `source-upgrade` | 0,1,2,3,4 | Everything from stage 4 onward | Better Greek source found |
| `cleaning` | 0,2,3 | Everything from stage 3 onward | Cleaning rules improved |
| `segmentation` | 0,4 | 5,6,7,8,9,10 | Segmentation logic fixed |
| `witness-upgrade` | 0,1,5 | 6,8,9,10 | New PD witnesses available |
| `versification` | 0,4,5,6,9,10 | - | Versification logic changed |
| `transliteration` | 0,7,9,10 | - | Transliteration standard updated |
| `interlinear` | 4,5,6,7,8 | 5,6,7,8,9,10 | Regenerate stages 4-8 from existing source data |
| `treebank-enrichment` | 0,8,9,10 | - | Treebank data upgraded |
| `reliability` | 0,9,10 | - | Reader QA criteria changed |
| `full-rehab` | All stages | All stages | Complete text rehabilitation |

**Note on invalidation cascades**: The table shows the *requested* replay stages. Invalidation rules in `textpipeline.go` may mark additional downstream stages as `invalidated`, making them actionable too. For example, `segmentation` profile requests stage 4, but stage 4 invalidates 5,6,7,8,9,10 — so all of those become actionable and will execute during the replay.

### Steps

1. **Verify workspace exists**:
   ```bash
   SLUG="your-text-slug"
   WORKSPACE="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"
   
   if [[ ! -d "${WORKSPACE}" ]]; then
     echo "ERROR: Workspace not found at ${WORKSPACE}"
     echo "Use '/text-orchestrator run' to create a new workspace"
     exit 1
   fi
   
   if [[ ! -f "${WORKSPACE}/manifest.json" ]] || [[ ! -f "${WORKSPACE}/state.json" ]]; then
     echo "ERROR: Workspace missing critical files (manifest.json or state.json)"
     exit 1
   fi
   ```

2. **Plan the replay** via `text-replay`:
   ```
   Load skill: text-replay
   Execute: /text-replay plan [work] --profile [profile]
   
   Verify:
   - Profile matches one of the available profiles
   - Requested stages are identified
   - Downstream invalidations are computed
   - Current state is understood
   ```

3. **Apply the replay plan**:
   ```
   Execute: /text-replay run [work] --profile [profile]
   
   This updates:
   - state.json (marks replay stages as 'pending', downstream as 'invalidated')
   - replay/stage-history.json (appends replay event)
   - Sets current_stage to the first actionable stage
   ```

4. **Enter the stage execution loop** (same as `run`, but starting from first actionable stage):
   ```
   WHILE next_actionable_stage exists AND next_actionable_stage != "10-human-review":
     stage = determine_next_actionable_stage()  # Any stage with status 'pending' or 'invalidated'
     skill = map_stage_to_skill(stage)
     
     ANNOUNCE: "=== Replaying Stage: {stage} via {skill} ==="
     
     # Follow exact same protocol as /text-orchestrator run:
     # 1. Create backup (MANDATORY)
     # 2. Load skill file
     # 3. Spawn subagent with safety constraints and timeout
     # 4. Verify workspace integrity
     # 5. Run verification script
     # 6. Update state
     # 7. Advance or stop
     
     IF blocked OR failed:
       STOP with report
   ```

5. **Reach ship gate or stop**:
   ```
   IF current_stage == "10-human-review":
     ANNOUNCE: "Replay reached Stage 10 (Human Review)"
     ANNOUNCE: "Automatic execution complete. Human approval required."
     Execute: /new-text-ship preview [work]
     STOP (do not auto-promote)
   ELSE IF blocked:
     ANNOUNCE: "Replay stopped at stage {stage}: {blocker_reason}"
     STOP
   ELSE:
     ANNOUNCE: "Replay complete. All actionable stages executed."
     STOP
   ```

### Actionable Stage Logic

A stage is actionable if its status is:
- `pending` (explicitly requested by the replay profile)
- `invalidated` (downstream of a replayed stage)

The orchestrator executes actionable stages in canonical order (0 → 10), regardless of which profile triggered them.

This logic is already implemented in `internal/textpipeline/textpipeline.go` via `nextActionableStage()`.

### Safety Protocols

Replay follows **ALL** the same defense mechanisms as `run`:
- Pre-stage backup (MANDATORY)
- Post-stage integrity verification
- Recovery from backup on integrity failure
- Subagent safety constraints
- Stage timeouts
- Verification script execution
- State updates and history logging
- Backup cleanup (keep last 3)

See "Defense Mechanisms" section for full details.

### Example Replay Transcript

```
User: /text-orchestrator replay "plato-republic-book-1" --profile interlinear

Orchestrator: === Verifying Workspace ===
Workspace found: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-republic-book-1
Manifest: plato-republic-book-1 (add mode)
Current state: Stage 8 done, Stage 9 done, Stage 10 pending

=== Planning Replay ===
Loading skill: text-replay
Executing: /text-replay plan "plato-republic-book-1" --profile interlinear

Replay Profile: interlinear
Requested stages: 4, 5, 6, 7, 8
Invalidated stages: 5, 6, 7, 8, 9, 10 (downstream from Stage 4)
Actionable stages: 4 (pending), 5 (invalidated), 6 (invalidated), 7 (invalidated), 8 (invalidated)

=== Applying Replay Plan ===
Executing: /text-replay run "plato-republic-book-1" --profile interlinear

Updated state.json:
  - Stages 4-8 marked as pending/invalidated
  - current_stage set to 4-structural-segmentation

Appended to replay/stage-history.json:
  - Replay event: profile=interlinear, at=2026-03-22T14:30:00Z

=== Entering Stage Execution Loop ===

=== Replaying Stage: 4-structural-segmentation via segmentation ===
Creating backup: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-republic-book-1.backup-4-structural-segmentation-20260322-143000
Loading skill: segmentation
Executing: /segmentation run "plato-republic-book-1"

Stage 4 complete.
Integrity check: PASSED
Verification script: verify_stage_4.sh PASSED

=== Replaying Stage: 5-witness-normalization-ranking via translation-witness ===
Creating backup: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-republic-book-1.backup-5-witness-normalization-ranking-20260322-143100
Loading skill: translation-witness
Executing: /translation-witness run "plato-republic-book-1"

Stage 5 complete.
Integrity check: PASSED
Verification script: verify_stage_5.sh PASSED

=== Replaying Stage: 6-versification-alignment via versify ===
Creating backup: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-republic-book-1.backup-6-versification-alignment-20260322-143200

Stage 6 complete.
Integrity check: PASSED
Verification script: verify_stage_6b.sh PASSED

=== Replaying Stage: 7-transliteration via transliteration ===
Creating backup: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-republic-book-1.backup-7-transliteration-20260322-143300

Stage 7 complete.
Integrity check: PASSED
Verification script: verify_stage_7.sh PASSED

=== Replaying Stage: 8-interlinear-morphology via treebank → alignment → gloss-review ===
Creating backup: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-republic-book-1.backup-8-interlinear-morphology-20260322-143400

[Stage 8 multi-skill execution: treebank, LLM interlinear, ground-truth benchmark, gloss-review]

Stage 8 complete.
Integrity check: PASSED
Verification script: verify_stage_8.sh PASSED

=== Replay Complete ===
All actionable stages (4-8) executed successfully.
Stages 9-10 remain invalidated — run full pipeline or reliability profile to re-verify.
```

### Common Replay Scenarios

**Scenario 1: Source Upgrade**
```
Problem: Found a better Greek source for an existing text
Solution: /text-orchestrator replay [work] --profile source-upgrade
Effect: Stages 0-4 re-run, stages 5-10 invalidated and re-executed
```

**Scenario 2: Interlinear Improvement**
```
Problem: Improved LLM prompt for glossing
Solution: /text-orchestrator replay [work] --profile interlinear
Effect: Stages 4-8 regenerated from existing source data, stages 9-10 invalidated but not executed
```

**Scenario 3: Transliteration Standard Change**
```
Problem: Updated transliteration rules repo-wide
Solution: /text-orchestrator replay [work] --profile transliteration
Effect: Stage 7 re-runs, stages 9-10 invalidated (but stages 5-6 and 8 are NOT invalidated)
```

**Scenario 4: Complete Rehab**
```
Problem: Text needs to be rebuilt from scratch (but keep original sources)
Solution: /text-orchestrator replay [work] --profile full-rehab
Effect: All stages re-execute from 0 through 10
```

### Pre-Replay Checks

Before starting a replay, verify:

1. **Workspace integrity**:
   ```bash
   bash scripts/verify_workspace_integrity.sh "${SLUG}"
   ```

2. **No concurrent operations**:
   ```bash
   nix-shell -p jq --run "jq -r '.current_stage, .stages[] | select(.status == \"in_progress\")' ${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}/state.json"
   ```
   If any stage shows `in_progress`, either resume or abort it first.

3. **Profile appropriateness**:
   - Don't use `full-rehab` unless you really mean it
   - Use the most specific profile that matches your intent
   - Check invalidation cascade: `text-replay plan` shows what *will* be invalidated

### Error Handling

Same as `run`, plus:

**Workspace Not Found**:
```
ERROR: Workspace not found at ${LYCEUM_TEXTS_DIR:-output/texts}/[slug]
ACTION: Use '/text-orchestrator run' to create a new workspace, not replay.
```

**Invalid Profile**:
```
ERROR: Unknown replay profile '[profile]'
ACTION: Use one of: source-upgrade, cleaning, segmentation, witness-upgrade, 
        versification, transliteration, interlinear, treebank-enrichment, 
        reliability, full-rehab
```

**Stage Currently In Progress**:
```
ERROR: Stage [stage] is currently in_progress
ACTION: Resume or abort the current operation before replaying.
```

---

## `/text-orchestrator resume`

Resume a pipeline that was stopped.

### Steps

1. Load workspace and determine current state
2. Find the next actionable stage
3. Continue the stage loop from that point
4. Follow same stopping conditions as `run`

### Pre-Resume Checks
- Verify workspace exists
- Check if any stages are `blocked` (require manual intervention)
- Check if pipeline is already complete

---

## `/text-orchestrator step`

Execute exactly one stage, then stop.

### Use Cases
- Testing a single stage
- Manual intervention between stages
- Debugging pipeline issues

### Behavior
Same as `run` but exits after one stage completes (success or failure).

---

## `/text-orchestrator status`

Report pipeline progress without executing anything.

### Output Format
```
Pipeline Status: [work]
==================
Workspace: ${LYCEUM_TEXTS_DIR:-output/texts}/[slug]
Mode: [add|apply]
Profile: [profile]

Stage Progress:
  [✓] 0-intake (done)
  [✓] 1-source-discovery (done)
  [→] 2-acquisition-extraction (in_progress)
  [ ] 3-cleaning-normalization (pending)
  ...

Next Stage: 2-acquisition-extraction
Next Skill: source-extract
Blockers: [none | list]
```

---

## `/text-orchestrator abort`

Mark pipeline as blocked and stop.

### Steps
1. Update state.json to mark current stage as `blocked`
2. Add note explaining why
3. Append history event

---

## State Management

### After Each Stage

Update `state.json` to reflect:
- Stage status: `done`, `blocked`, or `failed`
- `last_run_at` timestamp
- Outputs produced
- Notes from execution

Individual pipeline scripts (e.g., `text_pipeline_segmentation.go`) handle state updates internally. For manual state updates, use jq or direct file editing.

### Appending History

Every stage execution should append to `replay/stage-history.json`:
```json
{
  "at": "2026-03-17T19:00:00Z",
  "stage": "1-source-discovery",
  "action": "stage_completed",
  "notes": "Found 3 Greek candidates, 2 English witnesses"
}
```

### Sub-Stage Tracking

Complex stages (e.g., Stage 6, Stage 8) consist of multiple sub-stages that can be tracked independently. The state model supports sub-stage tracking via the `sub_stages` map in each stage's state.

**Sub-stage state structure**:
```json
{
  "stages": [
    {
      "stage": "8-interlinear-morphology",
      "status": "in_progress",
      "sub_stages": {
        "treebank": {
          "status": "done",
          "started_at": "2026-03-23T10:00:00Z",
          "completed_at": "2026-03-23T10:05:00Z",
          "elapsed_seconds": 300,
          "notes": ["Treebank import completed"]
        },
        "llm-interlinear": {
          "status": "in_progress",
          "started_at": "2026-03-23T10:05:00Z"
        }
      }
    }
  ]
}
```

**Canonical sub-stage names**:
- **Stage 6**: `try-witness`, `translation-synthesis`, `versify`
- **Stage 8**: `treebank`, `llm-interlinear`, `ground-truth-benchmark`, `gloss-review`

**Updating sub-stage status** (since `scripts/text_pipeline_state.go` doesn't exist, use jq):

Mark sub-stage in progress:
```bash
SLUG="your-text-slug"
STAGE="8-interlinear-morphology"
SUB="treebank"
WORKSPACE="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"

nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"${STAGE}\") | .sub_stages.\"${SUB}\") = {status: \"in_progress\", started_at: \"$(date -Iseconds)\"}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
```

Mark sub-stage done:
```bash
nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"${STAGE}\") | .sub_stages.\"${SUB}\") = {status: \"done\", completed_at: \"$(date -Iseconds)\", elapsed_seconds: ${ELAPSED}, notes: [\"Sub-stage completed\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
```

Mark sub-stage blocked/failed:
```bash
nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"${STAGE}\") | .sub_stages.\"${SUB}\") = {status: \"blocked\", completed_at: \"$(date -Iseconds)\", notes: [\"Reason for blocking\"]}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
```

Clear all sub-stages (when replaying a stage):
```bash
nix-shell -p jq --run "jq '(.stages[] | select(.stage == \"${STAGE}\") | .sub_stages) = {}' ${WORKSPACE}/state.json" > tmp.json && mv tmp.json ${WORKSPACE}/state.json
```

**Resume logic**: Before executing a sub-stage, check its status:
```bash
STATUS=$(nix-shell -p jq --run "jq -r '.stages[] | select(.stage == \"${STAGE}\") | .sub_stages.\"${SUB}\".status // \"not_started\"' ${WORKSPACE}/state.json")

if [[ "$STATUS" == "done" ]]; then
  echo "Sub-stage ${SUB} already done, skipping"
  # Skip to next sub-stage
else
  # Execute sub-stage
fi
```

**Benefits**:
- Fine-grained progress tracking within long-running stages
- Ability to resume from within a stage after interruption
- Detailed timing data for performance analysis
- Clear visibility into which sub-step is blocking

---

## Error Handling

### Stage Skill Not Found
```
ERROR: No skill mapping for stage [stage]
ACTION: Stop and report. This indicates a gap in the skill inventory.
```

### Stage Execution Failed
```
ERROR: Stage [stage] failed with: [error]
ACTION: Mark stage as blocked, record error, stop orchestration.
```

### Verification Failed
```
WARNING: Stage [stage] claims done but verification failed
ACTION: Mark as blocked with verification failure note, stop.
```

### Human Intervention Required
```
INFO: Stage [stage] requires human decision
ACTION: Stop and report what decision is needed.
```

---

## Defense Mechanisms

### Pre-Stage Backup

Before invoking ANY stage skill, the orchestrator MUST create a backup:

```bash
# Create timestamped backup before stage execution
SLUG="your-text-slug"
STAGE="N-stage-name"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
BACKUP_DIR="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}.backup-${STAGE}-${TIMESTAMP}"

# Full workspace backup
cp -r "${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}" "${BACKUP_DIR}"

# Verify backup was created
if [[ ! -d "${BACKUP_DIR}" ]]; then
  echo "FATAL: Failed to create backup at ${BACKUP_DIR}"
  exit 1
fi

echo "Backup created: ${BACKUP_DIR}"
```

### Post-Stage Integrity Verification

After EVERY stage completes (success, failure, or timeout), verify workspace integrity:

```bash
SLUG="your-text-slug"
WORKSPACE="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"

# Critical checks
INTEGRITY_OK=true

# 1. Workspace directory exists
if [[ ! -d "${WORKSPACE}" ]]; then
  echo "CRITICAL: Workspace directory deleted!"
  INTEGRITY_OK=false
fi

# 2. manifest.json exists and is valid JSON
if [[ ! -f "${WORKSPACE}/manifest.json" ]]; then
  echo "CRITICAL: manifest.json missing!"
  INTEGRITY_OK=false
elif ! nix-shell -p jq --run "jq '.' '${WORKSPACE}/manifest.json'" > /dev/null 2>&1; then
  echo "CRITICAL: manifest.json is not valid JSON!"
  INTEGRITY_OK=false
fi

# 3. state.json exists and is valid JSON
if [[ ! -f "${WORKSPACE}/state.json" ]]; then
  echo "CRITICAL: state.json missing!"
  INTEGRITY_OK=false
elif ! nix-shell -p jq --run "jq '.' '${WORKSPACE}/state.json'" > /dev/null 2>&1; then
  echo "CRITICAL: state.json is not valid JSON!"
  INTEGRITY_OK=false
fi

if [[ "${INTEGRITY_OK}" != "true" ]]; then
  echo "INTEGRITY CHECK FAILED - INITIATING RECOVERY"
fi
```

### Recovery Protocol

If integrity check fails after a stage:

1. **Stop immediately** — do not continue to next stage
2. **Locate most recent backup**:
   ```bash
   ls -td ${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}.backup-* | head -1
   ```
3. **Restore from backup**:
   ```bash
   BACKUP=$(ls -td ${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}.backup-* | head -1)
   rm -rf "${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"  # Remove corrupted workspace
   cp -r "${BACKUP}" "${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"
   echo "Restored from ${BACKUP}"
   ```
4. **Mark current stage as failed** in state.json
5. **Report the failure** with:
   - Which stage was running
   - What integrity check failed
   - Which backup was restored
   - Any error output captured

### Subagent Sandboxing

When spawning subagents via `pi -p --no-session`, the orchestrator MUST prepend these safety instructions to the prompt:

```
CRITICAL SAFETY CONSTRAINTS:
1. DO NOT delete the workspace directory (${LYCEUM_TEXTS_DIR:-output/texts}/SLUG)
2. DO NOT run `rm -rf` on any parent directory of the workspace
3. DO NOT run destructive commands outside the workspace
4. ONLY modify files WITHIN the workspace directory
5. If you encounter an error, report it - do not attempt cleanup by deletion
6. The workspace directory MUST still exist when you complete

Workspace path: ${LYCEUM_TEXTS_DIR:-output/texts}/SLUG
This directory and its contents must remain intact.
```

### Timeout Configuration

Default timeouts by stage complexity:

| Stage Type | Timeout | Notes |
|---|---|---|
| Simple (intake, transliteration) | 300s | Minimal external dependencies |
| Medium (cleaning, segmentation) | 600s | File processing |
| Complex (source-hunt, versification) | 900s | May involve web lookups or LLM calls |
| Very Complex (interlinear) | 1800s | Minimum for small texts (<500 words). Use 3600s for texts >1000 words in agent mode |

**Stage 8 timeout guidance**: In agent mode, Stage 8 requires ~5 minutes per 100 words due to subprocess overhead.
A 1600-word text takes ~90 minutes. Set timeout to at least 1800s for small texts, and 3600s (1 hour) for
larger works. API mode is 5-10x faster and can use shorter timeouts.

### Subagent Invocation Pattern

The safe pattern for invoking stage skills:

```bash
SLUG="your-text-slug"
STAGE="N-stage-name"
SKILL="skill-name"
WORKSPACE="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}"
TIMEOUT=900

# 1. Create backup
BACKUP="${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}.backup-${STAGE}-$(date +%Y%m%d-%H%M%S)"
cp -r "${WORKSPACE}" "${BACKUP}"

# 2. Spawn subagent with safety constraints and timeout
timeout ${TIMEOUT} pi -p --no-session "
CRITICAL SAFETY CONSTRAINTS:
- DO NOT delete ${WORKSPACE}
- DO NOT run rm -rf on parent directories
- Workspace MUST exist when you complete

/skill:${SKILL} run ${SLUG}
" 2>&1 | tee "${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}/logs/${STAGE}.log"

SUBAGENT_EXIT=$?

# 3. Verify integrity regardless of exit code
if [[ ! -d "${WORKSPACE}" ]] || [[ ! -f "${WORKSPACE}/manifest.json" ]] || [[ ! -f "${WORKSPACE}/state.json" ]]; then
  echo "INTEGRITY CHECK FAILED"
  echo "Restoring from backup: ${BACKUP}"
  rm -rf "${WORKSPACE}" 2>/dev/null
  cp -r "${BACKUP}" "${WORKSPACE}"
  # Mark stage failed
  echo "Stage ${STAGE} failed: workspace integrity violated"
  exit 1
fi

# 4. Check subagent exit status
if [[ ${SUBAGENT_EXIT} -eq 124 ]]; then
  echo "WARNING: Subagent timed out after ${TIMEOUT}s"
  # Workspace intact, but check if stage completed
fi

# 5. Clean up old backups (keep last 3)
ls -td ${LYCEUM_TEXTS_DIR:-output/texts}/${SLUG}.backup-* 2>/dev/null | tail -n +4 | xargs rm -rf 2>/dev/null
```

### Orchestrator Loop with Defenses

Updated stage execution loop:

```
FOR each stage in pipeline:
  1. ANNOUNCE: "=== Pre-Stage ${stage}: Creating Backup ==="
  2. CREATE BACKUP of workspace
  3. VERIFY backup exists
  
  4. ANNOUNCE: "=== Executing Stage: ${stage} via ${skill} ==="
  5. SPAWN subagent with:
     - Safety constraints prepended to prompt
     - Appropriate timeout
     - Output logging
  
  6. ANNOUNCE: "=== Post-Stage ${stage}: Integrity Check ==="
  7. VERIFY workspace integrity:
     - Directory exists
     - manifest.json valid
     - state.json valid
  
  8. IF integrity check FAILED:
     - RESTORE from backup
     - MARK stage as failed
     - STOP orchestration
     - REPORT failure details
  
  9. IF stage completed successfully:
     - RUN VERIFICATION SCRIPT (see below)
     - IF verification FAILS: MARK stage blocked, STOP
     - IF verification WARNS: record warnings, ADVANCE
     - IF verification PASSES: ADVANCE to next stage
  
  10. CLEANUP old backups (keep last 3)
```

---

## Verification Scripts

**CRITICAL**: After every stage completes, the orchestrator MUST run the corresponding
verification script before advancing. These scripts verify quality, not just file existence.

### Stage-to-Script Mapping

| Stage | Script | What it verifies |
|-------|--------|-----------------|
| 0-intake | `scripts/verify_stage_0.sh` | URNs, manifest, directories |
| 1-source-discovery | `scripts/verify_stage_1.sh` | Source provenance, recommendations |
| 2-acquisition | `scripts/verify_stage_2.sh` | Greek/English content, file sizes |
| 3-cleaning | `scripts/verify_stage_3.sh` | No HTML, no bleed-through, report quality |
| 4-segmentation | `scripts/verify_stage_4.sh` | Segment counts, DB import, cross-check |
| 5-witnesses | `scripts/verify_stage_5.sh` | Catalog roles, provenance, file existence |
| 6a-translation | `scripts/verify_stage_6a.sh` | Coverage, reviews, metadata, DB cross-check |
| 6b-versification | `scripts/verify_stage_6b.sh` | Generated edition, segment match, no empties |
| 7-transliteration | `scripts/verify_stage_7.sh` | Artifact existence, no Greek leakage |
| 8-interlinear | `scripts/verify_stage_8.sh` | Gloss quality, completeness, dictionary-style detection, ground-truth benchmark presence |
| 9-reader-reliability | `scripts/verify_stage_9.sh` | Full reader stack, feature matrix, server build |
| 10-ship-gate | `scripts/verify_stage_10.sh` | Runs ALL prior verifications, final cross-check |

### Invocation Pattern

```bash
SLUG="your-text-slug"
STAGE_SCRIPT="scripts/verify_stage_N.sh"  # or 6a, 6b, etc.

bash "${STAGE_SCRIPT}" "${SLUG}"
VERIFY_EXIT=$?

case $VERIFY_EXIT in
  0) echo "VERIFICATION PASSED" ;;
  1) echo "VERIFICATION FAILED — blocking stage advancement"
     # Mark stage as blocked with verification failure
     ;;
  2) echo "VERIFICATION WARNINGS — proceeding with notes"
     # Record warnings in stage history
     ;;
esac
```

### Exit Code Semantics

| Exit Code | Meaning | Action |
|-----------|---------|--------|
| 0 | PASS — all checks passed | Advance to next stage |
| 1 | FAIL — critical check failed | Block stage, stop orchestration |
| 2 | WARN — non-critical issues | Advance with warning notes in history |

### Special Cases

**Stage 6** verification depends on the path taken:
- **Fast path** (witness versification succeeded): run only `verify_stage_6b.sh` (the versified witness satisfies the same structural contract)
- **Standard path** (generation required): run `verify_stage_6a.sh` after `translation-synthesis`, then `verify_stage_6b.sh` after `versify`

**Stage 8** verifies gloss quality, not just existence:
- Contextual glosses (not dictionary-style comma lists)
- No Greek characters in English glosses
- Lemma and transliteration completeness
- Workspace ground-truth benchmark/report artifacts exist before promotion
- Benchmark status in `qa/interlinear-report.md` is not `not_available`
- Samples the first segment's word table for human review

**Stage 9** simulates the reader view:
- Assembles what the `/read/` handler would show for segment 1
- Verifies all reader features are available (row view, glosses, popup, sidebar)
- Prints a feature support matrix

**Stage 10** is a full cross-check:
- Runs ALL prior stage verifications end-to-end
- Reports a final SHIP / WARN / NOT READY verdict

---

## Example Run Transcript

```
User: /text-orchestrator run "Plato Allegory of the Cave"

Orchestrator: === Initializing Pipeline ===
Loading skill: text-intake
Executing: /text-intake init "Plato Allegory of the Cave" --mode add --profile add

[text-intake runs, creates workspace]

Workspace created: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-allegory-of-the-cave
Mode: add
Profile: add
Requested stages: 0-intake through 10-human-review

=== Stage 1: Source Discovery ===
Loading skill: source-hunt
Executing: /source-hunt run "plato-allegory-of-the-cave"

[source-hunt discovers Greek sources from Perseus, English from Jowett]

Stage 1 complete. Found:
- Greek: Perseus TEI (recommended)
- English: Jowett 1892 (recommended)

=== Stage 2: Acquisition and Extraction ===
Loading skill: source-extract
Executing: /source-extract run "plato-allegory-of-the-cave"

[... continues through stages ...]

=== Stage 10: Human Review ===
Pipeline reached the ship gate.
Automatic execution complete.

Review pack available at: ${LYCEUM_TEXTS_DIR:-output/texts}/plato-allegory-of-the-cave/qa/final-review-pack.md

To approve and ship:
  /new-text-ship promote "plato-allegory-of-the-cave"

Human approval required. Orchestrator stopping.
```

---

## Key Files

| File | Purpose |
|---|---|
| `internal/textpipeline/textpipeline.go` | Stage definitions and state model |
| `scripts/init_text_pipeline_workspace.go` | Workspace initializer |
| `docs/text-pipeline-master-plan-2026-03-13.md` | Canonical stage model |
| `docs/text-pipeline-skill-architecture-2026-03-13.md` | Skill ownership |
| `.pi/skills/*/SKILL.md` | Individual stage skills |

---

## Implementation Notes

### This Is an Agent Skill, Not a Script

The orchestrator works by:
1. Reading this skill file to understand the protocol
2. Sequentially loading and executing stage skills
3. Checking state between stages
4. Making decisions about continuation

The "automation" comes from the agent following these instructions without waiting for human prompts between stages.

### Subagent Invocation

When the orchestrator "invokes" a skill, it means:
1. **Create backup** of the workspace FIRST
2. Read the skill's SKILL.md file
3. **Prepend safety constraints** to the invocation prompt
4. Execute the skill's `run` command with appropriate timeout
5. **Verify workspace integrity** immediately after completion
6. If integrity check fails, **restore from backup** and stop
7. Follow the skill's instructions
8. Return to the orchestrator protocol

This is effectively the same agent wearing different "hats" for each stage.

### Known Gotcha: nix-shell Banner Pollution

**Do NOT pipe `nix-shell -p <pkg> --run "command"` output into data files or DB imports.**

The project's shellHook prints a banner to stdout that contaminates captured output.
Use `jq` directly (available in the dev shell) or redirect stderr.

**Fixed**: The flake.nix shellHook now only prints in interactive shells, but avoid
the pattern anyway. Use tools from the dev shell directly, or pipe to files if needed.

### CRITICAL: Subagent Trust Model

**Subagents are NOT sandboxed by the system.** The `allowed-tools` field in skill files is documentation only — it is NOT enforced. Subagents spawned via `pi -p --no-session` have full unrestricted tool access.

This means:
- A subagent CAN delete the entire workspace
- A subagent CAN run arbitrary destructive commands
- A subagent CAN corrupt state files

The orchestrator's defenses (backup, integrity check, recovery) exist because we CANNOT trust subagent behavior. Always:
1. Backup before spawning
2. Verify after completion
3. Be prepared to restore and abort

### Parallelization

The current model is strictly sequential. Future enhancements could allow parallel execution of independent stages, but the dependency graph (see `invalidationRules` in textpipeline.go) must be respected.
