---
name: text-intake
description: Stage 0 intake orchestrator — resolve the requested work, choose add vs apply mode, select the replay profile, and create or refresh the canonical text-pipeline workspace manifest/state.
allowed-tools: Bash(go:*), Bash(jq:*), Bash(find:*), Bash(ls:*), Bash(git:*), Read, Write, Edit, Grep, Glob
---

# Text Intake (Stage 0 Orchestrator)

Resolve what text is being worked on and initialize its canonical workspace.

This skill is the **Stage 0 owner** for intake and disambiguation.
It should answer:
- what exact work does the user mean?
- is this `add` mode or `apply` mode?
- which replay profile fits the request?
- what should the pipeline produce?
- what risks, blockers, and open questions need to be recorded now?

It does **not** own:
- source discovery
- extraction/cleaning/segmentation execution
- replay execution itself
- final promotion/import

## Quick Status

Existing workspaces: !`find ${LYCEUM_TEXTS_DIR:-output/texts} -mindepth 1 -maxdepth 1 -type d 2>/dev/null | wc -l`
Bootstrap script: !`ls scripts/init_text_pipeline_workspace.go 2>/dev/null | wc -l`
Pipeline profiles defined: !`grep -c "Profile[A-Z]" internal/textpipeline/textpipeline.go 2>/dev/null`

## Commands

- `/text-intake init [work] [--mode add|apply] [--profile <profile>]` — Create or refresh a canonical workspace
- `/text-intake inspect [work]` — Inspect likely mode, scope, and ambiguity before initialization
- `/text-intake gap [work]` — Compare requested outcome against current workspace/repo state and identify missing assets
- `/text-intake status [work]` — Summarize intake state, profile, blockers, and next stage

Target: $ARGUMENTS

---

## Owned Responsibilities

### Owns
- work disambiguation
- add vs apply mode selection
- replay profile selection
- success-criteria capture
- initial risk/blocker/open-question capture
- creation or refresh of `manifest.json` and initial `state.json`

### Does not own
- source hunting itself
- extraction or cleaning execution
- replay execution
- final review/ship

---

## ⚠️ URN Convention (Critical)

**Do NOT manually set `english_edition_urn`, `generated_edition_urn`, or `greek_edition_urn` in manifest.json.**

The import script generates URNs containing "versified" which the reader requires for row view detection.
Manually overriding these fields breaks the reader layout:

- The reader checks for `"versified"`, `"verse"`, or `"gen-eng"` in the English edition URN
- Without this convention, row view will not activate
- Users will see a broken/incomplete reading experience

**What to do instead:**
- Leave these fields blank in `manifest.json`
- The import script will auto-generate correct URNs at Stage 10
- Generated URNs follow the pattern: `<work_urn>.workspace-versified-eng1`

**When you might be tempted to set them:**
- Stage 6b (versification) — DON'T set `generated_edition_urn`
- Stage 10 (import) — DON'T override the auto-generated URNs
- External edition reuse — Use `existing_edition_urn` instead, not edition URN fields

The validation in `scripts/import_workspace.go` will warn if this convention is violated.

---

## Current Repo Implementation

The current scaffold for this skill is:
- `scripts/init_text_pipeline_workspace.go`
- `internal/textpipeline/textpipeline.go`
- `internal/textpipeline/textpipeline_test.go`

### Canonical init command
```bash
nix-shell -p go --run "go run ./scripts/init_text_pipeline_workspace.go \
  -work 'WORK NAME' \
  [-mode add|apply] \
  [-profile PROFILE] \
  [-edition-urn URN]"
```

### Example: add mode
```bash
nix-shell -p go --run "go run ./scripts/init_text_pipeline_workspace.go \
  -work 'Sophocles Antigone'"
```

### Example: apply mode
```bash
nix-shell -p go --run "go run ./scripts/init_text_pipeline_workspace.go \
  -work 'Meditations' \
  -mode apply \
  -profile interlinear \
  -edition-urn 'urn:cts:greekLit:tlg0562.tlg001.perseus-grc2'"
```

---

## Workflows

## `/text-intake inspect`

Use before initializing when the request is ambiguous or the existing text state is unclear.

### Decide
1. exact work identity
2. ambiguity in author/title/recension
3. whether the request is `add` or `apply`
4. likely replay profile
5. expected target output
6. known blockers or open questions

### Inspect using
- existing pipeline workspaces in `$LYCEUM_TEXTS_DIR/<slug>/` (defaults to `output/texts/`)
- existing editions/alignment artifacts in the repo
- pipeline profile definitions in `internal/textpipeline/textpipeline.go`

---

## `/text-intake init`

Create or refresh the canonical workspace.

### Required decisions
- `mode`: `add` or `apply`
- `profile`: one of the defined replay profiles
- `work`: canonical request label / work name

### Useful optional data to record
- existing edition URN
- target output label
- base Greek edition strategy
- target reference system
- success criteria
- risks
- blockers
- open questions

### Profiles currently encoded
- `add`
- `source-upgrade`
- `cleaning`
- `segmentation`
- `witness-upgrade`
- `versification`
- `transliteration`
- `interlinear`
- `treebank-enrichment`
- `reliability`
- `full-rehab`

### Expected outputs
```text
$LYCEUM_TEXTS_DIR/<slug>/
├── manifest.json
├── state.json
├── provenance.md
├── sources/
├── raw/
├── extracted/
├── clean/
├── structured/
├── witnesses/
├── versification/
├── interlinear/
├── qa/
└── replay/stage-history.json
```

**Note:** `LYCEUM_TEXTS_DIR` defaults to `output/texts` for local development.

---

## `/text-intake gap`

Use for apply-mode planning or when resuming a text.

### Compare
- requested profile vs current stage states
- requested output vs existing artifacts
- known repo assets vs missing assets
- current blockers vs new request

### Typical questions
- does the workspace already exist?
- which stages are already requested / done / invalidated?
- is there already an edition URN or prior artifact set to reuse?
- which next stage should run?

### Useful checks
```bash
find ${LYCEUM_TEXTS_DIR:-output/texts} -mindepth 1 -maxdepth 2 | sort
```

```bash
nix-shell -p jq --run "jq '.' ${LYCEUM_TEXTS_DIR:-output/texts}/SLUG/manifest.json"
```

```bash
nix-shell -p jq --run "jq '.' ${LYCEUM_TEXTS_DIR:-output/texts}/SLUG/state.json"
```

---

## `/text-intake status`

Summarize:
- workspace path
- mode and profile
- requested stages
- current stage
- blockers/open questions
- next recommended skill to run

---

## Outputs

### Canonical outputs
- `$LYCEUM_TEXTS_DIR/<slug>/manifest.json`
- initial `state.json`
- initialized workspace directory tree
- initial `replay/stage-history.json`
- intake notes in `provenance.md` and/or `qa/final-review-pack.md`

### Current source of truth
- `internal/textpipeline/textpipeline.go`

---

## Verification Contract

This skill follows the Stage 0 contract from `docs/text-pipeline-skill-verification-2026-03-13.md`.

### Verify
- requested work was resolved correctly
- add vs apply mode is correct
- replay profile/stage scope is correct
- manifest fields are complete
- initial state marks only the intended stages as in scope

### Minimum evidence
- `$LYCEUM_TEXTS_DIR/<slug>/manifest.json`
- `$LYCEUM_TEXTS_DIR/<slug>/state.json`
- intake notes with risks/blockers/open questions

### Pass criteria
- manifest identifies the intended work/edition target
- mode and profile match the request
- requested stages match the selected profile
- required manifest fields are non-empty
- state initializes the correct stages without accidental extra scope

### Failure examples
- wrong work chosen for ambiguous title
- apply request initialized as full add
- wrong replay profile selected
- workspace exists but manifest/state do not match the request

### Required next steps
After a successful intake run, the next owner is usually:
- `source-hunt` for add mode or source-oriented apply work
- `text-replay` when the user asked for a targeted replay profile

---

## Verification

After completing this stage, run the automated verification script:

```bash
bash scripts/verify_stage_0.sh "${SLUG}"
```

Exit codes: 0=PASS (advance), 1=FAIL (block), 2=WARN (advance with notes).
The orchestrator runs this automatically; when executing manually, check the output for [FAIL] or [WARN] lines.

---

## Key Files

| File | Purpose |
|---|---|
| `scripts/init_text_pipeline_workspace.go` | Current workspace initializer |
| `internal/textpipeline/textpipeline.go` | Stage/profile definitions and invalidation rules |
| `internal/textpipeline/textpipeline_test.go` | Bootstrap tests |
| `docs/text-pipeline-bootstrap-2026-03-13.md` | Bootstrap behavior |
| `docs/text-pipeline-master-plan-2026-03-13.md` | Canonical stage model |
| `docs/text-pipeline-skill-architecture-2026-03-13.md` | Ownership and command surface |
| `docs/text-pipeline-skill-verification-2026-03-13.md` | Verification contract |