---
name: talk-stage1-extract
description: "Extracts and structures source material (articles, transcripts, notes) into a talk summary with narrative arc, themes, metrics, and gaps. Auto-detects REX vs Concept type. Use when starting a new talk from any source material or auditing existing material before committing to a talk."
tags: [talk, pipeline, presentation, stage-1]
allowed-tools: "Write, Read, AskUserQuestion"
effort: medium
---

# Talk Stage 1: Extract

Transforms raw material (article, transcript, notes, or a mix) into a structured summary ready for the pipeline's downstream stages. Auto-detects source type.

## When to Use This Skill

- Starting a new talk from any source material
- First step of the talk pipeline (always run before other stages)
- Auditing existing source material before committing to a talk

## What This Skill Does

1. **Collects metadata** — asks for slug, event, date, duration, audience, mode if not provided
2. **Reads the source** — loads the source file or inline content
3. **Detects source type** — REX (real-world proof) vs Concept (ideas/thesis) based on content signals
4. **Extracts the narrative arc** — chronological for REX, thematic for Concept
5. **Extracts metrics** — every measurable number with its source
6. **Identifies main themes** — 3-7 themes
7. **Flags gaps** — what's missing for a complete talk
8. **Writes `{slug}-summary.md`**

## Input

Required:
- Source file path or inline content (article `.mdx`, transcript `.md`, notes)
- Metadata: `slug`, `event`, `date`, `duration`, `audience`, `type` (--rex or --concept)

If metadata is missing → `AskUserQuestion` before proceeding.

## Output

`talks/{YYYY}-{slug}-summary.md`

## Source Type Detection

| REX signals | Concept signals |
|-------------|-----------------|
| Specific dates | Theses, arguments |
| Measured metrics | General observations |
| Project/tool names | Trend observations |
| Commits, releases, PRs | Analogies, metaphors |
| "I shipped", "We built" | "I think", "In my opinion" |

If hybrid → note both components in the summary.

## Output Format

```markdown
# Talk Summary — {Provisional Title}

**Slug** : {slug}
**Event** : {event}
**Date** : {date}
**Duration** : {duration} min
**Audience** : {audience description}
**Type detected** : REX | Concept | Hybrid
**Source** : {source file path}

---

## Narrative Arc

{Arc description: 3-5 sentences. Chronological if REX, thematic if Concept.}

## Main Themes

| # | Theme | Short description | Weight |
|---|-------|------------------|--------|
| 1 | {theme} | {description} | High/Medium/Low |
...

## Key Metrics Extracted

{All measurable numbers found in the source}

Format: `{value}` — {context} — Source: {section/page/git}

Examples:
- `1,200 commits` over 7 months — Source: "acceleration" section
- `-97% traffic` after SSE migration — Source: CHANGELOG v1.1.0

If none → "No verifiable metrics found (Concept mode)"

## Narrative Potential

{3-5 sentences on the strengths and possible narrative angles.
What makes this talk potentially strong. What might be missing.}

## Gaps Identified

- [ ] {gap 1} — {how to fill it}
- [ ] {gap 2} — {how to fill it}

If no obvious gaps → "No major gaps identified."

## Recommendations for next stages

- **Research**: {recommended / not applicable (Concept mode)} — {why}
- **Concepts**: {priority themes to explore}
- **Position**: {angles already visible from the source material}

---

*Generated by talk-stage1-extract — {date}*
*Source: {source path}*
```

## Metric Extraction Rules

- Do not round without indicating it
- Always include the metric's source
- If two sources contradict → flag both, do not pick one
- No invented metrics to fill gaps
- Use `{before} → {after}` format for evolutions

## Anti-patterns

- Vague summary ("This text is about AI...")
- Omitting metrics — even approximate ones with their source
- Hiding gaps — naming them is better than pretending they don't exist
- Changing the detected type without justification
- Inventing a narrative arc not present in the source

## Validation Checklist

- [ ] Source type detected and justified
- [ ] Narrative arc in 3-5 clear sentences
- [ ] All measurable metrics extracted with their source
- [ ] Main themes listed (3-7 max)
- [ ] Gaps explicitly identified
- [ ] File saved to `talks/{YYYY}-{slug}-summary.md`

## Tips

- Run this before the orchestrator if you want to verify the source material is usable
- The summary is the foundation — every downstream stage reads it
- Hybrid sources (part REX, part Concept) are fine — name both components clearly

## Related

- [Stage 2: Research](../stage-2-research/SKILL.md) — git archaeology (REX mode)
- [Stage 3: Concepts](../stage-3-concepts/SKILL.md) — reads this summary
- [Orchestrator](../orchestrator/SKILL.md) — runs all stages in sequence