---
name: hypotheses
description: Manages the project's testable hypotheses — surfacing new ones, refining existing ones, updating status, reviewing the full set, and assessing hypothesis state based on evidence gathered so far. Use when the conversation touches assumptions, risks, what to validate, hypotheses, interview prep, assessing which hypotheses are confirmed or invalidated, reviewing overall hypothesis health, or when the user questions whether something about their idea is true.
---

# Hypotheses

Helps the user surface, refine, and manage testable assumptions about their idea. These hypotheses are the foundation for interview scripts, surveys, and validation experiments.

## Before you start

Read `startup/core.md` to load project context (name, seed description, and all fields under `## Core`).

Check if `startup/hypotheses/` contains any `.md` files.

---

## When hypotheses already exist

Load and understand them for context. Infer intent from the conversation — don't mechanically ask "what do you want to do?" If the user is:

- **Talking about a specific assumption** — help them refine or update the matching hypothesis
- **Questioning new assumptions** — help shape new hypotheses following the conventions below
- **Reviewing the full set** — summarize what exists, grouped by tag type, with status
- **Updating status** — read the file, propose the status change, get confirmation, write it back
- **Archiving** — when a hypothesis is no longer relevant (e.g., after a pivot), set `status: archived` and add `archived_reason` to frontmatter with a one-line explanation. Archived hypotheses stay in place — they can be restored by flipping the status back
- **Wanting web research on a hypothesis** — dispatch the `web-researcher` agent (fast model) with a focused prompt: include the hypothesis statement, the problem space from `core.md`, and ask for community signals (Reddit, forums), existing workarounds, and willingness-to-pay evidence. Incorporate key findings into the hypothesis `## Notes` section. Save the full web-researcher output to `startup/research/{YYYY-MM-DD}-{hypothesis-slug}-research.md` with frontmatter `date`, `topic`, and `source_skill: hypotheses`

When adding or updating hypotheses, follow the file conventions:
- YAML frontmatter with `status` (`untested`, `confirmed`, `invalidated`, or `archived`)
- Optional `last_assessed` ISO date (`YYYY-MM-DD`) — set by assessments, not on creation
- H1 heading: the hypothesis as a testable statement
- Obsidian tag on the next line: `#problem`, `#solution`, `#willingness_to_pay`, `#urgency`, or `#other`
- Description: what the assumption is, why it matters, what changes if it's wrong
- Optional `## Notes` section
- Optional `## Next Action` section — the smallest observable next validation move. Advisory and generated by assessments (see below), not authored by hand. No required internal structure: a tight one-sentence directive is the norm. It is overwritten on each assessment, always reflecting the latest reasoning — not an append-only log.

**Slug convention:** lowercase the title, replace spaces and non-alphanumeric characters with hyphens, collapse multiple hyphens.

Read before writing, propose before saving, get confirmation.

---

## When no hypotheses exist

Check if `startup/core.md` has at minimum **Audience** and **Problem** defined under `## Core`.

- **If not:** Mention that fleshing out the core idea first would help — strongly suggest to do it first, and if the user agrees, invoke the `whats-next` skill to initialize the project, and then you can come back to hypotheses. But do not block: if the user insists to work on hypotheses now, proceed.
- **If yes (or user insists):** Load the reference file for the guided first-time conversation:

```
.claude/skills/hypotheses/references/initial-hypotheses.md
```

The reference file's instructions take over from this point.

---

## Assessing hypothesis state

State assessment is one of this skill's core capabilities. When hypothesis state is in question, dispatch the `hypotheses-manager` subagent rather than evaluating evidence inline — it's bias-isolated, reads across interview evidence independently, and returns structured recommendations.

**When to dispatch:**
- The user asks directly about hypothesis health ("how are my hypotheses looking?", "what do we know so far?", "any hypothesis ready to confirm?", "which of these are we still guessing on?")
- Right after an interview has been analyzed and new `[[slug]]` backlinks have arrived (this is wired in automatically by the `interviews` skill's post-transcript flow — you'll see that path in the interviews skill)
- You're orienting the user on what to do next and hypothesis state is material to the decision
- The user questions whether a specific hypothesis still holds, or whether a cross-interview pattern deserves a new hypothesis

**What to pass:**
- `slugs`: the keyword `all` (or a specific list if the conversation is about particular hypotheses)
- `scope`: include instruction to also synthesize candidate new hypotheses from unlinked statements across interview files

**What comes back:** a structured block of state recommendations — each with a `What changed` line, reasoning, evidence pointers, and a `Next action` — plus a single cross-hypothesis `Top pick` and any candidate new hypotheses. The subagent never edits files.

**What to do with the result:**
- **First (no user confirmation needed) — eager bookkeeping:** for every hypothesis the subagent actually evaluated, update two things in its file:
  - Its `last_assessed` frontmatter to today's date.
  - Its `## Next Action` section to the subagent's suggested next action (create the section if absent, overwrite it if present).
  Both are mechanical, advisory bookkeeping — a factual record of what the subagent recommended against current evidence. Neither changes `status` or the hypothesis body, so neither needs per-item approval. Writing the next action eagerly keeps hypothesis files a live dashboard (the `whats-next` skill reads these sections) and keeps the stability anchor accurate even if the user nods and closes the chat, or if the assessment ran as a byproduct of another flow. (Read the file before writing, so you only touch frontmatter and the `## Next Action` section.)
- Then surface the state recommendations conversationally — lead each touched hypothesis with **what changed → the next action**, not just a status. Surface the `Top pick` as the single most pressing move. Then any candidate new hypotheses.
- For each recommended status change, use this skill's normal propose-before-writing flow — confirm per-item before flipping `status`.
- For each candidate new hypothesis, get the user's go-ahead before creating a file following the conventions above.
- Do not flip statuses or create new hypothesis files without explicit per-item confirmation.

This is the same subagent dispatched by the `interviews` skill after a transcript is analyzed — so state assessment is centralized here regardless of which entry point triggers it.

---

## Ad-hoc web search vs. dispatching web-researcher

Not every web question needs a subagent:

- **Inline `WebSearch` / `WebFetch` (you, the main agent):** single-fact lookups ("is this Reddit thread still active?", "what's the latest on X pricing model?"), quick verification of a claim, one data point asked about in flow. Stays in conversation, no persistence needed.
- **Dispatch `web-researcher`:** multi-source validation passes for a specific hypothesis (scanning community signals across Reddit/HN/forums, surveying workarounds, gathering willingness-to-pay evidence). Output is structured and gets saved to `startup/research/{YYYY-MM-DD}-{hypothesis-slug}-research.md` for later reference — and the findings feed into the hypothesis's `## Notes` section.

Rough rule: one fact in flow → inline. Multi-source or results-should-persist → dispatch.
