---
name: search-history
description: Search past Claude Code conversation transcripts by topic or keyword. Use when the user asks "what did we discuss about X", "do you remember when we talked about Y", "find past conversations about Z", or wants to recall something from a previous session.
---

# Search History

Search past conversation transcripts stored in `~/.claude/projects/` using **semantic (embedding) search** with automatic fallback to regex.

## Steps

1. **Run the search script** with the user's query:
   ```sh
   node <skill-dir>/search-history.mjs "<query>" [--project <name>] [--limit <n>] [--since <Nd|Nh>] [--reindex] [--no-embed]
   ```
   - `<query>`: natural language phrase or regex (semantic mode ignores regex syntax)
   - `--project`: filter results to a project (partial match, e.g. `transAct`)
   - `--limit`: max results (default 10)
   - `--since`: time window like `7d`, `30d`, `24h`
   - `--reindex`: wipe and rebuild the full index (run once after `npm install`, or to fix a corrupt index)
   - `--no-embed`: force regex mode (useful for exact keyword matching)

   Replace `<skill-dir>` with the absolute directory of this SKILL.md file.

2. **Parse the JSON output.** Key fields per result:
   - `mode`: `"semantic"` or `"regex"` — which engine was used
   - `score`: 0–1 blended score (80% semantic similarity + 20% recency decay)
   - `userSnippet`: the user's message (or the matching excerpt)
   - `assistantSnippet`: first ~400 chars of the assistant reply
   - `sessionId`: for reference if the user wants to dig deeper
   - `timestamp`: ISO date of the exchange

3. **Present results** to the user:
   - Date + project name
   - The user's message snippet
   - The assistant's reply snippet
   - Session ID

4. **If no results:** try a broader or differently phrased query, check if `--project` is too narrow, or try `--no-embed` for exact keyword matching.

## How the index works

- On first run, the script auto-installs `@huggingface/transformers` and downloads the model (~22MB) — no manual setup needed
- Index stored at `~/.claude/search-index.json` (vectors encoded as base64 Float32Array, ~3x smaller than JSON arrays)
- **Incremental**: only files modified since last run are re-embedded — subsequent searches are fast (~500ms)
- **Model**: `Xenova/all-MiniLM-L6-v2` — 384-dim, ~22MB, fully offline after first download
- Only user messages are embedded (clearest signal of intent); assistant text is stored for display only
- If the index exists but has no embeddings (e.g. was built in regex mode), it re-embeds automatically on next run

## Notes
- Thinking blocks, tool call content, and subagent internals (`isSidechain`) are always excluded
- Minimum semantic score threshold: 0.35 — clearly unrelated results are filtered out
- Recency decay uses a 30-day half-life blended at 20% weight
- `--no-embed` regex mode searches both user and assistant text; semantic mode scores user text only