---
name: embedding-dedup
description: Embed an idea title+summary via HF inference; check cosine similarity vs the embeddings store.
---

# embedding-dedup

## Inputs
- `text`: title + 1-2 sentence summary of the candidate idea.
- `store_path`: path to `state/embeddings.jsonl`.
- `threshold` (optional): cosine threshold for "duplicate". Default 0.85.

## Behavior
- Embed text via HF inference (`BAAI/bge-small-en-v1.5`, 384-dim).
- Cosine-compare against every stored vector.
- Return the best match above threshold, or `None`.

## Fallback (caller responsibility)
If HF call raises (rate limit / cold start / 5xx), caller falls back to title-substring dedup and labels the PR `degraded-dedup`.

## Tests
`pytest .claude/skills/embedding-dedup/tests/`
