---
name: vc-signals
description: "VC signal-to-thesis skill for Marathon-style deal discovery. Weekly all-sector radar, OSS radar, agent-native research workbench, theme drill-downs, company backtrace, and GitHub trending repos."
argument-hint: 'vc-signals radar all, vc-signals workbench docs/radar-runs/current, vc-signals oss ai-infra, vc-signals theme "agent evals", vc-signals setup'
allowed-tools: Bash, Read, Write, WebSearch, AskUserQuestion
user-invocable: true
---

# VC Signals: Emerging Theme Discovery for Venture Capital

Turn noisy public internet chatter into ranked, investor-oriented theme briefs with company mapping.

**This is NOT a generic research tool.** It is a signal-to-thesis skill built for a VC workflow.

## Argument Parsing

Parse the user's input to determine the mode and arguments:

- `/vc-signals setup` → Setup wizard mode
- `/vc-signals radar <sector|all> [time]` → Company-first weekly radar (sectors: `devtools`, `cybersecurity`, `ai-infra`, `vertical-ai`, `data-infra`, `oss`; `all` runs one unified brief with sector sections)
- `/vc-signals weekly <sector> [time]` → Alias for `radar` (kept for backward compatibility)
- `/vc-signals theme "<topic>" [time]` → Theme drill-down
- `/vc-signals company "<name>" [time]` → Company backtrace
- `/vc-signals oss <sector> [time]` → OSS-native radar using repo velocity, community signal, contributor quality, and company-formation likelihood
- `/vc-signals github <sector>` → GitHub trending repos (sectors: `devtools`, `cybersecurity`, `ai-infra`, `vertical-ai`, `data-infra`, `oss`, `all`)
- `/vc-signals workbench <run-dir>` → Agent-native research workbench for weak-signal synthesis without promoting unverified leads
- `/vc-signals add-sector <name>` → Add a new sector (guided)
- `/vc-signals compare "<company1>" "<company2>"` → Head-to-head comparison (stretch)

**Time window (optional):** Users can append a time window like `7d`, `14d`, `30d`, `60d`, `90d` to control how far back to search. Examples:
- `/vc-signals weekly devtools` → default 14 days
- `/vc-signals weekly devtools 7d` → last 7 days only
- `/vc-signals weekly devtools 30d` → last 30 days
- `/vc-signals theme "agent evals" 90d` → last 90 days of evidence

**Defaults by mode:**
- `weekly`: 14 days (focused on recent signal)
- `theme`: 30 days (broader context for deep analysis)
- `company`: 30 days

If no arguments or unrecognized arguments, show this help and ask what they'd like to do.

## Environment Detection (ALWAYS check this FIRST)

Before anything else, detect whether you're running in a sandboxed web environment (Claude.ai, Co-Work web) or a local environment (Claude Code CLI, Co-Work desktop with terminal access).

**Quick test:** Try running a simple Bash command:
```bash
echo "env_check"
```

If Bash is not available, or if you get network errors (403 Forbidden) when running Python scripts, you are in a **web sandbox**. In this case:

**Web sandbox mode:**
- Use ONLY WebSearch for retrieval (built-in, always works)
- Do NOT run any Python scripts (persistence.py, github_trending.py, last30days_adapter.py) — they will fail with network errors
- Do NOT run setup wizard — API keys can't be used in the sandbox
- Skip GitHub trending entirely
- Skip persistence (save/load) — just print the full brief inline
- Tell the user: "Running in web mode — using web search only. For full coverage (Reddit, HN, X, GitHub trending), install Claude Code locally and run from there."

**Local mode:**
- Full access — run all Python scripts, use last30days if configured, save results locally
- Proceed with the normal flow below

## First-Run Detection (check this after environment detection)

**Before executing ANY mode** (including setup), check if this is a first run:

```bash
cat ~/.config/last30days/.env 2>/dev/null | grep -q "SETUP_COMPLETE=true" && echo "configured" || echo "not_configured"
```

**If NOT configured — and the user did NOT explicitly run `/vc-signals setup`:**

Say this to the user:

> "Welcome to VC Signals! This is your first run. I need about 2 minutes to set things up — I'll install the research engine and ask you for a couple of optional API keys. You can skip any you don't have."
>
> "Want me to run setup now, or would you rather jump straight in with basic web search? (Setup gives you Reddit, Hacker News, X/Twitter, YouTube, and GitHub trending coverage.)"

If they choose setup → run the Setup Wizard (below), then continue to their original command.
If they choose to skip → proceed with their command using WebSearch path. Note: "Running with basic web search. You can run `/vc-signals setup` anytime to unlock more sources."

**If the user explicitly runs `/vc-signals setup`:**

Always run the setup/update wizard, even when `SETUP_COMPLETE=true` already exists. Treat setup as "add or update provider keys", not only "first install." First read `~/.config/last30days/.env`, detect which keys are already present, and ask only for missing keys unless the user says they want to replace one. Never print existing secret values; show only configured/missing status.

**Auto-install last30days during setup:**

As part of setup Step 3, ALWAYS clone last30days if it's not already installed — don't ask the user. Just do it:

```bash
# Determine where to clone last30days
if [ -d ".claude/skills/vc-signals" ]; then
  # Project-level install — clone to vendor/
  mkdir -p vendor
  git clone --quiet --depth 1 https://github.com/mvanhorn/last30days-skill.git vendor/last30days-skill 2>/dev/null || true
elif [ -d "$HOME/.claude/skills/vc-signals" ]; then
  # Global install (Co-Work) — clone to ~/.claude/vendor/
  mkdir -p "$HOME/.claude/vendor"
  git clone --quiet --depth 1 https://github.com/mvanhorn/last30days-skill.git "$HOME/.claude/vendor/last30days-skill" 2>/dev/null || true
else
  # Marketplace plugin install — keep shared vendor under ~/.claude/vendor/
  mkdir -p "$HOME/.claude/vendor"
  git clone --quiet --depth 1 https://github.com/mvanhorn/last30days-skill.git "$HOME/.claude/vendor/last30days-skill" 2>/dev/null || true
fi
```

Then install requests into the same Python runtime the adapter will use:
```bash
RUNTIME_PY="${CODEX_BUNDLED_PYTHON:-}"
if [ -z "$RUNTIME_PY" ] || ! "$RUNTIME_PY" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1; then
  RUNTIME_PY="$(
    for py in python3.14 python3.13 python3.12 python3; do
      command -v "$py" >/dev/null 2>&1 || continue
      "$py" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1 && { command -v "$py"; break; }
    done
  )"
fi
if [ -n "$RUNTIME_PY" ]; then
  "$RUNTIME_PY" -m pip install 'requests>=2.32,<3' 2>/dev/null || "$RUNTIME_PY" -m pip install --user 'requests>=2.32,<3' 2>/dev/null || true
  "$RUNTIME_PY" -c 'import requests' 2>/dev/null && echo "requests ready for $RUNTIME_PY" || echo "requests missing for $RUNTIME_PY"
else
  echo "Python 3.12+ missing; install with: brew install python@3.13"
fi
```

Then check Node.js for cookie-based X. last30days uses its bundled `bird-search.mjs` backend for `AUTH_TOKEN`/`CT0`, and that backend requires `node` on PATH. If X cookies are configured and Node is missing, install it with Homebrew when available:

```bash
ENV_FILE="$HOME/.config/last30days/.env"
if [ -f "$ENV_FILE" ] && grep -Eq '^(AUTH_TOKEN|TWITTER_AUTH_TOKEN)=' "$ENV_FILE" && grep -Eq '^(CT0|TWITTER_CT0)=' "$ENV_FILE"; then
  if ! command -v node >/dev/null 2>&1; then
    if command -v brew >/dev/null 2>&1; then
      brew install node || true
    elif [ -x /opt/homebrew/bin/brew ]; then
      /opt/homebrew/bin/brew install node || true
    else
      echo "Node.js missing; install with: brew install node"
    fi
  fi
  command -v node >/dev/null 2>&1 && node --version || true
fi
```

If either fails, continue — the skill works without them (WebSearch fallback, no GitHub trending). Tell the user what succeeded and what didn't.

## Script Paths

Before running any script, determine the skill directory. Check these locations in order:
1. `.claude/skills/vc-signals/` (relative to the current project root — check if it exists)
2. `~/.claude/skills/vc-signals/` (global installation for Co-Work)

Once found, use that path for all script commands. For example:
- If found at `.claude/skills/vc-signals/`, run: `python3 .claude/skills/vc-signals/scripts/<script>.py`
- If found at `~/.claude/skills/vc-signals/`, run: `python3 ~/.claude/skills/vc-signals/scripts/<script>.py`

Store the resolved path and reuse it for all script calls in this session.

Scripts:
- `<skill_dir>/scripts/persistence.py` — save/load briefings, diffs, theme index
- `<skill_dir>/scripts/github_trending.py` — GitHub star velocity search
- `<skill_dir>/scripts/last30days_adapter.py` — last30days integration
- `<skill_dir>/scripts/attio.py` — read-only Attio CRM matching from `ATTIO_ACCESS_TOKEN`
- `<skill_dir>/scripts/radar_run.py` — weekly evidence collection, quality filtering, Attio merge, and partner-preview rendering
- `<skill_dir>/scripts/provider_ab_test.py` — dry-run-first paid search provider comparison

Config:
- `<skill_dir>/config/sectors.json` — sector taxonomy
- `<skill_dir>/config/company_aliases.json` — company seed map

**For free-text identifiers** like company names or theme topics, pass `--name "Free Text"` to `save-markdown`; it slugifies internally (lowercases, replaces non-alphanumeric runs with hyphens, strips leading/trailing hyphens). Use `--slug` only when the value is already a known-safe identifier (sector slug, etc.).

---

## Mode: Setup Wizard

**Trigger:** `/vc-signals setup`

Walk the user through setup one step at a time. Use plain, non-technical language. Check what's already configured and skip completed steps. If setup was previously completed, do not exit early; run the provider-key update flow below so newly added sources like Exa and Product Hunt can be configured.

Before prompting, load existing config:

```bash
mkdir -p ~/.config/last30days
touch ~/.config/last30days/.env
chmod 600 ~/.config/last30days/.env
```

Use this file as the shared local config:

```text
~/.config/last30days/.env
```

When a user provides a key, save it as the exact env var name shown below. Preserve existing keys unless the user explicitly asks to replace them.

### Step 1: Python Check

```bash
RUNTIME_PY="${CODEX_BUNDLED_PYTHON:-}"
if [ -z "$RUNTIME_PY" ] || ! "$RUNTIME_PY" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1; then
  RUNTIME_PY="$(
    for py in python3.14 python3.13 python3.12 python3; do
      command -v "$py" >/dev/null 2>&1 || continue
      "$py" -c 'import sys; raise SystemExit(0 if sys.version_info >= (3, 12) else 1)' >/dev/null 2>&1 && { command -v "$py"; break; }
    done
  )"
fi
[ -n "$RUNTIME_PY" ] && "$RUNTIME_PY" --version || true
```

If Python 3.12+ is available, say: "Python is ready at `<runtime path>`." and move on.
If not, say: "You need Python 3.12 or newer. Here's how to install it:" and provide instructions for macOS:
```
brew install python@3.13
```

### Step 2: Install Python Dependencies

```bash
"$RUNTIME_PY" -m pip install 'requests>=2.32,<3' 2>/dev/null || "$RUNTIME_PY" -m pip install --user 'requests>=2.32,<3'
"$RUNTIME_PY" -c 'import requests'
```

Say: "Installed requests into the Python runtime vc-signals will actually use."

### Step 3: Provider Keys

Ask for these provider keys explicitly. Every key is optional, but do not hide Exa or Product Hunt behind a generic "web search" prompt.

**Recommended now**

**Exa API Key (`EXA_API_KEY`) -- recommended for source-yield work:**
> "Exa helps us resolve Product Hunt and X launch chatter into official domains and richer page evidence. Paste your Exa API key, or type 'skip'."

If provided, save:
```bash
EXA_API_KEY=<value>
```

**Product Hunt API Token (`PRODUCT_HUNT_TOKEN`) -- recommended for launch discovery:**
> "Product Hunt gives us structured launch data: products, makers, topics, launch copy, and Product Hunt URLs. Paste your Product Hunt token, or type 'skip'."

If provided, save:
```bash
PRODUCT_HUNT_TOKEN=<value>
```

**GitHub Token (`GITHUB_TOKEN`) -- recommended for OSS market radar:**
First check:
```bash
gh auth status 2>&1
```

If `gh` is not authenticated and `GITHUB_TOKEN` is missing:
> "GitHub helps us find OSS momentum and company/market signals. Paste a GitHub Personal Access Token with `public_repo`, or type 'skip'."

If provided, save:
```bash
GITHUB_TOKEN=<value>
```

**Attio Access Token (`ATTIO_ACCESS_TOKEN`) -- recommended for Marathon owner/status checks:**
> "Attio lets the packet show whether a company is already known, stale, passed, or missing an owner. Paste an Attio access token, or type 'skip'."

If provided, save:
```bash
ATTIO_ACCESS_TOKEN=<value>
```

**X launch radar (`XAI_API_KEY`, `AUTH_TOKEN` + `CT0`, or `TWITTER_AUTH_TOKEN` + `TWITTER_CT0`) -- optional:**
> "X is useful as launch radar, not identity truth. Paste an `XAI_API_KEY` if you use xAI access for this lane, or paste browser cookies `AUTH_TOKEN` and `CT0`. Cookie-based X also needs Node.js for the bundled bird backend; setup can install it with `brew install node` if missing. `TWITTER_AUTH_TOKEN` and `TWITTER_CT0` also work if your setup already uses those names. You can skip X."

If `XAI_API_KEY` is provided, save:
```bash
XAI_API_KEY=<value>
```

If browser cookies are provided, save:
```bash
AUTH_TOKEN=<value>
CT0=<value>
```

If the user already has Twitter-prefixed cookie names, preserve them too:
```bash
TWITTER_AUTH_TOKEN=<value>
TWITTER_CT0=<value>
```

After saving cookie-based X values, verify Node.js because the last30days X backend is the bundled `bird-search.mjs` script:

```bash
if ! command -v node >/dev/null 2>&1; then
  if command -v brew >/dev/null 2>&1; then
    brew install node || true
  elif [ -x /opt/homebrew/bin/brew ]; then
    /opt/homebrew/bin/brew install node || true
  else
    echo "Node.js missing; cookie-based X needs Node. Install with: brew install node"
  fi
fi
command -v node >/dev/null 2>&1 && node --version || true
```

Then re-run `<skill_dir>/scripts/last30days_adapter.py check`. If `x_backend.ready` is false with cookie keys present, report the missing dependency instead of saying X is active.

**Optional search alternatives**

Ask only if the user wants lower-cost provider trials or already has keys:

- `SERPER_API_KEY`
- `DATAFORSEO_LOGIN` and `DATAFORSEO_PASSWORD`
- `BRAVE_API_KEY` (guarded; do not auto-enable broad last30days grounding)
- `YOU_API_KEY` or `YDC_API_KEY`
- `PERPLEXITY_API_KEY`

Make the cost rule clear:
> "Normal weekly runs keep broad paid last30days grounding disabled. Exa can still be used for targeted hard-evidence resolution. To intentionally enable broad grounding, use `VC_SIGNALS_ALLOW_LAST30DAYS_GROUNDING=1` and a dollar cap."

**Optional deep research**

**OpenRouter API Key (`OPENROUTER_API_KEY`) -- optional:**
> "OpenRouter gives access to Perplexity-style deep research for theme drill-downs. Paste an OpenRouter key, or type 'skip'."

**Optional social/video**

**ScrapeCreators API Key (`SCRAPECREATORS_API_KEY`) -- optional:**
> "ScrapeCreators unlocks TikTok, Instagram, and YouTube through last30days. Paste a key, or type 'skip'."

**Manual-mode structured providers**

Do not block setup on these. Say:
> "Crunchbase, Coresignal, and LinkedIn stay manual-mode for now unless you already have compliant API access. You can add these later, but they are not required for this sprint."

Accepted optional direct-provider vars:

- `CORESIGNAL_API_KEY`
- `CRUNCHBASE_API_KEY` or `CRUNCHBASE_TOKEN`
- `LINKEDIN_ACCESS_TOKEN` or `LINKEDIN_API_KEY`

**Direct LLM API fallback**

Do not ask for OpenAI, Gemini, or xAI as normal reasoning providers. Say:
> "Claude Code/Codex is the reasoning layer. Direct OpenAI/Gemini/xAI LLM APIs are only for standalone non-harness runs and stay disabled unless `VC_SIGNALS_ALLOW_DIRECT_LLM_API=1` is set."

### Step 4: last30days Research Engine

Check availability:
```bash
"$RUNTIME_PY" <skill_dir>/scripts/last30days_adapter.py check
```

If not installed, tell the user:
> "The last30days research engine gives us much better results by searching Reddit, Hacker News, X/Twitter, and YouTube independently. Without it, we'll use web search which still works but gives less detailed results."
>
> "Want me to set it up? It takes about 5 minutes and requires a few API keys. Or we can skip this and use web search for now."

If they want to proceed:
```bash
git clone https://github.com/mvanhorn/last30days-skill.git vendor/last30days-skill
```

### Step 5: Save Configuration

Save all collected keys to `~/.config/last30days/.env`:
```bash
mkdir -p ~/.config/last30days
```

Write the .env file with all provided keys. Preserve existing keys, append missing keys, and never echo secret values back to the user.

Add `SETUP_COMPLETE=true` at the end.

Then lock down the file permissions:
```bash
chmod 600 ~/.config/last30days/.env
```

Setting secure file permissions so only your user can read the keys.

### Step 6: Verify

Run a quick test:
```bash
"$RUNTIME_PY" <skill_dir>/scripts/last30days_adapter.py check
```

Then run source access detection:
```bash
"$RUNTIME_PY" <skill_dir>/scripts/source_access.py
```

Optionally verify key presence without printing secrets:
```bash
grep -q '^EXA_API_KEY=' ~/.config/last30days/.env && echo "Exa configured" || echo "Exa missing"
grep -q '^PRODUCT_HUNT_TOKEN=' ~/.config/last30days/.env && echo "Product Hunt configured" || echo "Product Hunt missing"
grep -q '^ATTIO_ACCESS_TOKEN=' ~/.config/last30days/.env && echo "Attio configured" || echo "Attio missing"
```

### Step 7: Summary

Print what's configured and what each unlocks:

> **Setup complete. Here's what's active:**
> - [x/skip] Exa -- targeted hard-evidence search and official-domain resolution
> - [x/skip] Product Hunt -- structured product launch source
> - [x/skip] GitHub API -- trending repo and OSS market-radar discovery
> - [x/skip] Attio -- owner/status/pass/stale context
> - [x/skip] last30days -- Reddit, HN, X/Twitter, YouTube
> - [x] Harness LLM reasoning -- Claude/Codex plans searches and validates evidence
> - [x/skip] Deep research (OpenRouter) -- Perplexity synthesis for theme drill-downs
> - [x/skip] X -- launch radar
> - [manual] Crunchbase/Coresignal/LinkedIn -- manual-mode unless direct API keys were provided
>
> (Show [x] for configured items, [ ] for skipped items, based on what the user actually set up.)
>
> **You can run `/vc-signals setup` again anytime to add more capabilities.**
> **Try it out: `/vc-signals radar all`**

---

## Mode: Radar (Weekly Marathon Scan)

**Triggers:** `/vc-signals radar <sector|all>` or `/vc-signals weekly <sector|all>` (alias)

**Sectors:** `devtools`, `cybersecurity`, `ai-infra`, `vertical-ai`, `data-infra`, `oss`. Use `all` for the default Marathon weekly artifact.

If sector is not recognized, say so and list valid sectors.

**Default Marathon output:** one unified all-sector brief delivered Monday 8:00 AM ET when scheduled. Put a cross-sector summary and top 10-15 companies first, then compact sections for each sector. Do not dump 50 rows into Slack; Slack gets the teaser/digest and a link or artifact for the full brief. If no Slack channel is configured yet, produce the Markdown artifact and include instructions to set the channel later.

### Step 1: Load Configuration

Read the sector taxonomy:
```bash
cat <skill_dir>/config/sectors.json
```

Read the company alias map:
```bash
cat <skill_dir>/config/company_aliases.json
```

For deterministic weekly runs, use the orchestration helper as the canonical packet writer. Do not replace its generated report with a hand-written web-research memo:

```bash
python3 <skill_dir>/scripts/radar_run.py weekly --sectors all --output-dir <output_dir>
```

This saves raw evidence JSON, normalized signals, scored candidates, a weekly preview, a weekly focus packet, and `quality-gate.json`. The default weekly command is the full-quality safe path: it uses Product Hunt, YC, GitHub, X/social sources where configured, and direct Exa-first hard-evidence resolution for Product Hunt/X rows by default. Broad `last30days` paid grounding is disabled by default even when Brave, Exa, Serper, or Parallel keys exist; enable it only for intentional deep validations with `VC_SIGNALS_ALLOW_LAST30DAYS_GROUNDING=1` or `--allow-last30days-grounding`. Disable targeted hard evidence only with `--no-hard-evidence-live` or `VC_SIGNALS_HARD_EVIDENCE_DISABLE=1`. Signal investigation is runtime-capped by default (`--signal-investigation-max-runtime-seconds` overrides it) so weekly runs finish with explicit partial investigation instead of hanging quietly.

After every weekly run, read `<output_dir>/quality-gate.json` before summarizing. The gate is the product truth:
- `passing` means the generated packet cleared the canonical weekly shape.
- `partial` means the row counts passed but some source coverage needs caveats.
- `thin` means the packet is a partial review queue, not a full-quality weekly radar.
- `smoke` means setup/artifact validation only; never judge Marathon output quality from it.

If `quality-gate.json` says `do_not_freestyle_final_report: true`, obey it. Summarize the generated `weekly-preview.md` and `weekly-focus.md`; do not create a replacement radar with ad hoc web research. Extra Claude intelligence is allowed only as a clearly labeled supplement, such as "Supplemental leads to verify", and must not promote unsupported companies into the canonical rows.

Official-site crawling is targeted/opt-in because it is useful but slow. Enable it only for focused evidence-completion passes with `VC_SIGNALS_OFFICIAL_SITE_CRAWL_ENABLE=1`; normal weekly runs rely on hard-evidence search and targeted manual enrichment first.

Paid-search guardrails are always expected for local weekly runs:
- Shared provider cache: `~/.cache/vc-signals/provider-search-cache` (`VC_SIGNALS_PROVIDER_CACHE_DIR` overrides it)
- Spend ledger: `~/.cache/vc-signals/paid-search-ledger.jsonl` (`VC_SIGNALS_PAID_SEARCH_LEDGER_PATH` overrides it)
- Default caps: smoke/dev `$0.50`, manual enrichment `$2`, weekly `$8`, deep validation `$25`
- Override cap: `VC_SIGNALS_PAID_SEARCH_MAX_USD=<amount>`
- `last30days` passes `--web-backend=none` by default when paid web keys exist; after `VC_SIGNALS_ALLOW_LAST30DAYS_GROUNDING=1`, it can use `VC_SIGNALS_LAST30DAYS_WEB_BACKEND`, Exa, Serper, Parallel, or Brave with `VC_SIGNALS_ALLOW_BRAVE_AUTO=1`

Before a validation sprint or any repeated run, preview paid-search cost and exit before live collection:

```bash
python3 <skill_dir>/scripts/radar_run.py weekly --sectors all --output-dir <output_dir> --paid-search-dry-run
```

To compare search providers without spending, use the A/B scaffold:

```bash
python3 <skill_dir>/scripts/provider_ab_test.py --queries-file queries.json --providers brave,exa,serper,dataforseo
```

Add `--live --max-usd 1` only when the user intentionally wants a paid provider trial.

For a lightweight smoke test, and only when the user wants to try the flow quickly, run:

```bash
python3 <skill_dir>/scripts/radar_run.py weekly --sectors all --output-dir <output_dir> --first-pass
```

`--first-pass` uses one query per sector and a 45-second per-query cap. Do not use first-pass mode to judge final Marathon radar quality.

The weekly artifact must not silently omit sectors or source quality. If a sector has no qualified candidates, render a sector coverage note explaining whether the cause was no source evidence, source-not-candidate-eligible evidence, weak evidence, or missing grounded web/company enrichment. If Product Hunt, YC, X, hard evidence, Attio, or non-OSS company rows are missing or thin, rely on the generated quality gate instead of writing around the gap.

For lower-level debugging, run collection and preview separately:

```bash
python3 <skill_dir>/scripts/radar_run.py collect --output-dir <output_dir>
```

This writes raw evidence JSON, filters obvious GitHub/reddit noise, uses the bundled Python 3.12 runtime for last30days when available, and keeps collection separate from investor judgment.

To build an automatic scored preview from saved evidence:

```bash
python3 <skill_dir>/scripts/radar_run.py preview --from-evidence <output_dir>/<YYYY-MM-DD>-raw-evidence.json --output <output_dir>/<YYYY-MM-DD>-auto-scored-preview.md
```

The automatic preview is intentionally conservative: it extracts candidate companies/projects, applies first-pass Investment Interest and Evidence Confidence scores, merges Attio context when `ATTIO_ACCESS_TOKEN` is present, filters low-interest candidates, and renders only Medium/High-interest rows. If it underfills the table, label the run as thin/partial through the generated quality gate. Do not lower the threshold, and do not replace the canonical packet with freestyled synthesis.

The preview schema includes LinkedIn, Founders, and X columns. Fill them only from evidence, Attio/CRM fields, structured seed input, or explicit user-provided data. Do not invent LinkedIn or founder links. Without grounded web or LinkedIn-capable evidence, leave those cells blank and treat them as next diligence.

### Step 2: Check for Previous Briefing (Week-over-Week)

```bash
python3 <skill_dir>/scripts/persistence.py load-previous --sector <SECTOR> --before $(date +%Y-%m-%d)
```

If a previous briefing exists, save it for comparison in Step 7.

Also load the theme index to check for durable themes:
```bash
cat <skill_dir>/data/history/theme_index.json 2>/dev/null || echo "{}"
```

Use this to identify themes that have appeared in 3+ consecutive scans — these are candidates for the "Durable" section of the week-over-week comparison.

### Step 3: Select Retrieval Path

```bash
python3 <skill_dir>/scripts/last30days_adapter.py check
```

If `installed` is true -> use **last30days path**. The engine has useful zero-config free sources (`reddit`, `hackernews`, `github`, `polymarket`) even when no optional keys are configured.
Otherwise -> use **WebSearch path**.

Tell the user which path you're using:
- "Using last30days for multi-source research. Free sources are active; optional keys unlock X, YouTube, TikTok/Instagram, and grounded web."
- "Using web search for research. For deeper coverage across Reddit, HN, X, and YouTube, run `/vc-signals setup`."

### Step 4: Retrieve Evidence

**WebSearch path:**

Generate 8-12 search queries from the taxonomy. Use the sector's `discovery_queries` plus queries built from `subcategories` seed_queries. Run each query using the WebSearch tool. Collect all results.

Example query generation for devtools:
1. Use the sector's `discovery_queries` directly (6 queries)
2. Pick the most important seed query from each subcategory (6 queries)
3. Total: ~12 queries

For each query, use WebSearch. Collect titles, URLs, snippets.

**Filtering noise:** Check the sector's `negative_terms` list from the taxonomy config. Skip or deprioritize results that are clearly tutorial content, beginner guides, or consumer product reviews. These terms exist to reduce noise — use them when evaluating search results.

**last30days path (if available):**

Run 3-5 queries through the adapter with auto-resolve enabled. Auto-resolve discovers the right subreddits, X handles, and GitHub context automatically — no hardcoded lists needed.

Use `--lookback-days` to control the time window. Default for radar/weekly scans is 14 days. If the user appended a time window like `7d` or `30d` to the command, use that number's digits instead.

Curated subreddits from `sectors.json` are passed alongside `--auto-resolve` so the known-good list is always covered while auto-resolve discovers new sources.

```bash
# Set LOOKBACK_DAYS once for this scan, then reuse across all queries below.
LOOKBACK_DAYS=14   # radar/weekly default; override with the user's time-window digits if specified

# Read curated subreddits from the sector taxonomy.
SUBREDDITS=$(python3 -c "import json; print(','.join(json.load(open('<skill_dir>/config/sectors.json'))['<SECTOR>'].get('subreddits', [])))")

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<specific theme query>" --sources "reddit,hackernews,x,youtube,github,polymarket,grounding" --subreddits "${SUBREDDITS}" --auto-resolve --store --lookback-days ${LOOKBACK_DAYS}
```

Run each of the sector's `hn_queries` from the config, plus 2-3 discovery queries. Auto-resolve handles subreddit and handle targeting.

For `vertical-ai` and prosumer/creator-heavy workflows, add TikTok/Instagram only when they plausibly carry customer pull:

```bash
python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<vertical AI query>" --sources "reddit,hackernews,x,youtube,tiktok,instagram,github,grounding" --tiktok-hashtags "<tags>" --ig-creators "<handles>" --auto-resolve --store --lookback-days ${LOOKBACK_DAYS}
```

For infra-heavy sectors (`devtools`, `cybersecurity`, `ai-infra`, `data-infra`, `oss`), keep TikTok/Instagram off by default unless the user asks.

**IMPORTANT: Query strategy matters.** Hacker News search works best with specific, concise queries — NOT broad category dumps. Use 2-4 focused keywords per query.

Good queries (specific, get results):
- "AI coding agent Cursor Claude Code"
- "supply chain attack secrets security"
- "AI code review tools"
- "platform engineering internal developer portal"

Bad queries (too broad, return 0):
- "CI CD testing observability platform engineering" (too many unrelated keywords)
- "emerging developer tools trends 2026" (too generic)

After collecting results, filter by engagement:
- HN: prioritize items with points > 20 or comments > 10
- Reddit: prioritize items with score > 10 or num_comments > 5
- X: prioritize items with high engagement

If auto-resolve fails or last30days results are thin, supplement with WebSearch using `site:` targeting:
- `"<topic> site:news.ycombinator.com"`
- `"<topic> site:reddit.com/r/programming"`

### Step 5: Retrieve GitHub Trending Data

```bash
python3 <skill_dir>/scripts/github_trending.py --sector <SECTOR> --limit 15
```

This runs in addition to the general retrieval. GitHub data feeds into momentum scoring and company mapping.

### Step 6: Synthesize Themes

Now you have all the evidence. This is where your reasoning does the heavy work.

**Extract candidate themes:**
- Read through ALL retrieved evidence (search results, last30days items, GitHub repos)
- Identify recurring topics, problems, technologies, or shifts mentioned across multiple sources
- Name each candidate theme concisely (e.g., "AI-Powered Code Review", "Browser Sandboxing for AI Agents")
- Tag each with the relevant subcategory from the taxonomy

**Cluster and deduplicate:**
- Merge near-duplicate themes. Examples:
  - "AI code review" + "LLM-powered code review" + "automated PR review" -> "AI-Powered Code Review"
  - "shift-left security" + "developer-first security" -> "Developer-First Security Tooling"
- Pick canonical names that are specific enough to be useful, generic enough to cover the cluster

**Filter low-yield themes (Phase 1 radar requirement):**

After clustering, drop any theme that produces fewer than 3 mappable companies in Step 7. A theme without investable companies is signal noise — better to surface 6 themes × 8 companies than 12 themes × 2 companies.

Operationally: do a quick first pass through Step 7's company mapping for each candidate theme. If a theme yields fewer than 3 companies (across seed map, evidence, and GitHub data combined), drop it from the radar before scoring momentum. Note the dropped themes in the persistence record so future scans can detect when a previously-dropped theme starts producing companies.

**Score each theme -- Momentum (1-10):**

Assign a transparent momentum score. For each theme, weigh these factors:

| Factor | Weight | What to look for |
|--------|--------|-----------------|
| Recency | High | Are discussions from the last 1-2 weeks? |
| Source diversity | High | Does it appear across multiple independent sources? |
| Repetition density | Medium | How many distinct mentions vs one viral post? |
| Engagement signals | Medium | High upvotes/comments/stars on related content? |
| Novelty | Medium | Is this a NEW conversation or evergreen background chatter? |
| Technical specificity | Low | Specific tools/approaches mentioned vs vague hand-waving? |
| GitHub velocity | Low | Are related repos showing star acceleration? |

Rubric:
- **8-10:** Breakout -- multiple sources, very recent, specific tools named, high engagement, new conversation
- **5-7:** Rising -- clear signal but fewer sources, or overlaps with existing known trends
- **3-4:** Ambient -- mentioned but not clearly accelerating, could be background noise
- **1-2:** Faint -- single source, low engagement, or possibly stale

You MUST explain how you arrived at each score in 1-2 sentences.

**Rate confidence (low / medium / high):**
- **High:** Multiple independent sources, specific evidence, corroborated
- **Medium:** Clear signal but limited sources or partially inferred
- **Low:** Thin evidence, single source, or extrapolated

**Assess investment timing (early / mid / late):**
- **Early:** Problem is being discussed, OSS projects emerging, no clear commercial winners
- **Mid:** Commercial players exist, some funding, category is forming
- **Late:** Well-known category, established players, late-stage rounds

**Hype vs Durable verdict:**
One blunt sentence. Example: "Durable -- real pain point with multiple well-funded solutions." or "Likely hype -- single viral post driving most of the signal, unclear staying power."

**Theme tags will be computed in Step 9, AFTER the theme index is updated** (compute-tags reads post-update counts). Do not pre-compute tags here.

### Step 7: Map Companies

For each surviving theme (after Step 6 filtering), identify 8-12 relevant companies using three sources:

1. **Seed map:** Check `company_aliases.json` — does any known company map to this theme?
2. **Evidence:** Were any companies/projects mentioned in the search results for this theme?
3. **GitHub data:** Do any trending repos from Step 5 relate to this theme?

**Target: 30-50 total companies across the radar.** This is the centerpiece of the output, not an accessory. For `all`, spread coverage across sector sections and avoid letting AI infra crowd out every other sector.

**Deduplicate across themes:**

A company that appears in evidence for multiple themes (e.g., CodeRabbit in both "AI Code Review" and "Agentic Coding") must appear in the company list ONCE, not twice. Pick the strongest theme as `primary_theme`, list the others in `secondary_themes`. The output table shows only `primary_theme`.

To pick `primary_theme`:
- Prefer the theme with the most direct evidence
- If tied, prefer the theme where the company is `direct_solver` over `beneficiary`
- If still tied, pick the theme with higher momentum

**For each company, capture:**

| Field | Source | Notes |
|-------|--------|-------|
| `name` | Display form | "MintMCP", "Anysphere (Cursor)" |
| `primary_theme` | Picked above | The theme this company most clearly rides |
| `secondary_themes` | Other themes the company touches | List of theme names |
| `why_on_radar` | One sentence | The single most concrete reason this is investable. Specific signal, not generic. |
| `evidence_url` | Best source | URL of the strongest piece of evidence |
| `role` | direct_solver / beneficiary / adjacent / unclear | Same taxonomy as before |
| `confidence` | confirmed / likely / inferred / speculative | Same as before |
| `stage` | null | Phase 2 fills this; leave null for now |
| `raised` | null | Phase 2 fills this |
| `headcount` | null | Phase 2 fills this |
| `founders` | null | Phase 2 fills this |
| `investment_interest` | high / medium / low or 0-100 | How much Marathon should care if the evidence is true |
| `evidence_confidence` | high / medium / low or 0-100 | How well-supported the claim is |
| `attio_status` | no_match / stale / no_owner / active / passed / unknown | Phase 4 fills this; use unknown until Attio is connected |
| `action` | assign owner / refresh note / monitor only / flag quietly / ignore | Use judgment; passed companies get flag quietly |
| `why_this_may_be_noise` | One sentence | Skeptical counter-case |
| `source_links` | List | 1-3 supporting URLs |

**The `why_on_radar` field is critical.** It's what Alex reads in the radar table. Bad: "AI testing company". Good: "AI-native test gen, launched 3 weeks ago, ex-Datadog founders". One sentence, specific signal.

**Company tags will be computed in Step 9, AFTER the company index is updated.** Do not pre-compute tags here.

**Do NOT:**
- Pretend to know things you don't (especially funding amounts — leave null until Phase 2)
- Map a company to a theme just because the names sound related
- Duplicate companies across themes — pick one primary

### Step 7.5: Match Against Attio

If `ATTIO_ACCESS_TOKEN` is available, enrich the company rows before formatting:

```bash
python3 <skill_dir>/scripts/attio.py check
```

Then pass the deduplicated company list:

```bash
printf '%s' '<companies-json>' | python3 <skill_dir>/scripts/attio.py enrich
```

Input shape:

```json
{"companies":[{"name":"Cascade","domain":"runcascade.com"}]}
```

Use the returned fields:
- `attio_status`: `no_match`, `no_owner`, `active`, `stale`, `passed`, or `unknown`
- `attio_action`: `assign owner`, `refresh note`, `monitor only`, `flag quietly`, etc.
- `attio_lists`: human-readable list names
- `attio_attributes`: selected CRM fields such as status, last interaction, round, headcount, and amount raised when present

Matching rule: prefer company domain over name. The Attio client verifies stored domains before accepting a domain-based fuzzy search result, because Attio search can return similarly named but wrong companies. If no domain is known, use company name but mark weak matches conservatively.

Status rule:
- `no_match` -> new discovery; action `assign owner`
- `no_owner` -> already in Attio but needs assignment; action `assign owner`
- `stale` -> old pipeline or stale interaction context; action `refresh note`
- `passed` or `Deprioritized` -> action `flag quietly` and explain what changed
- `active` -> avoid duplicate outreach; action `monitor only` or enrich existing context

### Step 8: Format Output

Output uses the radar format: themes are short headers, the company table is the centerpiece, week-over-week diff lives at the top.

**Begin with the week-over-week diff (only if a previous briefing exists):**

```bash
echo '{"current": <CURRENT_COMPANIES_JSON>, "previous": <PREV_COMPANIES_JSON>}' | \
  python3 <skill_dir>/scripts/persistence.py company-diff
```

The output gives `new_companies` and `faded_companies`. Use these to populate the "New To Radar" and "Faded Off Radar" sections below.

**Output template:**

````markdown
## VC Radar: {Sector Display Name} — Week of {YYYY-MM-DD}

### What's Moving

[For each theme, exactly 3 lines:]
- **{Theme Name}** — {TAG (Nw)}. {one-line why-now in <120 chars}.
  Companies riding this: {N}

[6-8 themes max. TAG is one of NEW, ACCELERATING (#prev → #curr), PERSISTENT (Nw), or omitted if no tag. (Nw) means "N weeks active".]

### Company Radar ({N} companies)

| Company | Sector | Theme | Interest | Evidence | Attio | Action | Why On Radar | Why This May Be Noise | Sources |
|---------|--------|-------|----------|----------|-------|--------|--------------|------------------------|---------|
| MintMCP | AI Infra | MCP Infra | High | Medium | unknown | assign owner | First SOC2-compliant MCP gateway, picking up enterprise pilots | New category; buyer urgency unproven | [1](...) |
| CodeRabbit | Devtools | AI Code Review | Medium | High | unknown | monitor only | 2M repos, 13M PRs reviewed | Crowded category; likely consensus | [1](...) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |

[30-50 rows. For `all`, group by sector after the cross-sector top 10-15. Include NEW / RETURNING / PERSISTENT tags inside Why On Radar when useful instead of adding a separate tag column.]

### New To Radar This Week ({N} companies)

[Bulleted list, max 10. If more than 10 are new, show top 10 by best evidence and note "+N more in the table above".]
- {Company} — {primary_theme}. {why_on_radar}
- ...

### Faded Off Radar ({N} companies)

[Bulleted list, max 10. From company-diff faded_companies.]
- {Company} — last seen {date}, was in {primary_theme}.

### Theme Detail

[For each theme, 2-3 lines max. NOT the long-form analysis from v1. Pointer to deep-dive.]
- **{Theme}** ({company_count} companies) — {2-sentence context}.
  Run `/vc-signals deep "{Theme}"` for full evidence and subthemes.
````

**Sorting rules:**
- Themes in "What's Moving" are sorted by tag (NEW → ACCELERATING → PERSISTENT → no-tag), then by momentum descending.
- Company Radar rows are sorted by primary_theme (matching the order themes appear in "What's Moving"), then alphabetically by company name within each theme.
- New To Radar is sorted by primary_theme matching above.

**Length budget:** the entire radar should fit in roughly 150-250 lines. If it's longer, the per-theme detail is too verbose — tighten it.

### Slack Delivery Instructions

If the user asks to schedule or post to Slack:
- Default schedule: Monday 8:00 AM ET.
- Channel is intentionally not hardcoded. Ask for or read the configured channel name at scheduling time.
- Slack teaser should include only the top 10-15 cross-sector companies/projects, their sector, Investment Interest, Evidence Confidence, Attio status, action, and one-line noise caveat.
- Attach or link the full Markdown briefing. Do not paste the full 30-50 row table into Slack by default.

To modify later: change the channel name in the scheduler/Slack connector configuration, not in the radar logic. If no channel is configured, save the briefing and print: "Slack channel not configured yet; set `VC_SIGNALS_SLACK_CHANNEL` or update the scheduler prompt with the desired channel."

### Step 9: Persist Results

**Order matters.** Snapshot the OLD indices, compute tags from the snapshots (so NEW tags fire correctly for first-time entries), THEN update the indices, THEN save. Computing tags after the update breaks NEW: the entry exists in the index by then.

**Step 9a: Snapshot the OLD indices** (before any updates)

```bash
OLD_THEME_INDEX=$(cat <skill_dir>/data/history/theme_index.json 2>/dev/null || echo "{}")
OLD_COMPANY_INDEX=$(python3 <skill_dir>/scripts/persistence.py load-company-index)
```

**Step 9b: Compute tags from the OLD snapshots**

```bash
echo '{"themes": <THEMES_JSON>, "companies": <COMPANIES_JSON>, "theme_index": '"$OLD_THEME_INDEX"', "company_index": '"$OLD_COMPANY_INDEX"'}' | \
  python3 <skill_dir>/scripts/persistence.py compute-tags
```

The output enriches each theme and company with a `tag` field (NEW / ACCELERATING / RETURNING / PERSISTENT / null). Use these tags when rendering the markdown for Step 8 — re-render that section using the tagged data before saving.

**Step 9c: Update the theme index**

```bash
cat <<'THEMES_EOF' | python3 <skill_dir>/scripts/persistence.py update-index --sector <SECTOR> --date $(date +%Y-%m-%d)
[the JSON themes array goes here]
THEMES_EOF
```

**Step 9d: Update the company index** (NEW in Phase 1)

```bash
cat <<'COMPANIES_EOF' | python3 <skill_dir>/scripts/persistence.py update-company-index --sector <SECTOR> --date $(date +%Y-%m-%d)
[the JSON companies array goes here]
COMPANIES_EOF
```

**Step 9e: Save the briefing JSON** (themes + companies in one call)

```bash
cat <<'BRIEFING_EOF' | python3 <skill_dir>/scripts/persistence.py save-briefing --sector <SECTOR> --retrieval-path <websearch|last30days> --date $(date +%Y-%m-%d)
{"themes": [the tagged themes array], "companies": [the tagged companies array]}
BRIEFING_EOF
```

**Step 9f: Save the rendered markdown**

```bash
cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir briefings --slug <SECTOR> --date $(date +%Y-%m-%d)
[the rendered radar markdown content goes here]
MD_EOF
```

If any persistence step fails, warn the user but still display the full briefing inline. Do not crash.

---

## Mode: Agent-Native Research Workbench

**Trigger:** `/vc-signals workbench <run-dir>`

Use this when the user wants Codex/Claude's own LLM judgment over a weekly run, especially when grounded company discovery is unavailable or the radar is OSS-heavy.

This mode is a verification workbench, not a canonical radar writer:
- It may synthesize source gaps, theme hypotheses, possible companies requiring verification, and next searches.
- It must not add rows to `candidates.json`.
- It must not claim company domains, funding, headcount, founders, customers, or stage unless those facts appear in the supplied evidence.
- Possible companies stay "requiring verification" until grounded source URLs support them.

When the user types `/vc-signals workbench <run-dir>`, do the whole flow. Do not make the user run a second prompt.

First create the machine-readable pack:

```bash
python3 <skill_dir>/scripts/radar_run.py workbench --from-run <run-dir> --output-dir <run-dir>-workbench
```

Then read both generated files yourself:

- `<run-dir>-workbench/research-workbench-prompt.md`
- `<run-dir>-workbench/research-workbench-input.json`

Use the JSON as the only factual source and write `<run-dir>-workbench/research-workbench.md` with:

1. Partner Notes
2. Source Gap Diagnosis
3. Theme Hypotheses
4. Possible Companies Requiring Verification
5. Recommended Next Searches

Finally tell the user where the summary was saved. If the requested run directory is missing, ask them to run `/vc-signals radar all` first.

If the user wants a lead promoted into the weekly radar, first run or request real verification searches that return credible company/product URLs.

---

## Mode: Theme Drill-Down

**Trigger:** `/vc-signals theme "<topic>"`

### Step 1: Load Config

Read `sectors.json` and `company_aliases.json`.

Check if the topic maps to a known subcategory. If yes, use its seed_queries as starting points. If no, generate queries from scratch.

### Step 2: Retrieve Evidence

Use the same retrieval path selection as weekly scan (check last30days availability).

Run 5-8 targeted queries about the specific theme. Include:
- "[topic] trends emerging"
- "[topic] startups companies"
- "[topic] open source projects github"
- "[topic] hacker news discussion"
- "[topic] why now 2026"
- "[topic] problems challenges"

**last30days path (if available):**

Run 3-5 queries through the adapter with auto-resolve enabled. Default lookback for theme drill-downs is 30 days. If the user appended a time window like `7d` or `30d` to the command, use that number's digits instead.

```bash
# Set LOOKBACK_DAYS once for this drill-down, then reuse across queries below.
LOOKBACK_DAYS=30   # theme drill-down default; override with the user's time-window digits if specified

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<topic-specific query>" --sources "reddit,hackernews,x" --auto-resolve --lookback-days ${LOOKBACK_DAYS}
```

Auto-resolve discovers the right subreddits, X handles, and GitHub context automatically for the theme.

**Deep research (if OPENROUTER_API_KEY is configured):**

For theme drill-downs, use the deep research mode for comprehensive synthesis with 50+ citations:

```bash
python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<topic> emerging trends companies" --deep-research --auto-resolve --lookback-days ${LOOKBACK_DAYS}
```

This uses Perplexity Sonar Pro (~$0.90 per query) to produce a structured research report with citations. If deep research is not available (no OPENROUTER_API_KEY), fall back to the standard multi-query approach above.

Also run GitHub trending for related keywords:
```bash
python3 <skill_dir>/scripts/github_trending.py --sector all --limit 10
```
(Filter results to those matching the theme in post-processing.)

### Step 3: Synthesize and Output

```markdown
## Theme Deep-Dive: [Topic]

### What Is This?
[2-3 sentence explanation of the theme for someone unfamiliar]

### Why Now?
[What changed recently that made this theme emerge? Be specific -- new tech, new problem, regulatory shift, etc.]

### Key Subthemes
- **Subtheme A:** [1-2 sentences]
- **Subtheme B:** [1-2 sentences]
- **Subtheme C:** [1-2 sentences]

### Evidence
- [Source 1: title, URL, key insight]
- [Source 2: title, URL, key insight]
- [Source 3+]

### Adjacent Categories
- [Related theme 1 -- how it connects]
- [Related theme 2 -- how it connects]

### Companies Solving the Problem
| Name | What They Do | Confidence | Source |
|------|-------------|------------|--------|

### Companies Benefiting From the Trend
| Name | How They Benefit | Confidence |
|------|-----------------|------------|

### Relevant OSS Projects
| Project | Stars | Velocity | Commercial Entity |
|---------|-------|----------|------------------|

### Durable vs Hype
[Blunt, honest assessment. 2-3 sentences. What could make this fade? What would confirm it's real?]

### Investment Implications
- **Timing:** Early/Mid/Late
- **What to watch for:** [Specific signals that would confirm or invalidate this theme]
- **Biggest risk:** [One sentence]
```

### Step 4: Persist

```bash
cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir themes --name "<topic>" --date $(date +%Y-%m-%d)
[the markdown content goes here]
MD_EOF
```

---

## Mode: Company Backtrace

**Trigger:** `/vc-signals company "<name>"`

### Step 1: Check Seed Map

Read `company_aliases.json`. Check if the company exists. If yes, note its known themes, sectors, and OSS projects.

### Step 1.5: Select Retrieval Path

Check if last30days is available (same as weekly scan Step 3). If available, use it for source-specific searches in Step 2.

### Step 2: Retrieve Evidence

Run 4-6 queries:
- "[company name] trends news"
- "[company name] product updates"
- "[company name] competitors market"
- "[company name] open source projects"
- "[company name] funding investment"

**last30days path (if available):**

Search for the company across sources with auto-resolve. Default lookback for company backtrace is 30 days. If the user appended a time window like `7d` or `30d` to the command, use that number's digits instead.
```bash
# Set LOOKBACK_DAYS once for this backtrace, then reuse across queries below.
LOOKBACK_DAYS=30   # company backtrace default; override with the user's time-window digits if specified

python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<company name>" --sources "hackernews,reddit,x" --auto-resolve --quick --lookback-days ${LOOKBACK_DAYS}
```

If the company has known OSS projects from the seed map, also search for those.

**GitHub deep search (if company has known GitHub org or repos):**

If the company has known OSS projects from the seed map, search for the repo directly:
```bash
python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<company name>" --github-repo "<owner/repo>" --auto-resolve --quick --lookback-days ${LOOKBACK_DAYS}
```

If you can identify a founder's GitHub username, search their activity:
```bash
python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<founder name>" --github-user "<username>" --quick
```

**X/Twitter deep search:**

If the company or founder has a known X handle, search their timeline:
```bash
python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<company name>" --x-handle "<handle>" --sources "x" --quick
```

Check GitHub:
```bash
python3 <skill_dir>/scripts/github_trending.py --sector all --limit 30
```
Filter for repos owned by or related to the company.

### Step 3: Map to Rising Themes

Cross-reference the evidence against:
1. Known themes from previous weekly scans (load recent briefings)
2. Themes apparent from current evidence
3. Seed map themes

### Step 4: Output

```markdown
## Company Backtrace: [Company Name]

### Overview
[What the company does, 2-3 sentences]

### Theme Exposure
| Rising Theme | Role | Confidence | Evidence |
|-------------|------|------------|----------|
| Theme A | Direct solver | Confirmed | [Brief evidence] |
| Theme B | Beneficiary | Likely | [Brief evidence] |

### OSS / Ecosystem Signals
- [Project 1: stars, velocity, relevance]
- [Project 2 if applicable]

### Competitive Context
[Brief note on who else operates in the same themes. NOT a full competitive analysis -- just enough for context.]

### Confidence Notes
[What you're confident about, what's uncertain, what you couldn't verify]
```

### Step 5: Persist

```bash
cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir companies --name "<company name>" --date $(date +%Y-%m-%d)
[the markdown content goes here]
MD_EOF
```

---

## Mode: OSS Radar

**Trigger:** `/vc-signals oss <sector> [time]`

Use this when the user asks for OSS startup discovery, fast-growing repos, open-source companies, or GitHub-first market signal. This mode is conceptually inspired by Gokul Rajaram's OSS Startup Radar, with permission to reuse ideas/code, but VC Signals should adapt it to Marathon's Seed-to-Series-B workflow.

### Step 1: Scope

Valid sectors: `ai-infra`, `devtools`, `cybersecurity`, `data-infra`, `vertical-ai`, `all`. If no sector is provided, default to `ai-infra` for OSS radar.

### Step 2: Collect Repo Candidates

Use GitHub trending and last30days together:

```bash
python3 <skill_dir>/scripts/github_trending.py --sector <SECTOR> --limit 50
```

Then search community signal:

```bash
python3 <skill_dir>/scripts/last30days_adapter.py query --topic "<repo or theme>" --sources "reddit,hackernews,x,youtube,github" --github-repo "<owner/repo>" --quick --store --lookback-days 30
```

### Step 3: Rank OSS Signal

Do not rank by total stars alone. Score these separately:

- 30/60/90-day star velocity and stars/day
- age-adjusted momentum
- community signal from Reddit, HN, X, YouTube, and GitHub discussions/issues where available
- contributor/maintainer quality
- repo quality: installability, docs, releases, issue health, package/demo path
- license and commercialization path
- company formation probability
- star authenticity/noise risk

Use the Alex/Gokul OSS email as the output bar: top themes, top 25 fast-growing projects, 30/60/90 velocity, community signal, funding/stage label, founder/contributor profiles, and clear methodology. Improve it by adding Investment Interest, Evidence Confidence, action, and "Why This May Be Noise."

### Step 4: Assign Repo Type And Action

Repo type must be one of:
- `company-backed`
- `founder-likely`
- `ecosystem signal`
- `portfolio-support relevant`
- `high-noise`

Action must be one of:
- `watch`
- `contact maintainer`
- `map ecosystem`
- `track company formation`
- `ignore`

### Step 5: Output

```markdown
## OSS Radar: {Sector} -- {YYYY-MM-DD}

### Top Themes
- **{Theme}** — {2 sentence why-now}. Key projects: `{repo}`, `{repo}`, `{repo}`.

### Top 25 OSS Projects

| Repo | Type | Action | Interest | Evidence | +30d | +60d | +90d | Community | Why Now | Why This May Be Noise |
|------|------|--------|----------|----------|------|------|------|-----------|---------|------------------------|
| owner/repo | founder-likely | contact maintainer | High | Medium | +1.5k | +4.9k | +4.9k | 2r+3hn | Agent memory is emerging | Star spike may be launch hype |

### Founder & Contributor Profiles
- **owner/repo** — top contributors, GitHub profiles, LinkedIn if found, caveat that identity should be verified before outreach.

### Methodology
Explain star velocity windows, community source weighting, funding/stage labeling, filtering, and star-authenticity caveats.
```

---

## Mode: GitHub Trending

**Trigger:** `/vc-signals github <sector>`

### Step 1: Run GitHub Trending Script

```bash
python3 <skill_dir>/scripts/github_trending.py --sector <SECTOR> --limit 15
```

If `sector` is `all`, run for each sector and merge results.

### Step 2: Enrich with Company Mapping

For each repo in the results:
1. Check `company_aliases.json` -- is the owner/org a known company?
2. Check the repo owner type -- is it an organization (likely a company)?
3. Use your knowledge + the repo description to identify if there's a commercial entity behind it

### Step 3: Map to Themes

For each repo, identify which sector themes it relates to. Use the taxonomy subcategories and your judgment.

### Step 4: Output

```markdown
## GitHub Trending: [Sector] -- YYYY-MM-DD

Repos ranked by star velocity (recent growth rate, not absolute count).

| # | Repo | Stars | 7d Growth | 30d Growth | Language | Commercial Entity |
|---|------|-------|-----------|------------|----------|------------------|
| 1 | owner/name | 12,500 | +450 | +1,800 | Rust | Company Name (Confirmed) |
| 2 | ... | | | | | |

### Standout Repos

**[Repo 1: owner/name]**
- **Description:** [What it does]
- **Why it's interesting:** [1-2 sentences -- what's driving the growth?]
- **Theme mapping:** [Which sector themes this relates to]
- **Commercial entity:** [Company behind it, if any. Monetization status if known.]

**[Repo 2: owner/name]**
...

### Acceleration Alerts
[Repos with unusually high velocity relative to their size -- the "0 to 10k in a week" signals]

### Theme Patterns
[Do multiple trending repos point to the same emerging theme? Call it out.]
```

### Step 5: Persist

```bash
cat <<'MD_EOF' | python3 <skill_dir>/scripts/persistence.py save-markdown --subdir github --slug <SECTOR> --date $(date +%Y-%m-%d)
[the markdown content goes here]
MD_EOF
```

---

## Mode: Add Sector

**Trigger:** `/vc-signals add-sector <name>` (e.g., `/vc-signals add-sector fintech`)

This mode lets users add a new sector without editing JSON manually.

### Step 1: Get Sector Details

Ask the user for:
- Sector name (slug, lowercase, hyphens ok): use what they provided or ask
- Display name: suggest one based on the slug, let them confirm

### Step 2: Generate Taxonomy

Based on the sector name, propose 4-6 subcategories. For example, if the user says "fintech":

> "Here are the subcategories I'd suggest for **Fintech**:
> 1. Payments Infrastructure
> 2. Neobanking & Digital Banking
> 3. Lending & Credit Platforms
> 4. Regtech & Compliance
> 5. Embedded Finance
> 6. Crypto & DeFi Infrastructure
>
> Want to add, remove, or rename any of these?"

For each subcategory, generate:
- `name`: display name
- `aliases`: 4-6 relevant search terms
- `seed_queries`: 2-3 search queries

Also generate:
- `subreddits`: 5-8 relevant subreddits for the sector
- `hn_queries`: 4-6 HN-optimized short queries
- `discovery_queries`: 4-6 broad discovery queries
- `negative_terms`: 3-5 noise filters

### Step 3: Confirm and Save

Show the user the complete sector JSON and ask for confirmation. Then:

```bash
cat <skill_dir>/config/sectors.json
```

Read the current config, add the new sector, and write it back:

```bash
python3 -c "
import json
from pathlib import Path
config_path = Path('<skill_dir>/config/sectors.json')
config = json.loads(config_path.read_text())
config['<sector_slug>'] = <new_sector_dict>
config_path.write_text(json.dumps(config, indent=2))
print('Sector added successfully')
"
```

### Step 4: Confirm

> "Added **<Display Name>** with <N> subcategories. Try it out:
> `/vc-signals weekly <sector-slug>`"

---

## Graceful Degradation

At every step, handle failures gracefully:

1. **last30days not available:** Use WebSearch. Say so. Still produce useful output.
2. **GitHub API rate limited:** Use partial results. Warn the user. Suggest running again later.
3. **GitHub token missing:** Say "GitHub trending requires a token. Run `/vc-signals setup` to configure one." Still run the rest of the scan.
4. **Persistence fails:** Display full output inline. Warn that it wasn't saved. Do not crash.
5. **WebSearch returns thin results:** Note limited coverage. Still extract what themes you can. Be honest about confidence.
6. **Unknown sector:** List valid sectors. Don't guess.
7. **No previous briefing for comparison:** Skip the week-over-week section. Say "This is your first scan for [sector]. Future scans will include week-over-week comparisons."

**Never crash. Never pretend you have data you don't. Always tell the user what worked and what didn't.**
