---
name: research-companies
description: Research a batch of companies in parallel using background subagents. Demonstrates the file-based result passing pattern to avoid context overflow.
argument-hint: [count]
allowed-tools: Read, Bash, Task, WebFetch, WebSearch
---

# Research Companies (Example Skill)

Research companies in parallel by launching background subagents that write results to files.
This skill demonstrates the **file-based result passing** pattern that prevents context overflow
when coordinating multiple background agents.

> **Why this pattern?** Claude Code's `TaskOutput` returns the _entire_ agent transcript (every
> web page, search result, and reasoning step). One agent = fine. Ten agents = 100-500K tokens
> dumped into your context = session instability. File-based passing avoids this completely.

## Arguments

- `$ARGUMENTS` - Number of companies to research (default: 5)

## Workflow

### Step 1: Get Work Items

Get the list of companies that need research. Replace this with your own data source
(database query, CSV file, API call, etc.).

```bash
# Example: read from a simple text file, one domain per line
head -n ${count:-5} companies.txt
```

Or from a database:

```bash
python3 -c "
import sqlite3
conn = sqlite3.connect('mydata.db')
rows = conn.execute('''
    SELECT domain, full_url FROM companies
    WHERE status = 'pending'
    LIMIT ${count:-5}
''').fetchall()
for r in rows:
    print(f'{r[0]}|{r[1]}')
"
```

Save the list. You'll iterate over it in Step 3.

### Step 2: Create Output Directory

```bash
mkdir -p /tmp/results/research-companies
```

Use a unique subdirectory name per skill. This prevents file collisions if multiple
skills run concurrently.

### Step 3: Launch Background Subagents

For each company, launch a background subagent using the Task tool with these parameters:

```
Tool: Task
subagent_type: general-purpose
model: haiku                    # Use haiku for cost-efficiency
run_in_background: true         # MUST be true — don't block on each agent
max_turns: 20                   # Prevent runaway agents
```

**Launch all agents in a single message** (parallel tool calls) for maximum concurrency.

#### Subagent Prompt Template

Customize the research goal, but keep the Output Instructions and CRITICAL sections exactly as shown:

```
Research [COMPANY_NAME] ([DOMAIN]).

Find the following information:
1. What the company does (one sentence)
2. Company size (employees, if available)
3. Key decision-maker name and title
4. Contact email (from website, not guessed)

## Research Strategy

1. Visit [FULL_URL] — check About page, Team page, Contact page
2. If the website doesn't have enough info, use WebSearch: "[COMPANY_NAME]" "[DOMAIN]" founder CEO
3. Look for LinkedIn profiles, press mentions, or directory listings

## Output Instructions

ALWAYS write a result file, even if nothing was found. Use this exact Bash command:

cat <<'RESULT' > /tmp/results/research-companies/[DOMAIN].txt
DOMAIN: [domain]
COMPANY: [company name or "Not found"]
DESCRIPTION: [one sentence description or "Not found"]
SIZE: [employee count/range or "Not found"]
CONTACT_NAME: [name or "Not found"]
CONTACT_TITLE: [title or "Not found"]
CONTACT_EMAIL: [email or "Not found"]
SOURCE: [where you found the info or "Not found"]
RESULT

## CRITICAL: Your Response

After writing the file, your ENTIRE response must be ONLY these words:
Done: [DOMAIN]

Do NOT include a summary, explanation, or any other text. Just "Done: [DOMAIN]".
```

#### Why each part of the prompt matters

| Prompt element | Purpose |
|---|---|
| `ALWAYS write a result file, even if nothing was found` | Ensures file count matches expected count — otherwise polling never completes |
| `Use this exact Bash command` | Removes ambiguity — agent doesn't improvise a different format |
| `your ENTIRE response must be ONLY these words` | If anything accidentally reads the response, it's only a few tokens |
| `"Not found"` as default | Makes parsing deterministic — every field always exists |

### Step 4: Wait for Results

**IMPORTANT — DO NOT use TaskOutput.** TaskOutput returns the full agent transcript
(all web searches, page content, reasoning) into your context, causing context overflow.
The results are already in the files — just wait for them to appear.

Poll for result files using Bash. Set EXPECTED to the number of agents you launched:

```bash
EXPECTED=5
TIMEOUT=180
ELAPSED=0
while [ $(ls /tmp/results/research-companies/*.txt 2>/dev/null | wc -l) -lt $EXPECTED ] && [ $ELAPSED -lt $TIMEOUT ]; do
    sleep 15
    ELAPSED=$((ELAPSED + 15))
    echo "$(ls /tmp/results/research-companies/*.txt 2>/dev/null | wc -l)/$EXPECTED complete (${ELAPSED}s)"
done
echo "Proceeding with $(ls /tmp/results/research-companies/*.txt 2>/dev/null | wc -l)/$EXPECTED files"
```

**Polling parameters:**

| Parameter | Default | When to adjust |
|---|---|---|
| `sleep 15` | 15 seconds | Shorter (5s) for fast tasks, longer (30s) for slow web research |
| `TIMEOUT=180` | 3 minutes | Increase for agents doing extensive web crawling |
| `2>/dev/null` | Always use | Suppresses "No such file" when directory is empty |

### Step 5: Read and Parse Results

Read the result files directly:

```bash
cat /tmp/results/research-companies/*.txt
```

Then parse them into structured data:

```python
python3 -c "
import glob

results = []
not_found = []

for filepath in sorted(glob.glob('/tmp/results/research-companies/*.txt')):
    with open(filepath) as f:
        lines = f.read().strip().split('\n')

    entry = {}
    for line in lines:
        if ':' in line:
            key, value = line.split(':', 1)
            value = value.strip()
            if value and value != 'Not found':
                entry[key.strip()] = value

    domain = entry.get('DOMAIN', 'unknown')
    if entry.get('CONTACT_NAME'):
        results.append(entry)
        print(f'Found: {domain} -> {entry[\"CONTACT_NAME\"]} ({entry.get(\"CONTACT_TITLE\", \"?\")})')
    else:
        not_found.append(domain)
        print(f'Not found: {domain}')

print(f'\nSummary: {len(results)} found, {len(not_found)} not found')
"
```

After parsing, update your database, write to a file, send notifications — whatever
your workflow needs.

### Step 6: Clean Up

```bash
rm -f /tmp/results/research-companies/*.txt
```

Always clean up. Leftover files from a previous run will confuse the next batch
(the polling loop will see them as "already complete").

### Step 7: Report Summary

After processing, report:

- How many agents completed vs timed out
- What was found (with key details)
- What wasn't found
- Suggested next steps

Example output:

```
Researched 5 companies:
- 4 completed, 1 timed out
- 3 contacts found with email
- 1 contact found, no email
- 1 not found (site appears abandoned)

Next: Run /draft-email for the 3 with contact emails.
```

---

## Adapting This Skill

To use this pattern for your own use case:

1. **Change the data source** (Step 1) — database, CSV, API, whatever you have
2. **Change the output directory name** (Step 2) — use something unique to your skill
3. **Change the research goal** (Step 3) — rewrite the prompt for your specific task
4. **Change the result fields** (Step 3) — define the KEY: VALUE pairs you need
5. **Change the parsing logic** (Step 5) — extract the fields you defined
6. **Keep everything else the same** — the polling, cleanup, and anti-overflow patterns are universal
