---
name: voice-setup
description: Complete voice configuration in chat - PTT key, microphone permissions, ElevenLabs TTS, and troubleshooting
compatibility: "Designed for Vellum personal assistants"
metadata:
  emoji: "🎙️"
  vellum:
    display-name: "Voice Setup"
    includes: ["elevenlabs-voice"]
    activation-hints:
      - "Guided setup or troubleshooting (walkthrough, PTT not working, mic issues, ElevenLabs/TTS)"
      - "Simple voice setting changes (PTT key, wake word) -> use voice_config_update directly"
    avoid-when:
      - 'If "voice" is in a Twilio/phone context, load phone-calls instead'
---

You are helping the user set up and troubleshoot voice features (push-to-talk, text-to-speech) entirely within this conversation. Do NOT direct the user to the Settings page for initial setup - handle everything in-chat using the tools below.

## Available Tools

- `voice_config_update` - Change any voice setting (PTT key, conversation timeout, TTS voice ID)
- `open_system_settings` - Open macOS System Settings to a specific privacy pane
- `navigate_settings_tab` - Open the Vellum settings panel to the Voice tab
- `credential_store` - Collect API keys securely (for ElevenLabs TTS)

## Setup Flow

Walk the user through each section in order. Skip sections they don't need. Ask before proceeding to the next section.

### 1. Microphone Permission

Check `<channel_capabilities>` for `microphone_permission_granted`.

**If `false` or missing:**

1. Explain that macOS requires microphone permission for voice features.
2. Use `open_system_settings` with `pane: "microphone"` to open the right System Settings pane.
3. Tell the user: "I've opened System Settings to the Microphone section. Please toggle **Vellum Assistant** on, then come back here."
4. After they confirm, verify by checking capabilities on the next turn.

**If `true`:** Tell them microphone is already granted and move on.

### 2. Push-to-Talk Activation Key

Present common PTT key options:

- **Right Option** - Default, good general choice
- **Fn** - Dedicated key on most Mac keyboards
- **Right Command** - Easy to reach
- **Right Control** - Familiar from gaming

Ask which key they prefer, then use `voice_config_update` with `setting: "activation_key"` and the chosen value.

**Common issues to mention:**

- If they pick a key that conflicts with their emoji picker (Fn or Globe on newer Macs), warn them and suggest an alternative.
- If they use a terminal app heavily, warn that some keys may be captured by the terminal.

### 3. Text-to-Speech / ElevenLabs (Optional)

Ask if they want high-quality text-to-speech voices via ElevenLabs (optional - standard TTS works without it).

If yes, the included **ElevenLabs Voice** skill (automatically appended below via `includes`) provides the full setup flow: curated voice list, API key collection, advanced voice selection, and tuning parameters. Follow the instructions there.

Note: The config key `services.tts.providers.elevenlabs.voiceId` controls the voice for both in-app TTS and phone calls. If the user sets up phone calls later, they will automatically use the same voice for a consistent experience.

### 4. Verification

After setup is complete:

1. Summarize what was configured.
2. Suggest they test by pressing their PTT key and speaking.
3. Offer to open the Voice settings tab if they want to review: use `navigate_settings_tab` with `tab: "Voice"`.

## Troubleshooting Decision Trees

When the user reports a problem, follow the appropriate decision tree:

### "PTT isn't working" / "Can't record"

1. **Microphone permission** - Check `microphone_permission_granted` in capabilities. If false, guide through granting it.
2. **Key check** - Ask what key they're using. Confirm it matches their configured PTT key.
3. **Emoji picker conflict** - On newer Macs, Fn/Globe opens the emoji picker. If they're using Fn, suggest switching to Right Option or Right Command.
4. **Speech Recognition permission** - Some voice features need this. Use `open_system_settings` with `pane: "speech_recognition"`.
5. **App focus** - PTT may not work when Vellum is not the frontmost app or if another app has captured the key.

### "Recording but no text" / "Transcription not working"

1. **Speech Recognition permission** - Must be granted for transcription.
2. **Microphone input** - Ask if they see the recording indicator. If yes, the mic works but transcription is failing.
3. **Locale/language** - Speech recognition works best with the system language. Ask if they're speaking in a different language.
4. **Background noise** - Excessive noise can prevent transcription. Suggest a quieter environment or a closer microphone.

### "Changed a setting but it didn't work"

1. **Event broadcast** - The setting should take effect immediately. If it didn't, suggest restarting the assistant.
2. **Verify** - Open the Voice settings tab with `navigate_settings_tab` to confirm the setting was persisted.

## Deep Debugging

For persistent issues, suggest checking system logs:

```bash
log stream --predicate 'subsystem == "com.vellum.assistant"' --level debug
```

Key log categories:

- `voice` - PTT activation, recording state
- `speech` - Speech recognition results

## Rules

- Always handle setup conversationally in-chat. Do NOT tell the user to go to Settings for initial configuration.
- Use `navigate_settings_tab` only for review/verification after in-chat setup, not as the primary setup method.
- Be concise. Don't explain every option exhaustively - present the most common choices and let the user ask for more.
- If a permission is denied, acknowledge it gracefully and explain what features won't work without it.
