---
name: kokoro-tts
description: Generate high-quality text-to-speech audio using Kokoro, a neural TTS model running locally on Apple Silicon via MLX. Use when any agent needs to synthesize speech, generate audio from text, read content aloud, or produce WAV/MP3 files from text. Triggers on "text to speech", "TTS", "speak this", "generate audio", "read aloud", or any request to convert text into spoken audio. Also used as the TTS backend for other skills (e.g., audio-review).
---

# Kokoro TTS

Local neural TTS on Apple Silicon via MLX. No API key, no cloud, no cost.

## Quick Start

```bash
speak.sh "Hello Captain"                        # default voice (riker)
speak.sh "Save this" -o ~/output.mp3             # save as MP3
speak.sh -f /tmp/script.txt -o ~/narration.wav   # from file
cat text.txt | speak.sh -o ~/out.wav             # from stdin
```

`speak.sh` = `<skill_path>/~/workspace/bin/speak.sh`. Stdout is the output file path. No other output unless `--verbose`.

## Ship Voice

Default voice is `riker` — a custom blend tuned to Commander Riker's vocal profile via spectral analysis.

Recipe: `voice-blend.py --output riker am_fenrir:0.36 bm_daniel:0.24 am_onyx:0.40`

## Custom Blends

```bash
# Create a new blended voice (weights must sum to ~1.0)
~/workspace/bin/voice-blend.py --output my_voice am_fenrir:0.5 bm_daniel:0.3 am_onyx:0.2

# Use it
speak.sh "Testing" -v my_voice
```

## Options

| Flag | Description |
|------|-------------|
| `-v` | Voice preset (default: `riker`) |
| `-o` | Output file (.wav or .mp3) — skips playback |
| `-f` | Read text from file |
| `-s` | Speed multiplier (default: 1.0) |
| `--play` | Play even when `-o` is set |
| `--voices` | List all available voices |