---
name: subtitle-translator
description: Use when translating subtitle files (SRT, ASS, SSA, VTT, SUB, SBV) for TV series or movies using AI, when the user has subtitle files in a folder and wants natural, context-aware translations with glossary-based consistency
---

# Subtitle Translator

## Overview

Multi-phase AI subtitle translation workflow that produces natural, context-aware translations by first building a translation glossary, then translating with context, and finally running consistency and technical checks. Uses Claude Code as orchestrator and Gemini CLI as translation engine.

## Prerequisites

- **Claude Code** installed and authenticated
- **Gemini CLI** installed and authenticated (`gemini` command available in terminal)
- Subtitle files placed in a working folder (supported formats: SRT, ASS, SSA, VTT, SUB, SBV)

## Gemini CLI Usage

Pass content to Gemini via pipe:

```bash
# Send a file with a prompt
cat file.srt | gemini "Your prompt here"

# Send multiple files
cat glossary.md file.srt | gemini "Your prompt here"

# Send a plain prompt
echo "Your prompt here" | gemini
```

## Workflow

```dot
digraph subtitle_flow {
    rankdir=TB;
    node [shape=box];

    A [label="1. Glossary Creation\nGemini reads all subtitles\n+ user context → glossary.md"];
    R [label="1b. Glossary Review\nUser reviews glossary\nbefore proceeding"];
    B [label="2. Translation\nEach subtitle sent to Gemini\nwith glossary (chunked if large)"];
    V [label="2b. Verification\nBlock count + timecode check"];
    C [label="3. Consistency Check\nAll translations combined\n→ Gemini reviews → fix errors"];
    D [label="4. Technical Check (optional)\nLine length, line count\nformatting rules"];

    A -> R -> B -> V -> C -> D;
}
```

### Phase 1: Glossary Creation

**Before starting, ask the user (if not already provided):**
- What is the target language?
- What show/movie is this? (title, genre, setting, time period)
- Any character details? (names, relationships, bios)
- Any specific terminology, slang, or invented words?
- What tone? (formal, casual, comedic, dark)
- Timecode mode: **preserve** (keep original timecodes) or **retimed** (adjust timecodes for the target language)?
- Should song lyrics be translated or left in the original language?

Use whatever context the user provides to enrich the glossary prompt.

Send each subtitle file to Gemini individually for term extraction (do not concatenate all files into one prompt). Then send all extracted terms to Gemini to produce a unified glossary.

Save the glossary as `glossary.md` in the working folder. The glossary should contain:
- Character name spellings and pronunciation notes
- Recurring terms with agreed translations
- Tone and style guidelines
- Show-specific context

### Phase 1b: Glossary Review

**Present the glossary to the user for review before proceeding.** The user may want to correct character names, adjust term translations, or add missing context. Fix any issues before moving to Phase 2.

### Phase 2: Translation

Send each subtitle file to Gemini one by one, along with the glossary.

**Chunking strategy:** For files with more than 300 subtitle blocks, split into chunks of ~250 blocks each. Translate each chunk separately with the glossary, then combine the results. This prevents output truncation from Gemini's context limits.

**Key translation principles:**
- Natural, fluent target language
- Context-aware (follows glossary terms)
- Easy to read on screen (subtitle-friendly phrasing)
- Preserve all formatting tags (`{\an8}`, `<i>`, `<b>`, `<font>`, etc.) exactly as-is
- Song lyrics: follow the user's preference from Phase 1

**Timecode modes:**
- **Preserve**: Keep original timecodes exactly as-is. Translate text only. Best when the source timing already works well.
- **Retimed**: Rewrite timecodes to fit the target language. Allows splitting or merging subtitle blocks when the translation is significantly shorter or longer than the original. Best for languages with very different sentence lengths.

**Output file naming:** Replace the source language code with the target language code in the filename. Example: `Show.en.srt` → `Show.tr.srt`

**Rate limits:** Gemini CLI may return 429 (capacity exhausted) errors. This is normal. Wait a few seconds and retry automatically.

### Phase 2b: Verification

After translating each file, verify:
- Subtitle block count matches the original (in preserve mode)
- All timecodes are present and correctly formatted
- No formatting tags were removed or corrupted
- No empty subtitle blocks

Fix any issues before proceeding.

### Phase 3: Consistency Check

Combine all translated subtitle files and send them back to Gemini for review:
- Term consistency across episodes
- Character voice consistency
- Glossary adherence
- Contextual accuracy

Errors found are automatically corrected.

### Phase 4: Technical Check (Optional)

Common subtitle technical rules:
- Max characters per line (e.g., 25-42 depending on platform)
- Max 2 lines per subtitle block
- Minimum display duration (typically 1 second)
- No orphan words on second line

## Quick Reference

| Phase | What | Tool | Output |
|-------|------|------|--------|
| 1. Glossary | Read all subtitles + context → build term dictionary | Gemini CLI | `glossary.md` |
| 1b. Review | User reviews and corrects glossary | User | Updated `glossary.md` |
| 2. Translate | Each subtitle + glossary → translated subtitle (chunked) | Gemini CLI | Translated files (same format) |
| 2b. Verify | Block count, timecodes, tags check | Claude Code | Verified files |
| 3. Consistency | All translations → review & fix | Gemini CLI | Fixed subtitle files |
| 4. Technical | Format checks (line length, count) | Claude Code | Validated subtitle files |

## Tips

- **Batch by season**: Keep one folder per season for manageable glossary scope
- **Glossary is key**: The better your context input (character bios, show details), the better the translations
- **Iterate glossary**: After Phase 3, update the glossary with any new terms discovered during consistency check

## Common Mistakes

| Mistake | Fix |
|---------|-----|
| Skipping glossary phase | Always build glossary first - it prevents inconsistent character names and terms across episodes |
| Sending all subtitles at once for translation | Send one by one with glossary - prevents context window overflow and improves quality |
| Not chunking large files | Files over 300 blocks should be split into ~250-block chunks to avoid output truncation |
| Not providing show context | Character bios, setting, tone info dramatically improve translation quality |
| Stripping formatting tags | Preserve `{\an8}`, `<i>`, `<b>` and other tags - they control subtitle positioning and style |
| Ignoring technical limits | Platform subtitle rules (char limits, line counts) affect readability |
| Not verifying output | Always check block count and timecodes match the original after translation |
| Ignoring rate limits | Gemini 429 errors are normal - retry after a few seconds |
