Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsContent › Audio Podcast › Page 5

Audio Podcast (Page 5 of 5)

268 Claude Code skills in the Audio Podcast sub-category of Content.

268 skills · updated 2026-06-01 · showing 241–268 of 268 by quality score

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

Use when the user has audio or video and wants a timestamped transcript (SRT) in the source language.
Trích xuất transcript narration có timestamp cấp câu và cấp từ; script hiện dùng OpenAI Whisper hoặc API transcription tương thích.
Use when a user needs YouTube subtitles or transcript text from a URL using yt-dlp only, with no video download and no browser automation fallback.
The new frontier of audio: AI-generated music with Suno and Udio, AI sound effects with ElevenLabs, AI voice cloning, and AI audio enhancement.
AI voice creation skill supporting speech recognition (ASR) and text-to-speech (TTS). Uses qwen3-asr-flash-filetrans, qwen-tts and other models.
Tools, patterns, and utilities for creating music with code. Output as a .mp3 file with realistic instrument sounds.
Expert developer skill for implementing real-time voice and video interactions using the Google Gemini Live API.
Discover and install third-party skills from external registries when the user needs a capability that no currently-active skill covers.
This skill should be used when the user asks to "implement voice input", "add speech recognition", "use SFSpeechRecognizer", "configure microphone permissions", "音声入力を実装したい",…
The `marmot` CLI bundles AI generation (text, image, video, speech, transcription), web retrieval (search, scrape, answer, map, crawl, research, findall), and data lookup (enrich,…
Moonshine Voice is a fast on-device speech recognition library for interactive voice applications. This skill helps agents install the Python package, load supported language…
MUST read this skill BEFORE entering generate mode for music tasks. Covers prompt crafting framework, structure syntax, and multi-clip strategy.
Handle audio messages from Telegram and send voice responses. Trigger when the user sends an audio file (e.g., .ogg), requests a voice response, or when the conversation context…
This skill should be used when the user asks to "clean up a transcript", "fix speech artifacts", "edit interview quotes", "polish transcription", "clean up quotes from a…
Use this skill when the user wants to transcribe a Google Meet recording, generate a commercial proposal from a meeting transcript, or push a proposal to Notion.
Interactive text-to-speech audio generation using the gemini-media MCP (Google Gemini TTS). Use this skill whenever the user asks to convert text to speech, generate spoken audio,…
Whishper is an open source self-hosted web app for speech-to-text, translation, and subtitle workflows built around Whisper models.
Extract, transcribe, and summarize audio or video files using OpenAI Whisper. Use this skill whenever the user wants to transcribe audio or video, extract what was said in a…
Control Yulu (语录), the local-first macOS meeting recorder. Use this skill when the user asks to start or stop recording a meeting, check recording status, look up past…
Speech-to-text (audio transcription) via flow_router /v1/audio/transcriptions. OpenAI-compat shape (multipart form), routes to active STT providers.
Add text-to-speech narration to Claude Code on macOS
Transcribe audio verbatim with speaker attribution
When extracting hardcoded data to JSON, field values silently drift:
Agent D2 - Data Collection Specialist - Interviews, Focus Groups & Observation. Covers protocol development, question design, probing strategies, transcription conventions, and…
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Alibaba Cloud Bailian Qwen TTS with voice/mood presets
動画/音声ファイルをタイムスタンプ付きで文字起こしし、要約まで行うスキル。「動画を文字起こしして」「この音声を要約して」「会議の録画をまとめて」「mp4を文字起こしして要約して」「インタビュー音声から議事録作って」などのリクエストで使用する。OpenAI…
围绕 Qwen3-TTS 提供本地 TTS 工作流。支持:单句语音生成(CustomVoice/VoiceDesign/VoiceClone)、长文稿批量配音生成(文章→配音稿JSON→批量TTS→合并)。适用场景:生成语音、有声书配音、视频旁白、多角色对话朗读、语音克隆。
All Content skills →
More in ContentStorytelling (434) · Translation (301) · Video (232) · Writing (144) · Editorial (134) · Image Design (99)