Audio Podcast — Content Claude Skills (Page 5 of 9)

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

hyperframes-creative

Non-animation creative direction for HyperFrames videos. Use for design spec (frame.md / design.md) handling, palettes, typography, narration, beat planning, audio-reactive…

hyperframes-media

Asset preprocessing for HyperFrames compositions — text-to-speech narration (Kokoro), audio/video transcription (Whisper), and background removal for transparent overlays (u2net).

iflytek-speed-transcription

Ultra-fast speech transcription using iFLYTEK Speed Transcription API. Transcribe audio files (WAV/PCM/MP3) up to 5 hours in ~20 seconds per hour.

improve-transcript

Verbeter een lecture-transcript: confidence-aware LLM-correctie (ASR-conf x LLM-conf decision matrix) van typos/eigennamen/boektitels en topic-paragraaf herstructurering in één…

insanely-fast-whisper-gpu-transcription

Insanely Fast Whisper is a CLI tool that transcribes audio at extreme speeds using OpenAI Whisper models with Hugging Face Transformers, Flash Attention 2, and batched inference.

integration-openai-tools

Connect OpenAI's non-chat capabilities (DALL-E / GPT-image image generation, Whisper transcription, embeddings for mid-conversation RAG builds, Batch API for 50%-cost overnight…

interview-transcription

Interview management, transcription workflows, and source note-taking for journalists. Use when preparing for interviews, managing recordings, transcribing audio/video, organizing…

ios-audio-dsp

Expert knowledge for iOS audio processing, pitch detection algorithms (HPS, YIN, FFT), DSP implementation, and AudioKit integration.

ipa-transcription-phonological

Transcribe speech using International Phonetic Alphabet and analyze sound systems including phonotactics and phonological rules

jaspar-api

Access JASPAR database for transcription factor binding profiles (matrices), collections, and species via REST API.

jaspar-database

Query the JASPAR database for Transcription Factor (TF) binding profiles. Use when retrieving Position Frequency Matrices (PFMs) or Position Weight Matrices (PWMs) for specific…

kg-bootstrap

Creates and bootstraps Knowledge Graph projects from video transcripts. Extracts entities (people, organizations, concepts) and relationships into searchable graphs.

kmer-annotation-matrix-assembly

Use when you have filtered peak counts from ATAC or DNase-seq data (with GC bias correction and sample/peak filtering applied) and want to annotate peaks by k-mer content rather…

kokoro-tts

Generate high-quality text-to-speech audio using Kokoro, a neural TTS model running locally on Apple Silicon via MLX.

kramme:markdown-converter

Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (wi — from…

Insert and Configure Images, Audio, and Interactive Media

Insert a media object on the intended slide, optionally configure click behavior, and verify the requested result before leaving the slide.

librosa-python-audio-music-analysis-library

librosa is a Python library for audio and music analysis. It provides tools for feature extraction, spectral analysis, beat tracking, onset detection, and audio visualization,…

meeting-sdk/linux

Zoom Meeting SDK for Linux - C++ headless meeting bots with raw audio/video access, transcription, recording, and AI integration for server-side automation

lipsync

Lip-sync a face to a specific audio track on RunComfy via the `runcomfy` CLI. Routes across ByteDance OmniHuman (audio-driven full-body avatar from a portrait + audio), S — from…

lipsync

Lip-sync a face to a specific audio track on RunComfy via the `runcomfy` CLI. Routes across ByteDance OmniHuman (audio-driven full-body avatar from a portrait + audio), S — from…

lipsync

Lip-sync a face to a specific audio track on RunComfy via the `runcomfy` CLI. Routes across ByteDance OmniHuman (audio-driven full-body avatar from a portrait + audio), S — from…

liter-llm

Universal LLM API client for 142+ providers with native bindings for 11 languages. Use when writing code that calls LLM APIs via liter-llm in Python, TypeScript, Rust, Go, Java,…

live-co-dm

Real-time co-DM for the Shattered Sea campaign while a session is actively being played. Invoke for: "co-DM the session", "live DM help", "improv help", "/co-dm".

live-stream-audio-monitor

Monitors live audio streams from RTMP, HLS, or Icecast sources using FFmpeg stream capture and real-time chunked transcription via Deepgram's streaming API or Whisper.cpp.

livekit-stt-selfhosted

Build self-hosted speech-to-text APIs using Hugging Face models (Whisper, Wav2Vec2) and create LiveKit voice agent plugins.

local-tts-ptbr

Gere texto para fala local em português brasileiro com Piper ou Kokoro. Use quando o usuário quiser TTS offline, leitura em pt-BR, geração rápida de áudio, narração natural, ou…

local-whisper

Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.

markdown-converter

Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (wi — from…

marker-gene-identification-by-clustering

Use when you have a processed single-cell expression matrix (AnnData object) with pre-computed cluster assignments (e.g., leiden or louvain clusters in adata.obs) and want to…

markitdown

Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EP — from…

markitdown

Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EP — from…

mastering-prep

Use when the user is finishing a track and wants to check it's ready to send to a mastering engineer or for self-mastering.

media-analysis

Content analysis for video and audio — YouTube, TikTok, podcasts, audio files. Transcription-first pipeline (captions API, user transcript, or Whisper opt-in).

metabolomics-data-integration-with-metabolic-networks

Use when you have measured intracellular metabolite concentrations (e.g., via LC–MS/MS) across multiple cell lines or samples and want to predict which metabolic reactions are…

mock-voice

Clone a voice from YouTube audio and synthesize custom dialogue using OmniVoice (k2-fsa, diffusion-LM TTS), with gemma4:e2b (Ollama) for fast reference transcription.

moltflow-ai

AI-powered WhatsApp features: auto-replies, voice transcription, RAG knowledge base, and style profiles.

multimodal-ai

Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines.

audiocraft-audio-generation

PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen).

multimodal-extraction

Given a local video or video URL, downloads the media if needed, extracts slide frames and key moments, transcribes the audio, and writes a Markdown timeline that interleaves…

whisper

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification.

music-generation

Tools, patterns, and utilities for generating professional music with realistic instrument sounds. Write custom compositions using music21 or learn from existing MIDI files.

music-producer-pro

Productor musical senior. Logic, Ableton, FL Studio, Pro Tools. Mixing, mastering, sound design.

music-prompt-engineering

Optimize and format prompts specifically for AI music generation platforms like Suno and Udio, including platform-specific syntax and tag optimization

nederlands-proof-and-analysis-toolkit

Use when a claim needs proof instead of trust: porting or replacing an algorithm (FSRS-6 vs py-fsrs parity), deciding whether ~225 reviews can train 21 FSRS weights, a read-db.py…

nowa-lekcja

Turn a new Italian-tutoring recording (.mov) into a transcript and a published Hugo blog post with lesson notes and a vocab list.

oma-voice

Local-first text-to-speech and speech-to-text via the Voicebox MCP server. Generates speech from cloned or preset voice profiles for agent notifications, content voiceovers, and…

omniroute-stt

Speech-to-text via OmniRoute using OpenAI /v1/audio/transcriptions format with auto-fallback across Whisper, AssemblyAI, Deepgram, Azure STT.

openai--hyperframes--hyperframes

Create video compositions, animations, title cards, overlays, captions, voiceovers, audio-reactive visuals, and scene transitions in HyperFrames HTML.

openai-api

OpenAI API integration for building AI-powered applications. Use when working with OpenAI's Chat Completions API, Python SDK (openai), TypeScript SDK (openai), tool use/function…

openai-whisper-api-transcription

API-based speech-to-text transcription through OpenAI. No local model downloads, no GPU, no Python ML stack — just an API key and a shell script.

openclaudio-update-advisor

Analisi acida e basata sui fatti della stabilità delle release di OpenClaudio. Da usare prima di ogni 'openclaw update' per evitare regressioni o leak di log.

openrouter-audio

Audio transcription and text-to-speech generation using OpenRouter API. Use when the user needs to transcribe audio files to text or generate speech/audio from text.

openrouter-transcribe

Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).

orion-audio-pipeline

Use when working with speech recognition, text-to-speech, wake word detection, clap detection, VAD, audio preprocessing, or any audio I/O - covers the full audio pipeline from…

otter

Otter.ai transcription CLI - list, search, download, and sync meeting transcripts to CRM.

p5js

Production pipeline for interactive and generative visual art using p5.js. Creates browser-based sketches, generative art, data visualizations, interactive experiences, 3D scenes,…

padlet-spalte-4-workflow

Baut die vierte Padlet-Spalte als Pendant zu Reiter 4 der Step-Plan-Excel. Workflow-Karten mit nummerierten Checkbox-Schritten, Rechtsgrundlage, Tags fuer Unterzeichner und…

pdf2audio-minimax

Convert PDF files to MP3 audio using MiniMax MCP Server's text-to-audio tool. Use when user wants to convert a PDF to audio/MP3, create audiobook from PDF, or text-to-speech for…

pedalboard-spotify-audio-effects-python

Pedalboard is a Python library built by Spotify for working with audio: reading, writing, rendering, and adding studio-quality effects.

phonetics-phonology

Sound systems of human language -- phoneme inventories, the International Phonetic Alphabet, articulatory and acoustic phonetics, phonological rules, suprasegmental features…

Audio Podcast (Page 5 of 9)

Categories

Use cases

Popular tags

Learn

Site