Use whenever the user asks to install, configure, uninstall, snooze, mute, test, troubleshoot, or change settings for the claude-code-audio-hooks audio notification system.
Test Bob The Skull with virtual audio injection instead of speaking. Use when testing wake word detection, STT accuracy, full conversation pipeline, or automated testing.
Audio generation skill — jingles, beds, voiceover, and sound effects. Routes music requests to Suno V5 / Udio / Lyria, speech to MiniMax TTS / FishAudio / ElevenLabs V3, and SFX…
Create memorable sonic logos using design principles from Intel, Netflix, and McDonald's—crafting 2-5 second audio signatures that achieve instant brand recognition.
You are the on-device audio ML specialist for Modcaster's AI-driven audio processing.
Use when asked to normalize audio volume, match loudness, or apply peak/RMS normalization to audio files.
Audio playback using Tone.js including players, transport, scheduling, and loading audio. Use when implementing background music, sound effects, audio synchronization, or timed…
Converts and processes audio files using ffmpeg. Supports format conversion, sample rate changes, mono/stereo conversion, and segment splitting.
Professional audio production for music, podcasts, and sound design. Use when working with audio recording, mixing, mastering, or sound design for any medium.
Analyze audio recording quality - echo detection, loudness, speech intelligibility, SNR, spectral analysis.
Analyze the WaveCap-SDR audio stream to assess tuning quality, detect silence, noise, proper audio, or distortion.
Binding audio analysis data to visual parameters including smoothing, beat detection responses, and frequency-to-visual mappings.
Router for audio domain including playback, analysis, and audio-reactive visuals. Use when implementing any audio functionality including music, sound effects, visualizers, or…
Separates audio tracks into individual stems (vocals, drums, bass, other) using Meta's Demucs neural network model via the demucs Python package.
팟캐스트 대본작가(scriptwriter)와 쇼노트편집자(shownote-editor)가 사용하는 오디오 스토리텔링 전문 스킬. 귀로만 듣는 매체에서 청취자의 몰입을 극대화하는 서사 구조, 페이싱, 사운드 연출 방법론을 제공한다.
Implements audio systems including sound management, music systems, positional audio, and audio effects.
Game audio systems, music, spatial audio, sound effects, and voice implementation. Build immersive audio experiences with professional middleware integration.
End-to-end audio production workflow with stems, effects, archiving, and verification
Step-by-step audio production with per-stem verification, timing alignment, and incremental quality gates
Incremental audio production with duration alignment handling, per-stem verification, and adaptive extension strategies
Incremental audio production with duration mismatch handling, adaptive stem extension, and pre-mix alignment verification
Audio production with diagnostic analysis, timecode parsing from documents, and verified export workflow
使用 Whisper 将音频/视频转换为文字,支持词级别时间戳。Use when user wants to 语音转文字, 音频转文字, 视频转文字, 字幕生成, transcribe audio, speech to text, generate subtitles, 识别语音.
Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration
Cut, trim, and edit audio segments with fade effects, speed control, concatenation, and basic audio manipulations.
Audio and video processing with FFmpeg, WebRTC, and streaming. Covers transcoding, format conversion, real-time communication, and media pipelines.
PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen).
audioFlux is a deep learning tool library for audio and music analysis and feature extraction, supporting dozens of time-frequency transforms and hundreds of feature combinations…
Transcribe audio verbatim with speaker attribution and chronological visual context
Use when implementing haptic feedback, Core Haptics patterns, audio-haptic synchronization, or debugging haptic issues - covers UIFeedbackGenerator, CHHapticEngine, AHAP patterns,…
Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
De novo motif discovery and known motif enrichment analysis using HOMER and MEME-ChIP. Identify transcription factor binding motifs in ChIP-seq, ATAC-seq, or other genomic peak…
Use Screenpipe when an agent needs private, local-first memory of what you saw or heard on your computer, including searchable screen text, app context, and transcripts, instead…
Automatically integrates processed media (audio transcriptions and image summaries) into chat.md files at the correct timestamp position.
Access ChEA3 and Harmonizome ChEA data for transcription factor enrichment analysis and metadata retrieval.
Battute brutte in stile Claudio: giochi di parole su AI, tech e lavoro in italiano
Analyze a competitor's recent social content — extract what's working, what's not, their posting cadence, content mix, and voice patterns — feeds directly into brand-voice-system,…
Guides users through saving generated content (summaries, notes, key points) to professionally formatted and themed files.
Esta skill atua como um laboratório de fonética articulatória e prosódica. Utilize esta skill SEMPRE que a pessoa usuária enviar um link de vídeo do YouTube ou um arquivo de vídeo…
Convert documents and files to Markdown using markitdown with Windows/WSL path handling. Supports PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML,…
An agent skill built on Coqui TTS, the open-source deep learning toolkit for text-to-speech synthesis.
Debugs and profiles Apple Audio Unit v3 (AUv3) plugins using auval validation tool, the AUAudioUnit Swift API, and Instruments AudioUnit trace template for latency measurement and…
Create TTS sound pack using Qwen3-TTS voice cloning or built-in voices. TTS 음성 합성으로 사운드 팩 생성. Use when the user says 'TTS 팩', 'TTS 사운드', 'tts pack', '음성 합성 팩', '보이스 클로닝', 'voice…
Diagnose and fix common Deepgram errors and issues. Use when troubleshooting Deepgram API errors, debugging transcription failures, or resolving integration issues.
Implement production pre-recorded speech-to-text with Deepgram. Use when building audio transcription, batch processing, or implementing diarization and intelligence features.
Implement real-time streaming transcription with Deepgram WebSocket. Use when building live transcription, voice interfaces, real-time captioning, or voice AI applications.
Optimize Deepgram costs and usage for budget-conscious deployments. Use when reducing transcription costs, implementing usage controls, or optimizing pricing tier utilization.
Implement audio data handling best practices for Deepgram integrations. Use when managing audio file storage, implementing data retention, or ensuring GDPR/HIPAA compliance for…
Create a minimal working Deepgram transcription example. Use when starting a new Deepgram integration, testing your setup, or learning basic Deepgram API patterns.
Deep dive into migrating to Deepgram from other transcription providers. Use when migrating from AWS Transcribe, Google Cloud STT, Azure Speech, OpenAI Whisper, AssemblyAI, or…
Real-time speech-to-text using Deepgram Nova-2 API with streaming WebSocket connections. Supports diarization, punctuation, and language detection via the Deepgram Python SDK for…
Optimize Deepgram API performance for faster transcription and lower latency. Use when improving transcription speed, reducing latency, or optimizing audio processing pipelines.
Streams live audio to Deepgram's WebSocket API at wss://api.deepgram.com/v1/listen for real-time speech-to-text.
Receive and verify Deepgram webhooks (callbacks). Use when setting up Deepgram webhook handlers, processing transcription callbacks, or handling asynchronous transcription results.
Implement Deepgram callback and webhook handling for async transcription. Use when implementing callback URLs, processing async transcription results, or handling Deepgram event…
Analyze, summarize, and extract insights from DeLive transcription sessions. Use when: user mentions DeLive, transcription, meeting transcripts, live captions, audio…
Specialist in creating size-optimized real-time audio-visual demos and procedural artUse when "demoscene, size coding, 64k intro, 4k intro, 1k intro, tiny code, shader golf,…
Demucs is Meta's open-source music source separation project for splitting songs into stems such as vocals, drums, bass, and accompaniment.
Receive and verify ElevenLabs webhooks. Use when setting up ElevenLabs webhook handlers, debugging signature verification, or handling call transcription events.
Implement ElevenLabs webhook HMAC signature verification and event handling. Use when setting up webhook endpoints for transcription completion, call recording, or agent…