Studio audio : mixage, mastering, export pro. Use when: podcast, audiobook, jingle, mixer, mastering, montage audio, voiceover+musique.
Guides audio mastering for streaming platforms including loudness optimization and tonal balance. Use when the user has approved tracks and wants to master audio files.
Polishes raw Suno audio by processing per-stem WAVs (vocals, backing_vocals, drums, bass, guitar, keyboard, strings, brass, woodwinds, percussion, synth, other) with targeted…
Coordinates album release including QA, distribution prep, and platform uploads. Use when mastering and album art are complete and the user is ready to release.
Converts mastered audio to sheet music and creates printable songbooks. Use after mastering when the user wants sheet music or a songbook for their album.
Локальная транскрибация аудиофайлов без отправки в облако. Используй когда пользователь просит транскрибировать запись, расшифровать аудио, сделать конспект встречи, преобразовать…
Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object…
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding,…
Access macOS productivity apps (Calendar, Contacts, Mail, Messages, Reminders, Voice Memos). Use when user asks about calendar events, contacts, emails, iMessages, reminders, or…
Game audio design patterns for creating sound effects and UI audio. Use when designing sounds for games, writing AI audio prompts (ElevenLabs, etc.), creating feedback sounds, or…
Mix a music / audio track onto an existing video via ffmpeg. Three modes: replace (drop original audio), overlay (mix both audible), duck (sidechain-compressor lowers music when…
Automates Apple Voice Memos (Mac Catalyst, no dictionary) via JXA using filesystem/SQLite access and System Events UI scripting.
Guide for implementing Google Gemini API audio capabilities - analyze audio with transcription, summarization, and understanding (up to 9.5 hours), plus generate speech with…
Write prompts for 10+ frontier AI music generators (Suno v5.5, Udio v4, Google Lyria 3 Pro, ElevenLabs Music, Stable Audio 2.5, MusicGen, Tencent SongGeneration, Sonauto v2,…
Transcribe audio / video to SRT / WebVTT / JSON / plain text via OpenAI Whisper. Auto-detects language or accepts --lang ISO-639-1 hint. ~$0.006/min.
End-to-end aside session processing — transcribe, align memo + transcript, distill into a structured vault note via Enzyme.
Transcribes audio and video files to text using Qwen3-ASR. Supports two modes — local MLX inference on macOS Apple Silicon (no API key, 15-27x realtime) and remote API via…
Build features for Ayeeeen (عين), an AI-powered accessibility app for blind users. Use when: implementing screens, adding ML features, creating audio-first UX flows, handling…
Transcribe a multi-talk conference livestream or long YouTube video into separate per-talk transcripts.
Download videos from social media URLs (X/Twitter, YouTube, Instagram, TikTok, etc.) using yt-dlp. Use when saving a video locally, extracting content for transcription, or…
Transcribes audio/video files using ElevenLabs Scribe v2 API. Use when transcribing audio files, generating transcripts, or converting speech to text.
Generate subtitles (SRT/VTT) and plain text transcripts from video or audio files using AWS Transcribe.
Use this skill whenever a user has an audio file (.wav/.mp3/.flac/etc.) that needs to loop as background on a website or web page — hero ambience, landing-page atmosphere,…
Expert in 2000s-era music visualization (Milkdrop, AVS, Geiss) and modern WebGL implementations. Specializes in Butterchurn integration, Web Audio API AnalyserNode FFT data, GLSL…
The differential-region-analysis pipeline identifies genomic regions exhibiting significant differences in signal intensity between experimental conditions using a count-based…
The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R.
Speech-to-text via 9Router /v1/audio/transcriptions using OpenAI Whisper / Groq / Gemini / Deepgram / AssemblyAI / NVIDIA / HuggingFace models.
Implement Abridge ambient clinical documentation capture-to-note pipeline. Use when building the primary encounter workflow: audio capture, real-time transcription, AI note…
Add voice message transcription to Deus using OpenAI's Whisper API. Automatically transcribes WhatsApp voice notes so the agent can read and respond to them.
Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3).
Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3).
Extract structured intelligence from audio using the AssemblyAI API with sentiment analysis, entity detection, topic modeling, and auto-chapter generation.
Diagnose and fix AssemblyAI common errors and exceptions. Use when encountering AssemblyAI errors, debugging failed transcriptions, or troubleshooting streaming and LeMUR issues.
Execute AssemblyAI primary workflow: async transcription with audio intelligence. Use when transcribing audio/video files, enabling speaker diarization, sentiment analysis, entity…
Execute AssemblyAI streaming transcription and LeMUR workflows. Use when implementing real-time speech-to-text, live captions, voice agents, or LLM-powered audio analysis with…
Optimize AssemblyAI costs through model selection, feature budgeting, and usage monitoring. Use when analyzing AssemblyAI billing, reducing transcription costs, or implementing…
Create a minimal working AssemblyAI transcription example. Use when starting a new AssemblyAI integration, testing your setup, or learning basic transcription patterns.
Optimize AssemblyAI API performance with caching, parallel processing, and model selection. Use when experiencing slow transcriptions, implementing caching strategies, or…
Execute AssemblyAI production deployment checklist and rollback procedures. Use when deploying AssemblyAI integrations to production, preparing for launch, or implementing go-live…
Streams audio from Twilio Media Streams over WebSocket to AssemblyAI real-time transcription, extracting speaker-diarized transcripts with word-level timestamps.
Transcribes audio and generates auto-chapters with summaries using AssemblyAI's /v2/transcript endpoint with auto_chapters=true.
Implement AssemblyAI webhook handling for transcription completion events. Use when setting up webhook endpoints, handling transcription callbacks, or processing async…
Audio analysis with Tone.js and Web Audio API including FFT, frequency data extraction, amplitude measurement, and waveform analysis.
Comprehensive audio analysis with waveform visualization, spectrogram, BPM detection, key detection, frequency analysis, and loudness metrics.
Generate a concise, audio-ready executive briefing script from data across your systems. Use when the user wants a morning brief, situation update, or a summary they can listen to…
Convert audio files between formats (MP3, WAV, FLAC, OGG, M4A) with bitrate and sample rate control. Batch processing supported.
Automatically helps debug Web Audio API issues, audio playback problems, pitch preservation, and caching issues in the VSSK-shadecn music practice app
Làm sạch bản ghi giọng nói WAV/MP3 theo workflow 2-phase semantic — AI viết lại nội dung không lặp vào TOML, sau đó căn keep flag từng token để render audio cuối.
Expert in digital signal processing for audio applications. Validates biquad filter implementations, frequency response calculations, and audio algorithms.
Master the essential audio post-production techniques—normalization, compression, EQ, and noise reduction—using the correct processing order to achieve professional-quality audio.
FFmpeg audio processing, batch editing, normalization, mixing, and automated audio production workflows.
Create standard SuperCollider audio effects for Bice-Box (delays, reverbs, filters, distortions). Provides templates, ControlSpecs, common patterns, and MCP workflow for safely…
Audio engineering — mastering, mixing, EQ, compression, loudness standards, synthesis, podcast production, music theory, spectrum analysis.
Audio production concepts, DSP fundamentals, mixing/mastering techniques, and DAW workflows. Bridges modular synthesis philosophy with practical audio engineering.
从视频文件中提取音频。Use when user wants to 提取音频, 抽取音频, 视频转音频, 导出音频, extract audio, video to audio, get audio from video, 把视频的声音提取出来.
ffmpeg patterns for extracting audio from video files and transcoding between formats
You are the audio architecture expert ensuring Leavn's complex audio pipeline stays coherent.
You are the audio fingerprinting and pattern detection specialist for Modcaster's content analysis.
Identifies audio content using Chromaprint/AcoustID fingerprinting, Shazam API recognition, and ACRCloud monitoring.
통합 오디오 생성 스킬. ElevenLabs MCP 기반 TTS(32개국어), 보이스 클로닝(1분 샘플), 다국어 더빙(립싱크), 효과음 생성을 지원. "목소리 생성", "TTS", "음성 합성", "보이스 클로닝", "더빙", "나레이션", "효과음", "AI 음성" 요청 시 사용.