Multimodal LLM includes worked examples; pricing or quota commentary; at least one code block. The SKILL.md runs to about 1,007 words, in the catalog's typical mid-range.
Multimodal LLM sits in the General category under the integrations sub-topic in the ClaudSkills catalog. There are 10 related skills indexed alongside it; comparing a few before installing usually reveals which fits your workflow best.
These notes are auto-generated from features detected in the SKILL.md file and from this catalog's structure — they aren't part of the source repository.
Integrate vision, audio, and video generation capabilities from leading multimodal models. Covers image analysis, document understanding, real-time voice agents, speech-to-text, text-to-speech, and AI video generation (Kling v3, Sora 2, Veo 3.1 std/lite/fast tiers, Runway Gen-4.5 via gen4_turbo).
Multimodal LLM is a community-contributed Claude Code skill in the integrations sub-category. It ships as a SKILL.md file that Claude Code auto-discovers under ~/.claude/skills/multimodal-llm/ and loads when your prompt matches the skill's trigger.
When to invoke it: Use when processing images, transcribing audio, generating speech, generating AI video (Kling v3, Sora 2, Veo 3.1 std/lite/fast, Runway Gen-4.
The Multimodal LLM Claude Code skill is built for Claude Code users and developers across all disciplines looking for general-purpose AI assistance. It's part of ClaudSkills (also referred to as Claude Skills or Claude Code Skills) — the open community-curated registry of 115,000+ SKILL.md files for Anthropic's Claude Code agent and the wider Claude ecosystem (Claude API, Claude Agent SDK).
mkdir -p ~/.claude/skills/multimodal-llm curl -L https://claudskills.com/skills/multimodal-llm/SKILL.md \ -o ~/.claude/skills/multimodal-llm/SKILL.md
Or just download SKILL.md directly and drop it into ~/.claude/skills/multimodal-llm/. Claude Code auto-discovers it on next session.
Skills live at ~/.claude/skills/multimodal-llm/SKILL.md on macOS/Linux, or %USERPROFILE%\.claude\skills\multimodal-llm\SKILL.md on Windows. See the full install guide for step-by-step instructions.
Open @claudskills_bot on Telegram, tap Open Desktop App, and the desktop app installs this skill for you. Or share the bot link with a colleague — they get the same one-tap install. Learn more →
The ClaudSkills desktop app installs any skill directly into ~/.claude/skills/ with one click — no terminal required. Pro starts at $9/mo or $149 lifetime.
For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.
SKILL.md from the source repository to ~/.claude/skills/multimodal-llm/SKILL.md and restart Claude Code. Both flows are detailed at claudskills.com/install/.SKILL.md file that lives under ~/.claude/skills/<name>/ and tells the Claude Code CLI agent how to perform a specific task (instructions, prompts, allowed tools). Skills are auto-discovered at session start. Multimodal LLM is one of 67,000+ skills indexed in the open ClaudSkills catalog, classified under the General category. Learn more at /learn/what-is-a-claude-skill/.If you reference this skill in a blog post, paper, or documentation, you can cite it as:
@misc{multimodal-llm-2026,
author = {OrchestKit},
title = {Multimodal LLM [Claude Code skill]},
year = {2026},
publisher = {ClaudSkills},
url = {https://claudskills.com/skills/multimodal-llm/}
}Grade A · scanned 2026-06-13 — free static scan against the OWASP Agentic Skills Top 10.
No risk patterns were found in any of the ten OWASP-aligned categories. How grading works ›
Browse all General skills in the ClaudSkills registry, or explore these other picks from the same category:
Part of Acreator Store — Adam Lankamer's AI tools: PerfectStudio · Ucaption · UTagger · AutoXPoster · TestYourSkills · AutomationFlows · Au Naturel · Telegram @acreatorstore
SKILL.md files, not affiliated with, endorsed by, or sponsored by Anthropic.