ClaudSkills / General / general-misc

Vllm Speculative Decoding

Category: General  ·  Sub-category: general-misc  ·  Last updated:
ai:llm
Pick, configure, tune, monitor vLLM speculative decoding in production. Eleven SpeculativeMethod options (ngram, ngram_gpu, medusa, mlp_speculator, draft_model, suffix, eagle, eagle3, dflash, mtp, extract_hidden_states), `--speculative-config` JSON schema, which methods pair with which target model family, Prometheus acceptance metric surface, version gates (v0.11.1 EAGLE-3 preamble fix, v0.16 parallel drafting, v0.18 ngram_gpu, v0.19 dflash and zero-bubble), composability with chunked prefill / PP / LoRA / FP8 / structured outputs, Arctic Inference plugin, where spec-dec stops paying at high batch.

From the source SKILL.md

For production vLLM operators deciding which speculative method fits a given model + workload, configuring it correctly, wiring the acceptance metrics into their dashboards, and diagnosing why a deployment isn't seeing the expected speedup.

What this skill does

Vllm Speculative Decoding is a community-contributed Claude Code skill in the general-misc sub-category. It ships as a SKILL.md file that Claude Code auto-discovers under ~/.claude/skills/vllm-speculative-decoding/ and loads when your prompt matches the skill's trigger.

Who uses this skill

The Vllm Speculative Decoding skill is built for Claude Code users and developers across all disciplines looking for general-purpose AI assistance. It is part of the open ClaudSkills registry, a community-curated catalog of 56,000+ capabilities you can install for Claude Code — the Claude CLI agent.

How to install

Free

Manual install (2 steps)

mkdir -p ~/.claude/skills/vllm-speculative-decoding
curl -L https://claudskills.com/skills/vllm-speculative-decoding/SKILL.md \
  -o ~/.claude/skills/vllm-speculative-decoding/SKILL.md

Or just download SKILL.md directly and drop it into ~/.claude/skills/vllm-speculative-decoding/. Claude Code auto-discovers it on next session.

Skills live at ~/.claude/skills/vllm-speculative-decoding/SKILL.md on macOS/Linux, or %USERPROFILE%\.claude\skills\vllm-speculative-decoding\SKILL.md on Windows. See the full install guide for step-by-step instructions.

Pro

One-click install via the desktop app

The ClaudSkills desktop app installs any skill directly into ~/.claude/skills/ with one click — no terminal required. Pro starts at $9/mo or $149 lifetime.

Pro

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

More General skills

Browse all General skills in the ClaudSkills registry, or explore these other picks from the same category:

Browse all General skills → Top 100 skills
Part of ClaudSkills — the open registry for Claude Code skills.  ·  What's New  ·  Install guide  ·  About  ·  llms.txt

Part of Acreator Store — Adam Lankamer's AI tools: GifPerfect · AspectPerfect · SlomoPerfect · Ucaption · UTagger · AutoXPoster · TestYourSkills