How old is the SKILL.md format?

Anthropic released the SKILL.md format in October 2025, so as of mid-2026 it's roughly six months old. The conventions, tooling, and community practices around it are still actively evolving.

Does SKILL.md work outside of Claude Code?

Not natively in most cases. Claude Code is the reference runtime. Some VS Code extensions and community converters can read SKILL.md and translate it to formats used by Cursor, Windsurf, Cline, and similar tools, but first-class support outside Anthropic's stack is still rare.

What's the single most distinguishing trait of a well-written skill?

Anti-trigger discipline — explicitly telling the model when the skill should not apply. Most published skills omit this section, and the ones that include it tend to be substantially more reliable in production.

Where are the biggest gaps in the published catalog today?

Legal and regulatory compliance workflows, healthcare and clinical operations, vertical enterprise SaaS (Salesforce, Workday, ServiceNow), non-English-language skills, and multi-step workflow orchestration. Engineering and content categories are saturated; these are not.

Will Anthropic add a version field or capability declarations to SKILL.md?

My read is yes on both, within the next twelve months. Authors are already using custom frontmatter for versions, and as skills start to depend on each other or require specific model capabilities, the runtime will need to read those declarations directly.

Should I trust community skills with production data?

By default, no. For anything touching credentials, customer data, or money, stick to skills you've personally read end to end or to Anthropic-published skills. For lower-stakes work, well-authored community skills are usually fine — just read them before installing.

How do I start my own internal company skill catalog?

Most teams put their skills in a private git repo and clone it into ~/.claude/skills on each developer's machine, syncing via a small script. The hard part is governance — treat skills like code, review them in PRs, and gate any skill that touches sensitive systems behind an approval step.

Home › Learn › The Claude Code Skills Roadmap: Where SKILL.md Stands in 2026

The Claude Code Skills Roadmap: Where SKILL.md Stands in 2026

Published 31 May 2026 · 14 min read · By a long-time Claude Code practitioner

SKILL.md is barely six months old. Anthropic shipped the format in October 2025, and in the time since it has gone from a curiosity buried in the Claude Code docs to something tens of thousands of developers are writing, sharing, and arguing about. That's fast, even by AI-tooling standards — and it means the conventions you internalize this month may not be the ones the ecosystem settles on by autumn.

What follows is my read on the landscape: where the format actually is right now, what the catalog looks like in the aggregate, the gaps that anyone shipping skills should know about, and the changes I expect to land over the next six to twelve months. It's a practitioner's perspective, not a roadmap from Anthropic, and I'll flag opinions as opinions.

In this guide

Where the format sits today
Adoption beyond Anthropic
What the catalog actually looks like
Quality patterns in the high-end tail
Gaps and opportunities for authors
Where the format is heading
Anthropic's skills vs the community's
How to participate

Where the format sits today

SKILL.md is a Markdown file with a YAML frontmatter block. Claude Code looks for it in ~/.claude/skills/<skill-name>/SKILL.md (user-scope) or .claude/skills/<skill-name>/SKILL.md inside a project (project-scope). The frontmatter declares name and description at minimum; richer frontmatter declares model, allowed-tools, user-invokable, and a growing handful of optional fields. The body is freeform instructions the model loads when the skill activates.

That's the formal spec. The interesting thing is what activates a skill. Claude Code reads the description field at session start and uses it as a routing signal — if the user's prompt looks like the skill might apply, the model pulls the body into context. That's why I tell anyone writing skills that the description field is the single highest-leverage line in the entire file. Most authors waste it on a generic summary; the well-written ones use it as a precise activation contract.

What Claude Code's current implementation supports

User-scope and project-scope discovery. Skills under your home dir are available everywhere; skills under a project root only activate in that project. The project-scope behavior is what makes skills viable for team workflows.
Subskills. A skill directory can contain a scripts/ folder, helper Markdown files, and a references/ folder that Claude will read on demand. The model does not load all of it eagerly; it pulls what it needs.
Allowed-tools restriction. If a skill declares allowed-tools: [Read, Grep], Claude won't reach for Bash while that skill is active. This is underused. It should not be.
User-invokable flag. Setting user-invokable: true lets the human trigger the skill explicitly via a slash-style invocation. Without it, the skill only activates via the description-matching path.

What's not in the spec yet, and what most authors get wrong: there's no version field, no capability declaration, no model-compatibility gate, no formal way to express "this skill requires Claude Opus, not Haiku." You can put those things in custom frontmatter keys — and many authors do — but nothing in the runtime reads them. The skill activates the same way regardless.

Adoption beyond Anthropic

The SKILL.md format is Anthropic's, but it doesn't have to stay that way, and the early signs suggest it won't. The format is just Markdown plus YAML — there's nothing Anthropic-specific about the file shape itself. What's Anthropic-specific is the runtime that loads it.

Here's where adoption sits in my reading of the field as of mid-2026:

Claude Code (Anthropic). First-party, full implementation. This is the reference runtime — everything else compares to it.
Codex CLI (OpenAI). Has its own equivalent format and discovery path. There's been informal community work on converters, but no native SKILL.md ingestion. My read: OpenAI is unlikely to adopt a competitor's file format directly. Expect a parallel ecosystem.
Gemini CLI (Google). Same story. Google's tooling reads its own format. A few community projects ingest SKILL.md and translate, but it's not first-class.
VS Code Copilot extensions. Several community extensions now scan a workspace for .claude/skills/ and surface the skill metadata, even when the underlying chat model isn't Claude. This is the closest thing to cross-tool adoption I see today.
Cursor, Windsurf, Cline, Aider, Roo Code. Each has its own rules-file convention (.cursorrules, .windsurfrules, .clinerules, conventions.md, etc). There's a clear pattern of community-built converters that translate SKILL.md to and from these formats. The conventions are similar enough that translation is mostly mechanical.

My read: SKILL.md is on track to become the de facto interchange format for agent instructions, even if it never becomes the universal runtime format. The reason is simple — it's the one with the largest published corpus. When you have tens of thousands of human-written skills already in the wild, every other tool either ingests them or loses the network effect. I expect to see "import from SKILL.md" become a standard feature in adjacent tools over the next year.

The risk to that trajectory is fragmentation. If Anthropic adds runtime-specific features fast enough — model-specific gating, sandboxed subskill execution, signed manifests — the format may bifurcate into "Anthropic SKILL.md" and "portable SKILL.md." I think the community would route around that the same way it routed around browser-specific HTML, but it would slow things down.

What the catalog actually looks like

Take the published corpus in aggregate and a few patterns jump out immediately. I'm speaking in broad strokes here because exact numbers shift week to week.

Categories are wildly uneven

Engineering dominates. Code review, refactoring, test generation, debugging, infrastructure — these are the categories with the deepest published coverage, and it's not close. There's a tier-two cluster around content production (drafting, editing, SEO, social), product management (specs, planning, prioritization), and growth marketing. Then the long tail.

Where the catalog is conspicuously thin:

Legal and compliance. A handful of contract-review skills, almost nothing for regulated industries. HIPAA workflow skills are nearly absent. SOX testing helpers are a desert. GDPR-aware processors are scarce.
Finance and accounting. Some basic close-management skills, very few that handle real audit trails, journal entry prep, or reconciliation flows.
Healthcare and life sciences. Almost nothing patient-data-aware. The few that exist tend to be generic medical Q&A and don't account for the operational reality of clinical workflows.
Education. Tutoring-style skills exist; curriculum design, IEP support, assessment authoring — barely a presence.
Government and civic. Effectively zero.

Tool coverage is bimodal

Skills that wrap a specific tool fall into two camps. Developer-facing tools — git, GitHub, AWS, Docker, Kubernetes, Stripe — have multiple competing skills, often of high quality. Enterprise SaaS — Salesforce, Workday, ServiceNow, Oracle, SAP — has near-nothing despite being where most actual office work happens. That's a big gap that I think is underweighted.

Languages and locales

The published corpus is overwhelmingly English. Skills with non-English descriptions exist — Japanese, Chinese, Portuguese, Polish, and Spanish all have presence — but they're a small fraction. Most authors writing in their native language still write the frontmatter description in English because the activation routing reads it. That tells you the activation layer is implicitly English-biased.

If you're a multilingual author, this is an opportunity. A skill whose body is in Japanese and whose description routes from both English and Japanese prompts is genuinely rare today.

Quality patterns in the high-end tail

The median skill is shorter than people expect. A lot of authors write a single tight paragraph after the frontmatter, ship it, and move on. That's not necessarily wrong — some of the most useful skills I've seen are short — but it's worth knowing the distribution. Most published skills clock in under a few hundred words of body content. The thoughtful long ones are the exception.

Frontmatter coverage is patchy. name and description are nearly universal. After that, it falls off fast. Maybe a third of skills declare allowed-tools. Even fewer set model. Custom frontmatter — license, version, tags, author — appears on a minority. The frontmatter floor is low, and the people who fill it out beyond the minimum are signaling care.

The single trait that distinguishes the well-authored skills

Anti-trigger discipline. By a wide margin. I'd say it's the most reliable signal of a skill that was written by someone who actually used it in production, vs. someone who wrote it once and shipped.

Anti-trigger discipline means the skill explicitly tells the model when not to apply itself. Something like:

## When NOT to use this skill

- The user is asking about a different framework (skip).
- The error message contains 'permission denied' — that's a different skill.
- If the project has no test runner configured, do not attempt to generate tests.

This pattern shows up in maybe one in ten published skills. The ones that have it almost always also have specific examples, pricing/quota disclosure when they touch an external API, code blocks instead of prose for shell commands, and clear scope boundaries. The ones that lack it are usually the ones that misfire — the model loads them on prompts they have no business handling, eats context, and gives bad results.

Other patterns in the high tail

Pricing and quota tables. Any skill that touches a paid API and is worth a damn discloses what the user is about to spend or rate-limit against. The really good ones tier it: "cheap path," "normal path," "expensive escalation."
Worked examples. Not just "the skill does X" — concrete input → expected output pairs the model can pattern-match against.
Explicit failure modes. "If the lookup returns nothing, do this. If it returns more than 20 results, do this other thing."
Tight tool restrictions. The good skills use allowed-tools aggressively. They don't say "the model has access to everything;" they say "the model can Read and Grep, nothing else, and here's why."

If you're writing a skill today and want it to be in the high tail, the four habits above will get you most of the way there. Start with the anti-trigger section — most authors leave it out, and adding it is what most distinguishes thoughtful authorship from quick authorship.

Gaps and opportunities for authors

The thin spots in the catalog are the opportunity. If you're trying to write something that gets used, write into a gap, not into an oversaturated category. There are already dozens of competent code-review skills. There are not dozens of competent SOX-walkthrough skills.

Domain-specific compliance workflows

If you have real expertise in a regulated workflow — HIPAA breach notification triage, SOX 404 testing, GDPR Article 30 record-keeping, SOC 2 evidence collection — and you write a skill that encodes the actual workflow with proper anti-triggers, you will own that niche. There is essentially no competition today. The reason these are scarce isn't lack of demand — it's that the people who know these workflows aren't usually the people writing skills. Bridging that is high value.

Vertical SaaS tooling

The skills catalog wraps developer tools well and enterprise SaaS terribly. A skill that handles Salesforce SOQL queries with proper governor-limit awareness, or one that drafts Workday business processes correctly, or one that navigates ServiceNow Flow Designer — these are missing. Some of this is because the underlying APIs are notoriously crusty. But that's also the point: the crustier the underlying interface, the more value a well-written skill provides on top of it.

Multilingual skills

I mentioned this earlier — worth restating. Skills with native-language bodies and bilingual descriptions are rare. Anyone working in Japanese, Mandarin, German, Spanish, Portuguese, Arabic, or Hindi who writes a skill that's competent in their language has a near-empty niche.

Workflow skills, not feature skills

The catalog over-indexes on skills that wrap a single feature or tool. It under-indexes on skills that orchestrate a multi-step workflow — the kind a senior practitioner would walk a junior through. "Run our standard incident response intake." "Triage a customer escalation through the right team." "Take a sales call recording, extract the action items, draft the follow-up, and stage it in the CRM." These exist, but in nothing like the volume the catalog could support.

Anti-skills

This is a personal hobbyhorse. Some of the most valuable skills I've written do nothing themselves — they exist to tell the model what not to do. "When the user mentions a credential, never echo it back." "When the user asks for legal advice, decline and route." The catalog has very few of these, and they're underweighted because they don't feel like "features." They are, though. Defensive skills compound across every other skill that runs alongside them.

Where the format is heading

This is the speculative part. I'll flag confidence levels.

High confidence

A capability declaration. Right now, allowed-tools is the only formal capability gate. I think we'll get a richer one — something that lets a skill say "I need network access," "I need to read environment variables," "I run a subprocess." The runtime will then enforce it. This is the natural next step for any system that's going to be trusted with real production work.

A version field. Authors are already putting version: 1.2.0 in custom frontmatter. The runtime doesn't read it. I expect it will. Once skills get a real version field, the catalog gets requires:, dependency resolution, and a proper semver discipline.

Model-specific behavior gates. Some skills need Claude Opus or above to work properly. Others run fine on Haiku. There's no way to express that today other than convention. I expect a min-model or recommended-model field within twelve months.

Medium confidence

Signed manifests. Any system that runs third-party instructions at scale eventually adds a signature layer. I expect Anthropic to ship something here — probably optional at first — for enterprise customers who need to attest that the skills running on their org are the ones they vetted.

Skill packs and dependencies. Right now a "pack" is a folder of skills you copy together. There's no formal grouping. I expect a manifest format — call it pack.json or plugin.json, the latter is already a community convention — that declares a bundle of related skills, optionally with a shared license and shared metadata.

Lower confidence, but interesting

Conditional activation. A skill that only activates when a specific file is present in the project, or when a specific environment variable is set, or when the user is on a specific OS. This would clean up a lot of skills that today have to do all that gating in their body.

A formal subskill spec. Right now references/ and scripts/ conventions are mostly informal. A real subskill spec would let a skill say "here are my three sub-procedures, each with their own activation conditions." Useful for big skills; risky for fragmentation.

Things I do not expect

I don't expect the format to become a programming language. The temptation will be there — "let's add conditionals, loops, variable interpolation" — and I think it would be a mistake. The whole point of SKILL.md is that it's instructions a model reads, not code a runtime executes. Keep the executable parts in scripts/; keep the Markdown declarative.

Anthropic's skills vs the community's

Anthropic ships skills. The community ships orders of magnitude more. As a practitioner, you'll be choosing between the two constantly, and the heuristics for when to trust which are worth being explicit about.

What Anthropic ships

The official skills tend to be conservative, well-tested, and narrow. They cover patterns that Anthropic has high confidence will work across model versions: PDF generation, document processing, brand voice, MCP server scaffolding, that kind of thing. They use the format's features cleanly because the people writing them know the runtime intimately.

What they don't do is move fast. By the time Anthropic ships an official skill for a domain, the community has usually shipped half a dozen variants. So Anthropic's catalog is small but high-confidence, and the community's catalog is large but variable.

What the community ships

Everything else. The interesting work is here. The depth of coverage, the niche skills, the multilingual stuff, the weird-but-useful — none of it would exist if the ecosystem were just Anthropic-published.

The trade-off is quality variance. A community skill might be the best thing of its kind, or it might be a stub someone uploaded after fifteen minutes of work. You can usually tell which is which by reading the body and checking for the patterns from earlier — anti-trigger discipline, pricing tables, frontmatter depth.

My personal heuristic

For anything touching production credentials, customer data, or money, I default to Anthropic-published or to a community skill I've personally read end to end. For everything else — drafting helpers, internal workflows, side-project work — I'll happily install whatever the community shipped that looks well-authored, and I'll uninstall it the moment it misfires.

One more thing about the split: I think it'll stay this shape. Anthropic does not want to be in the business of curating tens of thousands of skills, and the community would not be served by Anthropic doing so. The right division of labor is Anthropic shipping the format, the runtime, and a small set of reference skills; the community shipping the long tail. That's how every healthy ecosystem I've seen has structured itself, and SKILL.md is following the pattern.

How to participate

If you've read this far and want to actually contribute, here are the practical first moves I'd make.

If you want to ship a public skill

Pick a niche. Look at the gaps from the section above and pick something the catalog doesn't have. Resist the urge to write another code-review skill.
Read three skills in your target niche end to end before writing yours. The patterns are easier to internalize from reading than from reading about.
Write the anti-trigger section first. Before you describe what the skill does, describe what it doesn't. This forces you to scope tightly.
Write to a real use case. The strongest skills come from someone who needed them and used them. Synthetic skills tend to ring hollow.
Publish, install it yourself, use it for a week, then revise. Most of the polish happens after first use.

For a deeper walkthrough of authoring conventions, the parallel piece on writing a SKILL.md file covers the mechanics. The piece on frontmatter reference covers every field. The piece on SKILL.md anti-patterns covers the failure modes.

If you want to contribute to a plugin or pack

The community convention is to keep related skills together in a directory with a top-level manifest. Look for repos with a .claude-plugin/ directory at the root — these are the early instances of what I expect to become a formal pack spec. Contributing a skill to an existing plugin tends to land faster than starting your own, because the maintainer has already done the work of establishing conventions and an install path.

If you want to run your own internal catalog

This is increasingly common at companies. The pattern: a private git repo holds your org's skills, every laptop clones into ~/.claude/skills/, and a small script keeps them in sync. The hard part is governance — who reviews skills, who has push, how do you handle skills that touch sensitive systems. The same patterns that work for internal code review work here. Treat skills as code; review them like code; version them like code.

If you want to browse and learn before writing

The category index is the entry point. Pick a domain you know well, read the top of the list, read the bottom, and you'll have a fast intuition for what's working and what isn't. The curated best-of page is a shortcut to the high tail. Both are better than reading the spec cold — the format makes more sense when you see what people are doing with it.

The ecosystem is young enough that anything you ship now becomes part of the shape it takes. That's the honest pitch for participating today rather than waiting. The conventions are still being negotiated, and the skills people are publishing this year are the ones the next wave of authors will read for reference. Worth being one of them.

Frequently asked questions

How old is the SKILL.md format?: Anthropic released the SKILL.md format in October 2025, so as of mid-2026 it's roughly six months old. The conventions, tooling, and community practices around it are still actively evolving.
Does SKILL.md work outside of Claude Code?: Not natively in most cases. Claude Code is the reference runtime. Some VS Code extensions and community converters can read SKILL.md and translate it to formats used by Cursor, Windsurf, Cline, and similar tools, but first-class support outside Anthropic's stack is still rare.
What's the single most distinguishing trait of a well-written skill?: Anti-trigger discipline — explicitly telling the model when the skill should not apply. Most published skills omit this section, and the ones that include it tend to be substantially more reliable in production.
Where are the biggest gaps in the published catalog today?: Legal and regulatory compliance workflows, healthcare and clinical operations, vertical enterprise SaaS (Salesforce, Workday, ServiceNow), non-English-language skills, and multi-step workflow orchestration. Engineering and content categories are saturated; these are not.
Will Anthropic add a version field or capability declarations to SKILL.md?: My read is yes on both, within the next twelve months. Authors are already using custom frontmatter for versions, and as skills start to depend on each other or require specific model capabilities, the runtime will need to read those declarations directly.
Should I trust community skills with production data?: By default, no. For anything touching credentials, customer data, or money, stick to skills you've personally read end to end or to Anthropic-published skills. For lower-stakes work, well-authored community skills are usually fine — just read them before installing.
How do I start my own internal company skill catalog?: Most teams put their skills in a private git repo and clone it into ~/.claude/skills on each developer's machine, syncing via a small script. The hard part is governance — treat skills like code, review them in PRs, and gate any skill that touches sensitive systems behind an approval step.

Found a bug or want a topic covered? Email [email protected] or open an issue via GitHub.