Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsLearn › Security scan

We scanned 97,000 Claude Code skills for security risks

Data journalism · ClaudSkills · 2026-06-12 · reproducible from the public dataset

A Claude Code skill is just a Markdown file — but its instructions run inside your agent, with your permissions. So we ran a static security scan over all 97,633 skills in the catalog, grading each against the ten threat categories of the OWASP Agentic Skills Top 10. The headline is good news: the open SKILL.md corpus is overwhelmingly clean.

92.1%
completely clean — zero findings of any category
99.6%
earned an A grade (90–100)
31
skills with a Critical pattern (0.03%)
1,168
skills with a High pattern (1.2%)
Why this matters. The agent-skill supply chain has a documented malware problem. A widely-cited 2026 supply-chain study of a low-barrier agent-skill registry found prompt-injection or unsafe patterns in roughly a third of listings. Against that backdrop, an open, GitHub-sourced corpus with a structural admission gate landing at 92% completely clean is a strong result — and a signal that curation and an admission threshold do real work.

The full breakdown

Each skill was scanned for ten categories of risk pattern. A skill can trigger more than one, so the percentages below sum to more than the ~8% of skills that have any finding. Here's every category across the whole catalog:

Threat categorySkillsShareReading
Filesystem ops3,6373.7%Mostly benign — writing files is ordinary work
Network references2,1452.2%Hardcoded IPs/URLs; low severity
Supply chain9321.0%Mostly curl … | sh install lines
Execution8130.8%eval/subprocess/os.system
Obfuscation3410.3%Base64/hex — often legitimate encoding
Persistence3170.3%Cron / launchctl / shell-rc writes
Prompt injection1460.1%Instruction-override / jailbreak framing
Credentials1400.1%SSH / AWS / keychain reads
Data exfiltration980.1%Secret-source read piped to network
Reverse shell110.01%The genuinely scary one

The pattern: scary categories are vanishingly rare

The two biggest categories — filesystem and network — are the least alarming. A skill that writes a config file or references an API endpoint is doing exactly what skills are for. The categories that genuinely matter — reverse shells, data exfiltration, credential reads — sit at the very bottom of the table, together accounting for fewer than 250 skills out of 97,633. The danger in the agent-skill ecosystem is real, but in the open corpus it is concentrated in a tiny tail, not spread across the catalog.

Prose vs. code fences: the insight that shaped the grade

An early version of the scan over-flagged. The reason: a security skill that documents a reverse shell inside a ``` code fence (as an example of what to detect) is not dangerous — but a skill whose prose instructs the agent to open one is. The agent acts on the prose; the fence is illustration. So the grader penalises a pattern inside a code fence at half weight. That single distinction moved hundreds of defensive-security and CTF skills out of the false-positive bucket and is the main reason the final numbers are trustworthy rather than alarmist.

The 31 Criticals

Thirty-one skills carry a Critical-severity pattern in their prose — a reverse shell, an environment dump piped to a URL, or a read of ~/.ssh sent outbound. Some are legitimate red-team and detection-engineering skills whose job is to discuss these exact techniques; others warrant a closer look. They're being hand-reviewed before any are gated out of the one-click install client — a static scan flags candidates, a human makes the call. Every one of them is still browsable, and every one shows its grade openly so you can judge for yourself.

Methodology & reproducibility

The scanner is a pure static pass over each skill's SKILL.md — no network, no execution. Rules are regular-expression patterns grouped into the ten categories above, each tagged with a severity (Critical −25 / High −15 / Medium −8 / Low −3) and a confidence level. Penalties are halved inside code fences and halved again for low-confidence rules, summed, and subtracted from a starting score of 100; the result maps to A–F. Full methodology, including the grade scale and the honest limits of static analysis, lives at claudskills.com/security/. Every grade is published on its skill page and in the open dataset (CC BY 4.0), so the figures in this article are independently reproducible. Counts reflect the catalog as of 2026-06-12 and shift slightly as the catalog grows.

Share on X Share on LinkedIn Submit to HN

FAQ

How many Claude Code skills have security risks?
Of 97,633 scanned, 92.1% were completely clean and 99.6% earned an A. Only 31 (0.03%) had a Critical pattern; 1,168 (1.2%) had a High pattern, most commonly a curl … | sh install line.
What was the most common finding?
Filesystem operations (3.7%) and network references (2.2%) — both mostly benign. The dangerous categories (reverse shells, exfiltration, credential reads) are at the bottom of the table.
Is the open corpus safer than other registries?
On these numbers, substantially. Low-barrier registries studied in 2026 showed unsafe patterns in ~a third of listings; the curated open corpus shows 92% completely clean.
Can I see the grade before installing?
Yes — a free A–F grade sits next to every skill's title. Methodology at /security/.

Related reading