Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkills / Security / red-team

RAG Evaluator

Category: Security  ·  Sub-category: red-team  ·  Last updated:
ai:rag
Generates tailored giskard.checks evaluation suites for RAG (Retrieval-Augmented Generation) systems. Use whenever a user describes a Q&A bot grounded in documents, a knowledge-base chatbot, a retrieval system, or wants to evaluate answer groundedness, faithfulness, hallucination, retrieval quality, citation accuracy, or out-of-scope handling. Triggers on phrases like "evaluate my RAG", "test my retrieval", "check groundedness", "build a RAG eval suite", "eval my chatbot answers from docs", "test if my agent hallucinates", "check if my answers are faithful to the sources", or any evaluation task involving an agent that answers from documents, FAQs, wikis, or a knowledge base. Use this skill even when the user does not explicitly say "RAG" but describes an agent grounded in documents. For adversarial / red-teaming evaluation, use the `scenario-generator` skill instead. This skill focuses on quality, not safety.

From the source SKILL.md

You are an expert RAG evaluation engineer. Your job is to help users build comprehensive, quality-focused evaluation suites for RAG (Retrieval-Augmented Generation) systems using the giskard.checks Python library.

What this skill does

RAG Evaluator is a community-contributed Claude Code skill in the red-team sub-category. It ships as a SKILL.md file that Claude Code auto-discovers under ~/.claude/skills/rag-evaluator/ and loads when your prompt matches the skill's trigger.

When to invoke it: Use whenever a user describes a Q&A bot grounded in documents, a knowledge-base chatbot, a retrieval system, or wants to evaluate answer groundedness, faithfulness, hallucination, retrieval quality, citation accuracy, or out-of-scope handling. Triggers on phrases like "evaluate my RAG", "test my retrieval", "check groundedness", "build a RAG eval suite", "eval my chatbot answers from docs", "test if my agent hallucinates", "check if my answers are faithful to the sources", or any evaluation task involving an agent that answers from documents, FAQs, wikis, or a knowledge base.

Who uses this skill

The RAG Evaluator Claude Code skill is built for security engineers, penetration testers, DevSecOps practitioners, and development teams hardening codebases and infrastructure. It's part of ClaudSkills (also referred to as Claude Skills or Claude Code Skills) — the open community-curated registry of 69,000+ SKILL.md files for Anthropic's Claude Code agent and the wider Claude ecosystem (Claude API, Claude Agent SDK).

How to install

Free

Manual install (2 steps)

mkdir -p ~/.claude/skills/rag-evaluator
curl -L https://claudskills.com/skills/rag-evaluator/SKILL.md \
  -o ~/.claude/skills/rag-evaluator/SKILL.md

Or just download SKILL.md directly and drop it into ~/.claude/skills/rag-evaluator/. Claude Code auto-discovers it on next session.

Skills live at ~/.claude/skills/rag-evaluator/SKILL.md on macOS/Linux, or %USERPROFILE%\.claude\skills\rag-evaluator\SKILL.md on Windows. See the full install guide for step-by-step instructions.

Pro

One-click install via the desktop app

The ClaudSkills desktop app installs any skill directly into ~/.claude/skills/ with one click — no terminal required. Pro starts at $9/mo or $149 lifetime.

Pro

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

Attribution & license

More Security skills

Browse all Security skills in the ClaudSkills registry, or explore these other picks from the same category:

Browse all Security skills → Top 100 skills
Part of ClaudSkills — the open registry for Claude Skills & Claude Code Skills.  ·  What's New  ·  Install guide  ·  About  ·  llms.txt

Part of Acreator Store — Adam Lankamer's AI tools: PerfectStudio · Ucaption · UTagger · AutoXPoster · TestYourSkills · AutomationFlows · Au Naturel