Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsEngineering › Testing › Page 22

Testing (Page 22 of 41)

2448 Claude Code skills in the Testing sub-category of Engineering.

2,448 skills · updated 2026-06-12 · showing 1261–1320 of 2,448 by quality score

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

Use when adding or enforcing lint rules as part of a test or verification plan. Extends testing-strategy with lint-specific guidance: rule selection, gate placement, failure…
Testing patterns for litefs-py and litefs-django. Use when writing tests, setting up fixtures, understanding test organization, or configuring pytest marks.
Use when building real-service end-to-end tests with fixtures, cleanup, rate limits, and evidence. Triggers:
Comprehensive guide for building functional tools for LiveKit voice agents using the @function_tool decorator.
LLM evaluation harness for accuracy benchmarking. MMLU/HumanEval/MATH eval runners, model-graded scoring, prompt regression testing, and per-skill accuracy tracking.
Master comprehensive evaluation strategies for LLM applications, from automated metrics to human evaluation and A/B testing.
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking.
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking.
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking.
Comprehensive guide to using LLMs throughout the game development lifecycle - from design to implementation to testingUse when "ai game development, llm game dev, claude game, gpt…
LLM gateway and routing configuration using OpenRouter and LiteLLM. Invoke when: - Setting up multi-model access (OpenRouter, LiteLLM) - Configuring model fallbacks and…
LLM inference load testing for throughput and concurrency limits. Token/s benchmarks, concurrent request sweeps, latency-vs-throughput curves, and breaking-point identification.
Creates test documentation (testing-strategy.md, tests/README.md) with Risk-Based Testing philosophy. Use when setting up test strategy for a project.
Executes test tasks (label 'tests') through Todo to To Review with risk-based limits. Use for test task execution. Not for implementation tasks.
Orchestrates test planning pipeline (research → manual → auto tests). Coordinates ln-511, ln-512, ln-513. Invoked by ln-500-story-quality-gate.
Performs manual testing of Story AC via executable bash scripts saved to tests/manual/. Creates reusable test suites per Story. Worker for ln-510.
Plans automated tests (E2E/Integration/Unit) using Risk-Based Testing after manual testing. Calculates priorities, delegates to ln-301-task-creator. Worker for ln-510.
Analyzes application logs: classifies errors, checks log quality, maps stack traces to source. Use when logs need review after test runs or during development.
Orchestrates test planning pipeline: research, manual testing, automated test planning. Use when Story needs comprehensive test coverage planning.
Performs manual testing of Story AC via executable bash scripts in tests/manual/. Use when Story implementation needs hands-on AC verification.
Plans automated tests (E2E/Integration/Unit) using Risk-Based Testing after manual testing. Use when Story needs a test task with prioritized scenarios. — from engineering/testing
Use when auditing the test surface through the evaluation platform with mandatory research, coordinated test audit workers, and structured summaries.
Detects tests validating framework/library behavior instead of project code. Use when auditing test business logic focus.
Validates E2E coverage for critical paths (money, security, data integrity). Risk-based prioritization. Use when auditing E2E test coverage.
Scores each test by Impact x Probability, returns KEEP/REVIEW/REMOVE decisions. Use when auditing test value and pruning low-value tests.
Identifies missing tests for critical paths (money, security, data integrity, core flows). Use when auditing test coverage gaps.
Checks test isolation (API/DB/FS/Time/Network), determinism, flaky tests, order-dependency, anti-patterns. Use when auditing test isolation.
Checks manual test scripts for harness adoption, golden files, fail-fast, config sourcing, idempotency. Use when auditing manual test quality.
Checks test file organization, directory layout, test-to-source mapping, domain grouping, co-location. Use when auditing test structure.
Audits assertion strength and test oracles that prove real defects. Use when finding weak tests that execute code but prove little.
Sets up test infrastructure with Vitest, xUnit, and pytest. Use when adding testing frameworks and sample tests to a project.
Executes all test suites and reports results with coverage. Use when verifying that test infrastructure works after bootstrap.
Executes optimization hypotheses with keep/discard testing loop. Use when applying validated performance improvements.
Replaces custom modules with OSS packages using atomic keep/discard testing. Use when migrating custom code to established libraries.
Planifie et exécute des tests de charge et performance. Se déclenche avec "test de charge", "load test", "stress test", "performance test", "k6", "JMeter", "Gatling", "be — from…
Creates comprehensive load test plans with realistic scenarios, traffic models, k6 scripts, and success criteria.
Load Test Scenario Planner - Auto-activating skill for Performance Testing. Triggers on: load test scenario planner, load test scenario planner Part of the Performance Testing…
Tester les performances sous charge. Utiliser quand on mesure la capacité du système ou optimise les temps de réponse.
Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load.
Write a load and performance testing plan for a service. Use when asked to create a performance test plan, write load testing documentation, define stress or soak test sc — from…
Manage local Ollama LLM models for development and testing. Use when: running local models, configuring Ollama, switching between fast/quality models, optimizing VRAM usage,…
Local testing setup - start dev server with mock Claude and run tests (unit tests, CLI E2E)
Internationalization (i18n) and localization (l10n) testing for global products including translations, locale formats, RTL languages, and cultural appropriateness.
Use Lockplane for safe database schema management - define schemas in .lp.sql files, validate, and apply with shadow DB testing
Locust Test Creator - Auto-activating skill for Performance Testing. Triggers on: locust test creator, locust test creator Part of the Performance Testing skill category.
Pure logic and math testing with Vitest. Use for single-point assertions on functions, state transitions, and physics calculations.
Create a minimal working Lokalise example. Use when starting a new Lokalise integration, testing your setup, or learning basic Lokalise API patterns.
Use when a browser-run UI or client claim needs real runtime observation: visual state, DOM, accessibility tree, console, network, CORS, viewport, screenshot, or browser…
End-to-end testing for web applications with Playwright, Cypress, Selenium, and Puppeteer. Use for setting up E2E tests, debugging failures, improving reliability, and…
Performance and load testing with k6, locust, JMeter, Gatling, and artillery. Use for load/stress/spike/soak tests, API and database benchmarking, profiling, p95/p99 latency…
Use when implementation should be driven by test-first verification: a focused executable check can fail before the change, pass after it, and provide evidence for a behavior,…
Test strategy guidance — test pyramid design, coverage goals, categorization, flaky test diagnosis, infrastructure architecture, and risk-based prioritization.
Call Apex methods imperatively from LWC — on button click, lifecycle hooks, or conditional logic. Covers import syntax, cacheable vs non-cacheable, async/await patterns, error…
Use when setting up or reviewing Lightning Web Component unit tests with Jest, including `@salesforce/sfdx-lwc-jest`, wire adapter mocks, imperative Apex mocks, async rerender…
One-shot XCUITest scaffolding for macOS SwiftUI apps. Audits the project, generates ranked TIER-1/2/3 test stubs, suggests accessibility identifiers with batch confirmation, and…
Multi-perspective deliberation (Logos/Pathos/Sophia) for architecture arbitration, trade-offs, Go/No-Go, and strategic decisions. Does not write code.
MailDev is a local SMTP server with a web UI and REST API for capturing application email during development.
MailDev is a local SMTP server with a browser UI for viewing test emails during development. It catches outgoing mail, exposes a REST API, supports attachments and relay options,…
Uses MailHog to capture outbound email in development and test environments through a local SMTP server, browser UI, and JSON API.
Integrate MaintainX API testing into CI/CD pipelines. Use when setting up automated testing, configuring CI workflows, or implementing continuous integration for MaintainX…
All Engineering skills →
More in EngineeringDevops (2,410) · Architecture (1,778) · Backend (1,375) · Frontend (1,035) · Languages (880) · Cloud Platforms (802) · Code Quality (774) · Databases (568) · Performance (517) · Mobile (379) · Observability (272) · Data Engineering (230) · Docs Engineering (197) · Workflow Orchestration (170) · ML AI Eng (144) · API Tooling (15)