Automated E2E testing and error resolution using Playwright MCP integration (Codex skill for /ensemble:playwright-test)
Use when dispatched to author the QA test plan for a game feature. Translates a feature spec into player-journey test cases (happy paths, edge cases, adversarial inputs), grouped…
Domain-specific testing patterns for episodic memory operations. Use when testing episode lifecycle, pattern extraction, reward scoring, or memory retrieval.
Build a standards-compliant EPUB 3 file from a manuscript markdown, cover image, and book_manifest.json using the bundled pandoc-based script.
ERNE — Test-driven development workflow with Jest and React Native Testing Library
Generates Espresso UI tests for Android apps in Kotlin or Java. Espresso runs inside the app process for fast, reliable UI testing.
Bias detection and mitigation, fairness metrics, privacy frameworks, consent models, transparency requirements, and accountability structures for data science practice.
EU AI Act (Regulation EU 2024/1689) compliance specialist. Use when classifying AI systems by risk tier, assessing provider or deployer obligations, evaluating GPAI model…
Evaluation harness for testing agent and skill quality through structured benchmarks, regression tests, and quality scoring.
Run Microsoft's eval-recipes benchmarks to validate amplihack improvements against baseline agents. Auto-activates when testing improvements, running evals, or benchmarking…
Git branch cleanup utility. Lists and deletes branches that have been merged to main. Use when user wants to clean up old branches, delete merged branches, or tidy up their git…
Evaluate RAG systems with hit rate, MRR, faithfulness metrics and compare retrieval strategies. Use when testing retrieval quality, generating evaluation datasets, comparing…
Evaluate skills by executing them across sonnet, opus, and haiku models using sub-agents. Use when testing if a skill works correctly, comparing model performance, or finding the…
Build evaluation frameworks for agent systems. Use when testing agent performance systematically, validating context engineering choices, or measuring improvements over time.
Evaluate agent systems with quality gates and LLM-as-judge. Use when you need to measure component quality or implement quality gates.
Builds repeatable evaluation systems with golden datasets, scoring rubrics, pass/fail thresholds, and regression reports.
Create a minimal working Evernote example. Use when starting a new Evernote integration, testing your setup, or learning basic Evernote API patterns.
Set up efficient local development workflow for Evernote integrations. Use when configuring dev environment, setting up sandbox testing, or optimizing development iteration speed.
Configure Exa CI/CD integration with GitHub Actions and automated testing. Use when setting up automated testing for Exa integrations, configuring CI pipelines, or adding Exa…
Configure Exa CI/CD integration with GitHub Actions and testing. Use when setting up automated testing, configuring CI pipelines, or integrating Exa tests into your build process.
Create a minimal working Exa search example with real results. Use when starting a new Exa integration, testing your setup, or learning basic search, searchAndContents, and…
Implement Exa load testing, capacity planning, and scaling strategies. Use when running performance tests, planning capacity for Exa integrations, or designing high-throughput…
Configure Exa local development with hot reload, testing, and mock responses. Use when setting up a development environment, writing tests against Exa, or establishing a fast…
Configure Exa local development with hot reload and testing. Use when setting up a development environment, configuring test workflows, or establishing a fast iteration cycle with…
Use when running a previously designed distributed-systems test plan against a real or simulated cluster — driving fault injection, workload, chaos scenarios, linearizability /…
Use after the approval gate to dispatch each item in an ApprovedPlan to the pr-test-executor subagent and assemble TestResults. Third stage of the PRoctor pipeline.
Analyzes the variety and depth of assertions across .NET test suites. Use when the user asks to evaluate assertion quality, find shallow testing, identify tests with only — from…
Reference data for .NET test framework detection patterns, assertion APIs, skip annotations, setup/teardown methods, and common test smell indicators across MSTest, xUnit — from…
Performs pseudo-mutation analysis on .NET production code to find gaps in existing test suites. Use when the user asks to find weak tests, discover untested edge cases, c — from…
Detects duplicate boilerplate, copy-paste tests, and structural maintainability issues across .NET test suites.
Deep formal test smell audit based on academic research taxonomy (testsmells.org). Detects 19 categorized smell types — conditional logic, mystery guests, sensitive equal — from…
Analyzes test suites and tags each test with a standardized set of traits (e.g., positive, negative, critical-path, boundary, smoke, regression).
Guides taking or defending U.S. expert witness depositions with Daubert/Frye methodology testing, Rule 26(a)(2) compliance, and Rule 702/703 foundations.
Execute multi-level exploratory testing of the app covering basic functionality, complex operations, adversarial testing, and cross-cutting scenarios. Deeper than /smoke-test.
Advanced exploratory testing techniques with Session-Based Test Management (SBTM), RST heuristics, and test tours.
Choose optimal external AI models for code analysis, bug investigation, and architectural decisions. Use when consulting multiple LLMs via claudish, comparing model perspectives,…
Extract and summarize test failures from logs. Use to quickly understand what tests failed and why.
A framework for discovering non-obvious product solutions by rapidly building and testing radical, opposing versions of a feature's core attributes.
Execute and generate ExUnit tests for Elixir projects with setup callbacks, describe blocks, and async testing support — from FortiumPartners/ensemble
Execute and generate ExUnit tests for Elixir projects with setup callbacks, describe blocks, and async testing support — from engineering/testing
Automation & tooling specialist: browser automation, CI/CD, monorepo, performance testing, feature flags. 23 methodologies.
Viết test case (TC) trên FARE từ spec đã có — mỗi AC của user_story / mỗi flow của use_case → n TC theo kỹ thuật ISTQB (positive / negative / boundary / equivalence_class /…
Chạy verify cho test cases trên FARE — ghi `verify_history` atomic qua `update_test_case(verify={...})`, cập nhật `verify_status` (passed/failed/blocked/skipped); đối chiếu task…
Pattern di test avanzati: unit, integration, E2E, mocking, fixture. Trigger: "scrivi test", "testing strategy", "coverage", "test E2E"
Guidelines for building high-performance APIs with Fastify and TypeScript, covering validation, Prisma integration, and testing best practices
Build production-ready MCP servers using FastMCP framework with proven patterns for tools, resources, prompts, OAuth authentication, and comprehensive testing.
Leitfaden Stresstest- und Szenarien-Aufbau fuer Fortbestehensprognose: Basis-, Stress- und Worst-Case, KPIs, Trigger fuer Massnahmen.
Hướng dẫn viết unit test cho Pinia Stores và Vue Components trong PomoHaven bằng Vitest và Vue Test Utils.
Local E2E debug and test framework for clawd-feishu plugin development. Use when debugging message flow, testing bot responses, verifying Feishu web UI interactions, or performing…
Feynman Technique for deep learning—explain a concept simply, identify gaps, fill them, then refine. Use when learning something new, testing understanding, or preparing to teach.
Expert guidance for ffuf web fuzzing during penetration testing, including authenticated fuzzing with raw requests, auto-calibration, and result analysis — from ffuf/ffuf
Expert guidance for ffuf web fuzzing during penetration testing, including authenticated fuzzing with raw requests, auto-calibration, and result analysis — from ffuf/ffuf
Expert guidance for ffuf web fuzzing during penetration testing, including authenticated fuzzing with raw requests, auto-calibration, and result analysis — from ffuf/ffuf
Testing toolkit for the FHIR Writing Clinical Notes specification at connectathons. Use when the user needs to test FHIR DocumentReference write operations, validate conformance…
Develop custom FiftyOne plugins (operators and panels) from scratch. Use when user wants to create a new plugin, extend FiftyOne with custom operators, build interactive panels,…
Make your first Figma REST API call to fetch a file and inspect its node tree. Use when starting a new Figma integration, testing API connectivity, or learning the Figma document…
Load test Figma API integrations and plan for scale. Use when benchmarking API throughput, testing rate limit behavior, or planning capacity for high-volume Figma integrations.
Generate Pest tests for FilamentPHP v4 resources, forms, tables, and authorization
Use when checking if a file is covered by rules, testing path patterns, verifying rules coverage, or when the user asks about rule applicability.
File a GitHub issue for local integration test failures. TRIGGERS: file test bug, report test failure, create bug for test, integration test failed, test failure issue, junit…