Implement a new custom `tally/*` Dockerfile lint rule end-to-end (rule code, overlap research, fix coordination, realistic tests/fixtures, snapshots, and docs).
Pesquisa multi-fonte sobre pergunta ou tópico de cliente com atribuição de fontes. Use quando um cliente pergunta algo que precisa ser verificado, investigando se um bug foi…
Manipulate CSV files - view, filter, sort, convert to JSON/Markdown, get statistics. Use when working with CSV data files.
Multi-source research on a customer question or topic with source attribution. Use when a customer asks something you need to look up, investigating whether a bug has been…
Academic paper review pipeline for computer vision LaTeX papers with professor, peer, and official reviewers
Post-session documentation chain — propagate changes through lab-notebook, journal, architecture, MEMORY, snapshot, and commit.
Research the user's arXiv/repo knowledge base using a MCP retrieval tool called lodestone to answer questions with paper-grounded and repo-grounded citations.
Enhance a plan with parallel research agents for each section to add depth, best practices, and implementation details
Performs thorough deep research on any topic using web search, Context7 docs, and GitHub CLI. Saves structured results to docs/ directory.
Synthesize user research into themes, insights, and recommendations. Use when you have interview transcripts, survey results, usability test notes, support tickets, or NP — from…
Research external context for a dataset — domain background, history, related studies, and why this data matters.
End-to-end workflow for taking a new tool idea from research to working MVP. Use when the user has an idea for a CLI tool, library, or small project and wants to go from concept…
Add a physics lesson — particle dynamics, rigid bodies, collisions, constraints, rendered with SDL GPU
Master orchestrator for the AI-driven development workflow. Use when starting a new User Story / Issue implementation.
Generate NGM Commons directory listings with iterative research and quality review. Uses Perplexity for deep research and Claude Opus for content generation with embedded SVG…
Have an interactive discussion about a topic, approach, or feature. Researches the codebase as needed, talks through options, and updates ./tmp/context.md with decisions.
Use when posting a GitHub investigation issue for an unverified finding, potential gap, or anomaly that needs root-cause analysis before any action is taken.
Use when creating a presentation about a feature, concept, or system. Researches the topic, structures content for accessibility, generates diagrams, and exports polished Marp…
Set up or audit documentation health for any repo. Use 'init' to bootstrap a docs/ structure (plans/, design/, research/) with YAML front matter, agent discovery scripts, and…
This skill should be used when the user asks to "optimize with SIMBA", "use Bayesian optimization", "optimize agents with custom feedback", mentions "SIMBA optimizer", "mini-batch…
Fetch a Bug work item from Azure DevOps/Jira, find the affected component in the codebase, and save triage findings.
Generate an implementation plan with status-tracked steps. Creates implement.md from explain.md + research.md. Uses extended thinking for deep reasoning.
Full requirements pipeline — fetch ADO/Jira story, validate DoR, distill requirements, research codebase, generate team summary.
Run eggNOG-mapper on genome protein FASTA files to generate functional annotations (COG, GO, KEGG, EC, PFAMs).
Select the right price-elasticity estimation method (historical regression / survey / experimental) given data availability, and produce an implementation plan with required N.
Use when the user invokes /evolve-loop or asks to run autonomous improvement cycles, self-evolving development, compound discovery, or multi-cycle code improvement with research,…
Experiment verdict gate — Review LLM independently judges results → 4 verdict paths → auto-update the linked idea's status / failure_reason and graph edges
Full experiment execution pipeline — prepare code → deploy(Confirm with the user before operation and ask the applicant to conduct manual inspection) → monitor → collect results,…
Audit experiment integrity before claiming results. Uses cross-model review (GPT-5.4) to check for fake ground truth, score normalization fraud, phantom results, and insufficient…
Turn a product or growth idea into a rigorous experiment brief and post-test readout format.
SSH job queue for multi-seed/multi-config ML experiments with OOM-aware retry, stale-screen cleanup, and wave-transition race prevention.
Analyse A/B test results — significance, CIs, segment cuts, novelty/primacy check, SRM, decision matrix application, and follow-up experiments.
Evaluates judge scoring fairness in competitions by detecting systematic bias (leniency/strictness) and contestant-specific anomalies (favoritism/prejudice) using statistical…
Systematic ablation study runner. After research:run finds improvements, fortify identifies component candidates from git diff + diary, creates isolated git worktrees per ablation…
Incremental installer for frontend tools and libraries in existing projects. Needs-driven flow with curated tier-1 modules and Context7 + WebSearch research fallback for the long…
Search research papers via Gemini for broad literature discovery. Use when user says "gemini search", "gemini papers", "search with gemini", or wants AI-powered literature…
Researches domain experts and generates standardized persona files for use with /persona
Generate a comprehensive summary report of the latest experiment including metrics, plots, and comparison with baseline.
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure e — from…
Create structured plans for any multi-step task -- software features, research workflows, events, study plans, or any goal that benefits from structured breakdown.
[BETA] Execute work with external delegate support. Same as gh:work but includes experimental Codex delegation mode for token-conserving code implementation.
Lightweight version control for agent-generated document changes. Tracks iterative refinement without creating v1, v2, v3 copies.
Show GPU availability across all SSH servers listed in this project's CLAUDE.md. Use when user says "check GPUs", "which GPUs are free", "gpu status", "GPU 状态", or needs to know…
Analyse Google Search Console performance data — clicks, impressions, CTR, and position deltas with statistical significance bands, winners/losers segmentation, and recommended…
Orchestrate milestone planning with parallel research for subsequent project phases
QA specialist of haipipe-probe. Three complementary checks. (1) STRUCTURAL: audits run quality (per-run sanity) and probe quality (statistical claim integrity) via checklists,…
Produce a reliable, high-stakes-grade result for ANY task — code, writing, research, ops — by treating the work as guilty until independent skeptics fail to break it.
Turn a signal into testable hunt hypotheses, scope, datasets, and success criteria
Structured hypothesis formulation, experiment design, and results interpretation for Product Managers.
Research Idea Generator - Generate novel research ideas based on mentor skills + latest arxiv papers, with automatic novelty verification.
Pushes interfaces past conventional limits with technically ambitious implementations — shaders, spring physics, scroll-driven reveals, 60fps animations.
Generate a monthly incident recap page in Notion, following the standard Productboard SRE Guild format.
Mad House project lifecycle manager. Create, list, stage, promote, ship, and archive projects in ~/dev/mad-house/lab.
Industrial AI literature research with mandatory intake questions, venue-aware source prioritization, structured report outputs, and survey draft generation.
Research and tier industries by fit with your profile. Identifies where your skills are most valued and hiring trends are strongest.
Design a single validation experiment. One assumption, one experiment, one numeric threshold defined upfront.
Run the Vibe Innovation Framework red team protocol on any artifact (problem statement, concept, business model, experiment design, prototype, decision).
Use when the user invokes /inspirer or asks to brainstorm creatively, think outside the box, explore unconventional approaches, break out of stagnation, or generate…
Writes academic prose interpreting regression output. Use when describing estimation results in manuscript-ready language.
Generate a long-form Chinese interview-prep cheat sheet on a specific ML/LLM topic — formulas with derivations, from-scratch PyTorch code, comparison tables, and 25 高频面试题 (L1 必会 /…