Run eval scenarios to benchmark Mycelium effectiveness. Execute tasks using reflexion loop, validate against success criteria, record metrics.
Use to evaluate whether current work aligns with Better Value Sooner Safer Happier. Run at diamond completion and periodically.
Evaluate Mycelium's own process effectiveness. Measures cycle velocity, discard trends, confidence calibration, gate effectiveness, regression rate.
Record findings from completed offline human tasks (interviews, observations, outreach) back into the canvas. The re-entry point after /handoff.
Use when starting implementation on a new or unfamiliar codebase. Auto-detects tech stack and sets up development context.
Guide for conducting Torres-style story-based user interviews with bias mitigation and JTBD lens.
Update canvas sections with new evidence. Ensures canvas stays current as the single source of truth.
Progress a diamond from one phase to the next. Runs all required theory gate checks, validates evidence, and at Deliver->Complete runs the executable Definition of Done checklist.
Use when real user interviews aren't possible (solo/hobby/dogfood projects) but persona work is still needed.
Use to assess Privacy by Design compliance and GDPR/data protection alignment for a feature or system.
Evaluate user-facing interfaces against Nielsen's 10 Usability Heuristics. Complements /service-check (Downe = service-level quality, Nielsen = interface-level quality).
Map user needs independently of solutions using Allen's User Needs Mapping methodology. Identifies underserved needs that feed into the Opportunity Solution Tree.
Use before any research activity or significant decision. Reviews cognitive biases relevant to the current stage.
Pull snapshots from all configured metric sources, compute deltas against prior snapshots, flag unexplained signals, and draft evidence entries for canvas files.
Parallel agent orchestration for OST exploration. Fan-out multiple solution explorations, fan-in results to compare and select winners.
Design the smallest viable test to validate or invalidate a critical assumption. Based on Torres's assumption testing framework, organized by Gilad's AFTER model (Assessment →…
Assess delivery health metrics. For software: DORA + APEX. For content/AI/service products: product-type-appropriate metrics.
GIST planning workflow. Structure goals into ideas, steps, and tasks using Gilad's evidence-guided framework.
Synchronize canvas state across team sessions via git. Ensures all team members see the same product knowledge.
Use to build or update an Opportunity Solution Tree from research data. Never from brainstorming.
Create or update a Wardley Map of the value chain. Maps user needs, components, evolution stages, and strategic gameplay.
Lint canvas files for staleness, missing fields, inconsistent evidence types, and orphaned references. Run periodically or before major transitions.
Use to evaluate the current state of a diamond. Checks theory gates, confidence levels, and recommends next action.
Classify releases into launch tiers and plan go-to-market. Based on Lauchengco's Loved framework.
Migrate a Mycelium project from legacy install (npx-degit, framework files in .claude/) to plugin install (framework lives in plugin cache, .claude/ holds project state only).
Aggregate feedback signals across all active loops. Reports health, trajectory, overdue checks, regression warnings, and Goodhart's Law violations.
Assess team structure against Skelton's Team Topologies. Evaluate cognitive load, interaction modes, and Conway's Law alignment.
Use to prioritize solutions or opportunities using ICE scoring with evidence-backed confidence.
Map user Jobs to be Done across functional, emotional, and social dimensions. Based on Christensen's JTBD theory.
Use when facing a new problem to classify its domain (Clear, Complicated, Complex, Chaotic, Confused) and select appropriate methods.
Use to assess regulatory applicability for products that may fall under AI regulation (EU AI Act, Article 50 transparency).
Explainability (XAI) audit for products containing AI components. Five-stage tier-scaled check: risk classification, stakeholder×question matrix, fidelity audit, system card,…
Use to evaluate a service or feature against Downe's 15 principles of good services.
Detect which external metric sources apply to this product (GitHub, Plausible, Stripe, etc.) and configure adapters.
Use to analyze correction trends, surface recurring patterns, and graduate repeat corrections to guardrails or anti-patterns.
Accessibility audit against WCAG 2.1 AA. Checks semantic HTML, ARIA, keyboard navigation, color contrast, screen reader compatibility.