--- name: prosperity-research-session description: Use when task is to run or resume an iterative IMC Prosperity research loop: generate or refine hypotheses, check prior attempts, evaluate candidate signals or portfolio ideas, compare against baselines, and produce a patch plan or ranked next steps. Trigger for non-trivial research, backtesting, diversification analysis, or "what should we try next?" questions. Do not trigger for tiny local code edits that do not require research or evaluation. --- Use this skill to run a durable research session for IMC Prosperity work. Public-release note: this skill is the workflow glue for the repo. It tells an agent how to use the local MCP server, how to recover from long-running work, and how to avoid mistaking stale or cached evidence for fresh research. Core behavior: - Treat invocation of this skill as an explicit request to use subagents for non-trivial work. - For non-trivial tasks, spawn bounded subagents early for parallel exploration, review, or verification. - Keep final synthesis and implementation decisions in main thread. - Use `prosperityResearch` MCP server as primary interface for session orchestration when available. - If MCP server is not configured yet, fall back to `python scripts/prosperity_research_cli.py ...`. - If skill discovery misses this repo-local skill, use this file at `.agents/skills/prosperity-research-session/SKILL.md` directly and note that user-level install may still be needed. When this skill is active: 1. Smoke MCP with `tools/list` or `prosperityResearch.start_or_resume_session`; verify a session manifest exists before using CLI fallback. 2. Inspect current frontier with `prosperityResearch.get_session_status` and `prosperityResearch.get_top_candidates`. 3. Before long alpha work, read `prosperityResearch://alpha_autoresearch_protocol`; use it as source of truth for OOS splits, keep/discard policy, lane policy, and continuous agent budget loop. 4. If frontier is weak or stale, continue loop with `prosperityResearch.continue_session` or `prosperityResearch.run_alpha_autoresearch_loop`. 5. For alpha autoresearch, pass explicit budget controls: - `iteration_runtime_ms` - `max_total_runtime_seconds` - `stale_operation_timeout_seconds` - `response_mode: "compact"` or `"artifact_only"` for long sweeps 6. MCP auto-detaches `continue_session` and `run_alpha_autoresearch_loop` when `detach` is omitted. Poll with `prosperityResearch.get_compact_session_status` or `prosperityResearch.get_session_status`; inspect `active_operation`, `output_path`, `cancel_path`, and artifact links before relaunching. 7. For ROUND_3/ROUND_4, replay concrete generated candidate labels only. Do not spend research budget replaying current `algorithm.py` as implicit fallback. 8. If there is a strong candidate, request a patch plan with `prosperityResearch.request_patch_plan`. 9. Summarize: - what was tried - what looks promising - what was rejected - what should be implemented next - what should be verified before implementation CLI fallback: - Use `python scripts/prosperity_research_cli.py --input-file --output `. - Prefer explicit input/output files for long loops so transport timeouts do not hide completed artifacts. - Include same budget fields as MCP payloads. Use stable input/output files and inspect session artifacts before retrying after timeout. Autoresearch/time-budget behavior: - Treat wall-clock budget as real. Do not stop early only because a current best candidate exists unless user budget is exhausted, user cancels, or an explicit stop rule fires. - `run_alpha_autoresearch_loop` is fixed-budget orchestration. It persists `iteration_runtime_ms` as per-iteration subprocess budget and respects `max_total_runtime_seconds`. - Generator lane should synthesize or evaluate fresh generated labels before replay. - Evaluation lane should run R3/R4 replay batches against explicit candidate labels. - Algorithm experiment lane is optional and bounded. Use `run_algorithm_autoresearch_experiments` with `agent_recipes[]`, `research_program`, and `fixed_time_budget_seconds` when the agent proposes direct `algorithm.py` changes. - Continuous Karpathy-style autoresearch requires the outer agent to repeatedly propose new `agent_recipes[]`, validate under a fixed budget, read the artifact/ratchet, update the hypothesis, and repeat until remaining budget is below the next validation slice. A static recipe catalog is not enough. - Minimum loop for any multi-hour budget: 1. Set a deadline from user time budget. 2. Draft fresh non-duplicate `agent_recipes[]` from current artifacts, losses, and lessons. 3. Call `prosperityResearch.run_algorithm_autoresearch_experiments` with `fixed_time_budget_seconds <= remaining_time`. 4. Read returned `artifact_path`; inspect `ratchet`, `best_variant`, `errors`, and discarded variants. 5. Generate next recipe from evidence. Do not repeat recipe after a dedup/cache hit. 6. Continue until remaining time is below one validation slice or user cancels. - Stop/relaunch when generated labels are empty, replay fingerprints repeat, or algorithm-cache fingerprints repeat; widen parents, recipes, or candidate generation before spending more replay budget. Subagent policy: - Spawn one bounded subagent per clearly separable investigative workstream for non-trivial work. - Good subagent roles: - history reviewer - candidate reviewer - verification reviewer - code impact reviewer - Each subagent must return: - scope inspected - key findings - risks or caveats - recommended next step Output contract: - Always return concise research summary. - If there is a strong candidate, include: - candidate name - why it is promising - expected edge or diversification contribution - major risks - implementation outline - If there is not yet a strong candidate, include: - top rejected ideas and why - top open questions - recommended next experiment batch Guardrails: - Do not claim strategy is good without evaluation evidence from research session. - Do not recommend implementation before checking prior attempts and current frontier. - Prefer small, testable iterations over broad rewrites. - Do not give a final answer while a user-provided research time budget remains and the MCP is still able to validate more `agent_recipes[]`. - Inspect session manifest/artifacts after MCP transport timeout before retrying; long runs may have completed and persisted evidence. - Keep negative evidence. Discard means no robust frontier advance, not deletion.