---
name: prosperity-research-session
description: Use when task is to run or resume an iterative IMC Prosperity research loop: generate or refine hypotheses, check prior attempts, evaluate candidate signals or portfolio ideas, compare against baselines, and produce a patch plan or ranked next steps. Trigger for non-trivial research, backtesting, diversification analysis, or "what should we try next?" questions. Do not trigger for tiny local code edits that do not require research or evaluation.
---

Use this skill to run a durable research session for IMC Prosperity work.

Public-release note: this skill is the workflow glue for the repo. It tells an
agent how to use the local MCP server, how to recover from long-running work,
and how to avoid mistaking stale or cached evidence for fresh research.

Core behavior:
- Treat invocation of this skill as an explicit request to use subagents for non-trivial work.
- For non-trivial tasks, spawn bounded subagents early for parallel exploration, review, or verification.
- Keep final synthesis and implementation decisions in main thread.
- Use `prosperityResearch` MCP server as primary interface for session orchestration when available.
- If MCP server is not configured yet, fall back to `python scripts/prosperity_research_cli.py ...`.
- If skill discovery misses this repo-local skill, use this file at `.agents/skills/prosperity-research-session/SKILL.md` directly and note that user-level install may still be needed.

When this skill is active:
1. Smoke MCP with `tools/list` or `prosperityResearch.start_or_resume_session`; verify a session manifest exists before using CLI fallback.
2. Inspect current frontier with `prosperityResearch.get_session_status` and `prosperityResearch.get_top_candidates`.
3. Before long alpha work, read `prosperityResearch://alpha_autoresearch_protocol`; use it as source of truth for OOS splits, keep/discard policy, lane policy, and continuous agent budget loop.
4. If frontier is weak or stale, continue loop with `prosperityResearch.continue_session` or `prosperityResearch.run_alpha_autoresearch_loop`.
5. For alpha autoresearch, pass explicit budget controls:
   - `iteration_runtime_ms`
   - `max_total_runtime_seconds`
   - `stale_operation_timeout_seconds`
   - `response_mode: "compact"` or `"artifact_only"` for long sweeps
6. MCP auto-detaches `continue_session` and `run_alpha_autoresearch_loop` when `detach` is omitted. Poll with `prosperityResearch.get_compact_session_status` or `prosperityResearch.get_session_status`; inspect `active_operation`, `output_path`, `cancel_path`, and artifact links before relaunching.
7. For ROUND_3/ROUND_4, replay concrete generated candidate labels only. Do not spend research budget replaying current `algorithm.py` as implicit fallback.
8. If there is a strong candidate, request a patch plan with `prosperityResearch.request_patch_plan`.
9. Summarize:
   - what was tried
   - what looks promising
   - what was rejected
   - what should be implemented next
   - what should be verified before implementation

CLI fallback:
- Use `python scripts/prosperity_research_cli.py <tool> --input-file <payload.json> --output <result.json>`.
- Prefer explicit input/output files for long loops so transport timeouts do not hide completed artifacts.
- Include same budget fields as MCP payloads. Use stable input/output files and inspect session artifacts before retrying after timeout.

Autoresearch/time-budget behavior:
- Treat wall-clock budget as real. Do not stop early only because a current best candidate exists unless user budget is exhausted, user cancels, or an explicit stop rule fires.
- `run_alpha_autoresearch_loop` is fixed-budget orchestration. It persists `iteration_runtime_ms` as per-iteration subprocess budget and respects `max_total_runtime_seconds`.
- Generator lane should synthesize or evaluate fresh generated labels before replay.
- Evaluation lane should run R3/R4 replay batches against explicit candidate labels.
- Algorithm experiment lane is optional and bounded. Use `run_algorithm_autoresearch_experiments` with `agent_recipes[]`, `research_program`, and `fixed_time_budget_seconds` when the agent proposes direct `algorithm.py` changes.
- Continuous Karpathy-style autoresearch requires the outer agent to repeatedly propose new `agent_recipes[]`, validate under a fixed budget, read the artifact/ratchet, update the hypothesis, and repeat until remaining budget is below the next validation slice. A static recipe catalog is not enough.
- Minimum loop for any multi-hour budget:
  1. Set a deadline from user time budget.
  2. Draft fresh non-duplicate `agent_recipes[]` from current artifacts, losses, and lessons.
  3. Call `prosperityResearch.run_algorithm_autoresearch_experiments` with `fixed_time_budget_seconds <= remaining_time`.
  4. Read returned `artifact_path`; inspect `ratchet`, `best_variant`, `errors`, and discarded variants.
  5. Generate next recipe from evidence. Do not repeat recipe after a dedup/cache hit.
  6. Continue until remaining time is below one validation slice or user cancels.
- Stop/relaunch when generated labels are empty, replay fingerprints repeat, or algorithm-cache fingerprints repeat; widen parents, recipes, or candidate generation before spending more replay budget.

Subagent policy:
- Spawn one bounded subagent per clearly separable investigative workstream for non-trivial work.
- Good subagent roles:
  - history reviewer
  - candidate reviewer
  - verification reviewer
  - code impact reviewer
- Each subagent must return:
  - scope inspected
  - key findings
  - risks or caveats
  - recommended next step

Output contract:
- Always return concise research summary.
- If there is a strong candidate, include:
  - candidate name
  - why it is promising
  - expected edge or diversification contribution
  - major risks
  - implementation outline
- If there is not yet a strong candidate, include:
  - top rejected ideas and why
  - top open questions
  - recommended next experiment batch

Guardrails:
- Do not claim strategy is good without evaluation evidence from research session.
- Do not recommend implementation before checking prior attempts and current frontier.
- Prefer small, testable iterations over broad rewrites.
- Do not give a final answer while a user-provided research time budget remains and the MCP is still able to validate more `agent_recipes[]`.
- Inspect session manifest/artifacts after MCP transport timeout before retrying; long runs may have completed and persisted evidence.
- Keep negative evidence. Discard means no robust frontier advance, not deletion.
