---
name: algo-reproduce
description: "Reproduce core algorithms from research papers. Reads paper from Zotero, extracts key algorithms, implements in Python, runs numerical simulation verification. Use when user says \"复现\", \"reproduce\", \"baseline\", \"算法实现\", \"数值仿真\", \"implement algorithm\", or wants to verify a paper's core method."
argument-hint: [paper-title-or-zotero-key]
allowed-tools: Bash(*), Read, Write, Edit, Grep, Glob, Skill, mcp__zotero-mcp__*
---

# Algorithm Reproduction Workflow

Reproduce core algorithm from paper: **$ARGUMENTS**

## Overview

This skill takes a research paper from Zotero, extracts its core algorithm (formulas, pseudocode, key equations), implements it in Python, and runs numerical simulation to verify correctness.

```
Zotero paper → extract algorithm → Python implementation → numerical verification → comparison report
```

## Constants

- **FRAMEWORK = auto** — auto | pytorch | numpy. Auto-detect from paper context (PyTorch for ML papers, numpy for signal processing).
- **VERIFY_TOY = true** — Run toy example first to verify correctness before full-scale.
- **SAVE_DIR = ./reproduced/<series>/<paper-slug>/** — 输出路径，series 由会话上下文确定。
- **COMPARISON_MODE = table** — table | plot | both. How to present comparison with paper results.

> Override: `/algo-reproduce "FlatQuant" — framework: pytorch`

## Series 上下文

输出路径遵循 `reproduced/<series>/<paper-slug>/` 结构。详细规范见 [.claude/rules/reproduced.md](../rules/reproduced.md)。

- 每个 series 为独立 git submodule
- Series 由用户在会话中指定，skill 未设置时须提醒用户确定
- Paper slug 格式：`<简称>-<venue><year>`（如 `flatquant-iclr24`）

## Workflow

### Phase 1: Read Paper from Zotero

1. Search for the paper in Zotero:
   - If `$ARGUMENTS` looks like a Zotero key (e.g., `SQQG5IVT`), use `mcp__zotero-mcp__get_item_details` directly
   - Otherwise, use `mcp__zotero-mcp__search_library` with the title/topic

2. Get full paper content:
   - `mcp__zotero-mcp__get_content` with `mode: complete` for full text
   - `mcp__zotero-mcp__get_item_abstract` for quick overview
   - `mcp__zotero-mcp__get_annotations` for any existing highlights/notes

3. **确认 series**：若当前会话未设置 series，提醒用户确定（如 "当前 series 未指定，请确定研究系列（如 fp-cim、llm-quant）"）。用户可随时通过对话切换。

4. Extract key information:
   - **Core algorithm**: Look for sections named "Method", "Approach", "Algorithm", "Formulation"
   - **Key equations**: LaTeX formulas that define the algorithm
   - **Pseudocode**: Any algorithm boxes or procedure descriptions
   - **Baseline results**: Reported numbers in tables/figures for comparison
   - **Implementation details**: Hyperparameters, training settings, hardware specs

### Phase 2: Algorithm Analysis

1. Parse the core algorithm into discrete steps:
   - Input/output specification
   - Mathematical operations (matrix ops, quantization, transforms, etc.)
   - Data structures (codebooks, lookup tables, etc.)
   - Control flow (iterations, conditionals)

2. Identify dependencies:
   - Standard libraries (numpy, torch, scipy)
   - Custom operations that need implementation
   - Pre-trained weights or data (if needed)

3. Create an implementation plan:
   - Break into modular functions
   - Identify which parts to implement first (core logic vs. utilities)
   - Estimate complexity

### Phase 3: Python Implementation

1. Create the project structure:
   ```
   reproduced/<series>/<paper-slug>/
   ├── README.md          # Paper summary + reproduction notes（含 type: python）
   ├── src/
   │   ├── algorithm.py   # Core algorithm implementation
   │   └── utils.py       # Helper functions
   ├── test/
   │   └── test_algorithm.py  # Numerical verification script
   ├── scripts/           # 辅助脚本（可选）
   └── requirements.txt   # Dependencies
   ```

   若 series 目录不存在，自动创建并初始化 git repo（`git init` + `.gitignore`）。
   若 series 目录已有 README.md，追加新论文条目；否则创建 series README。

2. Implement `algorithm.py`:
   - One class or function per algorithmic component
   - Docstrings with paper equation references (e.g., "Eq. 3 from Section 4.1")
   - Type hints for all public interfaces
   - Configurable parameters matching paper specifications

3. Implement `verify.py`:
   - Toy example: small synthetic inputs to verify correctness
   - Comparison with paper's reported results (if available)
   - Numerical error analysis (absolute/relative error)
   - Output: comparison table in markdown

### Phase 4: Numerical Verification

1. Run toy example:
   ```bash
   cd reproduced/<paper_name> && python verify.py --mode toy
   ```
   - Use small, deterministic inputs
   - Verify intermediate steps match paper's descriptions
   - Check edge cases (zeros, negative values, large inputs)

2. Run comparison (if paper provides reference numbers):
   ```bash
   python verify.py --mode comparison
   ```
   - Compare with paper's Table/Figure results
   - Compute relative error: `|ours - paper| / |paper|`
   - Flag any discrepancy > 5%

3. Generate results report:
   - Summary table: metric | paper | reproduced | error
   - Key observations: what matches, what differs
   - Potential explanations for discrepancies

### Phase 5: Documentation

1. Write paper `README.md`（按 [.claude/rules/reproduced.md](../rules/reproduced.md) 模板）：
   - Paper title, authors, venue, year, Zotero key
   - `type: python`
   - 复现范围、目录结构说明、运行命令、与论文差异

2. 创建/更新 series `README.md`：
   - 若 series README 已存在，追加新论文条目到复现列表
   - 若不存在，创建包含研究方向描述和复现列表的 series README

3. Add inline comments in code:
   - Reference paper sections/equations
   - Explain non-obvious implementation choices
   - Note any deviations from the paper

## Output

After completion, report:
1. **Reproduction status**: success | partial | failed
2. **Verification results**: comparison table
3. **Key findings**: what works, what doesn't, why
4. **Code location**: path to `reproduced/<series>/<paper-slug>/`
5. **Series**: 当前 series 名称
6. **Next steps**: potential improvements, experiments to try

## Error Handling

- **Paper not found in Zotero**: Ask user for DOI or title keywords to search
- **Full text unavailable**: Fall back to abstract + annotations, note "全文不可用"
- **Algorithm too complex**: Break into sub-components, reproduce core first
- **Missing dependencies**: List required packages, offer to install
- **Numerical instability**: Add epsilon guards, use float64, document issues
