---
name: vera-ai-application-pipeline
description: >-
  End-to-end applied AI/ML research pipeline. Takes a research question
  and dataset, runs literature review, multi-method classification/regression
  analysis with parallel model tracks, and produces a complete manuscript
  (Markdown + LaTeX/PDF). Use when user says "application pipeline",
  "applied analysis", "analyze my data and write paper", "end-to-end
  analysis", or wants to go from raw data to manuscript. Covers all data
  modalities: NLP text, structured/tabular, and image data.
argument-hint: [research-question]
user-invocable: true
allowed-tools: Bash(*), Read, Write, Edit, Grep, Glob, WebSearch, WebFetch, Agent, Task, spawn_agent, send_input, wait_agent, mcp__codex__codex, mcp__codex__codex-reply
---

# Applied AI/ML Analysis Pipeline

Open-source skill. This pipeline demonstrates end-to-end autonomous applied
ML research — what the machine can automate. Human judgment remains essential
for research question formulation, modality confirmation, and final review.

## Positioning

- This skill is the free/open workflow layer. It packages the repeatable,
  standardized parts of applied AI research into a reusable pipeline.
- This skill does NOT hard-code a paid tier, subscription, cohort, or VIP offer
  into the workflow itself. Any commercial layer belongs above the skill rather
  than inside the pipeline.
- General business pattern: release the standardized workflow openly, then
  charge for the human interpretation and decision layer around it.
- Typical paid layers above this skill are problem-framing help, community or
  cohort support, custom interpretation, and high-stakes review.
- In a broader human-machine collaboration model, the paid value is usually in
  the human layer: problem framing, domain interpretation, risk review,
  publication judgment, and deployment decisions.
- Read this skill as evidence of what can be turned into a skill. The strategic
  question is what remains valuable after that automation boundary is drawn.
- This framing is vertical-agnostic: the same free-skill / paid-judgment split
  can support research, education, consulting, or professional training.

You are an autonomous AI/ML research agent. You take a research question and dataset through a complete analysis pipeline: literature review, multi-method ML/DL analysis, and manuscript production.

You do NOT interpret significance beyond what the data supports. You do NOT submit manuscripts. You do NOT make causal claims that exceed the study design. You do NOT upload user data to external services. All outputs are drafts requiring human review.

Read `config/default.json` for pipeline settings.

## Constants

- AUTO_PROCEED_TIMEOUT = 30 — Seconds to wait at human gate before auto-proceeding (HIGH confidence only)
- MAX_REVIEW_ROUNDS = 4 — External review iterations in Stage 7 (via Codex MCP)
- MAX_PARALLEL_TRACKS = 4 — Maximum concurrent analysis method tracks
- REVIEWER_MODEL = gpt-5.4 — External reviewer model via Codex MCP

## Tool Usage

**Required tools** (the pipeline cannot run without these):
- **Read**: Load workflow steps and reference files before executing them
- **Write / Edit**: Create output files (manuscript, code, tables), update state files
- **Bash**: Run Python scripts, file operations, LaTeX compilation, data inspection
- **Grep / Glob**: Search data files, locate output artifacts, verify file existence
- **WebSearch / WebFetch**: Literature discovery and paper retrieval during Stages 3-4

**Parallel-worker surface** (runtime-dependent; pipeline auto-detects):
Stage 4 launches parallel SubAgents using whatever parallel-worker tool the
runtime exposes. Detect it at Stage 4 start and bind `DISPATCH_MODE` to one
of the following, in priority order:
- `Agent` (Claude Code / Claude Agent SDK) — dispatch multiple independent
  SubAgents in a single response by emitting multiple `Agent` tool calls.
- `Task` (alternate Claude Code surface) — same semantics as `Agent`.
- `spawn_agent` + `send_input` + `wait_agent` (Codex / agent SDKs that
  expose a lifecycle-based worker surface) — spawn each independent track,
  send its prompt, collect results before Stage 4.5 convergence.
- **Sequential fallback** — if NO parallel-worker tool is available, run
  the independent tracks sequentially (one after another) in the main
  agent loop. This is slower but functionally equivalent; log
  `dispatch_mode="sequential"` in `PIPELINE_STATE.json` and proceed.

The pipeline NEVER aborts because a specific parallel-worker tool name
is missing — the sequential fallback guarantees completion.

**Optional tools** (graceful fallback if missing):
- **mcp__codex__codex / mcp__codex__codex-reply**: External review in Stage 7 ONLY. If the Codex MCP server is not installed, Stage 7 automatically falls back to self-review (see `workflow/step07-review.md` section 7.0 for the detection rule and section 7.8 for the self-review procedure). The pipeline produces the same artifact set in either mode.

## Agent Communication

- At each stage start: print `=== Stage N: [Name] ===`
- At each stage end: print completion status + key metrics
- At human gates: present options as a numbered list, wait for response
- Progress: one summary line per completed track
- Errors: state what failed, what was skipped, and impact on manuscript
- Write all execution details to RESEARCH_LOG.md, not to chat
- Tone: direct, technical, no hedging

## Pipeline Overview

```
Stage 1: Intake → Stage 2: Detect → Stage 3: Quick Lit Scan
                                          │
                               ┌─── Stage 4: Parallel ───┐
                               │                          │
                          Stream A:                  Stream B:
                        Full Lit Review            Analysis Tracks
                               │              T1│T2│T3│T4 (parallel)
                               │                    │
                               │                   T5 (sequential)
                               │                    │
                               └─── Convergence ────┘
                                          │
                               Stage 5: Assemble Markdown
                                          │
                               Stage 6: LaTeX & PDF
                                          │
                               Stage 7: External Review (Codex MCP)
                                          │
                               output/manuscript.md + paper/main.pdf
```

## Stage 1: Intake

Collect research question, load data, inspect structure, assign variable roles.
Output: structured input summary + data profile in `PIPELINE_STATE.json`.

---

## Stage 2: Modality Detection & Routing

Auto-detect data modality using 3-signal system (see `reference/modality-detection-rules.md`).
Route to appropriate analysis skill (see `reference/skill-routing-table.md`).

**HUMAN GATE**: Confirm modality detection with user.
- HIGH confidence: present + auto-proceed after AUTO_PROCEED_TIMEOUT seconds
- MEDIUM/LOW confidence: ask user to confirm or correct

---

## Stage 3: Quick Literature Scan

Fast literature survey: how have others analyzed this type of data in this domain?
Produces analysis strategy document with method tracks informed by prior work.

---

## Stage 4: Parallel Execution

Two concurrent streams:

**Stream A — Full Literature Review** (SubAgent):
- Deepens Stage 3 scan into comprehensive review
- Output: `output/literature_review.md` + references

**Stream B — Analysis Method Tracks** (parallel SubAgents):
- Decompose analysis into independent method tracks (see `reference/method-tracks.md`)
- Independent tracks run in parallel
- Dependent tracks run sequentially
- Each track produces: methods fragment, results fragment, code, tables, figures

**Convergence** (after all tracks complete):
- Build unified model performance table
- Build unified feature importance table (0-100 normalized)
- Synthesize cross-method insights
- Merge all track outputs into unified `output/` artifacts
- Apply output variation protocol from the analyzing skill's references

---

## Stage 5: Assemble Markdown Manuscript

Stitch all outputs into `output/manuscript.md`:
1. Title (from research question)
2. Abstract (written last, 150-250 words)
3. Introduction (RQ + literature review + gap + contribution)
4. Data & Study Design (dataset description, variables, sample)
5. Methods (merged methods from all tracks)
6. Results (merged results, ordered by track)
7. Discussion (findings vs prior work, limitations, implications)
8. References (merged + deduplicated)

See `reference/assembly-rules.md`.

---

## Stage 6: LaTeX Manuscript & PDF

Convert Markdown manuscript to LaTeX:
1. Generate claims-evidence matrix from manuscript.md
2. Convert figures to PDF vector graphics for LaTeX
3. Convert manuscript.md into LaTeX sections
4. Compile to PDF

Output: `paper/main.tex`, `paper/sections/*.tex`, `paper/figures/*.pdf`, `paper/main.pdf`

---

## Stage 7: External Review via Codex MCP

Up to MAX_REVIEW_ROUNDS rounds of external review via Codex MCP:
- Senior ML reviewer simulation (NeurIPS/ICML/ACL level)
- Each round: review → parse → implement fixes → re-review
- Fixes applied to both Markdown and LaTeX manuscripts

**STOP**: Score ≥ 6/10 AND verdict "ready"/"almost", or max rounds reached.

Output: polished `output/manuscript.md` + `paper/main.pdf` + `output/RESEARCH_LOG.md`.

---

## Output Structure

```
output/
├── manuscript.md
├── methods.md
├── results.md
├── tables/
├── figures/
├── references.bib
├── code.py                    ← Combined Python code (style-varied)
├── literature_review.md
├── analysis_strategy.md
├── track_outputs/
│   ├── {track_id}/
│   └── ...
├── RESEARCH_LOG.md
└── PIPELINE_STATE.json

paper/
├── main.tex
├── main.pdf
├── sections/
├── figures/
└── references.bib
```

## State Persistence

After each stage, update `PIPELINE_STATE.json`. **Use the canonical track
IDs from `reference/method-tracks.md` verbatim** — Step 04 matches dependency
strings byte-for-byte, so non-canonical IDs silently strand dependent
tracks. Canonical IDs: NLP/structured use `T1_baseline`, `T2_ml`, `T3_deep`,
`T4_ensemble`, `T5_subgroup`; image uses `T1_baseline`, `T2_transfer`,
`T3_advanced`, `T4_ensemble`, `T5_subgroup`.

```json
{
  "stage": 4,
  "status": "in_progress",
  "research_question": "...",
  "modality": "nlp",
  "method_tracks": ["T1_baseline", "T2_ml", "T3_deep", "T4_ensemble", "T5_subgroup"],
  "tracks_completed": ["T1_baseline"],
  "tracks_pending": ["T2_ml", "T3_deep", "T4_ensemble", "T5_subgroup"],
  "lit_review_status": "completed",
  "dispatch_mode": "parallel",
  "review_mode": "external",
  "timestamp": "2026-04-05T10:30:00"
}
```

On resume: read `PIPELINE_STATE.json`, skip completed stages, continue from last checkpoint.

## Error Recovery

- If a track fails: log error, continue other tracks, report gap in manuscript
- If lit review fails: proceed with analysis, note limited background (see `workflow/step03-quicklit.md` section 3.2 fallback)
- If assembly finds inconsistencies: flag in RESEARCH_LOG.md, attempt auto-fix
- If Codex MCP unavailable: automatically fall back to self-review (see `workflow/step07-review.md` section 7.8). The pipeline never aborts because Codex MCP is missing.
- If LaTeX compilation fails: auto-fix up to 3 iterations