---
name: jpe-replication-package
description: Use when assembling the data and code replication package for a Journal of Political Economy (JPE) manuscript to DCAS / JPE Data Editor standards for conditionally accepted papers, with deposit to the JPE Dataverse. Builds the package and README; it does not run the analysis itself.
---

# Replication Package (jpe-replication-package)

## When to trigger

- The paper is **conditionally accepted** and the JPE Data Editor needs a deposit
- You want the package to pass the JPE Data Editor's reproducibility check on the first pass
- Some data are proprietary or restricted and you must request an exemption and document access
- You are setting up the project early so reproducibility is not a last-minute scramble

> Verify the current policy at journals.uchicago.edu/journals/jpe/datapolicy before depositing. JPE publishes a paper only if its data and code are documented and available for replication. JPE **endorses DCAS (the Data and Code Availability Standard v1.0)** and runs its own **JPE Data Editor** (jpedataeditor.github.io) — distinct from the AEA Data Editor that serves AER/AEJ. The package is verified at the **conditional-accept** stage and, once it passes, deposited to the **JPE Dataverse**. The deposit must carry a license allowing **unrestricted access and use for replication**. Exemptions for non-shareable data must be **requested at first submission**, and exempted authors must **preserve the data and code for at least five years** after publication.

## What a passing package contains

1. **README** (the centerpiece) following the DCAS / Social Science Data Editors README template:
   - Overview of what the code does and the mapping from code → exhibits.
   - Data availability statement: source, terms, whether each dataset is public / restricted / proprietary, and exact access steps (including registrations, memberships, monetary and time costs). State clearly if data cannot be shared and why, referencing the exemption requested at first submission.
   - Computational requirements: software + versions, packages + versions, OS, memory, and approximate run time.
   - Instructions to run: a single master script ordering everything end to end.
   - List of every table/figure with the script and line that produces it.
2. **Data**: raw inputs (when license permits) and the code that builds analysis files from them. If raw data are restricted, include construction code plus a synthetic or pseudo dataset that lets the pipeline run.
3. **Code**: a `master` script that reproduces every number, table, and figure from raw inputs, with relative paths and fixed seeds.
4. **Output**: log files and generated exhibits, so the editor can diff against the paper.

## Reproducibility discipline

- One master script; no manual steps, no hard-coded absolute paths, no "run cell 4 then cell 2."
- Set and record random seeds for any simulation, bootstrap, or ML step.
- Pin software and package versions; record them in the README and, where possible, in a lockfile/environment file.
- Every exhibit in the paper is regenerated by the code — no hand-edited tables.
- Directory layout is clean: `data/` (raw, derived), `code/` (build, analysis), `output/` (tables, figures, logs).

## Restricted / proprietary data (the JPE exemption route)

- Request the exemption **at first submission**, not at acceptance — JPE requires this timing.
- You may not need to deposit the data, but you must deposit the **code** and a precise access path so a third party with the same license can reproduce results.
- Provide a Data Availability Statement and, where feasible, a small simulated dataset matching the schema so the pipeline is executable.
- Commit to **preserving the data and code for at least five years** after publication, since they cannot go to the JPE Dataverse.
- Confidential-data results may require a verification arrangement with the JPE Data Editor; document it.

## Checklist

- [ ] README follows the DCAS template (overview, data availability, requirements, run instructions, exhibit map)
- [ ] Deposit license allows unrestricted access and use for replication
- [ ] Single master script reproduces every table and figure from inputs
- [ ] Software and package versions pinned and recorded
- [ ] Random seeds set and documented
- [ ] Relative paths only; runs on a clean machine in a fresh directory
- [ ] Data availability statement covers each dataset (public / restricted / proprietary) with access steps and costs
- [ ] Restricted data: exemption requested at first submission + 5-year preservation commitment
- [ ] Package re-run from scratch and output diffed against the paper, ready for the JPE Data Editor
- [ ] Current JPE data policy (DCAS, JPE Dataverse) verified on the official page

## Anti-patterns

- A zip of scripts with no README and no code → exhibit mapping
- Absolute paths (`/Users/me/...`) that break on any other machine
- Unset seeds so bootstrap/simulation numbers do not reproduce
- "Data available on request" with no construction code and no access detail
- Requesting a restricted-data exemption only at acceptance instead of at first submission
- Assuming the AEA Data Editor handles JPE — JPE runs its **own** Data Editor and Dataverse
- Hand-edited tables that the code does not actually generate
- Submitting without re-running the package on a clean environment

## Output format

```
【Policy verified】JPE data policy (DCAS, JPE Dataverse) checked on official page [y/n]
【README】DCAS template sections present? [y/n each]
【License】unrestricted-access / replication license attached? [y/n]
【Master script】reproduces all exhibits from raw? [y/n]
【Versions + seeds】pinned/documented? [y/n]
【Data status】public / restricted (exemption + 5-yr preservation) + access path
【Clean-machine test】passed, ready for JPE Data Editor? [y/n]
【Next】jpe-submission
```
