---
name: deduplicate-security-issue
description: |
  Merge two <tracker> tracking issues that describe the same
  root-cause vulnerability (typically discovered independently by two
  reporters, arriving via different channels), preserving every
  reporter's credit, every mailing-list thread reference, and every
  independent attack-vector description. Updates the kept issue's body
  in place, closes the duplicate with the `duplicate` label, and
  regenerates the CVE JSON attachment so both finders land in
  `credits[]`.
when_to_use: |
  Invoke when a security team member says "dedupe #NNN and #MMM",
  "merge #MMM into #NNN", "#MMM is a duplicate of #NNN", or when the
  import-security-issue skill's Step 2a surfaces a STRONG match (GHSA
  ID collision) between a new report and an existing tracker. Also
  appropriate as a periodic cleanup action when a triager spots two
  open trackers describing the same bug from different angles.
---

<!-- Placeholder convention (see AGENTS.md#placeholder-convention-used-in-skill-files):
     <project-config> → adopting project's `.apache-steward/` directory
     <tracker>        → value of `tracker_repo:` in <project-config>/project.md
                       (example: airflow-s/airflow-s for the Apache Airflow security team)
     <upstream>       → value of `upstream_repo:` in <project-config>/project.md
                       (example: apache/airflow)
     Before running any bash command below, substitute these with the
     concrete values from the adopting project's <project-config>/project.md. -->

# deduplicate-security-issue

Merges two `<tracker>` tracking issues that describe the
same underlying vulnerability. The output is a single tracker
("the **kept** issue") that carries every reporter's credit, every
mailing-list thread, and every independent report's body, with the
other tracker ("the **dropped** issue") closed and labelled
`duplicate`.

This is **one of the few places in the security workflow** where a
piece of reporter-supplied content (the dropped issue's body) moves
from one tracker to another. Since the target tracker is private to
`<tracker>`, no confidentiality boundary is crossed, but
the skill must still preserve every reporter's credit verbatim and
surface the merge in a status comment on both trackers so the audit
trail stays complete.

**Golden rule — propose before applying.** Every merge is a
proposal: the skill computes the merged body, the two status
comments, the label/close-issue actions, and the CVE-JSON regen
command, and shows all of them to the user. Nothing is applied
until the user confirms. There is no fast-path.

**Golden rule — never merge across scopes.** Two trackers with
different **scope labels** (`airflow` vs. `providers`, `airflow`
vs. `chart`, etc.) must not be merged. If an external reporter
rediscovers the same bug in two different products' surfaces, that
is a multi-scope report and the resolution is a
**scope split** handled by the `sync-security-issue` skill, not a
dedupe. This skill refuses to operate when the two candidate
trackers have different scope labels, and the proposal says so
explicitly.

---

## Inputs

| Selector | Resolves to |
|---|---|
| `dedupe #<keep> <drop>` | merge the `<drop>` tracker into `<keep>`; `<keep>` stays open, `<drop>` closes as duplicate |
| `dedupe <keep> <drop>` | same, without the `#` |
| `dedupe #NNN` (single argument) | ambiguous — ask the user which one is kept; do not guess |

Picking which is kept vs. dropped is a user decision; the skill
does **not** auto-pick. Practical guidance to offer when asked:

- If one tracker has a **CVE allocated** and the other does not,
  keep the one with the CVE (preserves the allocation).
- If one tracker is older, keep the older one (preserves the
  audit-trail timestamp).
- If one tracker has richer body content (more attack vectors,
  CVSS scoring, PoC code), merge *into* the one with the CVE but
  keep all the rich content via the "Second independent report"
  section described in Step 3 below.

---

## Prerequisites

- **`gh` CLI authenticated** with collaborator access to
  `<tracker>` — the skill reads both trackers, edits
  the kept tracker's body, closes the dropped tracker, and adds
  / removes labels.
- **`uv` installed** — the Step 5 CVE-JSON regeneration is a
  `uv run` call.

See
[Prerequisites for running the agent skills](../../../README.md#prerequisites-for-running-the-agent-skills)
in `README.md`.

---

## Step 0 — Pre-flight check

1. `gh api repos/<tracker> --jq .name` returns
   `<tracker>`.
2. Both issue numbers resolve —
   `gh issue view <kept> --repo <tracker> --json number`
   and the same for `<dropped>` — before any write.
3. `uv --version` returns.

If any check fails, stop. A partial dedup (body merged but
dropped tracker left open, or CVE JSON not regenerated) is worse
than no dedup.

---

## Step 1 — Fetch and classify both trackers

```bash
gh issue view <keep>  --repo <tracker> --json number,title,state,body,labels,milestone,assignees,author,comments
gh issue view <drop>  --repo <tracker> --json number,title,state,body,labels,milestone,assignees,author,comments
```

Verify:

- Both trackers are in state `open` (merging into or out of a closed
  tracker is almost always a mistake; surface as a blocker if
  either side is already closed and ask the user to confirm).
- Both have the **same scope label** — `airflow` vs. `airflow`,
  or `providers` vs. `providers`, or `chart` vs. `chart`. If the
  scope labels differ, refuse the merge and tell the user this is
  a multi-scope report to be handled by `sync-security-issue`'s
  scope-split flow instead.
- Neither tracker is already labelled `duplicate` (that would
  indicate a partial-merge already happened and someone left it
  half-done; surface as a blocker and let the user decide how to
  recover).

---

## Step 2 — Extract the per-field values from both

For each tracker, extract the template fields:

- *The issue description* — typically the reporter's full message.
  In older trackers the field may not have an explicit heading
  (everything above *"Short public summary for publish"* is the
  description by convention).
- *Short public summary for publish*
- *Affected versions*
- *Security mailing list thread*
- *Public advisory URL*
- *Reporter credited as*
- *PR with the fix*
- *CWE*
- *Severity*
- *CVE tool link*

Also capture:

- Each tracker's **labels** (scope, `cve allocated`, `pr *`,
  `announced - emails sent`, etc.).
- Each tracker's **milestone** (Airflow version / Providers wave /
  Chart version).
- Each tracker's **assignees**.
- Whether each tracker has a **CVE JSON attachment** comment (from
  `generate-cve-json --attach`) — only the kept side's attachment
  will be regenerated in Step 5.

---

## Step 3 — Build the merged body proposal

The output is a single body that preserves both reporters' content
verbatim. The body-field schema (role names, empty-field convention,
body-field-surgery pattern) is documented in
[`tools/github/issue-template.md`](../../../tools/github/issue-template.md);
the concrete field names for the adopting project live in
[`<project-config>/project.md`](../../../<project-config>/project.md#issue-template-fields).
Structure:

```markdown
### The issue description

<keep.issue_description verbatim>

---

**Second independent report: [<tracker>#<drop>](https://github.com/<tracker>/issues/<drop>) — merged on <YYYY-MM-DD>.** <one-sentence headline: same root-cause bug, different attack vector / affected process.>

<details>
<summary>Full report from <drop.reporter> (click to expand)</summary>

<one-paragraph summary of WHY the two reports are the same root-cause
bug — same function, same file, same allowlist fix — but describe
different attack vectors / affected processes / threat-model
boundaries. This paragraph is the skill's own analysis, written
for a future triager who wants to understand why the two were
merged; write it so it reads naturally even after the duplicate
tracker has been closed for months.

<drop.issue_description verbatim>

</details>

### Short public summary for publish

<merged summary covering both vectors; if either side was `_No
response_`, use the populated side; if both were populated,
combine them with a leading sentence that covers both attack
vectors explicitly — the release manager will refine at Step 13>

### Affected versions

<widen the range to the broader of the two — take the lower `version
`-bound and the higher `lessThan` upper bound from both sides>

### Security mailing list thread

<keep.reporter> (<keep.context>): <keep's thread URL or Gmail threadId note>
<drop.reporter> (<drop.context>): <drop's thread URL or Gmail threadId note>
```
(one line per reporter; keep them in chronological order of the
original report, earliest first)

```markdown
### Public advisory URL

<keep's value; normally _No response_ at the time of merge>

### Reporter credited as

<keep.credit line verbatim>
<drop.credit line verbatim>
```
(one line per credit; preserve the *exact* form each reporter
confirmed, or the placeholder form when unconfirmed; the merge
does not silently re-synthesize credits)

```markdown
### PR with the fix

<keep's value, or merge if both are populated>

### CWE

<the more specific of the two values; if they disagree on the
primary CWE, surface the disagreement as a blocker for the
triager rather than silently picking one>

### Severity

<keep's value; do NOT propagate a reporter-supplied CVSS from the
dropped tracker into the kept tracker's Severity field — the
independent-scoring rule in AGENTS.md applies to merged content
the same way it applies to a single reporter's content>

### CVE tool link

<keep's value>
```

The **Second independent report** block is the load-bearing part of
the merge. It lets every future triager read both reports in one
place without having to chase the closed duplicate's content.
Append the drop side's body **verbatim** inside the `<details>`
disclosure — preserve the reporter's wording, code blocks, and PoC
text. Do not paraphrase; paraphrasing a security report is how
credits get subtly wrong before publication. The short headline that
stays visible at the top of the `<details>` block is a one-sentence
summary for scroll-readers; clicking expands to the full verbatim
report. This is the same short-headline-over-collapsed-details
pattern the status-change comments use, applied to the body so a
long secondary report does not push every other body field below
the fold.

If the drop-side body already had a *"Second independent report"*
`<details>` block (chain-merge case — rare), nest its content
inside the new outer block (or append as a sibling sub-block) so
the chain of merges stays visible. Never flatten or rewrite earlier
merges.

---

## Step 4 — Build the rollup-entry proposals

Two rollup-comment entries, one per tracker — **not** two new
top-level comments. The entries are appended to each tracker's
existing status-rollup comment (created by `import-security-issue`)
via the upsert recipe in
[`tools/github/status-rollup.md`](../../../tools/github/status-rollup.md#upsert-recipe--append-to-an-existing-rollup-or-create-one).
When either tracker does not yet carry a rollup (legacy tracker
pre-dating the convention), the upsert recipe's Step 2b creates
one and folds any pre-existing legacy bot comments in on the way.

Each entry is a single `<details>` block. Follow the zero-whitespace
rules from the shared spec — no leading spaces inside the block,
one blank line after `<summary>…</summary>`, one blank line
before `</details>`.

### Entry appended to the kept tracker's rollup

```markdown
<details><summary><YYYY-MM-DD> · @<author-handle> · Merge (kept) (from #<drop>)</summary>

**Merged [<tracker>#<drop>](https://github.com/<tracker>/issues/<drop>) into this tracker.** <one-sentence headline: same root-cause bug, different attack vector / affected process.>

- Body: <keep.reporter>'s original report preserved; <drop.reporter>'s report appended as *"Second independent report"*.
- Credits: **<keep credit>** + **<drop credit>**.
- Mailing threads: both listed.
- CVE: [<CVE-N>-<M>](https://cveprocess.apache.org/cve5/<CVE-N>-<M>) stays allocated here; [<tracker>#<drop>](...) being closed as duplicate.

**Next:** <one-line next step — e.g. credit-preference confirmation for both, or Step 6 CVE refinement>.

<Reporter-notification line — one of the four canonical options from the sync skill.>

Full analysis of why the two reports are the same root-cause bug (same function, same file, same allowlist fix) but describe different attack vectors / affected processes / threat-model boundaries. Per-field hand-off details:

- *Reporter credited as*: <full before → after>.
- *Security mailing list thread*: <full before → after, including PonyMail URLs and Gmail thread IDs>.
- *Short public summary for publish*: <kept as-is | seeded with a merged draft starting "..."/>.
- *CWE*: <set to <value> | kept as _No response_ | BLOCKER: conflict between <keep.cwe> and <drop.cwe> — triager to resolve>.
- *Affected versions*: widened to <value>.
- CVE JSON attachment regenerated: <comment URL>.

</details>
```

### Entry appended to the dropped tracker's rollup

```markdown
<details><summary><YYYY-MM-DD> · @<author-handle> · Merge (dropped) (into #<keep>)</summary>

**Closing as duplicate of [<tracker>#<keep>](https://github.com/<tracker>/issues/<keep>).** <one-sentence headline.>

Full content merged into [<tracker>#<keep>](...) as *"Second independent report"*; <drop.reporter> credited alongside <keep.reporter> there.

All triage and advisory work continues on [#<keep>](...).

<one-paragraph analysis matching the kept-side details>.

Specific artifacts merged: <CVSS scoring, attack chain, PoC, remediation options, etc.>.

See [the merge entry on <tracker>#<keep>](…) for the full hand-off record.

<Reporter-notification line — one of the four canonical options from the sync skill.>

</details>
```

Both entries must render every cross-issue reference as a
clickable markdown link per the *Linking `<tracker>` issues and
PRs* convention in [`AGENTS.md`](../../../AGENTS.md). No
six-line visible cap — the entire entry is already collapsed
inside `<details>`; write what the auditor needs. Do not pad.

---

## Step 5 — Confirm with the user, then apply sequentially

Present the proposal:

- Numbered items for the body update, each status comment, the
  `duplicate` label application on the dropped side, the
  close-issue action on the dropped side, and the CVE-JSON regen
  on the kept side.
- The resulting merged body rendered in full (not a diff), so the
  user can proofread end to end before confirming.

Confirmation forms:

- `all` — apply every proposed action.
- `1,3,5` — apply selected items only (for example, *"apply body
  update and status comment but don't close the duplicate yet — I
  want to triple-check"*).
- `none` / `cancel` — bail.
- Free-form edits — regenerate only the specified item and
  re-confirm.

After confirmation, apply **sequentially** (never in parallel):

1. `gh issue edit <keep> --body-file <tmpfile>` — updated body
2. Rollup-comment upsert on the kept tracker per
   [`tools/github/status-rollup.md`](../../../tools/github/status-rollup.md#upsert-recipe--append-to-an-existing-rollup-or-create-one)
   — append the `Merge (kept)` entry (`gh api -X PATCH
   repos/<tracker>/issues/comments/<id> --input …`) or create
   the rollup if none exists yet. The same step folds any legacy
   bot comments on the kept tracker into the rollup first, per
   the fold-legacy sub-step in
   [`sync-security-issue`](../sync-security-issue/SKILL.md).
3. Rollup-comment upsert on the dropped tracker — append the
   `Merge (dropped)` entry (same recipe; fold legacy comments
   first when needed).
4. `gh issue edit <drop> --repo <tracker> --add-label duplicate`
5. `gh issue close <drop> --repo <tracker> --reason "not planned"`
   (GitHub's `duplicate` close-reason is not exposed by `gh` on
   all versions; `not planned` combined with the `duplicate` label
   carries the same signal)
6. `uv run --project <framework>/tools/vulnogram/generate-cve-json generate-cve-json <keep> --attach`
   — the *Remediation developer* body field is the source of truth
   for remediation-developer credits (populated by the
   `sync-security-issue` skill from the linked PR's author); no CLI
   flag needed
7. For each legacy bot comment folded in steps 2 / 3, delete the
   original with `gh api -X DELETE
   repos/<tracker>/issues/comments/<id>` — only after the
   matching rollup PATCH succeeded.

If any step fails, stop and ask the user how to proceed — do not
guess. Partial merges are recoverable as long as the body update
(step 1) succeeded; the rest is bookkeeping on top.

---

## Step 6 — Recap

After the apply loop, print a short recap:

- The kept tracker as a clickable
  [`<tracker>#<keep>`](...) link with a short summary of
  its new state (label set, credit list, both threads).
- The dropped tracker as a clickable link with its new closed
  state.
- The regenerated CVE JSON attachment URL.
- Any blockers surfaced during the merge (CWE conflict, unconfirmed
  credits, stale drafts, etc.) repeated here so the user does not
  have to scroll.

Apply the `<tracker>` link-form self-check to the entire
recap before presenting.

---

## Hard rules

- **Never merge across scopes.** Different scope labels → scope
  split (via `sync-security-issue`), not dedupe.
- **Never re-synthesize credits.** Copy each reporter's credit line
  verbatim from their tracker.
- **Never propagate a reporter-supplied CVSS** from the dropped
  tracker into the kept tracker's `Severity` field. The
  independent-scoring rule in [`AGENTS.md`](../../../AGENTS.md)
  applies to merged content.
- **Never paraphrase a reporter's body.** Paraphrasing is how
  credits and vulnerability details go subtly wrong before
  publication; append verbatim under the *Second independent
  report* heading.
- **Never close the wrong side.** The kept issue stays open; the
  dropped issue closes. Before running the `close` command,
  re-check the mapping one last time.
- **Never delete the dropped tracker.** GitHub issues are
  effectively immutable audit trail; closing + labelling as
  `duplicate` is the right ending state.

---

## When dedupe is **not** appropriate

- The two trackers are in **different scopes** → use the scope-split
  flow in `sync-security-issue` instead.
- The two trackers describe the same code surface but **different
  bugs** with **different fixes** (for example, two separate
  allowlist gaps in the same file, each requiring its own
  advisory) → leave them as separate trackers and cross-link in
  comments, but do not merge.
- One tracker has already moved past Step 13 (advisory sent) — the
  advisory has already gone out citing one reporter; retroactively
  merging a second reporter into the sent advisory requires an
  errata announcement via the missing-credits follow-up (Step 16
  of the handling process), not a tracker-body merge.

---

## References

- [`README.md`](../../../README.md) — the handling process;
  duplicates are resolved here at various steps rather than at a
  single numbered step.
- [`import-security-issue`](../import-security-issue/SKILL.md) —
  Step 2a surfaces potential duplicates before a new tracker is
  even created, so in the ideal case this skill is never needed
  on a fresh import.
- [`sync-security-issue`](../sync-security-issue/SKILL.md) — runs
  on the kept tracker after the merge to reconcile labels /
  milestone / credit-preference drafts for both reporters.
- [`generate-cve-json`](../../../tools/vulnogram/generate-cve-json/SKILL.md) —
  regenerates the kept tracker's CVE JSON attachment so both
  finders land in `credits[]`.
