---
name: generate-cve-json
description: |
  Generate a CVE 5.x JSON document from an <tracker> tracking
  issue, ready to paste into the Vulnogram `#source` tab of the ASF CVE tool
  at https://cveprocess.apache.org/cve5/<CVE-ID>#source. The conversion is
  deterministic: same issue in, same JSON bytes out. Handles multiple
  credits (one per line) and multiple references (URLs extracted from the
  issue's "Public advisory URL" and "PR with the fix" fields; the
  "Security mailing list thread" field is treated as internal-only and
  never exported).
when_to_use: |
  Invoke when a security team member says "generate CVE JSON for NNN",
  "update the CVE tool entry for NNN", "paste-ready CVE for NNN", or is
  about to publish the advisory for a tracking issue and wants the CVE
  record filled in from the issue body in one paste-and-save step. Not
  appropriate before the CVE has been allocated (the script needs a CVE
  ID either from the issue body's `CVE tool link` field or from a
  `--cve-id` override).
---

# generate-cve-json

This skill produces a CVE 5.x JSON document from a tracking issue in
[`<tracker>`](https://github.com/<tracker>), ready to
paste into the Vulnogram **"#source"** tab of the ASF CVE tool. The goal is
to eliminate the manual "copy each field from the issue into the right
Vulnogram form input" step when you are preparing to publish an advisory.

> **Project-agnostic by design.** All project-specific values
> (vendor, top-level product / package name, project display map,
> CNA org id, generator tag, …) are loaded from a TOML config the
> adopting project ships at `<project-config>/tools/vulnogram/cve-json-config.toml`.
> Concrete `apache-foo-project-*` strings appearing in this
> document are **illustrative examples** of how a project with a
> project-style package layout would configure things; replace
> them mentally with the adopter's own package taxonomy. The
> schema is documented in the package [README](README.md).

**Golden rule:** the script generates a *proposal* JSON document. It
parses a handful of structured fields from the issue body, but it cannot
read the security team member's mind. Always review the generated JSON
before pasting, and always do the final review inside Vulnogram before
moving the CVE from DRAFT → REVIEW → READY → PUBLIC.

**Determinism:** the same input issue body produces exactly the same JSON
bytes on every run. The script uses only the Python standard library, has
no timestamps or machine-dependent values in its output, sorts JSON keys,
and sorts references alphabetically. This lets you paste the result into
Vulnogram, tweak fields in the tool, re-run the script later, and cleanly
diff the two to see what the tool has added / what you changed by hand.

---

## Inputs

- **Issue number** (required) — e.g. `232`.
- Optional CLI overrides:
  - `--cve-id CVE-YYYY-NNNN+` — override the CVE ID if the issue body's
    `CVE tool link` field has not yet been filled in, or retarget the
    JSON to a different CVE ID.
  - `--title "<vendor>: <product>: …"` — override the CVE title.
    Default is the GitHub issue title with the project's
    `<vendor>: <product>:` prefix (sourced from the TOML config) when
    it does not already start with that phrase.
  - `--version-start X.Y.Z` — override the start of the affected version
    range (the `affected[].versions[].version` field). Default is the
    lower bound parsed from the Affected versions field when it uses
    `>= X, < Y` syntax, otherwise `"0"`.
  - `--remediation-developer "Name"` — append a `type: "remediation
    developer"` credit on top of whatever the body's *Remediation
    developer* field already lists (auto-populated by the
    `sync-security-issue` skill from the linked PR's author). Repeat
    the flag to add multiple developers; duplicates between the
    body field and CLI flags are dropped silently. The reporter
    credit(s) from the *Reporter credited as* field are always
    emitted with `type: "finder"`.
  - `--vendor` / `--product` / `--package-name` / `--collection-url` —
    override the product identity fields. The defaults come from the
    project's TOML config (`product.vendor`, `product.default_product`,
    `product.default_package_name`, `product.default_collection_url`).
    They are used as the identity for *Affected versions* lines that
    don't start with a recognisable per-package directory name
    (see the multi-product note below).
  - `--product-for PACKAGE=PRODUCT` — override the CVE product display
    name for a specific `packageName`. Repeat to override multiple
    packages. Useful when a package is not in the project's
    `project_display_map` config, or when an acronym needs different
    casing from the title-cased fallback. Example:
    `--product-for apache-foo-project-baz='Apache Foo Project Baz'`.
  - `--org-id <uuid>` — override the CNA assigner org id (defaults to
    the ASF org id).
  - `--discovery <word>` — override `source.discovery` (default
    `"UNKNOWN"`; valid CVE 5.x values include `UNKNOWN`, `INTERNAL`,
    `EXTERNAL`, `USER`).
  - `--no-envelope` — emit only the inner `cna` container instead of
    the full CVE 5.x record (envelope is the default).
  - `--attach` — after generating the JSON, embed it at the end of
    the tracking issue's **body** (after the *CVE tool link* field),
    wrapped in a collapsible `<details>` block. The block is bracketed
    by HTML-comment markers
    (``<!-- generate-cve-json: cve=CVE-YYYY-NNNN+ version=v1 -->`` …
    ``<!-- generate-cve-json:end cve=CVE-YYYY-NNNN+ version=v1 -->``)
    that the script uses on later runs to find the existing block and
    **replace it in place**, so re-runs update the embedded attachment
    instead of duplicating it or breaking other body fields. The
    attachment lives in the body — not as a comment — so it stays
    above every status-change comment in the timeline (effectively
    "pinned" without needing any pin mechanism). Requires the
    positional issue argument; incompatible with `--stdin`.

---

## Prerequisites on the tracking issue

For the generated JSON to be useful, the issue body should already be
filled in through a prior `sync-security-issue` run. In particular:

- **Short public summary for publish** — becomes the CVE description.
- **Affected versions** — becomes the CVE `affected[]` list. The script
  understands the common version-expression shapes (`< 3.2.2`,
  `>= 2.0.0, < 3.2.2`, `<= 3.2.1`, a bare version like `3.1.5`, and a
  bare lower bound like `>= 2.0.0`).
  **Multi-product CVEs are supported** — put one package per line,
  prefixing each with the package directory name as it appears in
  the adopter's repo, and the script emits one `affected[]` entry
  per line with the right `product` / `packageName`. Example
  (illustrative — using a hypothetical `apache-foo` project's
  sub-project layout):

      apache-foo-project-alpha <=6.5.0
      apache-foo-project-beta <=1.9.0

  Known package directory names are resolved to the vendor-preferred
  display casing via the project's `packages.project_display_map`
  config table; unknown packages fall back to title-cased dash-split
  and can be overridden with `--product-for`. A line without a
  package prefix (or a single-line field) falls back to the
  `--product` / `--package-name` defaults, which preserves the
  single-product behaviour.

  **`< NEXT VERSION` placeholder** — multi-package trackers don't
  know which package version will ship the fix until the wave's
  release manager picks it during a release cut. Until then, the
  *Affected versions* lines use the literal token `NEXT VERSION` as
  the upper bound, e.g.:

      apache-foo-project-alpha < NEXT VERSION
      apache-foo-project-beta < NEXT VERSION

  The generator strips `< NEXT VERSION` before parsing each line and
  emits a `versions[]` entry without `lessThan` (open-ended upper
  bound — *"affected from \<low\> onwards, no fix released yet"*).
  When the wave ships and the version is known, the
  `sync-security-issue` skill replaces each `NEXT VERSION` with the
  actual `< X.Y.Z` and the next regen produces a fully-bounded entry.
  Case-insensitive; combines with a lower bound (e.g.
  `>= 2.0.0, < NEXT VERSION` becomes `{version: "2.0.0", status: "affected"}`).
- **Security mailing list thread** — internal navigation reference
  only; the script **does not** export URLs from this field into
  `references[]`. Keep whatever the reporter or triager put there.
- **Public advisory URL** — each URL in this field is extracted and
  added to `references[]` with `tags: ["vendor-advisory"]`. Populated
  by the release manager (or the `sync-security-issue` skill) once
  the advisory is archived on `<users-list>`. The
  `--advisory-url` CLI flag still exists for ad-hoc overrides.
- **PR with the fix** — each URL in this field becomes a reference URL.
  **Multiple URLs are supported**: paste them on separate lines, as a
  bullet list, or comma-separated — the script extracts every
  `https?://…` token it finds.
- **Reporter credited as** — each line becomes one CVE credit entry
  with `type: "finder"`. **Multiple credits are supported**: put each
  person on their own line. `Full Name, Affiliation` on a single line
  is treated as **one** credit, not two, so the common
  `Jed Cunningham, Astronomer` pattern works as expected. Bullets
  (`- `, `* `, `1. `) are stripped. Blank lines are ignored. If you
  need to credit many people::

      Jed Cunningham
      Saurabh Banawar
      selen (Huntr bounty 3e88d364-5047-4768-a52c-6568f21ef35b)

- **Remediation developer** — each line becomes one CVE credit entry
  with `type: "remediation developer"`. Same parsing rules as
  *Reporter credited as* (newline-separated, `Full Name, Affiliation`
  is one credit, bullets stripped). Auto-populated by the
  `sync-security-issue` skill from the linked PR's author the first
  time *PR with the fix* is set; manual edits survive subsequent
  syncs (the skill only proposes appending names that aren't already
  there). The `--remediation-developer` CLI flag adds further names
  on top of whatever the body already lists.

- **CWE** — `CWE-285: Improper Authorization` style works; so does a bare
  `CWE-285` or a plain sentence. The script extracts the `CWE-\d+` token
  for the `cweId` field and uses the rest as the human-readable
  description.
- **Severity** — `None`, `Low`, `Medium`, `High`, `Critical`
  (case-insensitive) are emitted as the text content of a `metrics[].other`
  block. Vulnogram lets you replace this with a CVSS vector in its form if
  you want a numeric score.
- **CVE tool link** — the ASF CVE tool URL, e.g.
  `https://cveprocess.apache.org/cve5/CVE-2026-40913`. The script extracts
  the `CVE-YYYY-NNNN+` token from this field. If the field is still
  `_No response_`, pass the CVE ID with `--cve-id`.

If one of these fields is missing, the JSON still generates, but the
reviewer will need to fill the gap in Vulnogram. The skill surfaces any
empty field in the proposal so nothing is silently skipped.

---

## Prerequisites

- **`gh` CLI authenticated** with collaborator access to
  `<tracker>` — the script reads the tracker via `gh`.
- **`uv` installed** — the script is a small `uv`-managed Python
  project and is invoked as `uv run --project
  tools/vulnogram/generate-cve-json generate-cve-json <N>`.

See
[Prerequisites for running the agent skills](../../../README.md#prerequisites-for-running-the-agent-skills)
in `README.md`.

---

## Step 0 — Pre-flight check

Before reading the tracker:

1. `gh api repos/<tracker> --jq .name` returns the adopter's
   tracker repo name (per `<project-config>/project.md`), **and**
2. `uv --version` returns.

If either fails, stop and tell the user what to install or log
in to.

---

## Step 1 — Verify the issue has the required fields

Fetch the issue body and check every template field the script reads. If
a field is missing or still `_No response_`, either run
[`sync-security-issue`](../../../.claude/skills/sync-security-issue/SKILL.md) first to fill it
in, or override it on the command line.

```bash
gh issue view <N> --repo <tracker> --json body --jq .body \
  | grep -E '^###|^_No response_'
```

Ask the user whether to proceed if any critical field is empty
(description, affected versions, CVE tool link, credits). Do not silently
generate a JSON with placeholder values.

---

## Step 2 — Run the generator

Run the project's console script through `uv run --project`, which
prepares the (cached) virtualenv on first use and reuses it on later
runs:

```bash
uv run --project <framework>/tools/vulnogram/generate-cve-json generate-cve-json <N> \
  --output /tmp/<CVE-ID>.json \
  --version-start <earliest-affected-version>
```

`--version-start` is the one flag the tracking issue body almost never
contains and that Vulnogram expects filled in (the body field usually
encodes only the upper bound). The remediation developer credit comes
from the body's *Remediation developer* field, populated by the
`sync-security-issue` skill from the linked PR's author — no CLI flag
needed in the normal flow. For a fix that landed in `3.2.2` and was
first introduced in `3.0.0`, for example:

```bash
uv run --project <framework>/tools/vulnogram/generate-cve-json generate-cve-json 232 \
  --output /tmp/CVE-2026-40913.json \
  --version-start 3.0.0
```

Pass `--remediation-developer "Name"` only when you need to add a
developer credit on top of (or in place of) what the body already
contains — for example a co-author who didn't end up as the PR's
GitHub author.

Additional flags, all optional:

- `--cve-id CVE-YYYY-NNNN+` — override the CVE ID if the *CVE tool link*
  field is empty.
- `--title "<vendor>: <product>: …"` — override the title.
- `--vendor` / `--product` / `--package-name` / `--collection-url` —
  override product identity (defaults sourced from the project's TOML
  config under `[product]`).
- `--org-id <uuid>` — override the CNA assigner org id (defaults to the
  ASF org id).
- `--discovery UNKNOWN|INTERNAL|EXTERNAL|USER` — override
  `source.discovery`.
- `--no-envelope` — emit only the `cna` container (no `cveMetadata`,
  no `dataType`/`dataVersion` wrapper). Use this if Vulnogram's `#source`
  tab is in "inner block only" mode.
- `--stdin` — read the issue body from stdin instead of calling `gh`.
  Useful for offline iteration and for drafting by hand.

The script is deterministic — re-running it with the same flags and the
same tracking-issue body produces the same JSON bytes.

### Output shape (in brief)

The generated record matches what Vulnogram exports after a save,
minus editor cruft. Notable fields:

- `containers.cna.affected[]` — `vendor`, `product`, `collectionURL`,
  `packageName` and a `versions[]` entry with `version`, `lessThan`,
  `status: "affected"`, `versionType: "semver"`.
- `containers.cna.descriptions[]` — both a plain `value` and an HTML
  `supportingMedia` alternative (Vulnogram's WYSIWYG mode needs both).
- `containers.cna.problemTypes[].descriptions[]` — `cweId`,
  human-readable `description`, `type: "CWE"`.
- `containers.cna.metrics[].other` — `type: "Textual description of
  severity"` and `content.text` = the severity word.
- `containers.cna.credits[]` — one entry per *Reporter credited as*
  line (type `"finder"`), plus one entry per *Remediation developer*
  body line and per `--remediation-developer` CLI override (type
  `"remediation developer"`); duplicates between the body field and
  CLI flags are dropped silently.
- `containers.cna.references[]` — URLs with auto-tagged `tags`:
  - GitHub `pull/` or `commit/` URLs → `["patch"]`;
  - `lists.apache.org` / `security.apache.org` → `["vendor-advisory"]`;
  - everything else → no tag.
- `containers.cna.source.discovery` — `"UNKNOWN"` by default.
- `containers.cna.providerMetadata.orgId` — ASF assigner org id.
- `cveMetadata` — `assignerOrgId`, `cveId`, `serial`, `state: "PUBLISHED"`.

---

## Step 3 — Surface the output to the user

After the script finishes, print these three things in order:

1. **The output file path**, with a one-line `cat` suggestion so the user
   can review the JSON in the terminal:

       ```
       Wrote /tmp/cve-CVE-2026-40913.json
       cat /tmp/cve-CVE-2026-40913.json
       ```

2. **A clipboard-copy command** appropriate to the user's platform. On
   Linux with `xclip` installed:

       ```
       xclip -selection clipboard < /tmp/cve-CVE-2026-40913.json
       ```

   On Wayland: `wl-copy < /tmp/cve-...json`. On macOS: `pbcopy < …`. If
   `xclip` / `wl-copy` / `pbcopy` is not on PATH, skip the clipboard
   command and tell the user to copy manually.

3. **The Vulnogram `#source` paste URL**, as a clickable link rendered per
   the "Linking CVEs" rule in [`AGENTS.md`](../../../AGENTS.md):

       ```
       Paste the JSON into the Vulnogram #source tab:
         [CVE-2026-40913](https://cveprocess.apache.org/cve5/CVE-2026-40913#source)
       ```

The #source tab on the ASF CVE tool is the direct "paste raw JSON" view of
the Vulnogram form. The page loads the current record, you paste the
script output over the top, click Save, and the form view reflects the
new values.

### Optional: `--attach` to embed (or refresh) the JSON in the issue body

If the user also wants the JSON *attached* to the tracking issue itself
(so it is discoverable from the issue without needing the local file),
add `--attach` to the invocation:

```bash
uv run --project <framework>/tools/vulnogram/generate-cve-json generate-cve-json 232 \
  --output /tmp/CVE-2026-40913.json \
  --version-start 3.0.0 \
  --attach
```

What `--attach` does:

- After generating the JSON (exactly the same bytes as without `--attach`),
  edits the tracking **issue's body** to embed the full JSON inside a
  four-backtick fenced code block, collapsed behind a `<details>`
  disclosure so long records don't bloat the issue view. The block is
  appended *after* the existing template fields, right after the
  *CVE tool link* field, so it lives at the end of the body.
- Brackets the attachment with a pair of hidden HTML-comment markers
  (``<!-- generate-cve-json: cve=CVE-YYYY-NNNN+ version=v1 -->`` …
  ``<!-- generate-cve-json:end cve=CVE-YYYY-NNNN+ version=v1 -->``) so
  subsequent runs can find the existing embedded block and **replace it
  in place**, without spawning duplicates and without touching any
  other text in the body.
- Re-running with `--attach` is safe and idempotent: same issue body →
  same JSON → the script patches the body, leaving you with one and only
  one embedded attachment per CVE id. If the current body already
  matches what the script would write, the PATCH is skipped entirely
  (no no-op timestamp on the issue).
- The script prints `Embedded CVE JSON in issue body on
  <tracker>#NNN` on first run and `Replaced CVE JSON in
  issue body on <tracker>#NNN` on subsequent runs, followed
  by a URL that deep-links to the `## CVE JSON — paste-ready for …`
  heading anchor inside the body.

**Why embedded in the body and not as a comment?** Two reasons:

1. **Natural "pinning" without an API for it.** GitHub has no
   pin-comment API. A separate comment ends up buried below every
   status-change comment — so a newcomer looking at the issue sees a
   long comment timeline with no obvious way to find the CVE JSON.
   The issue body always renders *above* the entire comment timeline,
   so anything embedded in the body is effectively pinned.
2. **One place to read the tracker.** The reporter-template fields,
   the CVE metadata, and the paste-ready JSON are all in one place —
   no hunting through the timeline to reconstruct the current state.

(GitHub also does not expose its `user-attachments` file-upload
pipeline to the REST API — only the web UI drag-and-drop uses it — so
a real file attachment isn't available to automation anyway. Embedding
as body text is the closest automatable equivalent and is directly
visible without a download round-trip.)

**Confidentiality.** The embedded block lives inside the private repo,
so it inherits the repo-wide confidentiality rules. Linking CVE
references inside the block follows the "Linking CVEs" rule in
[`AGENTS.md`](../../../AGENTS.md): before publication the block links
the ASF CVE tool; after publication, re-running the script includes a
`cve.org` link as well.

---

## Step 4 — Propose an issue comment recording the update

Per the "Keeping the reporter informed" rule in
[`README.md`](../../../README.md), any status change on an issue must be
recorded in an issue comment. Pasting a new version of the CVE record is
a status change. Propose (and, on confirmation, post) a short comment
like:

> **CVE entry regenerated from the tracking issue** — generated paste-ready
> JSON for [`CVE-2026-40913`](https://cveprocess.apache.org/cve5/CVE-2026-40913)
> from the current body fields (description, affected versions `< 3.2.2`,
> CWE-285, Low severity, N credits, M references). Pasted into the
> Vulnogram `#source` tab; the record is now in sync with the tracking
> issue.

Include a count of credits and references so a later reviewer can
sanity-check that nothing was dropped.

---

## Step 5 — Never edit the JSON in place after pasting

Once the JSON has been pasted into Vulnogram and saved, **do not** edit
the local JSON file to match tool-side changes. Re-run the script instead
(it is deterministic — you will get a clean baseline), diff the new
output against the current Vulnogram state, and paste the merged JSON
back. This keeps the tracking issue as the single source of truth: if
Vulnogram shows something different from the generated JSON, either the
issue body is out of date and needs a `sync-security-issue` run, or the
tool-side difference is intentional and the reviewer will keep it.

---

## Guardrails

- **Confidentiality.** The script deliberately **drops** any URL that
  points at `cveprocess.apache.org` or the project's `<tracker>` repo
  from the references list before serialising. Those URLs are private
  ASF-internal links and should not appear in a published CVE record.
  See the "Confidentiality of `<tracker>`" section of
  [`AGENTS.md`](../../../AGENTS.md).
- **CVE IDs are always linked** per the "Linking CVEs" rule in
  [`AGENTS.md`](../../../AGENTS.md). When the skill mentions the CVE in
  proposals, recaps or comments on the `<tracker>` issue, it must
  render the ID as a markdown link — before publication to the ASF
  CVE tool, and additionally to `cve.org` after publication.
- **`<tracker>` references are always linked** per the "Linking
  `<tracker>` issues and PRs" rule in
  [`AGENTS.md`](../../../AGENTS.md). When the skill mentions the
  tracking issue in its own comments, render it as a markdown link.
- **Deterministic output is a feature.** Do not introduce timestamps,
  random UUIDs, ordering dependencies on dict iteration, or other sources
  of non-determinism into the script. If you need to add a new field,
  make sure the output still hashes the same across runs on the same
  input.
- **Multi-entry fields.** Credits are split on **newlines only** to
  preserve the `Full Name, Affiliation` pattern. References are extracted
  from URL tokens in the field value. Do not reintroduce comma-splitting
  on credits.
- **No envelope means no metadata.** `--no-envelope` drops the
  `cveMetadata` block which includes the CVE ID; the JSON is pure CNA
  content. Make sure the user knows they will have to set the CVE ID by
  hand in Vulnogram in that mode.

---

## References

- [`AGENTS.md`](../../../AGENTS.md) — repo-wide conventions (confidentiality,
  Linking CVEs, Linking `<tracker>` issues and PRs, release-branch
  defaults).
- [`README.md`](../../../README.md) — handling process, in particular
  step 13 (fill in CVE tool fields and send advisory from the tool)
  and step 15 (paste the attached JSON into Vulnogram's #source tab,
  move the CVE to PUBLIC, close the issue).
- [`sync-security-issue`](../../../.claude/skills/sync-security-issue/SKILL.md) — the sibling
  skill that populates the tracking issue fields this skill consumes.
- [`fix-security-issue`](../../../.claude/skills/fix-security-issue/SKILL.md) — the other
  sibling skill that opens a public PR and updates the tracking issue
  with the fix URL.
