---
name: import-security-issue-from-md
description: |
  Open one or more `<tracker>` tracking issues from a markdown file
  containing a batch of security findings (typically the output of an
  AI security review or a third-party scanner). Each finding in the
  file becomes one tracker, landing in the `Needs triage` board
  column with the standard issue-template body fields populated from
  the markdown sections. Unlike `import-security-issue` (Gmail) and
  `import-security-issue-from-pr` (public PR), there is no inbound
  reporter to reply to and no PR to inspect — the file itself is the
  full report.
when_to_use: |
  Invoke when a security team member says "import findings from
  <path>", "import this scan output", "load these issues from a
  markdown file", or hands the agent a `.md` file containing one or
  more issue blocks separated by `---`. Typical sources: the output
  of a `/security-review`-style AI pass over an upstream branch, a
  third-party SAST report exported as markdown, or a security
  consultant's findings document. Not appropriate when a single
  inbound report is best handled through the Gmail path
  (`import-security-issue`) or when there is a public PR to anchor
  the import on (`import-security-issue-from-pr`).
---

<!-- Placeholder convention (see AGENTS.md#placeholder-convention-used-in-skill-files):
     <project-config> → adopting project's `.apache-steward/` directory
     <tracker>        → value of `tracker_repo:` in <project-config>/project.md
                       (example: airflow-s/airflow-s for the Apache Airflow security team)
     <upstream>       → value of `upstream_repo:` in <project-config>/project.md
                       (example: apache/airflow)
     Before running any bash command below, substitute these with the
     concrete values from the adopting project's <project-config>/project.md. -->

# import-security-issue-from-md

This skill is the **batch on-ramp** of the security-issue handling
process for the case where the security team has a markdown file
containing one or more pre-formatted security findings — typically
the output of an AI security review run against an `<upstream>`
branch, or a third-party scanner exporting in a similar shape. It
parses each finding in the file and creates one `<tracker>` tracking
issue per finding, landing them in `Needs triage` so the standard
validity discussion (Step 3 of [`README.md`](../../../README.md))
can run.

It is the third on-ramp variant alongside the two existing import
skills:

| | `import-security-issue` | `import-security-issue-from-pr` | `import-security-issue-from-md` |
|---|---|---|---|
| Source | `<security-list>` Gmail / PonyMail thread | `<upstream>` PR URL or number | Markdown file with one or more findings |
| Reporter | External researcher | None (PR author = remediation developer) | None (the file is the report; usually AI- or scanner-generated) |
| Receipt-of-confirmation reply | Drafted on the inbound thread | Skipped — no reporter to reply to | Skipped — no reporter to reply to |
| Validity assessment | Hosted on the tracker after import | Already done informally before invocation | Hosted on the tracker after import |
| Initial board column | `Needs triage` | `Assessed` | `Needs triage` |
| Cardinality | One thread → one tracker | One PR → one tracker | One file → N trackers |

**Golden rule — every finding lands as `Needs triage`.** A
markdown file (especially an AI-generated one) is a *proposal* of
findings, not an assessment. Each tracker created by this skill
must go through the same Step 3 validity discussion as a Gmail-
imported tracker. The skill must not pre-assess findings based on
their `**Severity:**` tag, must not skip the validity step for
findings tagged `HIGH`, and must not auto-allocate CVEs.

**Golden rule — confidentiality.** The input markdown file is
private security-team material. Treat it the same as
`<security-list>` content per the
[Confidentiality of `<tracker>`](../../../AGENTS.md#confidentiality-of-the-tracker-repository)
rule: paste verbatim into the (private) tracker is fine; **never**
paste into a public surface — not into `<upstream>`, not into a
public GHSA, not into any comment on a public repo. The `## Location`
URL fields commonly point at public branches / files; that is fine
to render as-is in the tracker (the URL is already public), but do
not propagate the surrounding security framing to the public
surface the URL points at.

**Golden rule — propose every finding individually before applying.**
Even when the input is a 50-finding file, the skill surfaces a
proposal table listing every finding and waits for explicit
confirmation. The default disposition mirrors `import-security-issue`:
*import all unless rejected upfront* (`skip N` to drop a specific
candidate). A bare `go` / `proceed` / `yes, all` imports every
non-rejected candidate. The skill must still render each candidate
in the proposal so the user can scan and override.

---

## Prerequisites

Before running, the skill needs:

- **`gh` CLI authenticated** with collaborator access to
  `<tracker>`. The skill calls `gh issue create`,
  `gh search issues`, and `gh issue edit`.
- **Project-board write access** for the `addProjectV2ItemById` /
  `updateProjectV2ItemFieldValue` mutations from
  [`tools/github/project-board.md`](../../../tools/github/project-board.md).
- **Read access to the markdown file** — the skill expects an
  absolute path or a path relative to `cwd`.

No Gmail, no PonyMail, no `<upstream>` access. There is no inbound
thread to read and no reporter to draft a reply to.

See [Prerequisites for running the agent skills](../../../README.md#prerequisites-for-running-the-agent-skills)
in `README.md` for overall setup.

---

## Step 0 — Pre-flight check

Before parsing the file, verify:

1. **`gh` is authenticated and has access.** Run
   `gh api repos/<tracker> --jq .name`; on 401 / 403 / 404, stop
   and tell the user to log in or get added.
2. **The input path is readable.** `Read` the file. If it does not
   exist or is empty, stop and surface a one-line ask for the
   correct path.
3. **The file is markdown of the expected shape.** Quick sanity
   check: at least one `# ` (title) heading and at least one
   `**Severity:**` metadata line. If neither is present, stop
   and surface: *"This does not look like a findings file. Expected
   format: per-finding `# Title`, `## Details`, `## Location`,
   `## Impact`, `## Reproduction steps`, `## Recommended fix`
   sections, then a `**Severity:** … **Status:** … **Category:**
   … **Repository:** … **Date created:** …` metadata block; blocks
   separated by `---` on their own line."*

If any check fails, do **not** proceed.

---

## Step 1 — Parse the file into findings

The expected per-finding shape:

```markdown
# <Title — one short imperative phrase>

## Details
<Multi-paragraph technical description. May reference file paths,
line numbers, function names. Often the longest section.>

## Location
[<file/line label>](<URL into the public source>)

## Impact
<One sentence. The threat actor's gain: arbitrary code execution,
data exfiltration, privilege escalation, etc.>

## Reproduction steps
1. <numbered list>
2. ...

## Recommended fix
<Suggested remediation. Free-form prose.>

---
**Severity:** HIGH|MEDIUM|LOW|UNKNOWN
**Status:** Open
**Category:** <free-text — Insecure Deserialization / RCE, SSRF, Broken Access Control, etc.>
**Repository:** <owner>/<repo>
**Branch:** <ref>
**Date created:** YYYY-MM-DD
```

Findings are separated by `---` on its own line (with blank lines
around it). The metadata block at the end of each finding is
itself preceded by `---`.

Parsing recipe:

1. Read the whole file.
2. Split on the regex `(?m)^---\s*$` to get raw blocks.
3. Drop blocks that are pure whitespace.
4. Group adjacent blocks: a "finding" is the block ending in the
   `**Severity:**` metadata line, plus the immediately preceding
   block (which carries `# Title` through `## Recommended fix`).
   Equivalently: walk blocks pairwise, treating
   `(narrative-block, metadata-block)` as one finding.
5. For each finding, extract the per-section payload:
   - `# Title` → the line after `# ` until newline.
   - Each `## <Section>` → everything until the next `## ` heading
     or the end of the narrative block.
   - Metadata: per-line `**Field:** value` extraction.
6. Validate per finding:
   - `# Title` is non-empty.
   - `**Severity:**` is one of `HIGH`, `MEDIUM`, `LOW`, `UNKNOWN`
     (case-insensitive); anything else → record as `UNKNOWN` and
     surface a one-line warning.
   - `**Repository:**` matches `<owner>/<repo>` shape; if absent,
     fall back to `<upstream>` (from `<project-config>/project.md`)
     and warn.
   - `## Details`, `## Impact`, and `## Reproduction steps` are
     present and non-empty. If any are missing, surface a warning
     but do not skip the finding (the importer can fill in
     `_No response_` for the corresponding tracker body field).

Record into the observed-state bag a list of `findings`, each with:

- `index` (1-based, matches the proposal table number).
- `title` (raw).
- `details`, `location_url`, `location_label`, `impact`,
  `repro_steps`, `recommended_fix` (string payloads).
- `severity`, `status`, `category`, `repository`, `branch`,
  `date_created` (metadata).

---

## Step 2 — Duplicate-tracker guard

For each parsed finding, search `<tracker>` for an existing tracker
with overlapping content so the skill does not silently land a
duplicate:

```bash
gh search issues "<title-keyword>" --repo <tracker> --json number,title,state,url
```

Pick `<title-keyword>` as the most distinctive 3-5 word substring
from the finding's title (drop common security words like *"in"*,
*"the"*, *"via"*). Hits with high title overlap, or hits whose body
mentions the same `## Location` URL, are surfaced inline in the
proposal as *"possible duplicate of `<tracker>#NNN`"* — they do not
auto-skip; the user decides during Step 4.

The duplicate guard is a soft signal, not a hard gate. Many AI scans
re-discover findings already tracked; surfacing the overlap lets the
user `skip N` for those candidates without parsing the full file by
hand.

---

## Step 3 — Build proposed tracker contents (per finding)

For each finding, prepare the tracker fields:

### 3a — Title

The tracker title is the finding's `# Title` with the standard
`[ Security Report ]` prefix prepended (per the issue-template
convention; see
[`tools/github/issue-template.md`](../../../tools/github/issue-template.md)):

```
[ Security Report ] <finding title>
```

The title is left otherwise untouched — this skill does not run the
title-normalisation cascade (that lives in `allocate-cve`, by which
point the validity of the report is established).

### 3b — Issue body

Map markdown sections to the standard `<tracker>` issue-template
body fields (per
[`tools/github/issue-template.md`](../../../tools/github/issue-template.md);
the role → concrete-name mapping comes from
[`<project-config>/project.md`](../../../<project-config>/project.md#issue-template-fields)):

| Markdown source | Tracker body field | Shape |
|---|---|---|
| `## Details` + `## Impact` + `## Reproduction steps` | `The issue description` | Verbatim, in that order, separated by blank lines and a `**Impact**`/`**Reproduction steps**` sub-heading line. |
| (auto) | `Short public summary for publish` | `_No response_` (the public summary is sanitised separately at Step 13). |
| `**Repository:**` + `**Branch:**` | `Affected versions` | Literal text *"`<owner>/<repo>` @ `<branch>` — versions to be confirmed during triage."* The release-train mapping happens at allocation. |
| (auto) | `Security mailing list thread` | `N/A — imported from markdown file <basename>; no security@ thread.` |
| (auto) | `Public advisory URL` | `_No response_`. |
| (auto) | `Reporter credited as` | `_No response_`. The credit decision happens at triage; if the file is AI-generated, there is typically no human finder to credit. |
| `## Location` URL (when it points at a `<upstream>` PR) | `PR with the fix` | The URL. Otherwise `_No response_` — the location commonly references a vulnerable file, not a fix. |
| (auto) | `Remediation developer` | `_No response_`. |
| `**Category:**` | `CWE` | Literal value (free text); the actual CWE assignment happens at triage / allocation. |
| `**Severity:**` | `Severity` | `HIGH` / `MEDIUM` / `LOW` / `UNKNOWN` from the metadata block. Surface in the body as-is; the CVSS scoring happens independently per [`AGENTS.md`](../../../AGENTS.md). |
| (auto) | `CVE tool link` | `_No response_`. |

Also append a *"Recommended fix (per the source markdown)"*
collapsible block at the end of the body. The recommended fix is
useful triage context but does not belong in any of the standard
template fields; a `<details>` block at the end of the body keeps it
out of the per-field surgery the other skills perform.

### 3c — Labels

Apply at creation:

- **`needs triage`** — every finding from this skill enters the
  standard validity-assessment flow.
- **`security issue`** — required for the `<tracker>` *Auto-add to
  project* workflow filter (`is:issue label:"security issue"`);
  without it the issue will not appear on the board.

Do **not** apply a scope label. Scope labels are assigned at
Step 5 of the handling process, after the validity assessment.

### 3d — Project board

Target column: `Needs triage`. The *Auto-add to project* workflow
adds the issue automatically once `security issue` is applied; the
skill still calls
`updateProjectV2ItemFieldValue` to set the `Status` to `Needs
triage` explicitly, so the column lands deterministically (per the
orphan-issue path in
[`tools/github/project-board.md`](../../../tools/github/project-board.md#orphan-issue-path)).

### 3e — Status-rollup comment

The first entry on the tracker's status rollup. Shape per
[`tools/github/status-rollup.md`](../../../tools/github/status-rollup.md):

```markdown
<!-- <tracker> status rollup v1 — all bot-authored status updates fold into this single comment. -->
<details><summary><YYYY-MM-DD> · @<author-handle> · Import from markdown (<basename>, finding <K>/<N>)</summary>

**Imported from markdown file `<basename>` on <YYYY-MM-DD>** (severity: `<severity>`, category: `<category>`).

This tracker was deliberately opened by the security team from a batch findings file. The validity of the report has **not** been assessed yet — the tracker landed in the `Needs triage` column accordingly. Standard Step 3 discussion applies.

**Source:** `<basename>` (finding `<K>` of `<N>` in the file).
**Location reference:** <location_url>
**Severity (from source):** `<severity>` (informational; CVSS scoring happens at allocation).
**Category (from source):** `<category>` (informational; CWE assignment happens at allocation).
</details>
```

Zero-whitespace rules from
[`status-rollup.md`](../../../tools/github/status-rollup.md#the-rollup-comment-shape)
apply: no leading spaces on any line inside the `<details>`
block, exactly one blank line after `<summary>…</summary>`,
exactly one blank line before `</details>`.

---

## Step 4 — Surface the proposal and wait for confirmation

Render a single proposal covering every parsed finding:

```
<file-basename> — N findings parsed.

| # | Severity | Category                       | Title                                              | Possible duplicate |
|---|----------|--------------------------------|----------------------------------------------------|--------------------|
| 1 | HIGH     | Insecure Deserialization / RCE | Arbitrary callable invocation during serialized…  | <tracker>#NNN      |
| 2 | HIGH     | Insecure Deserialization / RCE | Arbitrary import in custom deadline-reference…    | (none)             |
| 3 | MEDIUM   | Server-Side Request Forgery    | SSRF from API server via worker-supplied hostname | (none)             |
| 4 | MEDIUM   | Broken access control          | Import-error per-DAG authorization check is a no-op | (none)             |
| 5 | LOW      | Open redirect                  | Open-redirect validator accepts backslash-prefix… | (none)             |
| 6 | LOW      | Xss                            | DAG-author-controlled hrefs rendered without…     | (none)             |

Default disposition: import all 6 as `Needs triage`.
Reply with one of:
  - `go` / `proceed` / `yes, all`     — import every finding above.
  - `skip 4`                          — drop finding 4; import the rest.
  - `skip 4,6`                        — drop multiple.
  - `cancel` / `none`                 — bail; no trackers created.
```

Confirmation forms:

- `go` / `proceed` / `yes, all` — import every finding.
- `skip <N>` (or `skip <N>,<M>,…`) — drop the listed findings;
  import the remaining ones. The dropped findings get **no
  tracker** (no audit-trail draft, no follow-up — the markdown
  file itself is the audit trail).
- `cancel` / `none` / `hold off` — bail; no trackers created.

If a possible-duplicate flag is non-empty for a finding, the user
typically `skip`s it after a quick eyeball of the cited tracker; the
skill should not auto-skip on duplicate signal alone.

The proposal is a single round-trip even for a 50-finding file. The
skill must not stream per-finding confirmations.

---

## Step 5 — Apply (per kept finding, in order)

For each finding the user did not `skip`, run Steps 5a-5f
sequentially. The whole batch is a serial loop, **not** parallel —
per-finding `gh` calls and project-board mutations interleave with
GitHub rate limits cleanly when serialised.

### 5a — Create the tracker via `gh api`

Bypasses the form so the `Security mailing list thread`
required-field check does not fire. Same pattern as
[`import-security-issue-from-pr`'s](../import-security-issue-from-pr/SKILL.md#7a--create-the-tracker-via-gh-api) Step 7a.

Write the body to a temp file (per finding):

```bash
cat > /tmp/import-md-<basename>-<index>-body.md <<'EOF'
### The issue description

> **Imported from markdown file `<basename>` (finding <K>/<N>)** — there is no inbound `security@` report; the markdown sections below are the verbatim source.

**Details:**

<## Details payload, verbatim>

**Impact:**

<## Impact payload, verbatim>

**Reproduction steps:**

<## Reproduction steps payload, verbatim>

### Short public summary for publish

_No response_

### Affected versions

`<owner>/<repo>` @ `<branch>` — versions to be confirmed during triage.

### Security mailing list thread

N/A — imported from markdown file `<basename>`; no security@ thread.

### Public advisory URL

_No response_

### Reporter credited as

_No response_

### PR with the fix

<location_url if it points at a <upstream> PR, else _No response_>

### Remediation developer

_No response_

### CWE

<category from metadata; free-text — actual CWE assigned at triage>

### Severity

<severity from metadata>

### CVE tool link

_No response_

<details><summary>Recommended fix (per the source markdown)</summary>

<## Recommended fix payload, verbatim>
</details>
EOF
```

Create:

```bash
gh api repos/<tracker>/issues \
  -f title='[ Security Report ] <finding title>' \
  -F body=@/tmp/import-md-<basename>-<index>-body.md \
  --jq '.number, .node_id, .html_url'
```

Capture `number`, `node_id`, `html_url` from the response.

### 5b — Apply labels

```bash
gh issue edit <new-issue-number> \
  --repo <tracker> \
  --add-label 'needs triage' \
  --add-label 'security issue'
```

No scope label, no `pr created` / `pr merged` — those come later
in the lifecycle.

### 5c — Pin to the `Needs triage` board column

Run the orphan-issue path from
[`tools/github/project-board.md`](../../../tools/github/project-board.md#orphan-issue-path):

```bash
gh api graphql -f query='
  mutation($pid:ID!,$nid:ID!) {
    addProjectV2ItemById(input: { projectId: $pid, contentId: $nid }) {
      item { id }
    }
  }' \
  -F pid=<project-node-id> \
  -F nid=<issue-node-id> \
  --jq '.data.addProjectV2ItemById.item.id'
```

Capture the returned item ID, then set `Status` to `Needs triage`:

```bash
gh api graphql -f query='
  mutation($pid:ID!,$iid:ID!,$fid:ID!,$oid:String!) {
    updateProjectV2ItemFieldValue(input: {
      projectId: $pid,
      itemId: $iid,
      fieldId: $fid,
      value: { singleSelectOptionId: $oid }
    }) { projectV2Item { id } }
  }' \
  -F pid=<project-node-id> \
  -F iid=<item-id> \
  -F fid=<status-field-id> \
  -f oid=<needs-triage-option-id>
```

The `pid` / `fid` / `oid` values come from
[`<project-config>/project.md`](../../../<project-config>/project.md#github-project-board);
re-fetch them via the introspection query in
[`project-board.md`](../../../tools/github/project-board.md) if
either mutation returns `not found`.

### 5d — Post the status-rollup comment

```bash
gh issue comment <new-issue-number> \
  --repo <tracker> \
  --body-file /tmp/import-md-<basename>-<index>-rollup.md
```

The rollup body is the one drafted in Step 3e with placeholders
filled.

### 5e — Cleanup (per finding)

Delete `/tmp/import-md-<basename>-<index>-body.md` and
`/tmp/import-md-<basename>-<index>-rollup.md`. They served their
purpose for this finding and would otherwise accumulate.

### 5f — Loop progress

After every finding lands, print a short one-liner so the user can
see progress on long batches:

```
[K/N] <tracker>#NNN — <finding title>
```

If a single finding's `gh api` call fails (rate limit, transient
network error, schema mismatch), surface the failure with the
finding's index and continue with the rest. Do **not** abort the
batch on the first failure — the user can re-invoke for the failed
indices once the cause is fixed.

---

## Step 6 — Recap

Print a one-screen recap:

- File imported (`<basename>`, `<N>` findings parsed).
- For each kept finding: `<tracker>#NNN` (clickable), title.
- For each `skip`-ped finding: index, title, reason if surfaced
  (`possible duplicate`, `user skip`, etc.).
- For each failed finding: index, title, failure cause (so the
  user can re-invoke).

Then a one-line hand-off:

> Next: triage each new tracker per Step 3 of the handling
> process. Run [`sync-security-issue`](../sync-security-issue/SKILL.md)
> on `<tracker>#NNN` once the validity discussion progresses.

Do **not** auto-invoke `sync-security-issue` — these trackers are
freshly created in `Needs triage` and have nothing to sync until
the validity discussion produces signal.

---

## What this skill does **not** do

- **Does not run the validity discussion.** Every finding lands as
  `Needs triage`; Step 3 of the handling process happens in tracker
  comments after import.
- **Does not draft a reporter reply.** There is no reporter — the
  markdown file is the report, and any clarification questions the
  team has about a finding are recorded as comments on the
  resulting tracker, not on a Gmail thread.
- **Does not allocate CVEs.** A finding tagged `**Severity:** HIGH`
  in the source markdown is *still* unassessed from the security
  team's perspective; the CVE-allocation gate (per
  [`allocate-cve`](../allocate-cve/SKILL.md)) requires the team's
  own validity decision first.
- **Does not parse markdown formats other than the one documented
  in Step 1.** If the input file uses a different shape (e.g.
  `### Title` instead of `# Title`, or a YAML front-matter block
  instead of `**Field:**` lines), surface a one-line ask for the
  user to either reformat the file or open the trackers manually.
  The skill must not silently best-effort parse a divergent shape;
  the resulting trackers would be subtly malformed and confuse the
  rest of the lifecycle.
- **Does not characterise the source as authoritative.** The
  status-rollup line `Severity (from source): HIGH (informational;
  CVSS scoring happens at allocation)` is the standard wording —
  the source's tags are recorded, not adopted.

---

## Failure modes

| Symptom | Likely cause | Fix |
|---|---|---|
| File parse yields zero findings | The file uses a different heading level or no `**Severity:**` metadata block | Stop; surface the expected shape from Step 1 and ask the user to reformat. |
| `gh api repos/<tracker>/issues` returns 422 | Title or body field shape doesn't match the issue template | Re-check the body against the eleven `### <field>` headings; the heading text is case-sensitive. |
| `addProjectV2ItemById` returns `not found` for the project | Project-board node ID changed in `<project-config>/project.md` | Re-run the introspection query in [`project-board.md`](../../../tools/github/project-board.md) and update `<project-config>/project.md`. |
| Many possible-duplicate hits surfaced for every finding | The file is a re-scan against an already-triaged branch | Pause; consider whether the right action is `skip` for every finding (the existing trackers cover this) rather than landing duplicates. |
| `gh api` rate-limits mid-batch | Large file (50+ findings) hits the per-minute limit | The skill surfaces the partial-success recap from Step 6; re-invoke against the same file later for the failed indices (the duplicate-guard at Step 2 will catch the already-imported ones). |

---

## Examples

### Example 1 — A six-finding AI-scan output

```
import findings from /tmp/scan-michaelwinser-airflow-2026-04-28.md
```

The skill parses six findings (severities: HIGH×2, MEDIUM×2,
LOW×2). The duplicate guard flags one HIGH as a possible
duplicate of an already-tracked deserialization finding; the user
replies `skip 1`, accepting the duplicate hint. The remaining five
land as `<tracker>#NNN..#NNN+4` in `Needs triage`. Recap shows
the five new tracker URLs and one skip with the duplicate
reference.

### Example 2 — A single-finding scanner export

```
import findings from ~/Downloads/sast-export.md
```

The file contains one finding (a SAST report exported as
markdown). The skill parses, surfaces a one-row proposal, the
user replies `go`, the tracker lands. The cardinality is the same
as a Gmail import; the only difference is the source format.

### Example 3 — Malformed input

```
import findings from /tmp/notes.md
```

`/tmp/notes.md` is a free-form scratch file — no `**Severity:**`
lines, no `---`-separated blocks. Step 0's sanity check fires;
the skill stops with the expected-shape ask and does not create
any tracker.
