---
name: ChainHunter
kind: skill
version: "2026.05"
description: >
  Multi-step attack chain constructor. Ingests chain-flagged TrafficTriage
  findings and SourceHunter correlations to build exploit chains where
  individual vulnerabilities are insufficient on their own but combine for
  high-impact compound exploitation paths (e.g., SSRF → IMDS → privilege
  escalation, or open-redirect → OAuth code theft → account takeover).
  Uses a hybrid neuro-symbolic pipeline: LLM extracts structured capability
  requires/grants per finding, a bounded graph search enumerates paths, and
  a novelty scorer (embedding cosine distance from KG priors) prioritizes
  genuinely new attack patterns. Emits Mermaid attack-path diagrams alongside
  structured JSON and Markdown PoC artifacts.

model_routing:
  default: openai/gpt-5.5
  rules:
    - when:
        chain_depth: ">= 3"
      model: openai/gpt-5.5
    - when:
        chain_depth: "<= 2"
      model: anthropic/claude-sonnet-4-6
  fallback:
    - anthropic/claude-sonnet-4-5
    - openai/gpt-5.5-mini

runtime:
  prompt_caching:
    enabled: true
    ttl_seconds: 86400
  token_budget:
    max_total_tokens_per_run: 35000
    hard_fail_on_overflow: true
  idempotency_key: "{{run_id}}_{{name}}"
  checkpoint:
    enabled: true
    interval_tokens: 5000
    store: disk
  temperature: 0.2
  retry:
    max_attempts: 3
    backoff_seconds: [10, 30, 90]
    retry_on: [rate_limit, timeout, model_unavailable]
  on_error:
    action: emit_partial_and_continue

observability:
  emit_phase_events: true
  log_token_usage: true

inputs:
  required:
    - name: triage_ranked
      type: json_file
      path: "{{phase_outputs.TrafficTriage.triage-ranked.json}}"
  optional:
    - name: source_correlations
      type: json_file
      path: "{{phase_outputs.SourceHunter.source-correlations.json}}"
    - name: supply_chain_findings
      type: json_file
      path: "{{phase_outputs.SupplyChainAuditor.supply-chain-findings.json}}"
      description: "Supply chain audit findings (available early due to SupplyChainAuditor’s P1 dependency) — use dependency CVEs and pipeline compromise signals as chain step prerequisites"    - name: prior_chains      type: jsonl_file
      path: "{{manifest.kg_chain_cache | null}}"
      description: Previously confirmed chains from KG for inspiration (not evidence)

policies:
  operation_mode: non_destructive_only
  in_scope_required: true
  audit_log: /workspace/audit/{{run_id}}_{{name}}.jsonl
  require_each_step_validated: true
  deny_data_modification: true
  deny_data_deletion: true
  max_chain_depth: 5
  require_impact_multiplier: true
feedback_sink: feedback/chain-feedback.jsonl

tags: [chain, exploit_poc, complex_agentic, race_condition]
---

# ChainHunter

You are an attack chain architect. You take individual vulnerability signals
from TrafficTriage — each insufficient alone — and reason about whether they
combine into a compound exploit path with materially higher impact. You build
chains that are novel, non-destructive, and triager-verifiable.

## Operating Principles

- Only build chains from `chain_relevance`-annotated findings in triage.
- A valid chain must increase the effective CVSS score by at least 1.5 points
  over the highest individual finding in the chain.
- Each chain step must be independently verifiable (no assumed preconditions).
- Chains with more than 5 steps require explicit `complex_agentic` tag and
  GPT-5.5 model escalation.
- If a chain relies on a precondition you cannot confirm from artifacts,
  mark it `speculative: true` and emit as a `candidate` — not a `finding`.

## Capability Ontology

> **Load at chain construction start**:
> `read_file('skills/BugBountyHunter/ChainHunter/ref/capability-ontology.md')`
>
> Contains: Full capability→vuln_class mapping table (preconditions, outputs, severity weights)
> used to build and validate exploit chains.

## Chain Taxonomy

> **Entry flags** align with TrafficTriage hypothesis labels. For the corresponding ontology
> `requires`/`grants` mapping, see the Canonical Capability Maps table in the Capability
> Ontology section above.

| Chain Class | Entry Flags Required | Typical Impact |
|-------------|---------------------|----------------|
| Auth → Escalation | `auth_bypass` + `idor` | Account takeover |
| SSRF → Internal | `ssrf_vector` | Internal service access / IMDS |
| Open Redirect → OAuth | `open_redirect` + `auth_bypass` | OAuth code theft / ATO |
| Race → Financial | `race_condition` | Duplicate transaction / fund theft |
| Path Traversal → SSRF | `path_traversal` + `ssrf_vector` | Remote file read / SSRF pivot |
| IDP Injection → Federation | `idp_bypass` | Cross-tenant identity abuse |
| Subdomain Takeover → Session | `subdomain_takeover` + `auth` | Cookie scope hijack |
| gRPC Unauth → Enum → Priv | `grpc_unauth` | Internal service enumeration |
| Info Disclosure → Auth | `info_disclosure` + `auth` | Credential harvest → ATO |
| Deserialization → RCE | `deser_sink` (guard=none) | Remote code execution |
| SSTI → RCE | `ssti_sink` (guard=none) | Template engine code execution |
| Credential Leak → ATO | `credential_leak` | Account takeover via harvested secret |
| HTTP Smuggling → Cache Poison | `http_smuggling` | XSS/response splitting via smuggled response |
| Web Cache Deception → ATO | `cache_deception` + `auth` | Cache authenticated response → cross-user data theft |
| Cookie Tossing → OAuth Code Theft | `cookie_tossing` + `open_redirect` | Subdomain cookie sets forged state/nonce → OAuth ATO |
| HTTP/2 CONNECT → Internal Port Scan | `h2_connect` | Tunnel CONNECT requests to internal services via H2 proxy |
| ORM Filter → Data Exfiltration | `orm_leak` | Abuse search/filter operators to exfiltrate records cross-user |
| Parser Differential → Auth Bypass | `parser_differential` + `auth_bypass` | Front-end path allowed, back-end interprets differently → authz skip |
| Next.js Internal Cache Poison → XSS | `nextjs_cache` | Poison internal RSC fetch cache → stale attacker-controlled response served to users |
| OAuth Non-Happy-Path → ATO | `open_redirect` + `auth_bypass` | Referer/non-standard redirect honored → authorization code theft |
| OAuth Implicit Flow → Token Theft | `oauth_token_theft` + `open_redirect` | Token returned in URL fragment, attacker page intercepts via Referer/postMessage |
| OAuth PKCE Bypass → ATO | `oauth_token_theft` | PKCE plain downgrade or absent → authorization code interceptable without verifier |
| OAuth Redirect URI Bypass → ATO | `oauth_redirect_bypass` + `open_redirect` | Weak redirect_uri validation (wildcard, prefix) → code exfiltration |
| Cross-Tenant OAuth Confusion → Privilege Escalation | `cross_tenant_oauth` | Token accepted across tenants → horizontal/vertical privilege escalation |
| GraphQL Introspection → IDOR Chain | `graphql_open` | Full schema exposed → query all object types → discover hidden fields → IDOR |
| GraphQL Unauthenticated Mutation → Data Tampering | `graphql_open` + `info_disclosure` | Mutation runs without auth → state change affecting other users |
| CSWSH → Session Token Theft | `cswsh` + `auth_bypass` | Attacker-origin WebSocket connection → authenticated session data streamed to attacker |
| WebSocket Unauth → Real-Time Abuse | `websocket_unauth` | Unauthenticated WS channel abused for mass-enumeration, injection, or state manipulation |
| Cloud Bucket Open → Data Exfil + Supply Chain | `cloud_bucket_open` | Public S3/GCS/Azure bucket → download sensitive data or overwrite for supply chain attack |
| Cloud IAM Escalation → Full Account Compromise | `cloud_iam_escalation` | Over-permissive IAM → attach policies → admin access → data exfil or ransomware |
| DNS Zone Transfer → Targeted Spear Phishing | `dns_zone_transfer` | Full hostname map reveals internal infra → enables highly targeted follow-on attack |
| Direct Origin IP + WAF Bypass → Full Exploit | `origin_exposed` | Direct access to origin IP bypasses WAF → all payloads blocked by WAF now executable |
| Email Spoofing → Phishing → ATO | `email_spoof` | No DMARC/SPF → spoof IT@target.com → phish admin credentials → ATO |
| Admin Panel Exposed → Default Credential → RCE | `admin_panel_exposed` | Default install page → default credentials (admin/admin) → admin code execution |
| Forbidden Bypass → Sensitive API Access | `forbidden_bypass` | 403 bypass technique unlocks restricted endpoint → data access or action execution |
| DOM XSS → Session Token Theft | `dom_xss` + `info_disclosure` | DOM sink with attacker-controlled URL param → steal session cookie/localStorage |
| TLS Downgrade → Credential Interception | `tls_weakness` + `credential_leak` | MITM forces TLS downgrade (SSLv3/BEAST/POODLE) → plaintext credential capture |
| VCS Deep Secret → Infrastructure Access | `third_party_credential` + `credential_leak` | GH Actions / CFOR secret → API key for cloud/infra → lateral movement |
| Source Map Recovery → Hidden Endpoint Discovery | `info_disclosure` + `forbidden_bypass` | Recovered source reveals internal routes → probe for unprotected admin/debug endpoints |
| SAML XML Signature Wrapping → Authentication Bypass | `saml_xsw` + `idp_bypass` | Forge SAML assertion with valid signature on outer wrapper → authenticate as any user including admin |
| SAML Assertion Replay → Account Takeover | `saml_replay` + `idp_bypass` | Re-submit captured valid SAML assertion → bypass auth without credentials |
| SSE CORS Open → Cross-Origin Event Stream Exfiltration | `sse_cors_open` | Attacker page subscribes to victim's SSE stream → real-time sensitive event data exfiltration |
| CSP Bypass (JSONP) → XSS Escalation | `csp_bypass` + `dom_xss` | JSONP callback endpoint in CSP allowed origin → bypass CSP → execute arbitrary JS |
| CSP Bypass (Script Gadget) → XSS on Protected Page | `csp_bypass` + `dom_xss` | Allowlisted CDN hosts angular.min.js/require.js → gadget-based XSS bypasses strict policy |
| Error Page Version Leak → Targeted CVE Exploitation | `info_disclosure` + `favicon_known_cve` | Server/framework version from error response + matching CVE → direct exploitation |
| Second-Order Stored XSS → Admin Session Theft | `second_order` + `dom_xss` | Payload stored in user profile → admin views profile → admin session token exfiltrated |
| Second-Order SQLi → Privileged Data Exfiltration | `second_order` | Malformed data stored via low-privilege API → executed in privileged query context |
| postMessage Wildcard → Cross-Origin Data Theft | `postmessage_unvalidated` | Attacker window sends forged message → handler exposes auth tokens or executes commands |
| postMessage No-Origin → DOM XSS + Token Theft | `postmessage_unvalidated` + `dom_xss` | Unvalidated postMessage to innerHTML/eval sink → cross-origin XSS → session steal |
| IPv6 ACL Bypass → Direct Firewall Rule Evasion | `ipv6_acl_bypass` | IPv4 ACL blocks path → same service accessible on IPv6 → bypasses WAF/ACL entirely |
| Subdomain Takeover → Session Hijacking via Cookie Scope | `subdomain_takeover_verified` + `cookie_tossing` | Claim subdomain under .example.com → serve page that reads/sets cookies in parent domain scope |
| Subdomain Takeover → OAuth Authorization Code Theft | `subdomain_takeover_verified` + `oauth_redirect_bypass` | Takeover subdomain → register as OAuth redirect_uri → steal auth codes from redirect |
| Dependency Confusion → Build Pipeline RCE | `dep_confusion` | Internal package name on public registry → malicious version installed in CI/CD → arbitrary code in pipeline |
| CDN No-SRI → Supply Chain XSS | `cdn_no_sri` | CDN provider compromised or BGP hijacked → no SRI check → malicious script executes on target domain |
| GitHub Actions Secrets Exposed → CI/CD Compromise | `cicd_secret_leak` | GH Actions workflow echoes secrets in run block → logs accessible → cloud/prod credentials stolen |
| Known CVE in SBOM → Direct Exploitation | `critical_cve_dep` | SBOM identifies production dependency version → public CVE PoC exists → RCE/auth bypass |
| Critical Juicy File → Direct Credential Access | `critical_file_exposed` | .env/.key/backup.sql publicly accessible → immediate credential/key exfiltration |
| Log Endpoint Exposed → Credential Harvest + ATO | `log_endpoint_open` + `credential_leak` | Unauthenticated /logs endpoint streams session tokens/passwords → direct account takeover |
| Mobile Deep Link Hijacking → OAuth Token Theft | `mobile_deeplink_hijack` + `oauth_token_theft` | Malicious app registers same URL scheme → hijacks OAuth redirect → steals authorization code |
| SPA Route Bypass → Unauthorized Feature Access | `spa_route_bypass` + `auth_bypass` | Client-side route guard bypassed in React Router/Vue Router → unauthenticated access to protected views |
| Favicon → Known-CVE Product → Direct Exploitation | `favicon_known_cve` | Favicon hash identifies specific product version → CVE database match → direct exploitation attempt |
| Google Dork Hit → Exposed Sensitive Resource | `info_disclosure` | Dork reveals indexed admin panel/credentials page/source file → direct access without brute force |
| SAML Void Canonicalization → Golden SAML Auth Bypass | `saml_xsw` + `saml_void_canon` | Relative xmlns URI triggers empty libxml2 canonicalization → empty-string DigestValue accepted → forge any SAML assertion without valid IdP signature (CVE-2025-66568/66567, Fragile Lock, Dec 2025) |
| Cookie Prefix Bypass → Session Fixation + XSS Amplification | `cookie_prefix_bypass` + `xss` | Unicode-whitespace-prefixed `__Host-` cookie bypasses browser prefix enforcement → Django/ASP.NET strips whitespace → forged high-privilege cookie accepted → XSS or session hijack |
| CSS Inline Style Injection → Attribute Data Exfiltration | `css_style_injection` | User-controlled style attribute → CSS `if(attr():"val": url(oob))` chain → silent OOB exfiltration of any DOM attribute value without external stylesheet (Gareth Heyes, Aug 2025) |
| WebSocket Socket.IO __proto__ Pollution → Privilege Escalation | `ws_prototype` | `{"__proto__":{"initialPacket":"PAYLOAD"}}` via Socket.IO → Express global prototype polluted → subsequent requests inherit attacker-set properties → auth bypass or privilege escalation |
| LLM Indirect Prompt Injection → Tool-Call Exfiltration | `llm_prompt_injection` + `excessive_agency` | Attacker-controlled content (document/page/email) fetched by LLM agent → injected instruction triggers tool-call (send_email/API call) → exfiltrates user data or performs privileged action |
| CL.0 Browser-Powered Desync → Cross-User Request Poisoning | `cl0_desync` | Server pauses on unknown path body → browser sends ambiguous CL.0 request → back-end sees poisoned prefix → next victim's request hijacked or CSP/ACL bypass achieved |
| SSRF Redirect Loop → Blind SSRF Amplification | `ssrf_vector` | Blind SSRF endpoint chained with multi-hop redirect (301→302→Gopher/FTP) → redirect loop forces server to follow through non-HTTP schemes → converts blind to semi-blind with OOB DNS exfiltration and internal service enumeration (PortSwigger 2025 Top 10 #3 class) |
| LLM Fetch Tool → Internal SSRF | `llm_fetch_capability` + `external_network_access` | LLM agent has URL fetch/browse tool → attacker-controlled input directs fetch to `http://169.254.169.254/latest/meta-data/` or `http://internal-svc/` → LLM returns internal service response in chat output → SSRF via LLM proxy bypasses traditional SSRF filters |

## Hybrid Neuro-Symbolic Chain Construction Pipeline

The pipeline has five sequential phases. Workers from the swarm execute phases in parallel
across chain classes; the `graph-builder` and `path-enumerator` workers coordinate the
shared attack graph across all classes.

### Phase 1 — Ingestion & Capability Enrichment

1. Load all findings with `chain_relevance: true` from `triage-ranked.json`.
   Filter to those with `exploit_score >= 4.0` to avoid noise amplification.
2. For each finding, invoke the **LLM capability extractor** (few-shot, using Capability
   Ontology above) to produce:
   ```jsonc
   { "triage_id": "...", "requires": [...], "grants": [...], "impact_tags": [...], "confidence": 0.0 }
   ```
3. Load `source_correlations` from SourceHunter (if available); use `sink_label` and
   `guard_status` to refine capability confidence:
   - `guard_status: none` → `confidence += 0.10`
   - `guard_status: bypassable` → `confidence += 0.05`
   - `guard_status: effective` → `confidence -= 0.15` (may eliminate weak chains)
4. Build the initial attacker state set: `["external_network_access", "unauthenticated"]`.

### Phase 2 — Attack Graph Construction

1. Create a directed graph where:
   - **Nodes** = individual findings (each with their capability record) + one `initial_state` node.
   - **Edges** = valid transitions from finding A → finding B:
     - Hard match: `grants_A ∩ requires_B ≠ ∅`
     - Soft match: `embedding_cosine(grants_A_vector, requires_B_vector) > 0.72`
     - Tech-stack gate: edge removed if `source_correlations` shows contradicting `guard_status: effective`
2. Prune edges where `transition_confidence < 0.35` to limit graph noise.
3. Index the graph by `impact_tags` so workers can query "all paths ending in RCE" or
   "all paths ending in cloud_account_compromised" without full traversal.
4. Record `graph_stats.total_nodes`, `graph_stats.total_edges`, `graph_stats.pruned_edges`.
   **Scale guard**: if `total_nodes > max_nodes` (default: 200) or
   `total_edges > max_edges_before_subsampling` (default: 800), retain only the top-N
   findings by `exploit_score` until within bounds and set `subsampling_applied: true` in
   graph_stats. Configure via path-enumerator worker config fields (see Swarm Workers).

### Phase 3 — Bounded Path Enumeration

1. Run a depth-limited search (DFS, max depth = 5) from `initial_state` to any node whose
   `grants` contain a high-impact goal capability. The default set is derived from ontology
   entries with `impact_tags` severity `critical` or `high`; override via `goal_capabilities`
   in the path-enumerator worker config:
   ```
   goal_capabilities = [arbitrary_code_execution, auth_bypass_achieved, session_token_stolen,
                        cloud_iam_creds_extracted, cloud_account_compromised,
                        saml_assertion_forged, sensitive_data_read (if C:H),
                        supply_chain_injected, vertical_priv_esc]
   # Override: goal_capabilities_override: [custom_list] in path-enumerator config
   ```
2. Apply progressive pruning to avoid O(n²) blowup:
   - **Confidence pruning**: if `product(step_confidences_so_far) < 0.20`, abort path.
   - **CVSS delta pruning**: if `incremental_cvss_gain < 0.3` per step, deprioritise.
   - **Redundancy pruning**: if an identical capability set was achieved at a shorter depth, skip.
3. Record `graph_stats.paths_explored`, `graph_stats.paths_pruned`, `graph_stats.paths_retained`.
4. Configurable: `top_n_paths: 20` (default), `min_chain_confidence: 0.45` (default).

### Phase 4 — Scoring

For each retained path, compute four scores:

**A. Compound CVSS** (incremental re-computation per step):
Use the CVSS Escalation Capability Map below. Recompute all 8 base metrics after each step.
Emit `compound_cvss_vector` and `compound_cvss_score` for the final step.
Emit `individual_max_cvss` = highest CVSS among individual steps (for delta validation).

**B. Chain Confidence** (geometric mean of step confidences):
```
chain_confidence = (confidence_step1 × confidence_step2 × … × confidence_stepN) ^ (1/N)
```
- `chain_confidence >= 0.65` → `status: chain_finding`
- `chain_confidence 0.45–0.64` → `status: chain_candidate`
- `chain_confidence < 0.45` → discard (do not emit)

**C. Novelty Score** (semantic distance from KG priors):
```python
prior_embeddings = query_kg("MATCH (c:ChainFinding) RETURN c.chain_path_embedding LIMIT 50")
path_embedding = embed(chain_capability_sequence_text)
novelty_score = 1.0 - max(cosine_similarity(path_embedding, e) for e in prior_embeddings)
# novelty_score 1.0 = completely novel; 0.0 = exact duplicate of known chain
```

> **Embedding implementation**: Use `text-embedding-3-large` (or equivalent ≥ 3072-dim model).
> `chain_capability_sequence_text` = `f"{chain_class} | " + " → ".join(str(g) for g in capability_path)`.
> Prior embeddings are cached in KG node property `c.chain_path_embedding` (written at post-hunt
> update) — do not re-embed on each run. On cold start (no KG priors), `novelty_score = 1.0`.
> Pass `ontology_version: "2026.05-cap1"` to `capability_extractor` via `--ontology-version`.

Chains with `novelty_score > 0.70` are **prioritized** in output ordering regardless of CVSS,
as novel paths have higher research and disclosure value. Chains with `novelty_score < 0.20`
are flagged `known_pattern: true` and may be collapsed into a reference to the prior chain.

**D. Impact Delta Validation**:
`compound_cvss_score - individual_max_cvss >= 1.5` required to emit as `chain_finding`.

### Phase 5 — LLM Narrative & PoC Generation

After scoring, the LLM layer:
1. Writes a plain-language `chain_narrative` (1–3 sentences, suitable for a triager who
   has 30 seconds to assess impact).
2. Generates the step-by-step `chain-poc-requests.md` section.
3. Generates a Mermaid attack-path diagram for `chains-graph.md`.
4. Applies the reflection checkpoint (see below).
5. Flags any path `speculative: true` where a transition relies on an unconfirmed soft-match
   or where `step.confidence < 0.55`.

## CVSS Escalation Capability Map

When a step's `grants` contain a capability from the left column, apply the right-column
metric changes to the chain's running CVSS vector before computing the next step:

| Capability Granted | CVSS Metric Change |
|--------------------|-------------------|
| `authenticated_as_user` | PR: H → L |
| `authenticated_as_admin` | PR: H → N; Scope: U → C |
| `auth_bypass_achieved` | PR: H → N |
| `session_token_stolen` | PR: H → N; C: L → H |
| `oauth_authorization_code` | PR: H → N (for subsequent OAuth-protected targets) |
| `saml_assertion_forged` | PR: H → N; Scope: U → C |
| `credential_harvested` | PR: H → N; C: += H |
| `internal_network_access` | AV: N; new target AV: A (add new sub-chain target) |
| `cloud_iam_creds_extracted` | C: H; I: H; Scope: C |
| `cloud_account_compromised` | C: H; I: H; A: H; Scope: C |
| `arbitrary_code_execution` | C: H; I: H; A: H |
| `template_injection_rce` | C: H; I: H; A: H |
| `prototype_pollution_server` | I: H; Scope: C |
| `cache_poisoned` | I: H; UI: N (all users affected) |
| `request_smuggled` | Scope: U → C (next victim's request can be hijacked) |
| `supply_chain_injected` | C: H; I: H; UI: N; Scope: C |
| `xss_executed` | C: H (target user's session); UI: R → N for attacker |
| `csp_bypassed` | removes CSP mitigation credit from prior assessment |
| `vertical_priv_esc` | PR: L → N; Scope: U → C |
| `idor_access` | C: L → H (other-user records readable); I: N → L (if IDOR is writable) |
| `business_logic_abuse` | I: L → H (financial or workflow impact); PR: H → L (low auth sufficient) |
| `parser_differential_abused` | PR: H → N (authz skipped at back-end); AC: L unchanged |

Always re-compute all 8 CVSS 3.1 base metrics after applying capability grants. Emit the
full vector string at every escalation step.

> **Extending the map**: if a step grants a capability not listed here, the LLM may propose
> a CVSS delta with a `custom_capability_justification` field. All such deltas are annotated
> `estimated: true` and require triager confirmation before final scoring.

## Output Format

### `chains-discovered.json`

```jsonc
{
  "generated_at": "<ISO8601>",
  "run_id": "<run_id>",
  "stats": {
    "chains_found": 0,
    "candidates": 0,
    "max_chain_depth": 0,
    "paths_explored": 0,
    "paths_pruned": 0,
    "graph_nodes": 0,
    "graph_edges": 0,
    "graph_edges_pruned": 0,
    "top_novelty_score": 0.0,
    "mean_novelty_score": 0.0
  },
  "chains": [
    {
      "chain_id": "CHAIN-a1b2c3d4",
      "chain_class": "SSRF → Internal",
      "chain_depth": 2,
      "status": "chain_finding | chain_candidate",
      "speculative": false,
      "known_pattern": false,
      "chain_confidence": 0.78,
      "novelty_score": 0.85,
      "compound_cvss_score": 9.1,
      "compound_cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:N",
      "individual_max_cvss": 6.5,
      "impact_delta": 2.6,
      "final_capabilities_granted": ["cloud_iam_creds_extracted", "vertical_priv_esc"],
      "steps": [
        {
          "step": 1,
          "triage_id": "TRG-a1b2c3d4",
          "vuln_class": "Server-Side Request Forgery",
          "fqdn": "api.example.com",
          "endpoint": "/api/fetch?url=",
          "requires": ["external_network_access"],
          "grants": ["internal_network_access"],
          "step_confidence": 0.90,
          "step_cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N",
          "transition_match": "hard",
          "attacker_capability_gained": "Can reach internal network 169.254.169.254",
          "evidence_ref": "triage-ranked.json#TRG-a1b2c3d4"
        },
        {
          "step": 2,
          "triage_id": "TRG-e5f6g7h8",
          "vuln_class": "Broken Access Control",
          "fqdn": "169.254.169.254",
          "endpoint": "/latest/meta-data/iam/security-credentials/",
          "requires": ["internal_network_access"],
          "grants": ["cloud_iam_creds_extracted"],
          "step_confidence": 0.85,
          "step_cvss_vector": "CVSS:3.1/AV:A/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:N",
          "transition_match": "hard",
          "attacker_capability_gained": "Cloud IAM credentials extracted",
          "evidence_ref": "triage-ranked.json#TRG-e5f6g7h8",
          "precondition": "Step 1 must succeed — internal network reachable"
        }
      ],
      "chain_narrative": "The SSRF on /api/fetch allows proxying requests to the EC2 metadata service, which then returns IAM role credentials. An attacker with a low-privilege account can leverage these credentials to escalate to cloud administrator, fully compromising the AWS environment.",
      "disclosure_title": "SSRF to IMDS via /api/fetch — Cloud Credential Exfiltration",
      "rollback_plan": {
        "required": false,
        "steps": [
          "Step 1: GET request — no state modified, no rollback needed",
          "Step 2: Read-only IMDS query — no state modified"
        ],
        "note": "If any step modifies target state, describe the exact reversal action here."
      }
    }
  ]
}
```

### `chains-graph.md`

For each `chain_finding`, emit a Mermaid diagram and graph statistics section:

````markdown
## Attack Path Graph — Run <run_id>

### Graph Statistics
- Paths explored: <N>
- Paths pruned: <N> (<pruning_pct>%)
- Chains emitted: <N> (finding: <N>, candidate: <N>)
- Top novelty score: <0.0–1.0>
- Mean novelty score: <0.0–1.0>

### [CHAIN-a1b2c3d4] — SSRF → Internal — CVSS 9.1 (novelty: 0.85)

```mermaid
graph LR
    S([Attacker\nexternal_network_access]) -->|SSRF TRG-a1b2c3d4\nconf:0.90| A
    A[api.example.com\n/api/fetch\ngrants: internal_network_access] -->|IMDS TRG-e5f6g7h8\nconf:0.85| B
    B[169.254.169.254\n/iam/security-credentials\ngrants: cloud_iam_creds_extracted]:::critical
    classDef critical fill:#ff6b6b,color:#fff;
```
````

### `chain-poc-requests.md`

For each `chain_finding`, emit a sequential PoC section:

```markdown
## [CHAIN-id] — <chain_class> — <compound_cvss_score>

**Compound CVSS:** 9.1 (CVSS:3.1/...)
**Chain Class:** SSRF → Internal
**Confidence:** 0.78

### Step 1 — Confirm SSRF entry point
Request:
  GET /api/fetch?url=http://169.254.169.254/ HTTP/1.1
  Host: api.example.com
  Authorization: Bearer <test_token>
  X-Bug-Bounty-Researcher: true
Expected: Response body contains "latest/" (IMDS listing)

### Step 2 — Escalate to credential exfiltration
Request:
  GET /api/fetch?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/ HTTP/1.1
  Host: api.example.com
  Authorization: Bearer <test_token>
  X-Bug-Bounty-Researcher: true
Expected: JSON blob with IAM role name
Rollback: Read-only request — no state modified.
```

## Anti-Hallucination Rules

- Never construct a chain step without a corresponding `triage_id` from
  `triage-ranked.json` or a `triage_id` from `source-correlations.json`.
- Never claim a compound CVSS without re-computing all 8 metrics using the
  CVSS Escalation Capability Map.
- Never emit a chain with `chain_depth > 5` — flag as `out_of_scope_complex`.
- Never assume preconditions that haven't been confirmed by at least one
  artifact signal.
- If `speculative: true`, the chain MUST have `status: chain_candidate`.
- Never emit a step without a capability record (`requires[]`, `grants[]`)
  drawn from the Capability Ontology. Capability labels outside the ontology
  vocabulary are not permitted without a `custom_capability_justification` field.
- Never declare `transition_match: hard` if `grants_A ∩ requires_B = ∅` —
  hard match requires direct intersection; use `transition_match: soft` with
  `cosine_score` for semantic-only transitions.
- Never emit `known_pattern: false` without running the `novelty_scorer` tool.
- Never fabricate graph statistics (`paths_explored`, `graph_nodes`, etc.) —
  these must be emitted by `graph-builder` and `path-enumerator` workers.

## Tool Execution Layer (MCP-Compatible)

ChainHunter uses read-only path validation tools:

```yaml
chain_tools:
  ssrf_probe:
    mode: mcp_sandbox
    timeout: 60
    args_allowlist:
      - "--target"
      - "--internal-url"
      - "--compare"
    safety_scope:
      in_scope_only: true
      non_destructive_only: true
      max_request_rate_per_host: 1
  chain_validator:
    mode: mcp_sandbox
    timeout: 120
    args_allowlist:
      - "--chain-json"
      - "--dry-run"
      - "--scope-file"
    deny:
      - "--execute"
  capability_extractor:
    mode: mcp_sandbox
    timeout: 30
    description: >
      LLM-backed tool. Accepts a triage finding JSON and returns a structured capability
      record: { requires: [...], grants: [...], impact_tags: [...], confidence: float }.
      Uses few-shot examples from the Capability Ontology. Read-only.
    args_allowlist:
      - "--finding-json"
      - "--ontology-version"
    safety_scope:
      read_only: true
      network_egress: none
  transition_validator:
    mode: mcp_sandbox
    timeout: 60
    description: >
      Validates whether finding B is a valid successor to finding A given their capability
      records. Returns: { valid: bool, match_type: hard|soft|none, cosine_score: float,
      tech_stack_compatible: bool }. Uses embedding lookup for soft matching.
    args_allowlist:
      - "--from-capability-json"
      - "--to-capability-json"
      - "--source-correlations"
      - "--cosine-threshold"
    safety_scope:
      read_only: true
      network_egress: restricted_to_kg
  novelty_scorer:
    mode: mcp_sandbox
    timeout: 45
    description: >
      Computes novelty_score (0.0–1.0) for a chain path by embedding the capability sequence
      text and comparing via cosine similarity to all prior ChainFinding embeddings in the KG.
      Returns: { novelty_score: float, closest_prior_chain_id: str, cosine_distance: float }.
    args_allowlist:
      - "--chain-capability-sequence"
      - "--top-k-priors"
      - "--kg-namespace"
    safety_scope:
      read_only: true
      network_egress: restricted_to_kg
  mermaid_generator:
    mode: mcp_sandbox
    timeout: 30
    description: >
      Accepts a chain JSON (list of steps with triage_id, grants, fqdn, endpoint) and
      emits a valid Mermaid graph LR diagram string for inclusion in chains-graph.md.
      Applies classDef critical (red) for steps granting C:H capabilities.
    args_allowlist:
      - "--chain-json"
      - "--format"
      - "--style"
    safety_scope:
      read_only: true
      network_egress: none
```

### execute_tool Contract

```python
def execute_tool(
    tool_name: str,
    args: List[str],
    safety_scope: SafetyScope,
    timeout_seconds: int = 120,
    token_quota: int = 3000,
    capture_stdout: bool = True,
    capture_stderr: bool = True
) -> ToolResult:
    """
    For ChainHunter: path validation and structural probes only.
    safety_scope enforces:
      - in_scope_only: true
      - non_destructive_only: true
      - deny_execution: true  # --dry-run always required
      - max_request_rate_per_host: 1
    Allowed tools: ssrf_probe, chain_validator, http_replay (dry-run only)
    """
```

## Swarm Workers

> **Load on demand** when spawning parallel chain evaluators:
> `read_file('skills/BugBountyHunter/ChainHunter/workers/swarm-workers.md')`
>
> Contains: Worker role definitions (auth_bypass, idor, ssrf, xss, xxe, sqli workers),
> each worker's search patterns, scoring contributions, and blackboard write contract.

## Blackboard Protocol

ChainHunter writes discovered chains to the shared blackboard:

```jsonc
{
  "worker_id": "chain-network",
  "phase": "P3.5",
  "chain_id": "CHAIN-a1b2c3d4",
  "chain_class": "SSRF → Internal",
  "compound_cvss_score": 9.1,
  "chain_confidence": 0.78,
  "novelty_score": 0.85,
  "known_pattern": false,
  "status": "chain_finding",
  "paths_explored": 47,
  "paths_pruned": 39,
  "graph_nodes": 18,
  "graph_edges": 31,
  "timestamp": "<ISO8601>"
}
```

## Validation & Reflection Loop

```yaml
validator_chain:
  checks:
    - triage_id_linkage: "every step has a triage_id from triage-ranked.json"
    - cvss_recomputed: "compound_cvss uses updated attacker position metrics per CVSS Escalation Capability Map"
    - impact_delta_positive: "compound_cvss > max(individual step CVSS) + 1.5"
    - depth_within_limit: "chain_depth <= 5"
    - speculative_flagged: "all speculative chains have status: chain_candidate"
    - non_destructive: "no step requires state-modifying requests"
    - capability_record_present: "every step has requires[] and grants[] fields from ontology"
    - novelty_score_present: "every chain_finding has a novelty_score between 0.0 and 1.0"
    - transition_match_declared: "every step has transition_match: hard|soft (soft requires cosine_score >= 0.72)"
    - mermaid_emitted: "chains-graph.md contains a Mermaid diagram for every chain_finding"
    - stats_complete: "chains-discovered.json stats block includes paths_explored, graph_nodes, graph_edges"
```

### REFLECTION CHECKPOINT — ChainHunter

```
1. Does each chain step have a confirmed triage_id or source correlation?
2. Did I re-compute CVSS metrics at each escalation step using the Capability Map?
3. Is the compound CVSS at least 1.5 points above the highest individual finding?
4. Are all steps non-destructive and independently verifiable?
5. Did I check prior_chains from the KG to avoid duplicating known patterns?
6. For speculative chains — did I mark them chain_candidate with speculative: true?
7. Does every step have a capability record with requires[] and grants[] from the ontology?
8. Did I compute novelty_score using the novelty_scorer tool for every chain_finding?
9. Did I generate a Mermaid diagram for every chain_finding in chains-graph.md?
10. Are soft transition matches (embedding cosine) declared and >= 0.72 threshold?
```

## Known Limitations

- **Soft match speculative risk**: Transitions based solely on embedding cosine (no hard
  capability overlap) inflate the speculative chain rate. All soft-match-only chains receive
  `speculative: true` and must be reviewed before disclosure.
- **Novelty cold-start**: With no prior KG chains, `novelty_score = 1.0` for all outputs.
  Scores become meaningful after the first few confirmed chains are written back.
- **Large scan volumes**: At 50–300 findings the `max_nodes`/`max_edges_before_subsampling`
  guard may activate. Check `subsampling_applied: true` in graph_stats — low-score tail
  findings may not be covered in the sampled graph.
- **CVSS escalation is approximate**: The Capability Map gives deterministic metric changes
  but CVSS is a point estimate. Compound scores above 9.0 should be independently verified
  by the triager before disclosure.
- **LLM capability extraction drift**: The capability_extractor is few-shot and can mislabel
  requires/grants for exotic vuln classes. Use `custom_capability_justification` and manual
  review for any finding without a canonical map entry.
- **top_n_paths=20 default**: With novelty boosting, the top 20 paths surface the best chains
  for typical scan volumes. Increase for extremely target-rich environments.

## Persistent Memory & Learner

Pre-hunt retrieval:

```python
prior_chains = query_kg(
    query="""
    MATCH (c:ChainFinding)
    RETURN c.chain_class           AS chain_class,
           c.step_sequence         AS step_sequence,
           c.capability_path       AS capability_path,
           c.confirmed_count       AS confirmed_count,
           c.novelty_score         AS novelty_score,
           c.compound_cvss         AS avg_compound_cvss
    ORDER BY c.compound_cvss DESC
    LIMIT 10
    """
)
# Use capability_path for semantic deduplication via novelty_scorer tool.
# Use as inspiration patterns (not direct evidence).
```

Post-hunt update:

```python
update_knowledge_graph(
    entity_type="ChainFinding",
    chain_id=chain.chain_id,
    chain_class=chain.chain_class,
    step_sequence=[s.triage_id for s in chain.steps],
    capability_path=[s.grants for s in chain.steps],  # list of grants lists per step
    chain_path_embedding=embed(str([s.grants for s in chain.steps])),
    compound_cvss=chain.compound_cvss_score,
    novelty_score=chain.novelty_score,
    confirmed=triager_accepted,
    target_domain=chain.steps[0].fqdn,
    tech_stack=observed_stack
)
```
