---
name: pentest
description: "Static-analysis penetration test that hunts for exploitable vulnerabilities with proof-of-concept payloads and fix code. Covers SQL and NoSQL injection (string concatenation, raw queries, operator injection), XSS (reflected, stored, DOM-based, template injection, dangerouslySetInnerHTML), authentication bypass (missing auth middleware, JWT algorithm confusion, predictable tokens, OAuth state CSRF), authorization flaws (IDOR, mass assignment, horizontal/vertical privilege escalation), path traversal and file inclusion (unsanitized file paths, upload validation, LFI), command injection (exec, system, subprocess with user input), CSRF and SSRF (missing SameSite, user-supplied URLs, open redirects), hardcoded secrets (AWS keys, private keys, JWT secrets, connection strings, .env in git), and insecure deserialization (pickle, yaml.load, XXE, ObjectInputStream). Maps full attack surface with route inventory. Use for pre-release security validation, finding exploitable bugs, or generating penetration test evidence."
version: "2.0.0"
category: security
platforms:
  - CLAUDE_CODE
---

You are in AUTONOMOUS MODE. Do NOT ask questions. Hunt for exploitable vulnerabilities methodically.

TARGET:
$ARGUMENTS

If no arguments provided, perform a full code-level penetration test of the project in the current working directory. If arguments specify a scope (e.g., "auth module", "API routes"), focus the analysis on that area.

============================================================
PHASE 0: RECONNAISSANCE
============================================================

Map the application's attack surface:

TECH STACK:
- Detect framework, language, database, auth mechanism
- Identify entry points (HTTP routes, WebSocket handlers, CLI commands, queue consumers)
- Map middleware chain (auth, validation, rate limiting, logging)

ROUTE INVENTORY:
- List ALL defined routes/endpoints with their HTTP methods
- Note which routes have authentication middleware
- Note which routes accept user input (body, query, params, headers, files)
- Flag routes that perform privileged operations

DATA FLOW MAPPING:
- Trace user input from entry point through processing to output/storage
- Identify where input is validated, sanitized, or escaped
- Identify where input reaches dangerous sinks (database, shell, filesystem, HTML)

Record the full attack surface map before beginning targeted analysis.

============================================================
PHASE 1: SQL / NOSQL INJECTION
============================================================

Hunt for injection points in database interactions:

SQL INJECTION:
- Search for string concatenation in SQL: `"SELECT * FROM " + table`
- Search for template literals in SQL: `` `SELECT * FROM ${table}` ``
- Search for format strings in SQL: `"SELECT * FROM %s" % table`
- Check ORM raw query methods for unsanitized input
- Check stored procedures called with dynamic parameters
- Verify parameterized queries are used consistently

NOSQL INJECTION:
- MongoDB: Search for `$where`, `$regex`, `$gt`, `$ne` with user input
- MongoDB: Check for object injection via `JSON.parse(userInput)` in queries
- Firestore: Check for user-controlled field paths in queries
- Redis: Check for user input in command strings

For each finding:
- **Location:** file:line
- **Severity:** Critical
- **Proof-of-concept:** Example malicious input that would exploit the flaw
- **Fix:** Parameterized query replacement code

============================================================
PHASE 2: CROSS-SITE SCRIPTING (XSS)
============================================================

Hunt for XSS vectors:

REFLECTED XSS:
- User input reflected in HTML responses without encoding
- Search parameters displayed on results pages
- Error messages containing user input
- URL path segments rendered in page content

STORED XSS:
- User-generated content (comments, profiles, messages) rendered without escaping
- File names displayed without sanitization
- User-uploaded HTML/SVG files served inline

DOM-BASED XSS:
- `document.write()` with URL fragments or query parameters
- `innerHTML` assignment with user-controlled data
- `eval()` or `Function()` with user input
- `location.href` / `window.open()` with user-controlled URLs
- jQuery `.html()` with unescaped data

TEMPLATE INJECTION:
- Server-side template injection (SSTI): user input in template strings
- Client-side: `dangerouslySetInnerHTML`, `v-html`, `[innerHTML]`, `{!! !!}`
- Missing auto-escape: `<%- %>` (EJS), `| safe` (Jinja2), `{{{ }}}` (Handlebars)

For each finding:
- **Location:** file:line
- **Severity:** High-Critical
- **Proof-of-concept:** `<script>alert(1)</script>` or context-specific payload
- **Fix:** Proper encoding/escaping code

============================================================
PHASE 3: AUTHENTICATION BYPASS
============================================================

Hunt for authentication weaknesses:

MISSING AUTH:
- Routes handling sensitive data without auth middleware
- Admin endpoints accessible without admin role check
- API endpoints that should require auth but do not
- WebSocket connections without authentication handshake

AUTH LOGIC FLAWS:
- JWT verification that can be bypassed (algorithm confusion, missing expiry check)
- Session tokens predictable or not cryptographically random
- Password reset tokens reusable or long-lived
- OAuth state parameter missing (CSRF on OAuth flow)
- Remember-me tokens without proper validation

CREDENTIAL ATTACKS:
- No rate limiting on login endpoint
- No account lockout after failed attempts
- User enumeration via different error messages (valid vs invalid email)
- Timing attacks on authentication (measurably different response times)

For each finding:
- **Location:** file:line or route definition
- **Severity:** Critical-High
- **Proof-of-concept:** Steps to bypass authentication
- **Fix:** Middleware addition or logic correction

============================================================
PHASE 4: AUTHORIZATION FLAWS
============================================================

Hunt for privilege escalation:

HORIZONTAL PRIVILEGE ESCALATION:
- Accessing other users' resources by changing ID parameters
- Missing ownership checks: `findById(req.params.id)` without `where: { userId: req.user.id }`
- Batch endpoints that do not filter by the requesting user
- File access without ownership verification

VERTICAL PRIVILEGE ESCALATION:
- Admin functions accessible by regular users
- Role changes possible without admin authorization
- Missing role checks on mutation endpoints (only checked on read)
- Client-side-only role enforcement (no server-side check)

MASS ASSIGNMENT:
- Request body spread directly into database updates: `Model.update(req.body)`
- User can set `role`, `isAdmin`, `balance` via API request
- GraphQL mutations accepting arbitrary fields

For each finding:
- **Location:** file:line
- **Severity:** Critical-High
- **Proof-of-concept:** Example API call exploiting the flaw
- **Fix:** Authorization middleware or ownership check code

============================================================
PHASE 5: PATH TRAVERSAL AND FILE INCLUSION
============================================================

Hunt for file system attacks:

PATH TRAVERSAL:
- User input in `fs.readFile()`, `open()`, `File()` without path sanitization
- File download endpoints using user-supplied filenames
- Pattern: `path.join(baseDir, userInput)` without `../` prevention
- Image/document serving with user-controlled paths

FILE UPLOAD:
- Missing file type validation (only checking extension, not magic bytes)
- Missing file size limits
- Uploaded files stored in web-accessible directories
- Uploaded filenames not sanitized (may contain `../` or special chars)
- No virus/malware scanning on uploads

LOCAL FILE INCLUSION:
- Dynamic `require()` or `import()` with user input
- Template file paths from user input
- Configuration file paths from user input

For each finding:
- **Location:** file:line
- **Severity:** High-Critical
- **Proof-of-concept:** `../../etc/passwd` or equivalent payload
- **Fix:** Path sanitization code

============================================================
PHASE 6: COMMAND INJECTION
============================================================

Hunt for command injection:

DIRECT INJECTION:
- `child_process.exec()` with user input: `exec("convert " + filename)`
- `os.system()`, `subprocess.run(shell=True)` with user arguments
- `Runtime.exec()` with unsanitized parameters
- Backtick execution with user data

INDIRECT INJECTION:
- User input in environment variables passed to subprocesses
- User-controlled arguments in CLI tool invocations
- CRON job expressions from user input
- Git commands with user-supplied branch/repo names

For each finding:
- **Location:** file:line
- **Severity:** Critical
- **Proof-of-concept:** `; cat /etc/passwd` or `$(whoami)` payload
- **Fix:** Use array-form exec, escape arguments, or avoid shell execution

============================================================
PHASE 7: CSRF AND REQUEST FORGERY
============================================================

Hunt for CSRF vulnerabilities:

MISSING CSRF PROTECTION:
- State-changing POST/PUT/DELETE endpoints without CSRF tokens
- Missing `SameSite` cookie attribute
- CORS allowing credentials with broad origins
- Form submissions without anti-CSRF tokens

SSRF:
- User-supplied URLs passed to HTTP clients (fetch, axios, requests)
- URL redirects without origin validation
- Webhook URLs from user input fetched server-side
- Image/preview URLs fetched without allowlist

OPEN REDIRECTS:
- Redirect URLs from query parameters without validation
- Login redirect (`?next=` or `?redirect=`) without allowlist
- OAuth callback URLs not properly validated

For each finding:
- **Location:** file:line
- **Severity:** Medium-High
- **Proof-of-concept:** Example malicious URL or forged request
- **Fix:** CSRF token implementation, URL validation code

============================================================
PHASE 8: SECRETS AND CREDENTIALS
============================================================

Hunt for hardcoded secrets:

PATTERNS TO SEARCH:
- API keys: strings matching `[A-Za-z0-9_-]{20,}` near key-related variables
- AWS keys: `AKIA[0-9A-Z]{16}`
- Private keys: `-----BEGIN (RSA |EC |)PRIVATE KEY-----`
- JWT secrets: hardcoded strings in JWT sign/verify calls
- Database passwords: connection strings with embedded credentials
- OAuth client secrets: hardcoded in source (not env vars)
- Encryption keys: hardcoded byte arrays or strings used as keys

CHECK CONFIGURATION:
- `.env` files committed (not in `.gitignore`)
- Secrets in `docker-compose.yml` (not using Docker secrets)
- Secrets in CI/CD config files (not using secret variables)
- Secrets in client-side/frontend code (visible to users)

For each finding:
- **Location:** file:line
- **Severity:** Critical
- **Secret type:** [what kind of credential]
- **Fix:** Move to environment variable or secret manager

============================================================
PHASE 9: INSECURE DESERIALIZATION
============================================================

Hunt for deserialization attacks:

- `JSON.parse()` on untrusted input used to construct queries or commands
- `pickle.loads()` on user-supplied data (Python)
- `yaml.load()` without `SafeLoader` (Python)
- `unserialize()` on user input (PHP)
- `ObjectInputStream` with untrusted data (Java)
- `Marshal.load()` on untrusted data (Ruby)
- XML parsing without disabling external entities (XXE)

For each finding:
- **Location:** file:line
- **Severity:** High-Critical
- **Proof-of-concept:** Malicious serialized payload structure
- **Fix:** Safe deserialization method or input validation


============================================================
SELF-HEALING VALIDATION (max 2 iterations)
============================================================

After producing the security analysis, validate thoroughness:

1. Verify every category in the audit was actually checked (not skipped).
2. Verify every finding has a specific file:line location.
3. Verify severity ratings are justified by impact assessment.
4. Verify no false positives by re-reading flagged code in context.

IF VALIDATION FAILS:
- Re-audit skipped categories or vague findings
- Verify or remove false positives
- Repeat up to 2 iterations

============================================================
OUTPUT
============================================================

## Penetration Test Report (Code Analysis)

**Project:** [name]
**Stack:** [detected technologies]
**Scope:** [full project or specified scope]
**Date:** [date]

### Executive Summary

| Category | Critical | High | Medium | Low |
|----------|----------|------|--------|-----|
| Injection (SQL/NoSQL) | N | N | N | N |
| XSS | N | N | N | N |
| Auth Bypass | N | N | N | N |
| Authorization Flaws | N | N | N | N |
| Path Traversal | N | N | N | N |
| Command Injection | N | N | N | N |
| CSRF/SSRF | N | N | N | N |
| Hardcoded Secrets | N | N | N | N |
| Insecure Deserialization | N | N | N | N |
| **Total** | **N** | **N** | **N** | **N** |

### Findings

For each finding, in severity order:

#### [SEVERITY] — [Finding Title]

- **Category:** [e.g., SQL Injection]
- **Location:** `path/to/file.ts:42`
- **Description:** [What the vulnerability is]
- **Proof-of-concept:**
  ```
  [Example exploit payload or steps]
  ```
- **Impact:** [What an attacker could achieve]
- **Fix:**
  ```
  [Code showing the remediation]
  ```

### Attack Surface Summary
- **Total routes:** N
- **Unauthenticated routes:** N
- **Routes accepting user input:** N
- **Routes with dangerous sinks:** N

### Remediation Priority
[Ordered list: Critical findings first, with effort estimates]

============================================================
NEXT STEPS
============================================================

After reviewing the pentest report:
- "Fix all Critical findings immediately — these are exploitable."
- "Run `/owasp` for a systematic OWASP Top 10 compliance check."
- "Run `/encryption` to address cryptographic weaknesses."
- "Run `/dependency-scan` to fix vulnerable component findings."
- "Run `/secure` after remediation to verify the security posture improved."


============================================================
SELF-EVOLUTION TELEMETRY
============================================================

After producing output, record execution metadata for the /evolve pipeline.

Check if a project memory directory exists:
- Look for the project path in `~/.claude/projects/`
- If found, append to `skill-telemetry.md` in that memory directory

Entry format:
```
### /pentest — {{YYYY-MM-DD}}
- Outcome: {{SUCCESS | PARTIAL | FAILED}}
- Self-healed: {{yes — what was healed | no}}
- Iterations used: {{N}} / {{N max}}
- Bottleneck: {{phase that struggled or "none"}}
- Suggestion: {{one-line improvement idea for /evolve, or "none"}}
```

Only log if the memory directory exists. Skip silently if not found.
Keep entries concise — /evolve will parse these for skill improvement signals.

============================================================
DO NOT
============================================================

- Do NOT execute actual exploits — this is static code analysis only.
- Do NOT send network requests to test vulnerabilities.
- Do NOT expose full secret values — redact all but first 4 characters.
- Do NOT modify any code — report findings with suggested fixes only.
- Do NOT report theoretical vulnerabilities without code evidence — every finding must cite a specific file and line.
- Do NOT conflate "possible" with "confirmed" — clearly state confidence level.
- Do NOT skip any phase — even if the project seems small, check all categories.
- Do NOT install external tools — analyze the source code directly.
