--- name: setup-isolated-setup-doctor description: | Probe the secure-agent setup for in-session functional restrictions that block legitimate workflows. Three live probes — SSH agent / Yubikey reachability, localhost port binding, docker / podman runtime socket — each pointing the user at the matching numbered troubleshooting entry and its settings.json remediation (see body). Read-only — never modifies settings.json, never invokes the sandbox bypass. when_to_use: | Invoke when the user says "doctor my sandbox", "diagnose sandbox friction", "why is the sandbox blocking X", "check whether ssh / docker / port-bind works inside the sandbox", or after the user reports a workflow failure that smells sandbox-shaped (agent unreachable, socket errors, port permission errors). Also a good periodic check after every Claude Code upgrade — the sandbox profile evolves and a previously-working call may have moved into deny. capability: - capability:setup - capability:reassess license: Apache-2.0 --- # setup-isolated-setup-doctor The **diagnostic** layer over the secure agent setup. Complements the existing setup skills: - [`setup-isolated-setup-install`](../setup-isolated-setup-install/SKILL.md) installs the secure setup. - [`setup-isolated-setup-verify`](../setup-isolated-setup-verify/SKILL.md) answers *"is the secure setup **installed** correctly?"* — static checks on settings.json shape, hook wiring, pinned tool versions. Catches drift and missing pieces. - [`setup-isolated-setup-update`](../setup-isolated-setup-update/SKILL.md) surfaces drift against the framework's latest. - **`setup-isolated-setup-doctor` (this skill)** answers *"are common workflows **functionally** blocked by the current sandbox?"* — live probes of SSH agent, port binding, docker / podman socket. Catches over-restrictive allowlists. Run `verify` first when the install is in question (fresh machine, recent framework upgrade, sandbox-state surprise). Run `doctor` when the install is known good but a workflow fails in a sandbox-shaped way — agent unreachable, socket error, port permission error. Every probe maps to a numbered entry in [`docs/setup/sandbox-troubleshooting.md`](../../docs/setup/sandbox-troubleshooting.md); the doctor's job is to identify *which* entry applies right now, not to re-explain the remediation. If a fail surfaces a failure mode not catalogued there, propose appending a new entry per the catalog's *Adding a new entry* section. ## Golden rules - **Read-only.** Each probe runs a small, deterministic, side-effect-free check. The skill never edits any settings file, never runs a command with `dangerouslyDisableSandbox`, never installs anything. If a check fails, surface the failure and point at the catalog entry; do not auto-fix. - **Run every probe, even on early failure.** Do not stop at the first ✗. The value of the report is in the full picture — a user may have one of three independent restrictions, or all three, and discovering them one re-run at a time is annoying. - **Distinguish ✗ (failing) from ⊘ (not applicable).** ✗ means the probe ran and the sandbox blocked it. ⊘ means the probe was skipped because the prerequisite is absent (e.g. no `docker` / `podman` on `PATH` → docker probe ⊘, not ✗). - **Surface evidence.** Each report line names the probe command, the exit code, and the relevant stderr snippet. "Looks blocked" is not a useful report; "ssh-add -l → rc=2 → `Could not open a connection to your authentication agent`" is. - **Map each ✗ to a catalog entry.** The fail report includes a direct link to the matching section of [`docs/setup/sandbox-troubleshooting.md`](../../docs/setup/sandbox-troubleshooting.md). Do not paraphrase the remediation — the catalog is the single source of truth. ## The 3 probes The current set covers the three failure modes the catalog documents. New probes are added when new entries land in the catalog; the two stay in lock-step. ### Probe 1 — SSH agent / Yubikey reachable Tests whether `ssh-agent` is reachable from inside the sandbox. Failure mode: `SSH_AUTH_SOCK` is passed through `claude-iso`'s env whitelist but the socket file is not in `sandbox.filesystem.allowRead`, so the agent's `ssh` / `git push` subprocesses cannot `connect(2)` to the socket. **Command:** ```bash if [ -z "$SSH_AUTH_SOCK" ]; then echo "PROBE: ssh-agent → ⊘ (SSH_AUTH_SOCK not set in env)" elif [ ! -S "$SSH_AUTH_SOCK" ]; then echo "PROBE: ssh-agent → ✗ (socket file at SSH_AUTH_SOCK not stat-able from inside sandbox)" echo " SSH_AUTH_SOCK=$SSH_AUTH_SOCK" else ssh-add -l > /tmp/ssh-add.out 2>&1; rc=$? case "$rc" in 0) echo "PROBE: ssh-agent → ✓ ($(wc -l < /tmp/ssh-add.out | tr -d ' ') identities listed)" ;; 1) echo "PROBE: ssh-agent → ✓ (agent reachable, no identities configured)" ;; 2) echo "PROBE: ssh-agent → ✗ (agent unreachable: $(head -1 /tmp/ssh-add.out))" ;; *) echo "PROBE: ssh-agent → ⚠ (unexpected rc=$rc: $(head -1 /tmp/ssh-add.out))" ;; esac fi ``` **Interpretation:** | Result | Status | Meaning | |---|---|---| | `✓ N identities listed` | Pass | `ssh-add -l` returned the key list. | | `✓ agent reachable, no identities` | Pass | `ssh-add -l` returned rc=1 (the documented "no keys" exit). | | `✗ socket not stat-able` | Fail | Sandbox blocks `stat(2)` on the socket file. | | `✗ agent unreachable` | Fail | Sandbox blocks `connect(2)` to the socket. | | `⊘ SSH_AUTH_SOCK not set` | Skip | Either the user does not run `ssh-agent`, or `claude-iso`'s env whitelist dropped it (separate bug — verify). | **On ✗ → remediation:** [`docs/setup/sandbox-troubleshooting.md` — SSH agent / Yubikey appears unreachable from inside the sandbox](../../docs/setup/sandbox-troubleshooting.md#ssh-agent--yubikey-appears-unreachable-from-inside-the-sandbox). ### Probe 2 — Localhost port bind Tests whether a process inside the sandbox can bind to a loopback port AND then talk to itself over loopback. The failure mode the catalog documents is the second half (egress proxy blocks `127.0.0.1`). **Command:** ```bash python3 - <<'PY' 2>&1 || true import socket, urllib.request, threading, http.server, sys # 1. Can we bind? try: s = socket.socket() s.bind(("127.0.0.1", 0)) port = s.getsockname()[1] s.listen(1) except OSError as e: print(f"PROBE: localhost-bind → ✗ (bind: {e})") sys.exit(0) # 2. Can we GET from our own server over loopback? class Handler(http.server.BaseHTTPRequestHandler): def do_GET(self): self.send_response(200); self.end_headers(); self.wfile.write(b"ok") def log_message(self, *_): pass server = http.server.HTTPServer(("127.0.0.1", 0), Handler) threading.Thread(target=server.serve_forever, daemon=True).start() try: with urllib.request.urlopen(f"http://127.0.0.1:{server.server_port}/", timeout=5) as r: body = r.read() print(f"PROBE: localhost-bind → ✓ (bound + loopback GET → HTTP {r.status}, body={body!r})") except Exception as e: print(f"PROBE: localhost-bind → ✗ (bind ok, loopback GET: {type(e).__name__}: {e})") finally: server.shutdown() s.close() PY ``` **Interpretation:** | Result | Status | Meaning | |---|---|---| | `✓ bound + loopback GET → HTTP 200` | Pass | Both bind and loopback HTTP work. | | `✗ bind: ...` | Fail | The sandbox blocks `bind(2)` on `127.0.0.1`. Rare. | | `✗ bind ok, loopback GET: ...` | Fail | Bind works but the sandbox egress proxy refuses `127.0.0.1` as a destination. Common shape. | **On ✗ → remediation:** [`docs/setup/sandbox-troubleshooting.md` — Test cannot bind to a localhost port](../../docs/setup/sandbox-troubleshooting.md#test-cannot-bind-to-a-localhost-port). ### Probe 3 — Docker / Podman runtime socket Tests whether the runtime CLI can talk to its daemon. Run for each of `docker` / `podman` that is on `PATH`; ⊘ each that is not installed (this is not a sandbox failure, just an absent prerequisite). **Command:** ```bash for rt in docker podman; do if ! command -v "$rt" > /dev/null 2>&1; then echo "PROBE: ${rt}-runtime → ⊘ ($rt not on PATH)" continue fi out=$("$rt" info > /dev/null 2>&1 && echo ok || echo "fail:$?") case "$out" in ok) echo "PROBE: ${rt}-runtime → ✓ (${rt} info returned)" ;; fail:*) err=$("$rt" info 2>&1 >/dev/null | head -2 | tr '\n' ' ') echo "PROBE: ${rt}-runtime → ✗ ($out: $err)" ;; esac done ``` **Interpretation:** | Result | Status | Meaning | |---|---|---| | `✓ info returned` | Pass | The CLI reached the daemon successfully. | | `✗ fail:1: Cannot connect to the Docker daemon …` | Fail | Daemon socket not readable from inside the sandbox. | | `✗ fail:1: connect: permission denied` | Fail | Same root cause, different stderr (Linux variant). | | `⊘ not on PATH` | Skip | Runtime not installed; not a sandbox restriction. | **On ✗ → remediation:** [`docs/setup/sandbox-troubleshooting.md` — Docker / Podman command fails with a socket error](../../docs/setup/sandbox-troubleshooting.md#docker--podman-command-fails-with-a-socket-error). ## After the report If every probe is ✓ or ⊘: > All three probes pass (or are not applicable). The sandbox is > not currently blocking the known failure modes catalogued in > `docs/setup/sandbox-troubleshooting.md`. If you hit a different > sandbox-shaped failure, follow the catalog's *Adding a new > entry* section and (optionally) extend this skill with a fourth > probe so future runs catch the same shape automatically. If any probe is ✗: 1. Surface every fail in one report (do not stop at the first). 2. For each fail, print the troubleshooting-doc anchor link from the probe's *On ✗ → remediation* row above. 3. Suggest the user open the catalog entry to read the symptom → root cause → fix shape, then apply the settings.json widening themselves. Do **not** propose to apply the widening from this skill — settings.json widenings are sandbox-bypass-adjacent and need an explicit user-driven edit. 4. After the user has applied the widening (in a separate flow), re-run `setup-isolated-setup-doctor` to confirm the probe now passes. If a probe surfaces a fail shape not catalogued in [`docs/setup/sandbox-troubleshooting.md`](../../docs/setup/sandbox-troubleshooting.md): 1. Report the fail with the literal probe command + exit code + stderr. 2. Suggest the user add a new entry to the catalog per its *Adding a new entry* section (symptom verbatim, root cause, fix, notes). 3. Once the catalog has the new entry, extend this skill with a matching probe in the same shape so the next doctor run catches it automatically. ## Extending the skill with a new probe When the catalog grows a new entry, add a matching probe section following the shape above: 1. **Command** — a short, deterministic, side-effect-free one-liner (or short Python heredoc) that triggers the failure mode reliably. 2. **Interpretation** — a 3–5-row table mapping result strings to ✓ / ✗ / ⊘ / ⚠. 3. **On ✗ → remediation** — a direct link to the matching section of `docs/setup/sandbox-troubleshooting.md`. Keep probes narrowly scoped: each probe tests **one** failure mode, not a bundle. A probe that conflates two restrictions makes the report ambiguous when the result is ✗.