---
name: llm-self-loop
description: 'Restructure Web-UI / human-triggered tasks into CLI + file-output loops the LLM can iterate alone. Open LLM-side observability — structured logs, file dumps, addressable scratchpads. Apply the trap-or-abandon decision: if a step cannot be looped, improve the harness rather than babysit. Trigger when the user mentions iterative grunt-work, "I have to push a button in a web UI to trigger this", monitoring dashboards, designing Claude-driven automation, or any workflow whose inner loop currently requires a human in the middle.'
---

The job: turn workflows that need a human in the inner loop into workflows the LLM closes itself. The two halves are *removing the trigger gate* and *opening observability*.

## Surface the gate first

Before proposing changes, name the trigger gate explicitly:

- *What action requires a human right now?* (button click, screenshot inspection, terminal interaction, web-form submission)
- *What signal does the human provide that the LLM cannot get on its own?* (visual confirmation, copy-paste, secret value, eyeball verdict)
- *Where does the result go?* (chat memory, screenshot, mental note)

Most loops have one or two gates that, removed, collapse the cycle to seconds. Pick the smallest gate first.

## Structural fixes

### Web-UI trigger → CLI trigger

If the workflow is gated by clicking in a web app, find or build the equivalent CLI command. Webhooks, REST endpoints, `gh` / `aws` / `gcloud` CLI subcommands, internal `just` targets — anything programmatically invokable. The LLM can then loop without leaving its session.

### Stdout-only output → file-based output

If the workflow's result lives in chat memory or a screenshot, redirect to a file the LLM can read back: structured JSON dumps, markdown reports, append-only logs with addressable offsets. *Why:* file outputs survive compaction, support diff, and are inspectable by future sessions without replaying context.

### Dashboards → structured logs

If verification requires eyeballing a Grafana / Datadog dashboard, surface the same metrics through a CLI query (PromQL, Datadog API, log aggregation tail). Anything that produces a `pass`/`fail`/`warn` verdict the LLM can read.

### Eyeball verdicts → contract assertions

If the human's role is "looks right to me", encode the criterion as a test, schema, or assertion. The contract becomes the loop's done-criterion (pair with `strict-validation-setup` for the bootstrap of those gates).

## Trap-or-abandon decision

After the structural fixes above, some steps still cannot be made autonomous — they involve genuine human judgment, external compliance, or capability the LLM lacks. For each remaining gate, apply this rule:

- **Trap** — if the step can be wrapped in a verification-and-iteration loop where the LLM proposes, the human approves once, and the LLM iterates until the contract passes, keep it. The human is at the *outer* loop, not the inner.
- **Abandon** — if a step requires the human in the *inner* loop and resists wrapping (e.g., new SOC2 review per iteration, real-time customer chat, hardware-mediated test), do not babysit. Either remove the step from the LLM's loop entirely (escalate to the human as a discrete handoff) or improve the harness so the step disappears (e.g., automate the SOC2 documentation pipeline).

Naming the rule: babysitting an unloopable step is the failure mode this skill exists to prevent. Pre-existing chat consensus: "what can't be looped — abandon firmly and improve the harness."

## What this skill does not do

- It does not author project rules — defer to `init` for AGENTS.md.
- It does not bootstrap strict-mode validation gates — defer to `strict-validation-setup`.
- It does not pick the test framework — defer to `test-driven` or the language's idiomatic tester.

## Cross-references

- `strict-validation-setup` — bootstrap the gates this skill verifies against. Pair: bootstrap once, run many.
- `odin:duet` — adjacent two-party working posture. Use `duet` when preserving the human as inner-loop director is the goal; use this skill when removing the human from the inner loop is the goal. Different ends of the same axis.

## Posture

Surgical, not architectural. Remove one gate at a time. After each fix, re-evaluate whether the loop now closes — sometimes one trigger removal is enough. Resist the temptation to redesign the whole system.