---
name: rule-refactor
description: "Use when the rule set is over the Augment budget, when a new rule would breach it, or when asked to audit / merge / prune rules — runs the audit pipeline and proposes a verdict per rule."
source: package
domain: process
workspaces:
  - agent-config-maintainer
packs:
  - meta
lifecycle: active
trust:
  level: core
  confidence: high
  human_review_required: false
install:
  default: true
  removable: false
---

<!-- cloud_safe: degrade -->

# rule-refactor

## When to use

* `measure_augment_budget --check` fails (utilisation ≥ 0.95)
* A new rule would push the budget over 0.95 — caught by the budget
  gate in [`rule-writing`](../rule-writing/SKILL.md)
* User says "audit rules", "rule cleanup", "rules over budget",
  "prune rules", "merge rules", "rule system review"
* Periodic governance pass after a batch of rule additions

Do NOT use this skill for:

* Editing a single rule's content → [`rule-writing`](../rule-writing/SKILL.md)
* Picking always vs auto for one new rule → [`rule-writing`](../rule-writing/SKILL.md)

## Iron Law

**Threshold-lift is forbidden.** When the budget breaches, the
content must shrink — not the gate. Loosening `FAIL_THRESHOLD` in
`scripts/measure_augment_budget.py` to make CI pass is an explicit
anti-pattern. The only valid budget-growth move is an ADR that
raises `TOTAL_CAP`.

## Procedure

### 1. Inspect the current budget state

```bash
python3 scripts/measure_augment_budget.py --json > /tmp/budget-before.json
python3 scripts/measure_rule_budget.py --json > /tmp/rule-budget-before.json
```

### 2. Run the audit pipeline

The audit infrastructure already exists — compose it:

```bash
python3 scripts/audit_auto_rules.py      # → agents/runtime/reports/auto-rules-audit.{json,md}
python3 scripts/audit_overlap.py         # → appends overlap pairs to the MD
python3 scripts/audit_likelihood.py      # → agents/runtime/reports/auto-rules-likelihood.json
```

Then read `agents/runtime/reports/auto-rules-audit.md` end-to-end.

### 3. Categorise every flagged rule

For each rule the audit surfaces (overlap pair, low-likelihood, oversized,
or the new addition that triggered this skill), assign exactly one verdict:

| Verdict | Test |
|---|---|
| **keep** | Iron-Law / always-on safety net, no overlap, fires often |
| **merge** | ≥ 2 rules same domain, near-identical triggers, overlap ≥ 0.4 |
| **delete** | Never fires (low-likelihood + no path/keyword hit in 30 days), or fully subsumed by a skill |
| **move-to-context** | Body is reference material (tables, mechanics, examples) — the obligation is short, the rest is lookup |
| **promote-to-skill** | Body has numbered steps / a workflow — not a constraint |

### 4. Present the verdict table to the user

One Markdown table, one row per flagged rule, **before** any file
change. User approves the list. No silent edits.

### 5. Apply approved changes

For each approved verdict:

* **merge** → rewrite the surviving rule to cover both domains;
  delete the absorbed one; update any `routes_to:` references.
* **delete** → remove the file from `.agent-src.uncompressed/rules/`
  and the corresponding `.agent-src/rules/` projection.
* **move-to-context** → extract the body into
  `.agent-src.uncompressed/contexts/<area>/<name>.md`, replace the
  rule body with the obligation + a `load_context:` pointer.
* **promote-to-skill** → create
  `.agent-src.uncompressed/skills/<name>/SKILL.md`, replace the rule
  with an auto-trigger stub that routes to it (or delete the rule
  entirely if the skill's own trigger suffices).

### 6. Re-validate

```bash
bash scripts/compress.sh --sync
python3 scripts/compress.py --generate-tools
python3 scripts/measure_augment_budget.py --check   # must exit 0
python3 scripts/skill_linter.py --all               # 0 FAIL
```

Then run your package's full CI pipeline (see `Taskfile.yml` for the
canonical sequence) before pushing.

### 7. Record the delta

Append a snapshot to `agents/runtime/.augment-budget-history.jsonl`:

```bash
python3 scripts/measure_augment_budget.py --trend-append
```

Commit the cleanup as a separate chunk from any rule-add commits so
the history shows "added X" + "cleaned up Y" as distinct steps.

## Output format

1. Verdict table (approved by user) at the top of the cleanup PR description
2. Per-verdict commits (one per merge / delete / move / promote group)
3. Final `measure_augment_budget --check` output showing utilisation < 0.95
4. Trend snapshot recorded

## Gotchas

* Do NOT raise `FAIL_THRESHOLD` to dodge the audit
* Do NOT delete a rule that has a `routes_to:` pointer without
  updating the pointer's source
* Do NOT merge rules across tier boundaries (e.g. tier-1 always
  with a tier-3 stub) without surfacing the tier collapse to the user
* Do NOT skip the trend-append — the history is what tells future
  agents how the cap was managed

## Do NOT

* Do NOT loosen the budget gate
* Do NOT touch the cap (`TOTAL_CAP`) without an ADR
* Do NOT apply changes before user approves the verdict table
* Do NOT delete the rule-refactor audit reports — they're the
  artifact reviewers cite

## Cloud Behavior

On cloud surfaces, the audit scripts are not reachable. The skill
still applies — prose-only:

* Inspect the rule list (frontmatter + descriptions) and propose the
  verdict table from reading alone.
* Tell the user to run the audit scripts locally before applying.
* Do not attempt to call any script.
