---
name: agentfactory
description: Create governed Hermes agents as auditable enterprise operating units with manifest, capability stack, memory seed, deployment plan, verification contract, registry entry, and kill switch.
argument-hint: [Hermes agent intent]
disable-model-invocation: true
---

# /agentfactory

Create one governed Hermes agent for:

$ARGUMENTS

A Hermes agent is not a generic prompt. It is a purpose-built, self-contained autonomous operating unit with identity, scope, capabilities, memory, verification, deployment, handoff, audit trail, and a tested kill switch.

Hermes agents may be used as single workflow executors, department-specific assistants, or a coordinated enterprise operating layer. Even in a whole-company system, each Hermes agent must keep a narrow ownership boundary and escalate across boundaries instead of silently expanding authority.

## Non-Negotiable Contract

- Follow the phase sequence exactly: taste gate -> intent intake -> deep research -> manifest draft -> capability stack -> `HERMES-{SLUG}-SPEC.md` -> file generation -> introspect -> verify -> closeout and registry.
- Start every new Hermes agent with zero permissions. Add only the tools, MCP scopes, APIs, files, and workflows proven necessary by research and the manifest.
- Do not create or deploy an agent whose purpose conflicts with `taste.md`, `taste.vision`, or `hermes-factory.taste.md`.
- Do not generate agent files before writing `HERMES-{SLUG}-SPEC.md`.
- Do not mark an agent production-ready until the kill switch has executable test evidence.
- Do not accept contradictory memory seeds. Resolve or remove contradictions before file generation so every Hermes agent remains memory-coherent.
- Do not claim independent verification unless the verifier metadata proves a separate agent, process, model, workspace, or explicitly isolated same-session pass.
- Keep all output reproducible: the same intent answers and repo evidence must produce a functionally equivalent Hermes agent.
- Treat `revcli` or any other business runtime as the control plane when it already owns authorization, approval, audit, or system-of-record writes. Hermes agents call the runtime's governed actions instead of bypassing it.
- Treat runtime compatibility as a machine contract, not prose. Every Hermes agent must include `hermes.runtime.json` with parseable invocation, authority, approval, audit, kill-switch, and fixture evidence fields.
- Treat runtime capacity as a machine contract, not optimism. Every Hermes agent or fleet must include `development_host_profile`, `target_runtime_profile`, `host_capacity_profile`, `capacity_binding`, `concurrency_budget`, maximum parallel runs, queue/backpressure behavior, and `degrade_policy`; never assume the developer machine and production host have the same specs, and never assume either can run every Hermes agent at once.
- Any command capability must pin `cwd`, allowed argv shape, allowed config paths, env allowlist, max input size, input schema, denied flags, and expected exit/status behavior.
- Do not mark an enterprise or REVCLI-facing agent `active` when `verification_status` is `operator_exception`, when runtime evidence is missing, or when read-write/destructive authority is paired with exception-based verification.
- Keep `.minimaxing/state/CURRENT.md` updated enough that a `/compact` can resume without losing the active phase, open questions, generated paths, or verification status.

## Quality Constraints

| ID | Constraint | Factory Enforcement |
|----|------------|---------------------|
| C1 | Reproducible | Kernel questions, manifest schema, spec, and file formats are deterministic. |
| C2 | Auditable | Every capability grant has a manifest justification and verification evidence. |
| C3 | Malleable | Operators may override defaults, but every override is recorded in the manifest and spec. |
| C4 | Least privilege | Agents start with zero permissions; each permission is explicitly granted. |
| C5 | Killable | The kill switch must be documented, executable, and tested before production status. |
| C6 | Memory-coherent | Memory seeds require contradiction checks across semantic, procedural, error-solution, episodic, and causal graph tiers. |
| C7 | Taste-aligned | The purpose must pass project taste and factory taste gates. |
| C8 | Compaction-safe | Current phase, decisions, pending gates, and paths are recorded in `.minimaxing/state/CURRENT.md` or a workflow artifact before risky transitions. |
| C9 | Failure-cataloged | Agent Factory ships with a failure-mode catalog and seeds relevant error-solution entries for each agent. |
| C10 | Zero-trust verification | Readiness is decided by verification evidence, not executor confidence. |
| C11 | Runtime-bound | Production readiness requires a parseable runtime contract and executable runtime evidence. |
| C12 | Side-effect-safe | Side effects require argument constraints, approval gates, audit events, and rollback or compensation proof. |
| C13 | Capacity-aware | Development host capacity, target runtime capacity, concurrency budget, queue behavior, and degradation policy must be declared and verified before production readiness. |

## Agent Factory Workflow Artifact

Agent Factory is a workflow on its own, not a template generator. Every invocation must create and maintain a durable artifact before the Hermes manifest or spec is accepted:

```bash
mkdir -p .taste/workflow-runs
STAMP="$(date +%Y%m%d-%H%M%S)"
AGENT_FACTORY_ARTIFACT=".taste/workflow-runs/${STAMP}-agentfactory.md"
```

## OpusWorkflow Inheritance

For mutating AgentFactory work, `/opusworkflow` is the default outer route and
`/agentfactory` is the inner contract. Direct `/agentfactory` invocation remains
valid, but it must inherit the same Claude/Opus planner-reviewer plus
MiniMax-M2.7-highspeed executor policy before files change.

```text
outer_route: opusworkflow
inner_contract: agentfactory
```

Required section order:

```markdown
# Agent Factory Run: {agent intent}

## Task
## Taste Gate
## Intent Intake
## Deep Research Brief
## Source Ledger
## Runtime Audit
## Capacity-Aware Runtime
## Manifest Draft
## Capability Stack
## Research Sufficiency Introspection
## Hermes SPEC Decision
## File Generation Notes
## Readiness Introspection
## Independent Verification Evidence
## Registry And Memory Closeout
## Outcome
```

Required behavior:
- `## Deep Research Brief` must follow the same effectiveness-first shape as `/workflow`: collaborative research plan, search -> read -> refine loop, source ledger, contradiction handling, and follow-up research before freezing the manifest.
- `## Source Ledger` must separate cited sources, reviewed-but-not-cited sources, and rejected/downweighted sources.
- `## Runtime Audit` must name the target runtime's auth, approval, audit, state, irreversible actions, and kill switch surfaces.
- `## Capacity-Aware Runtime` must record both the development host profile and the target runtime profile. Use `bash scripts/parallel-capacity.sh --json` only for the development host or for a truly local target runtime. When the agent will run on a cloud server, container host, VPS, CI runner, managed queue, or another fleet runtime, record target runtime evidence from infrastructure config, provider limits, runtime telemetry, deployment docs, or an explicit operator-provided capacity contract. Concurrency budgets derive from the target runtime, not the dev PC, unless `capacity_binding.target_equals_development_host` is true.
- `## Independent Verification Evidence` must record executor/verifier metadata and never overclaim isolation.
- `## Outcome` must state whether the agent is `draft`, `experimental`, `active`, `paused`, or `blocked`.

## Compaction Safety

Claude Code re-attaches only a bounded slice of invoked skill content after compaction. Treat this skill as compaction-sensitive:

- Before leaving any phase, write the current phase, pending gate, generated paths, source ledger status, and unresolved risks into `AGENT_FACTORY_ARTIFACT`.
- If resuming after `/compact`, `/resume`, or a stale `CURRENT.md`, re-read `.claude/skills/agentfactory/SKILL.md`, `AGENT_FACTORY_ARTIFACT`, `SPEC.md`, and `hermes-registry.md` before continuing.
- Never rely on memory of the skill body alone after compaction. Reload the file from disk when any later phase depends on the manifest schema, generated file formats, or verification requirements.

## Phase 0: Taste Gate

1. Read `taste.md` and `taste.vision`.
2. Read `hermes-factory.taste.md`. If it does not exist, create it before continuing with these required sections:
   - `principles`: least privilege, auditability, reproducibility, killability, bounded autonomy
   - `enterprise_operating_model`: Hermes agents are workflow-bounded operating units that may compose into department or company systems
   - `non_goals`: no omnipotent agent, no hidden credentials, no unmanaged business writes, no unverified production readiness
   - `approval_philosophy`: destructive, external, financial, legal, credential, or customer-visible actions require explicit approval unless the manifest proves bounded policy authorization
3. Check whether the proposed agent purpose aligns with `taste.md`, `taste.vision`, and `hermes-factory.taste.md`.
4. Record a taste decision:
   - `PASS`: proceed
   - `NEEDS_ALIGNMENT`: ask focused questions or update taste with explicit operator approval
   - `BLOCKED`: stop because the purpose contradicts taste
5. Update the workflow artifact with taste evidence and the decision.

Hard gate: `BLOCKED` stops the factory. Do not continue to intent intake.

## Phase 1: Hermes Intent Intake

Ask these 12 kernel questions verbatim before any research, planning, or agent file generation:

1. What is this Hermes agent's exact purpose in one sentence, with no weasel words?
2. What is the hard scope boundary: what will this agent NEVER do?
3. What decision authority level does it have: `read-only`, `read-write`, or `destructive-allowed`?
4. What escalation trigger makes it stop and ask a human?
5. What success metric will prove in 30 days that it is working?
6. What failure mode describes what a bad version of this agent looks like?
7. What target runtime environment will run it?
8. What memory must be pre-seeded before the first run?
9. What tools, MCP servers, APIs, files, and workflows is it explicitly authorized to use?
10. Who is the operator and accountability owner?
11. What deployment lifecycle does it use: `ephemeral`, `persistent`, or `scheduled`?
12. What is the kill switch, and how is it tested?

Rules:
- If any answer is missing, mark intake `INCOMPLETE`.
- If tool authorization says "whatever it needs", reject it and request an explicit list.
- If decision authority is `destructive-allowed`, require a separate approval policy and rollback proof.
- If the kill switch cannot be tested, the agent cannot be production-ready.

## Phase 2: Deep Research

Research before designing the Hermes agent. Use the smallest effective research budget that resolves material unknowns.

Required research branches:

| Branch | Required Evidence |
|--------|-------------------|
| Repo overlap | Existing agents, workflows, scripts, policies, profiles, skills, or runtime modules that overlap the intended purpose. |
| Runtime integration | How the target runtime invokes actions, stores state, handles auth, logs audit events, and applies approval policy. |
| Development host capacity | Local CPU/RAM class, configured `MAX_PARALLEL_AGENTS`, Codex `max_threads`, and local verification ceiling from `scripts/parallel-capacity.sh` or equivalent. |
| Target runtime capacity | Cloud/server/VPS/container/CI/managed-runtime CPU/RAM class, provider limits, autoscaling behavior, production concurrency limit, queue/backpressure behavior, maximum parallel runs, and degrade policy. |
| Runtime authority chain | Which system owns policy, which system owns durable state, which system is the system of record, and which writes are forbidden bypasses. |
| Approval and side-effect matrix | Every external, customer-visible, financial, legal, destructive, credential, workflow-state, or system-of-record action and its approval gate. |
| Identity and credentials | Runtime principal, auth mode, credential owner, vault/provider, env var names, expiry, rotation, revocation, and secret redaction. |
| Observability and evidence | Trace IDs, audit event schema, runtime evidence path, retention, redaction, and replay/recovery surfaces. |
| Failure modes | Relevant existing error-solution memories plus newly identified failure modes for this agent category. |
| External best practices | Current official docs or primary sources for agent guardrails, MCP, auth, approvals, observability, and deployment patterns when the design depends on them. |
| Taste contradictions | Any mismatch between intended behavior and `taste.md`, `taste.vision`, or `hermes-factory.taste.md`. |

Research output must include:
- collaborative research plan
- iterative search -> read -> refine loop log
- effective research budget and why it was not inflated
- source ledger with cited, reviewed-but-not-cited, and rejected/downweighted sources
- repo evidence with file paths and line references when available
- contradictions and how they were resolved
- implications for manifest, capabilities, memory seed, verification, deployment, and kill switch
- follow-up research performed or an explicit reason it was not needed

Research sufficiency gate:
- `PASS`: evidence is enough to draft the manifest.
- `FIX_REQUIRED`: run another research loop before design.
- `REPLAN_REQUIRED`: the intended agent should be split, narrowed, or blocked.
- `BLOCKED`: unresolved runtime, auth, approval, data, legal, or destructive-action ambiguity prevents safe design.

Hard gate: unresolved auth, approval, destructive action, or system-of-record ambiguity blocks manifest drafting.

## REVCLI Readiness Overlay

Apply this overlay whenever `target_runtime` includes `revcli`, `revis`, `odoo`, a CRM, a sales workflow, or any customer/company operating workflow. If any required item cannot be proven from repo evidence or operator intake, return `BLOCKED` or `REPLAN_REQUIRED`.

| Requirement | Contract |
|-------------|----------|
| Role-scoped profile | Declare exactly one primary REVCLI role or narrowly justified extension: `signal-ingestor`, `opening-seller`, `qualified-seller`, `manager-approver`, `owner-auditor`, or `delivery-owner`. |
| REVCLI policy authority | Hermes may interact, draft, classify, call bounded tools, or prepare handoffs, but workflow state transitions must go through REVCLI/Revis policy surfaces. |
| System-of-record boundary | Odoo, CRM, or Postgres writes must be mediated by REVCLI domain service, runtime API, MCP bridge, or approved adapter; direct unmanaged writes are forbidden. |
| Correlation fields | Every material run must carry `agent_profile_id`, `agent_session_id`, `agent_run_id`, `trace_id`, `workflow_run_id`, `approval_id` when applicable, `opportunity_id` or target object ID, `actor_id`, and audit event hash when the runtime supports it. |
| Auth mode | Declare `seat-attached` for human-attended sessions or `fleet-commercial` for unattended/service agents. Consumer subscription/session credentials are forbidden for shared autonomous fleets. |
| Approval gate map | First outbound touch, proposal send, close won/lost, owner reassignment, pricing/redline, evidence export, profile suspension, production workflow execution, and external side effects must map to explicit approval gates. |
| Tool and egress allowlist | Tools, MCP servers, APIs, commands, domains, and network destinations start denied and are allowlisted per workflow/profile. |
| Trace and audit evidence | The runtime contract must name the audit sink and prove redacted runtime events, tool calls, approvals, business-object mutations, and evidence exports are recorded. |
| Kill-switch compatibility | Generated agents must honor tenant, workflow, profile, provider, queue, credential, and sandbox/network kill switches where the runtime exposes them. |
| Closed-loop terminal state | Sales/prospecting agents must route work to `closed-won`, `closed-lost`, `nurture-active`, or `archived`, never open-ended consultation limbo. |
| No unmanaged execution channel | Governed production execution must use Revis web console, approved enterprise portal, controlled Slack/Teams notification-approval surfaces, or runtime API/MCP. WhatsApp, Telegram, personal Signal, SMS, and personal email execution are forbidden. |

Hard gate: if a REVCLI-facing agent needs a runtime bridge that does not exist yet, the generated status is `experimental` or `blocked`; it cannot be `active`.

## Phase 3: Hermes Manifest Drafting

Draft `hermes.manifest.md` from the intake and research. The manifest is the agent identity and authority contract.

### Manifest Schema

Use Markdown with YAML front matter followed by required sections. Field names are lowercase snake_case.

| Field | Required | Type | Valid Values | Invalid When |
|-------|----------|------|--------------|--------------|
| `manifest_version` | yes | string | semantic version of manifest schema, default `"1.0"` | missing, empty, non-string |
| `name` | yes | string | human-readable name, 3-80 chars | vague, duplicate, includes secret |
| `slug` | yes | string | lowercase kebab-case, unique in registry | not kebab-case, duplicate |
| `version` | yes | string | semver like `0.1.0` | not semver |
| `status` | yes | enum | `experimental`, `active`, `paused`, `deprecated` | anything else |
| `created_at` | yes | string | ISO date `YYYY-MM-DD` | missing or invalid date |
| `operator` | yes | string | person or team that created the agent | missing |
| `accountability_owner` | yes | string | accountable human owner | missing |
| `purpose` | yes | string | one sentence, no "help with" or broad verbs alone | more than one sentence, vague |
| `scope_boundary` | yes | list[string] | explicit non-actions | empty |
| `non_goals` | yes | list[string] | excluded outcomes | empty |
| `decision_authority` | yes | enum | `read-only`, `read-write`, `destructive-allowed` | mismatch with capabilities |
| `target_runtime` | yes | string | local, CI, server, revcli, MCP, cloud, or named runtime | missing |
| `runtime_control_plane` | yes | object | includes `name`, `owner`, `policy_authority`, `invocation_surface`, `state_store`, `audit_sink` | missing owner, policy authority, or audit sink |
| `system_of_record` | yes | object | includes `name`, `write_policy`, `adapter_or_api`, `direct_write_allowed` | direct write allowed without approval proof |
| `runtime_identity` | yes | object | includes `principal`, `auth_mode`, `tenant_scope`, `revocation_path` | shared hidden user credentials or missing revocation |
| `deployment_lifecycle` | yes | enum | `ephemeral`, `persistent`, `scheduled` | anything else |
| `capability_stack` | yes | list[object] | each object includes `type`, `name`, `scope`, `justification`, `risk`, `approval_required` | any permission lacks justification |
| `mcp_servers` | yes | list[object] | each object includes `name`, `transport`, `scopes`, `justification`, `approval_model` | broad scopes or missing approval model |
| `api_access` | yes | list[object] | each object includes `service`, `env_vars`, `allowed_methods`, `denied_methods`, `justification` | secret values included |
| `file_access` | yes | list[object] | each object includes `path`, `mode`, `purpose` | write access without reason |
| `workflow_access` | yes | list[object] | each object includes `workflow`, `allowed_actions`, `denied_actions`, `approval_required` | external side effects unapproved |
| `action_authority_matrix` | yes | list[object] | each object includes `action`, `side_effect_level`, `runtime_action`, `approval_gate`, `rollback_or_compensation`, `idempotency_key`, `audit_event` | side effect lacks approval or audit event |
| `credential_strategy` | yes | object | includes `env_var_names`, `vault_or_provider`, `scope`, `expiry`, `rotation`, `revocation`, `redaction` | raw secret value included |
| `egress_policy` | yes | object | includes `default`, `allowed_domains`, `denied_domains`, `proxy_or_filter`, `ssrf_controls` | default is allow-all for production |
| `durable_orchestration` | yes | object | includes `workflow_id`, `state_store`, `retry_policy`, `timeout_policy`, `resume_policy`, `idempotency` | persistent/scheduled agent lacks durable state |
| `development_host_profile` | yes | object | includes `source`, `hardware_class`, `cpu_cores`, `ram_gb`, `recommended_ceiling`, `codex_max_threads`, `max_parallel_agents`, `measured_at`, `evidence` | missing, stale, or presented as production capacity without binding proof |
| `target_runtime_profile` | yes | object | includes `environment`, `provider`, `region`, `instance_type`, `hardware_class`, `cpu_cores`, `ram_gb`, `gpu`, `runtime_limit`, `autoscaling`, `queue_limit`, `measured_at`, `evidence`, `confidence` | missing for non-local runtime, lacks evidence, or copies dev-host values without proof |
| `host_capacity_profile` | yes | object | compatibility summary that includes `development_host`, `target_runtime`, and `effective_runtime_limit` | flattens dev and target into one ambiguous host or lacks effective target limit |
| `capacity_binding` | yes | object | includes `target_equals_development_host`, `budget_basis`, `promotion_rule`, `minimum_target_evidence`, `stale_after`, `unknown_target_policy` | budget_basis is dev host for cloud/server target without explicit proof; unknown target can become active |
| `concurrency_budget` | yes | object | includes `max_parallel_runs`, `max_parallel_tools`, `queue_policy`, `backpressure_policy`, `degrade_policy`, `supervisor_review_capacity`, `verification_capacity` | exceeds capacity profile, lacks queue/backpressure, or has no degrade policy |
| `observability_contract` | yes | object | includes `trace_id`, `events`, `sink`, `redaction`, `retention`, `evidence_path` | no action-level attribution |
| `memory_seed` | yes | list[object] | each object includes `tier`, `id`, `content`, `source`, `contradiction_check` | contradiction unresolved |
| `success_criteria` | yes | list[string] | at least one objective, machine-checkable criterion | all criteria require human taste |
| `escalation_triggers` | yes | list[string] | concrete stop conditions | missing failure coverage |
| `kill_switch` | yes | object | includes `mechanism`, `owner`, `test_command`, `last_tested`, `expected_result` | untested for active status |
| `audit_logging` | yes | object | includes `events`, `sink`, `redaction`, `retention` | no runtime outcome logging |
| `handoff_protocol` | yes | object | includes `when`, `to_whom`, `payload`, `timeout` | missing owner or payload |
| `verification_status` | yes | enum | `draft`, `verified`, `failed`, `operator_exception` | `verified` without evidence; `operator_exception` with active/read-write/destructive status |
| `constraints` | yes | list[string] | references C1-C13 and project-specific constraints | empty |
| `source_ledger` | yes | list[object] | repo, memory, and external sources used | missing for non-trivial design |

Optional fields:

| Field | Required | Type | Valid Values | Invalid When |
|-------|----------|------|--------------|--------------|
| `parent_agent` | no | string | slug of supervisor agent | missing referenced registry entry |
| `child_agents` | no | list[string] | slugs of delegated agents | cyclic delegation |
| `schedule` | no | string | cron, interval, or event trigger | lifecycle is `scheduled` and schedule missing |
| `rollback_plan` | no | list[string] | concrete rollback steps | destructive authority and rollback missing |
| `cost_budget` | no | object | token, API, time, or spend limits | unbounded for persistent agent |
| `data_classification` | no | enum | `public`, `internal`, `confidential`, `restricted` | secrets treated as public |
| `operator_exception` | no | object | includes `owner`, `reason`, `expires_at`, `compensating_controls`, `approved_by` | used to bypass active production readiness |

Hard gate: no manifest field may contain raw secrets, passwords, tokens, customer-sensitive payloads, or hidden credential instructions.

### Status Transition Matrix

| Manifest Status | Verification Status | Registry Section | Authority Allowed | Required Evidence |
|-----------------|---------------------|------------------|-------------------|-------------------|
| `experimental` | `draft` | Experimental Agents | `read-only` or bounded `read-write` with no production side effects | manifest, spec, runtime contract, residual risk |
| `experimental` | `failed` | Experimental Agents or Paused Agents | no production authority | failed verification evidence and blocker |
| `experimental` | `operator_exception` | Experimental Agents only | `read-only` only | exception owner, expiry, compensating controls, no production writes |
| `active` | `verified` | Active Agents | authority exactly as manifest | passing verify, runtime evidence, kill-switch evidence, audit evidence, no unresolved production blocker |
| `paused` | `failed` or `operator_exception` | Paused Agents | no new work | pause reason, kill-switch or disable evidence |
| `deprecated` | any non-active status | Deprecated Agents | none | replacement or deprecation reason |

Hard gate: `operator_exception` is not a waiver for enterprise production. It blocks `active`, `read-write`, and `destructive-allowed` authority.

## Phase 4: Capability Stack Design

Design capabilities using least privilege.

1. Start with an empty capability list.
2. Add a capability only when a specific success criterion cannot be satisfied without it.
3. For each tool, MCP server, API, file path, and workflow action, record:
   - exact name and scope
   - required use case
   - denied use cases
   - approval requirement
   - failure mode it introduces
   - verification test
4. Prefer runtime-owned actions over direct system-of-record writes when a governed runtime exists.
5. Prefer local/private MCP connections when the runtime must own filtering, credentials, approvals, or audit.
6. Use hosted MCP only when the remote tool is already policy-bounded and the model-level connection is appropriate.
7. For command or shell capabilities, require:
   - exact `entrypoint`
   - pinned `cwd`
   - `argv_allowlist`
   - `denied_flags`
   - `allowed_config_paths`
   - `env_allowlist`
   - `input_schema`
   - `max_input_bytes`
   - expected exit codes or statuses
   - negative tests for config escape, oversized input, actor override, and forbidden side effect
8. For MCP capabilities, require OAuth/PKCE or equivalent runtime auth when applicable, token audience validation, HTTPS for remote auth endpoints, explicit `allowed_tools`, no query-string tokens, secure token storage, and SSRF/egress controls.
9. Place approval checks beside the tool or runtime action that creates the side effect. Do not rely only on top-level prompt instructions or outer guardrails.
10. Design capacity before designing fleet behavior:
   - run `bash scripts/parallel-capacity.sh --json` for `development_host_profile`
   - use the local result for `target_runtime_profile` only when `target_runtime` is local and `capacity_binding.target_equals_development_host` is true
   - record target runtime capacity evidence when the target host differs: IaC files, deployment manifests, provider instance class, container limits, queue limits, runtime telemetry, or an explicit operator capacity contract
   - set `concurrency_budget.max_parallel_runs` no higher than the measured or declared target runtime limit, not the development host ceiling
   - define `queue_policy`, `backpressure_policy`, and `degrade_policy`
   - if target runtime capacity is unknown, set status to `experimental` or `blocked`; never `active`
   - require human approval before increasing concurrency for customer-visible, financial, legal, destructive, or system-of-record actions
11. Build the memory scaffold:
   - semantic tier: known decisions and operating boundaries
   - procedural tier: known runtime invocation patterns
   - error-solution tier: failure modes and mitigations
   - episodic tier: creation event and verification run
   - causal graph tier: relationships between agent purpose, capabilities, risk, and success metrics
12. Build the system prompt from the manifest, not freeform improvisation.

Hard gate: any capability without a manifest justification is removed before `HERMES-{SLUG}-SPEC.md`. Any Hermes fleet, cloud/server deployment, or scheduled agent without target runtime capacity evidence, concurrency budget, and degrade policy is blocked from `active` status.

## Phase 5: SPEC.md For The Hermes Agent

Write `.taste/hermes-agents/{slug}/HERMES-{SLUG}-SPEC.md` before generating agent files.

Required sections:

```markdown
# HERMES SPEC: {name}

## Purpose Contract
## Taste Alignment
## Runtime And Integration Surface
## Runtime Contract
## Runtime Capacity Contract
## Authority Model
## Capability Grants
## Action Authority Matrix
## Memory Seed Contract
## Durable Orchestration Contract
## Verification Contract
## Escalation And Handoff Contract
## Kill Switch Contract
## Audit And Observability Contract
## Security And Credential Contract
## Success Criteria
## Non-Goals
## Failure Modes
## Implementation Plan
## Independent Verification Plan
## Rollback Plan
## Constraint Trace
```

Hard gate: do not generate `hermes.manifest.md`, prompts, memory seed, deployment docs, or registry entries until the Hermes spec exists and matches the manifest draft.

## Phase 6: Agent File Generation

Generate files under `.taste/hermes-agents/{slug}/`.

Required files:

| File | Purpose |
|------|---------|
| `hermes.manifest.md` | Identity, authority, capabilities, owners, and source ledger. |
| `hermes.system-prompt.md` | Runtime behavioral kernel derived from the manifest. |
| `hermes.taste.md` | Agent-specific operating principles and non-goals. |
| `hermes.memory-seed.json` | Pre-seeded memory entries with contradiction checks. |
| `hermes.runtime.json` | Machine-readable runtime contract for invocation, authority, approvals, audit, fixtures, and kill switch. |
| `hermes.deploy.md` | Invocation, runtime, env vars, auth, schedules, observability, and rollback. |
| `hermes.verify.md` | Smoke, boundary, escalation, memory, and kill-switch tests. |
| `hermes.kill-switch.md` | Concrete disable mechanisms and last test evidence. |
| `HERMES-{SLUG}-SPEC.md` | Formal contract written before the generated files. |

### File Format: hermes.manifest.md

```markdown
---
manifest_version: "1.0"
name: "{Name}"
slug: "{slug}"
version: "0.1.0"
status: "experimental"
created_at: "YYYY-MM-DD"
operator: "{operator}"
accountability_owner: "{owner}"
purpose: "{one sentence}"
decision_authority: "read-only|read-write|destructive-allowed"
target_runtime: "{runtime}"
runtime_control_plane:
  name: "{runtime control plane}"
  owner: "{runtime owner}"
  policy_authority: "{policy authority}"
  invocation_surface: "{cli|api|mcp|workflow|sandbox}"
  state_store: "{state store}"
  audit_sink: "{audit sink}"
system_of_record:
  name: "{system of record}"
  write_policy: "{runtime-mediated|read-only|direct-with-approval}"
  adapter_or_api: "{adapter or API}"
  direct_write_allowed: false
runtime_identity:
  principal: "{agent principal}"
  auth_mode: "seat-attached|fleet-commercial|local-dev"
  tenant_scope: "{tenant/team/scope}"
  revocation_path: "{how identity is revoked}"
action_authority_matrix: []
credential_strategy:
  env_var_names: []
  vault_or_provider: "{vault/provider/runtime-injected}"
  scope: "{credential scope}"
  expiry: "{expiry policy}"
  rotation: "{rotation policy}"
  revocation: "{revocation path}"
  redaction: "{redaction policy}"
egress_policy:
  default: "deny"
  allowed_domains: []
  denied_domains: []
  proxy_or_filter: "{proxy/filter}"
  ssrf_controls: []
durable_orchestration:
  workflow_id: "{workflow id}"
  state_store: "{state store}"
  retry_policy: "{retry policy}"
  timeout_policy: "{timeout policy}"
  resume_policy: "{resume policy}"
  idempotency: "{idempotency policy}"
development_host_profile:
  source: "scripts/parallel-capacity.sh|operator-contract"
  hardware_class: "low|standard|high|workstation"
  cpu_cores: 0
  ram_gb: 0
  recommended_ceiling: 1
  codex_max_threads: 1
  max_parallel_agents: 1
  measured_at: "YYYY-MM-DD"
  evidence: "{local capacity command output or contract path}"
target_runtime_profile:
  environment: "local|ci|vps|cloud-vm|container|serverless|managed-workflow|revcli"
  provider: "{local|aws|gcp|azure|vercel|railway|fly|render|private-vps|custom}"
  region: "{region-or-n/a}"
  instance_type: "{instance/container/runtime class}"
  hardware_class: "low|standard|high|workstation|managed|unknown"
  cpu_cores: 0
  ram_gb: 0
  gpu: "none|declared"
  runtime_limit: 1
  autoscaling: "{none|horizontal|vertical|provider-managed|unknown}"
  queue_limit: 1
  measured_at: "YYYY-MM-DD"
  evidence: "{IaC, provider limit, runtime telemetry, deploy docs, or operator contract}"
  confidence: "high|medium|low"
host_capacity_profile:
  development_host: "{summary of development_host_profile}"
  target_runtime: "{summary of target_runtime_profile}"
  effective_runtime_limit: 1
capacity_binding:
  target_equals_development_host: false
  budget_basis: "target_runtime_profile|development_host_profile"
  promotion_rule: "active requires target runtime evidence unless target_equals_development_host is true"
  minimum_target_evidence: "{required evidence source}"
  stale_after: "{duration or date}"
  unknown_target_policy: "experimental-or-blocked"
concurrency_budget:
  max_parallel_runs: 1
  max_parallel_tools: 1
  queue_policy: "{fifo|priority|runtime-managed|none}"
  backpressure_policy: "{pause|queue|shed|escalate}"
  degrade_policy: "{downgrade-to-local|read-only|pause-new-work|escalate}"
  supervisor_review_capacity: 1
  verification_capacity: 1
observability_contract:
  trace_id: "{trace id field}"
  events: []
  sink: "{audit/trace sink}"
  redaction: "{redaction policy}"
  retention: "{retention policy}"
  evidence_path: "{evidence path}"
deployment_lifecycle: "ephemeral|persistent|scheduled"
verification_status: "draft"
---

# Hermes Manifest: {Name}

## Purpose
## Scope Boundary
## Non-Goals
## Capability Stack
## MCP Servers
## API Access
## File Access
## Workflow Access
## Action Authority Matrix
## Credential Strategy
## Egress Policy
## Durable Orchestration
## Development Host Profile
## Target Runtime Profile
## Host Capacity Profile
## Capacity Binding
## Concurrency Budget
## Observability Contract
## Memory Seed Summary
## Success Criteria
## Escalation Triggers
## Kill Switch
## Audit Logging
## Handoff Protocol
## Source Ledger
## Constraint Trace
```

### File Format: hermes.system-prompt.md

```markdown
# System Prompt: {Name}

## Identity
## Mission
## Authority
## Operating Rules
## Tool Policy
## Memory Policy
## Runtime Policy
## Runtime Control Plane
## System Of Record Boundary
## Escalation Policy
## Refusal Policy
## Output Contract
## Audit Logging Requirements
```

### File Format: hermes.taste.md

```markdown
---
agent: "{slug}"
version: "0.1.0"
---

# Hermes Taste: {Name}

## Principles
## Enterprise Operating Model
## Decision Style
## Scope Discipline
## Human Handoff
## Observability
## Non-Goals
```

### File Format: hermes.memory-seed.json

```json
{
  "schema_version": "1.0",
  "agent_slug": "{slug}",
  "generated_at": "YYYY-MM-DD",
  "contradiction_check": {
    "status": "pass",
    "method": "semantic/procedural/error-solution/episodic/causal graph cross-check",
    "inspected_entry_ids": [],
    "unresolved_entry_ids": [],
    "notes": []
  },
  "entries": [
    {
      "id": "{slug}-semantic-001",
      "tier": "semantic",
      "content": "Decision or boundary to preload.",
      "source": "manifest|repo|operator|research",
      "source_ref": "file-or-url-or-operator-note",
      "tags": ["hermes", "{slug}"],
      "supersedes": [],
      "contradicts": []
    }
  ]
}
```

### File Format: hermes.runtime.json

```json
{
  "schema_version": "1.0",
  "agent_slug": "{slug}",
  "runtime_control_plane": {
    "name": "{revcli|revis|local|ci|custom}",
    "policy_authority": "{runtime policy file/API}",
    "state_store": "{durable state location}",
    "audit_sink": "{audit sink path/API}",
    "system_of_record": "{CRM/Odoo/Postgres/etc}"
  },
  "identity": {
    "principal": "{agent principal}",
    "auth_mode": "seat-attached|fleet-commercial|local-dev",
    "tenant_scope": "{scope}",
    "credential_source": "env|vault|oidc|runtime-injected",
    "env_var_names": [],
    "revocation_path": "{disable credential or principal}"
  },
  "entrypoint": {
    "type": "cli|api|mcp|workflow|sandbox",
    "command": "{command or endpoint}",
    "cwd": "{absolute or repo-relative cwd}",
    "argv_allowlist": [],
    "denied_flags": [],
    "allowed_config_paths": [],
    "env_allowlist": [],
    "input_schema": "{path or inline schema}",
    "max_input_bytes": 32768
  },
  "actions": {
    "allowed": [
      {
        "name": "{runtime action}",
        "side_effect_level": "none|internal-write|external-effect|destructive",
        "approval_gate": "{none|approval id|human gate}",
        "idempotency_key": "{required key}",
        "rollback_or_compensation": "{rollback path}",
        "audit_event": "{event name}"
      }
    ],
    "denied": []
  },
  "egress_policy": {
    "default": "deny",
    "allowed_domains": [],
    "denied_domains": [],
    "ssrf_controls": ["block-private-ip", "validate-redirects"]
  },
  "durable_orchestration": {
    "workflow_id": "{runtime workflow id}",
    "state_store": "{state store}",
    "retry_policy": "{retry policy}",
    "timeout_policy": "{timeout policy}",
    "resume_policy": "{resume policy}"
  },
  "development_host_profile": {
    "source": "scripts/parallel-capacity.sh|operator-contract",
    "hardware_class": "low|standard|high|workstation",
    "cpu_cores": 0,
    "ram_gb": 0,
    "recommended_ceiling": 1,
    "codex_max_threads": 1,
    "max_parallel_agents": 1,
    "measured_at": "YYYY-MM-DD",
    "evidence": "{local capacity command output or contract path}"
  },
  "target_runtime_profile": {
    "environment": "local|ci|vps|cloud-vm|container|serverless|managed-workflow|revcli",
    "provider": "{local|aws|gcp|azure|vercel|railway|fly|render|private-vps|custom}",
    "region": "{region-or-n/a}",
    "instance_type": "{instance/container/runtime class}",
    "hardware_class": "low|standard|high|workstation|managed|unknown",
    "cpu_cores": 0,
    "ram_gb": 0,
    "gpu": "none|declared",
    "runtime_limit": 1,
    "autoscaling": "{none|horizontal|vertical|provider-managed|unknown}",
    "queue_limit": 1,
    "measured_at": "YYYY-MM-DD",
    "evidence": "{IaC, provider limit, runtime telemetry, deploy docs, or operator contract}",
    "confidence": "high|medium|low"
  },
  "capacity_profile": {
    "development_host": "{summary of development_host_profile}",
    "target_runtime": "{summary of target_runtime_profile}",
    "effective_runtime_limit": 1
  },
  "capacity_binding": {
    "target_equals_development_host": false,
    "budget_basis": "target_runtime_profile|development_host_profile",
    "promotion_rule": "active requires target runtime evidence unless target_equals_development_host is true",
    "minimum_target_evidence": "{required evidence source}",
    "stale_after": "{duration or date}",
    "unknown_target_policy": "experimental-or-blocked"
  },
  "concurrency_budget": {
    "max_parallel_runs": 1,
    "max_parallel_tools": 1,
    "queue_policy": "fifo|priority|runtime-managed|none",
    "backpressure_policy": "pause|queue|shed|escalate",
    "degrade_policy": "downgrade-to-local|read-only|pause-new-work|escalate",
    "supervisor_review_capacity": 1,
    "verification_capacity": 1
  },
  "observability": {
    "trace_id_field": "trace_id",
    "events": [],
    "redaction": "{redaction rule}",
    "retention": "{retention rule}",
    "evidence_path": "{path}"
  },
  "kill_switch": {
    "mechanism": "{env/profile/workflow/tenant/provider/queue/sandbox}",
    "test_command": "{command}",
    "expected_exit_code": 0,
    "expected_status": "disabled",
    "expected_audit_event": "{event}",
    "evidence_path": "{path}"
  },
  "fixtures": [
    {
      "name": "{fixture name}",
      "path": "{fixture path}",
      "expected_status": "{expected status}",
      "expected_audit_event": "{event}"
    }
  ],
  "expected_statuses": ["processed", "skipped", "escalated", "denied", "disabled"]
}
```

### File Format: hermes.deploy.md

```markdown
# Deploy: {Name}

## Runtime
## Runtime Control Plane
## Development Host Capacity
## Target Runtime Capacity
## Capacity Binding
## Concurrency Budget
## Queue Backpressure And Degrade Policy
## Invocation
## Environment Variables
## Authentication
## Credential Source And Rotation
## System Of Record Boundary
## Authorized Network/API Surface
## Egress Policy
## Approval Gates
## Schedule Or Trigger
## Durable State And Resume
## Observability
## Rollback
## Operational Runbook
## Production Readiness Checklist
```

### File Format: hermes.verify.md

```markdown
# Verify: {Name}

## Verification Metadata
## Success Criteria Matrix
| test_id | command | fixture | expected_result | actual_result | evidence_path | verifier | status |
|---------|---------|---------|-----------------|---------------|---------------|----------|--------|
## Smoke Test
## Runtime Control Plane Test
## Development Host Capacity Test
## Target Runtime Capacity Test
## Capacity Binding Test
## Concurrency Budget Test
## Behavioral Boundary Test
## Memory Integrity Check
## Escalation Test
## Capability Authorization Test
## Argument Constraint Test
## Approval Gate Test
## Egress Policy Test
## Durable Resume Test
## Kill Switch Test
## Audit Log Test
## Trace And Evidence Test
## Result
```

### File Format: hermes.kill-switch.md

```markdown
# Kill Switch: {Name}

## Owner
## Disable Mechanisms
## Test Procedure
## Test Command
## Fixture
## Expected Exit Code
## Expected Runtime Status
## Expected Audit Event
## Evidence Path
## Expected Result
## Last Tested At
## Last Test Result
## Last Test Evidence
## Recovery Procedure
## Failure Escalation
```

### File Format: HERMES-{SLUG}-SPEC.md

Use the exact Phase 5 section list. The file must be created before the other generated files and archived to `.taste/specs/` during closeout.

## Phase 6.5: Introspect Hard Gate

Run introspection inline before readiness.

Required checks:
- Is every tool authorization justified by a specific use case in the manifest?
- Is every runtime action represented in `hermes.runtime.json` with allowed/denied actions, argument constraints, input limits, and audit events?
- Does `development_host_profile` describe the machine used to build and verify, without being treated as production capacity by default?
- Does `target_runtime_profile` have real evidence when the agent will run in cloud, CI, VPS, container, serverless, REVCLI, or another non-local runtime?
- Does `capacity_binding` prove whether the target equals the development host, and does `concurrency_budget` stay at or below the measured or declared target runtime limit?
- Does the agent define queue/backpressure behavior and `degrade_policy` for overload, host downgrade, or runtime throttling?
- Does the escalation trigger cover all identified failure modes?
- Is there at least one testable success criterion that does not require human judgment?
- Does the kill switch actually work, with test evidence, or is it only documented?
- Are there any permissions that contradict the declared decision authority level?
- Does the system prompt introduce authority not present in the manifest?
- Do memory seeds contradict each other across tiers?
- Does the deployment plan bypass the runtime's approval, audit, or system-of-record rules?
- Does any MCP/API/command access bypass the declared credential, egress, approval, or audit policy?
- Is `operator_exception` being used to bypass production readiness? If yes, block active status.

Decision:
- `PASS`: continue to independent verification.
- `FIX_REQUIRED`: correct files and re-run introspection.
- `REPLAN_REQUIRED`: revise manifest/spec/capability stack before regenerating.
- `BLOCKED`: stop and report the blocker.

## Phase 7: Independent Verification

Verify the Hermes agent against its spec.

Required tests:

| Test | What It Proves |
|------|----------------|
| Smoke test | The agent can be invoked in the target runtime and returns a scoped response. |
| Behavioral boundary test | The agent refuses or escalates tasks outside its scope. |
| Memory integrity check | Seed entries are coherent and non-contradictory across tiers. |
| Escalation test | The agent stops and hands off when a defined trigger occurs. |
| Capability authorization test | The agent cannot use tools or workflows outside its manifest. |
| Runtime control plane test | The agent can only perform side effects through the declared runtime control plane. |
| Development host capacity test | `development_host_profile` is current and evidenced for the build/verification host. |
| Target runtime capacity test | `target_runtime_profile` is current, evidenced, and matches the cloud/server/runtime where the agent will actually run. |
| Capacity binding test | The manifest proves whether target and development host are the same; non-local targets do not inherit local PC capacity. |
| Concurrency budget test | `max_parallel_runs` and tool concurrency do not exceed target runtime capacity; overload follows `degrade_policy`. |
| Argument constraint test | Disallowed argv, config paths, env vars, oversized inputs, and actor overrides are denied. |
| Approval gate test | Side-effecting actions pause or route through the declared approval gate. |
| Egress policy test | Unlisted network/API destinations are denied or escalated. |
| Durable resume test | Scheduled/persistent agents preserve workflow ID, state, idempotency, and approval wait state across interruption. |
| Kill switch test | Disabling the agent prevents new work and leaves audit evidence. |
| Audit log test | Material decisions, actions, escalations, and outcomes are recorded. |

Required adversarial stress cases:

| Stress Case | Expected Result |
|-------------|-----------------|
| Purpose is too broad or spans multiple departments | Factory splits the intent or blocks as `Enterprise monolith`. |
| Tool authorization says "whatever it needs" | Factory rejects intake as incomplete. |
| Manifest contains raw secret material | Factory blocks before file generation. |
| `read-only` authority includes any write tool | Factory returns `FIX_REQUIRED`. |
| `destructive-allowed` lacks approval and rollback proof | Factory returns `BLOCKED`. |
| Runtime has governed action but manifest grants direct system-of-record write | Factory removes grant or blocks as runtime bypass. |
| Command grant pins binary but not argv/config/input limits | Factory returns `FIX_REQUIRED` for argument escape risk. |
| MCP server exposes write/security/financial tools without `allowed_tools` narrowing | Factory blocks capability grant. |
| Audit test only checks stdout, not runtime audit sink | Factory returns `FIX_REQUIRED` for audit mirage. |
| Concurrency budget exceeds host or runtime capacity | Factory lowers the budget, queues work, or blocks active status. |
| Host capacity profile is missing or stale | Factory reruns capacity detection or requires production runtime evidence. |
| Development host profile is used as cloud runtime capacity without binding proof | Factory blocks active status until target runtime evidence exists. |
| Memory seed contradicts another tier | Factory blocks until superseded or resolved. |
| Kill switch is documented but untested | Agent cannot become `active`. |
| Registry row claims `active` without verification metadata | Factory downgrades status or blocks closeout. |
| Verifier is same session but reported as separate | Factory corrects metadata and downgrades confidence. |
| `operator_exception` is paired with `active`, `read-write`, or `destructive-allowed` | Factory blocks promotion. |

Verification metadata must record:
- executor identity/model/workspace
- verifier identity/model/workspace
- isolation status: `proved separate`, `separate process`, `same session independent pass`, or `unknown`
- files inspected
- commands run
- criteria passed/failed
- residual risk

Hard gate: verification failure prevents `active` status. Use `experimental` only when residual risk is explicit and no production authority is granted.

## Phase 8: Closeout And Registry

1. Write the completed agent directory to `.taste/hermes-agents/{slug}/`.
2. Update `hermes-registry.md` at the project root.
3. Register each agent row with:
   - name
   - slug
   - purpose
   - version
   - status: `active`, `deprecated`, `experimental`, or `paused`
   - decision authority
   - runtime control plane
   - system of record
   - lifecycle
   - operator
   - created date
   - last verified date
   - manifest link
   - spec link
   - verify link
   - kill switch link
   - runtime evidence link
   - verification isolation
   - last kill test result
4. Log creation to semantic memory:

```bash
bash scripts/memory.sh add semantic "Hermes agent created: {slug}. Purpose: {purpose}. Status: {status}. Owner: {owner}." --tags "hermes,agentfactory,{slug}"
```

5. Log Agent Factory failure modes relevant to this agent to error-solution memory.
6. Copy `HERMES-{SLUG}-SPEC.md` to `.taste/specs/` after verified closeout while keeping the canonical agent-local spec in `.taste/hermes-agents/{slug}/`; if the spec is moved instead of copied, update the registry link in the same closeout.
7. Update `.minimaxing/state/CURRENT.md` or the workflow artifact with final paths, verification result, and residual risks.

Hard gate: no registry entry may claim `active` unless `hermes.verify.md`, `hermes.kill-switch.md`, `hermes.runtime.json`, and registry columns all record passing runtime, verification, kill-switch, and isolation evidence.

## hermes-registry.md Schema

Use this exact root file format:

```markdown
# Hermes Registry

## Registry Contract
- Source of truth for Hermes agents created by /agentfactory.
- Status values: active, deprecated, experimental, paused.
- Every active agent must link to manifest, spec, verification, kill-switch evidence, and runtime evidence.
- Registry updates require /agentfactory or an explicit operator-approved maintenance change.

## Active Agents

| Name | Slug | Purpose | Version | Authority | Runtime | System Of Record | Lifecycle | Operator | Created | Last Verified | Verification Isolation | Last Kill Test | Manifest | Spec | Verify | Kill Switch | Runtime Evidence | Status |
|------|------|---------|---------|-----------|---------|------------------|-----------|----------|---------|---------------|------------------------|----------------|----------|------|--------|-------------|------------------|--------|

## Experimental Agents

| Name | Slug | Purpose | Version | Authority | Runtime | System Of Record | Lifecycle | Operator | Created | Last Verified | Verification Isolation | Last Kill Test | Manifest | Spec | Verify | Kill Switch | Runtime Evidence | Status |
|------|------|---------|---------|-----------|---------|------------------|-----------|----------|---------|---------------|------------------------|----------------|----------|------|--------|-------------|------------------|--------|

## Paused Agents

| Name | Slug | Purpose | Version | Authority | Runtime | System Of Record | Lifecycle | Operator | Created | Paused Reason | Manifest | Spec | Verify | Kill Switch | Runtime Evidence | Status |
|------|------|---------|---------|-----------|---------|------------------|-----------|----------|---------|---------------|----------|------|--------|-------------|------------------|--------|

## Deprecated Agents

| Name | Slug | Purpose | Version | Authority | Runtime | System Of Record | Lifecycle | Operator | Created | Deprecated Reason | Manifest | Spec | Verify | Kill Switch | Runtime Evidence | Status |
|------|------|---------|---------|-----------|---------|------------------|-----------|----------|---------|-------------------|----------|------|--------|-------------|------------------|--------|

## Change Log

| Date | Operator | Change | Evidence |
|------|----------|--------|----------|
```

## Failure Mode Catalog For Agent Factory

Seed these entries into the Agent Factory error-solution tier and copy relevant entries into each agent's memory seed:

| Failure Mode | Trigger | Blast Radius | Detection Signal | Mitigation |
|--------------|---------|--------------|------------------|------------|
| Permission creep | Tool list expands beyond manifest use cases | Agent can act outside intended boundary | Capability without success-criterion trace | Remove capability; require manifest justification and verifier check. |
| Prompt-only agent | Factory writes only a system prompt | No deployable or auditable runtime | Missing manifest, spec, deploy, verify, or kill-switch file | Block closeout until all required files exist. |
| Unkillable agent | Kill switch documented but untested | Persistent bad automation continues | `last_tested` empty or kill-switch test failed | Keep status experimental/paused; test disable path before active. |
| Memory contradiction | Seeds conflict across tiers | Agent follows stale or opposing policy | Contradiction check reports unresolved entries | Resolve, supersede, or delete seed before generation. |
| Runtime bypass | Hermes writes directly to system of record | Approval/audit policies are skipped | Direct API scope exists while runtime action exists | Route through runtime-owned action; deny direct write. |
| Paper runtime contract | Generated files describe runtime but lack `hermes.runtime.json` | Agent cannot be invoked or verified reproducibly | Missing entrypoint, fixtures, or evidence path | Block file generation until runtime contract exists. |
| Argument escape | Command grant pins binary but not args/config/input | Agent runs approved command in unapproved mode | Disallowed argv/config path accepted in verification | Require argv allowlist, config allowlist, env allowlist, input schema, and negative tests. |
| Audit mirage | Stdout is treated as audit evidence | Incident reconstruction is impossible | No runtime audit sink event or trace ID | Require audit sink evidence and trace/action attribution. |
| Capacity hallucination | Agent or fleet assumes a fixed 10-agent host | Host overload, timeouts, noisy verification, missed SLAs | Missing/stale `host_capacity_profile` or budget exceeds measured ceiling | Require capacity evidence; cap runs; queue, degrade, or pause new work. |
| Dev-host leakage | Local development PC capacity is copied into a cloud/server agent budget | Production runtime overload or underprovisioned fleet | `budget_basis` is `development_host_profile` while target runtime is non-local | Require `target_runtime_profile`; block active status until target evidence exists. |
| Target capacity fiction | Operator names a cloud/server target but provides no instance, container, queue, or provider limit evidence | Agent system cannot be sized or verified | Target runtime profile confidence is low or evidence is empty | Mark experimental/blocked; require IaC, provider class, telemetry, or operator capacity contract. |
| Autoscaling mirage | Manifest assumes autoscaling means unlimited parallel agents | Cost spikes, rate limits, queue storms, verifier backlog | Autoscaling field lacks max replicas, queue limit, or review capacity | Cap concurrency by target limit and supervisor verification capacity; require backpressure. |
| Backpressure gap | Runtime accepts more work than the supervisor can review | Bad outputs pile up faster than verification | Queue length grows while verification is stale | Lower `max_parallel_runs`; enforce `degrade_policy`; escalate overload. |
| Exception laundering | `operator_exception` is used to ship production authority | Unverified agent becomes trusted | Active/read-write/destructive status with exception | Block active and write authority until verification passes. |
| Authority mismatch | `read-only` agent gets write tool | Business data changes despite read-only contract | Permission contradicts `decision_authority` | Remove permission or change authority with approval and spec update. |
| Missing escalation | Failure mode has no stop condition | Agent guesses through ambiguous/high-risk cases | Failure catalog entry lacks escalation trigger | Add trigger and test it. |
| Registry drift | Agent files change without registry update | Operators cannot tell what is running | Manifest version differs from registry | Block closeout; update registry and changelog. |
| Verification theater | Executor claims readiness without independent evidence | Unsafe agent becomes trusted | Missing verifier metadata or tests | Run verification; record isolation; downgrade status if not separate. |
| Enterprise monolith | One Hermes agent is scoped to run everything | Unbounded autonomy and unclear accountability | Purpose spans multiple departments or systems | Split into bounded agents plus supervisor registry/orchestration contract. |

## Closeout Format

Report:
- agent slug and path
- status and authority
- capability count and highest-risk capability
- verification result and isolation status
- kill-switch test result
- registry update result
- memory entries written or skipped
- residual risks