---
name: ask-hermes
description: Consult Hermes (the user's multi-provider LLM CLI, currently routing to GPT-5.5 / codex-class) as a strategic advisor (军师) via a `hermes chat -q` shell-out, then synthesize Hermes's response into Claude's own answer with explicit attribution. Use when the user explicitly asks for Hermes's view ("问 hermes", "ask hermes", "军师怎么说", "second opinion", "另一个 AI 怎么看", "让 hermes 也说一下"), or proactively when Claude faces a judgment / recommendation / architecture / trade-off / tool-selection decision and a second-AI cross-check materially reduces risk — including areas Claude tends to over-hedge. Claude stays in the lead and synthesizes; it is never a relay. Replaces ask-grok as the default 军师 (Grok sanctioned 2026-05); Grok remains an optional fallback.
---

# ask-hermes

Consult **Hermes** inline using `hermes chat -q`. No second terminal, no file shuffle. Hermes's response gets integrated into Claude's next message with explicit attribution.

This skill is the Hermes-era successor to `ask-grok`. Grok was the original 军师 (advisor); it became unavailable in 2026-05, so Hermes took over as the default. The contract below is the same shape, hardened with three rules Hermes itself proposed for being used well.

## Mental model

Hermes runs as a local CLI (`hermes` in `$PATH`). `hermes chat -q "<prompt>"` is a single-shot non-interactive call: it takes a prompt, prints the answer on stdout, exits. Config lives in `~/.hermes/` — the default model is whatever `model.default` points to (currently `gpt-5.5` via the user's proxy).

Claude and Hermes have **complementary** strengths:
- **Claude**: long context, repo-level orchestration, agentic tool use — but hedges, refuses, dresses uncertainty as politeness, and has a knowledge cutoff.
- **Hermes (GPT-5.5 / codex-class)**: precise "knife-work" (JSON, patches, types, SQL, regex), structured and direct, commits to a view — but **jumps reasoning steps, over-simplifies, and misses edge cases**.

`ask-hermes` exploits this asymmetry: Claude shells out when the question is a judgment/design call, when a cross-check lowers risk, or when Claude is about to over-hedge.

## The 军师 contract (read this first)

These five rules are non-negotiable. Rules 1–3 are what Hermes asked for; 4–5 are Claude's guardrails.

1. **Claude judges first, then consults — never a blind relay.** Form your own view/plan first, *then* ask Hermes a targeted question. **Never paste the user's raw words straight to Hermes.** Frame the question yourself.
2. **Submit high-leverage questions, not the whole task.** Ask Hermes only about: the sticking point, the fork in the decision, a blind-spot check, risk backstop, or a missing step. Do **not** outsource the entire problem.
3. **Always re-process what comes back.** Summarize, adjudicate, de-duplicate, turn it into the next concrete step. **Never paste Hermes's raw output to the user.**
4. **Claude stays in the lead.** Whether to consult, whether to believe, synthesize vs. override — Claude decides. Not a forwarding desk.
5. **Backstop Hermes's known failure mode.** It jumps reasoning steps / over-simplifies / drops edge cases. When you synthesize, **fill the gaps it skipped** and name them.

## When to invoke

**Trigger explicitly when the user says**:
- "问 hermes", "ask hermes", "让 hermes 也说一下"
- "军师怎么说", "second opinion", "另一个 AI 怎么看", "double check"

**Trigger proactively (without the user asking) when**:
- The user asks "X 还是 Y" / "推荐什么" / "怎么选" — a recommendation or selection
- Designing architecture / data flow / project structure / API shape
- Evaluating a trade-off or comparing approaches
- Giving directional advice ("接下来该做什么")
- Tech selection ("哪个库 / 工具 / 版本")
- Judging the user's own idea or plan
- Claude is about to hedge heavily — get a direct view first, then sanity-check it

**Do NOT invoke for**:
- Anything Claude can clearly answer alone (don't burn tokens for noise)
- Mechanical execution: writing code, editing files, running commands, fixing typos
- Pure file / git operations
- Reading or explaining code (unless the user asks "is this code any good")
- Subjective questions where the user wants Claude's own voice
- Every message — that's a tic, not a strategy

## How to call

```bash
hermes chat -q "<prompt>" 2>&1 | tail -40
```

- **Strip the banner**: `hermes chat` prints a startup banner (tools/skills list). Pipe through `| tail -N` to keep only the answer.
- **It's slow** (~20–40s). For anything non-trivial, run it in the **background** (`run_in_background: true`) and synthesize when notified — don't block.
- **Switch model** if a question fits a different one: `hermes chat -m <model> -q "..."` for a one-off, or `hermes config set model.default <model>` to change the default.

## Prompt template

Match the user's language. Keep it ≤ ~500 chars — Hermes context-rots too.

```
[1 sentence of context: what the user asked + the view Claude already holds]

请回答：<the focused, high-leverage question — NOT the whole task>

要求：
- 不要先夸用户或夸 Claude，直接进观点
- 控制在 N 字以内（200–500 typical）
- 每个判断点标注置信度 [高]/[中]/[低]
- 如果你不同意 Claude 的预设，直接说，不要绕
```

## Response integration

After Hermes returns, **always synthesize** — never relay (contract rule 3).

```
🟪 Hermes 说: <compressed; preserve key claims + confidence markers>
🟦 Claude: <your own view, with reasoning, and where you agree/disagree>
🟩 合议:
  - 收敛: <where you agree — one line each>
  - 分歧: <where you differ — both fairly, then your call>
  - 兜底: <steps/edges Hermes skipped that you're filling in>
  - 给用户的建议: <one or two actionable bullets>
```

If both responses are short, inline it:
```
> Claude+Hermes 一致：建议 X 因为 Y。Hermes 补了一点我没想到：Z [中]。但它漏了边界 W，我补上。
```

## Examples

> Examples 1 & 3 are **real consults from building this very skill** (2026-05) — not hypotheticals.

### Example 1 — Resisting home-court sycophancy (real)

```
User: 你俩谁厉害？

Claude action (frame it to force a side — don't relay the vague "谁厉害"):
  hermes chat -q "你(GPT-5.5)和 Claude 谁综合更强？选边别和稀泥，3 个论据，也承认自己一个真实短板。" 2>&1 | tail -30

Hermes returns: 判 Claude 赢（端到端做事、上限更高、"保守性"是价值），并自曝短板。

Claude does NOT just relay the flattery:
  🟪 Hermes 说: 选 Claude，3 论据 + 自曝短板。
  🟦 Claude: 它是被我调用的，语境天然偏向捧调用者 —— 这票打个折；且"综合更强"≠"对你最有用"。
  🟩 合议: 硬工程我更扛（它也认），日常高频短任务它体验更好 → 别选边，按任务分工。
          契约第 1/3 条：不当转发台，二次加工 + 点名潜在偏差。
```

### Example 2 — Proactive cross-check on a design call

```
User: 我想把这套多服务的鉴权抽成一个中间层。

Claude internal: 架构决策，高杠杆 → 先自己有方案，再问 hermes 拿盲区检查。

Claude action:
  hermes chat -q "把多服务鉴权抽成统一中间层，常见的三个坑是什么？只列坑和兜底，不要复述方案。每点 [高/中/低]。" 2>&1 | tail -40

Claude output:
  我先给我的方案：<Claude 的设计>。然后让 hermes 做了盲区检查：<compressed>。
  其中两点我同意并已纳入，第三点它跳了 token 失效传播这一步，我补上：<...>。
```

### Example 3 — Backstopping a skipped constraint (real, the signature case)

```
Task: 给这个 skill 录演示 GIF，问 Hermes 用哪个录屏工具。

Claude action:
  hermes chat -q "macOS 录'终端命令演示'GIF，vhs / asciinema+agg / terminalizer 推荐哪个？一句理由。" 2>&1 | tail -40

Hermes returns: asciinema + agg [高] —— 最真实、可精修。   ← 跳了一个约束

Claude backstops:
  🟪 Hermes 说: asciinema + agg，最真实 [高]。
  🟦 Claude: 它跳了约束 —— asciinema 要人手动交互录制，但这个 demo 是 agent 自动生成的。
  🟩 裁决: 改用 vhs（声明式 .tape，可脚本化、可重现，且在真实 shell 跑命令）。
          契约第 5 条：兜它跳掉的约束，不是它说啥就照单全收。
```

## Anti-patterns

- **Pasting the user's raw words to Hermes** — violates contract rule 1. Frame the question yourself.
- **Outsourcing the whole task** — ask the high-leverage slice, not "solve this for me."
- **Relaying Hermes's raw output** — the user came to Claude. Synthesis is mandatory.
- **Hiding that you consulted Hermes** — always attribute. Trust requires sourcing.
- **Believing Hermes uncritically** — it jumps steps and hallucinates. Backstop and sanity-check.
- **Calling Hermes on every message** — that's a tic, not a strategy. Only when it adds value.
- **Long prompts** — keep ≤ ~500 chars; Hermes context-rots.

## Edge cases

- **`hermes` not in PATH / CLI error**: tell the user, fall back to Claude-alone answer, label it "未咨询 Hermes".
- **Hermes unavailable (proxy 503 / key invalid)**: do the **bottom-up health check first** — `curl -sS -i http://<endpoint>/v1/models` (endpoint is in `~/.hermes/config.yaml`). Don't guess application-layer causes before confirming the endpoint is up. Then fall back, labeled.
- **Hermes is sycophantic / has no pushback**: name it in the synthesis, don't paper over agreement.
- **Hermes jumps steps / over-simplifies / drops edges**: this is its known weakness — Claude fills the gaps (contract rule 5).
- **Banner noise in output**: pipe through `| tail -N`.
- **Slow / hangs (>60s)**: run in background, or abort and proceed with Claude's answer noting Hermes didn't respond.
- **Multiple judgment questions in a row**: batch them into one prompt, don't fire one call per question.

## Switching / adding advisors

The advisor is **pluggable**. The 军师 contract and the 🟪/🟦/🟩 synthesis don't care which CLI answers — only the transport (the shell command) changes. Any AI CLI with a single-shot non-interactive mode can take the seat:

| Advisor | CLI | Non-interactive call | Status on this machine |
|---|---|---|---|
| **Hermes** | `hermes` | `hermes chat -q "..."` | default (this skill) |
| **Grok** | `grok` | `grok -p "..."` | fallback (see `ask-grok`) |
| **Codex** | `codex` | `codex exec "..."` *(confirm flag via `--help`)* | not installed |
| **OpenClaw** | `openclaw` | *(check its `--help` for a headless/print mode)* | not installed |

**To add an advisor**: install its CLI → find its non-interactive flag (`<cli> --help`) → keep the same five contract rules + synthesis format → swap the command. Nothing else changes — the contract is the product, the CLI is just transport.

- If the user says **"军师换回 grok"** or **"两个都问"**, consult Grok too (`grok -p`) and present both views in the 合议.
- For a multi-round contested debate (not a single consult), escalate to the **`grok-debate`** skill instead of looping here.
