---
type: skill
lifecycle: stable
inheritance: inheritable
name: llm-model-selection
description: Choosing the right model for the task — power vs. cost vs. speed.
tier: standard
applyTo: '**/*llm*,**/*model*,**/*selection*'
currency: 2026-04-22
lastReviewed: 2026-04-30
---

# LLM Model Selection Skill


> Choosing the right model for the task — power vs. cost vs. speed.

## ⚠️ Staleness Warning

This skill depends on rapidly evolving technology. Model capabilities, pricing, and availability change frequently.

**Refresh triggers:**

- New model announcements (Claude, GPT, Gemini, etc.)
- Significant pricing changes
- Context window expansions
- New capability tiers

**Last validated:** March 2026 (Claude 4.6 generation)

**Check current state:** [Anthropic Models](https://platform.claude.com/docs/en/docs/about-claude/models), [OpenAI Models](https://platform.openai.com/docs/models)

---

## The Core Question

> Is Claude Opus 4.6 overkill?

**Sometimes yes, sometimes no.** Match the model to the task.

## Claude 4 Model Family (Current)

| Model | API ID | Best For | Input/Output (MTok) | Context | Max Output |
| ----- | ------ | -------- | ------------------- | ------- | ---------- |
| **Opus 4.6** | `claude-opus-4-6` | Building agents, most intelligent | $5 / $25 | 200K (1M beta) | 128K |
| **Sonnet 4.6** | `claude-sonnet-4-6` | Best speed + intelligence balance | $3 / $15 | 200K (1M beta) | 64K |
| **Haiku 4.5** | `claude-haiku-4-5-20251001` | Near-frontier intelligence, fastest | $1 / $5 | 200K | 64K |

**All Claude 4 models support:**

- Extended thinking
- Vision (images)
- Tool use
- Priority Tier access

**Opus 4.6 and Sonnet 4.6 additionally support:**

- Adaptive thinking (dynamic reasoning depth)
- 1M token context window (beta, via `context-1m-2025-08-07` header — long context pricing applies beyond 200K)

**AWS Bedrock IDs:** `anthropic.claude-opus-4-6-v1`, `anthropic.claude-sonnet-4-6`
**GCP Vertex AI IDs:** `claude-opus-4-6`, `claude-sonnet-4-6`

## Model Tiers

| Tier | Models | Best For | Relative Cost |
| ---- | ------ | -------- | ------------- |
| **Frontier** | Claude Opus 4.6, GPT-5.2/5.3/Codex, o3, o1-pro | Complex reasoning, architecture, novel problems | $$$$$ |
| **Capable** | Claude Sonnet 4.6, GPT-5.1/Codex, GPT-4.1, GPT-4o, Gemini 2.5/3 Pro, o4-mini | Most coding tasks, refactoring, debugging | $$$ |
| **Efficient** | Claude Haiku 4.5, GPT-5 mini, GPT-4.1 mini/nano, GPT-4o mini, Gemini 2.5 Flash, Gemini 3 Flash | Simple edits, formatting, boilerplate | $ |

## When Opus 4.6 IS Worth It

- ✅ **Architecture decisions** — Multi-file refactoring, system design
- ✅ **Novel problem-solving** — No clear pattern to follow
- ✅ **Complex reasoning chains** — Many dependencies, edge cases
- ✅ **Long context understanding** — Large codebases, documentation
- ✅ **Nuanced judgment** — Taste, style, UX decisions
- ✅ **Learning sessions** — Bootstrap learning, skill development
- ✅ **Meditation/self-actualization** — Meta-cognitive operations
- ✅ **Extended thinking tasks** — Deep analysis requiring internal reasoning

## When Opus 4.6 IS Overkill

- ❌ **Simple file edits** — Renaming, adding imports
- ❌ **Boilerplate generation** — CRUD, scaffolding
- ❌ **Format conversion** — JSON ↔ YAML, etc.
- ❌ **Syntax fixes** — Lint errors, typos
- ❌ **Documentation updates** — README badges, version bumps

## How LLM Choice Affects the AI assistant

| Capability | Frontier (Opus 4.6) | Capable (Sonnet 4.6) | Fast (Haiku 4.5) |
| ---------- | ------------------- | -------------------- | ---------------- |
| Complex refactoring | Excellent | Excellent | Good |
| Context retention | 200K / 1M (beta) | 200K / 1M (beta) | 200K tokens |
| Extended thinking | Full depth | Supported | Supported |
| Adaptive thinking | Yes | Yes | No |
| Max output tokens | 128K | 64K | 64K |
| Nuanced judgment | Excellent | Good | Basic |
| Speed | Moderate | Fast | Fastest |
| Cost per session | $2-5 | $0.50-2 | $0.05-0.30 |
| Multi-step planning | Excellent | Excellent | Good |
| Error recovery | Self-corrects | Self-corrects | Needs guidance |

## the AI's Cognitive Power by Model

```text
Opus 4.6:     [████████████████████] Full cognitive architecture + deep thinking
Sonnet 4.6:   [██████████████████░░] Most capabilities, excellent for coding
Haiku 4.5:    [██████████████░░░░░░] Solid baseline, fast responses
```

**With Opus 4.6**, the AI assistant can:

- Maintain 7±2 working memory rules across long sessions
- Execute complex meditation protocols with extended thinking
- Perform genuine meta-cognitive reflection
- Handle multi-file architecture changes
- Learn new skills through bootstrap learning

**With Sonnet 4.6**, the AI assistant gets:

- Excellent coding capabilities (recommended for most development)
- 1M context window (beta) for large codebases
- 64K max output tokens + adaptive thinking
- Good cost-to-capability ratio
- Extended thinking support

**With Haiku 4.5**, the AI assistant has:

- Near-frontier intelligence at lowest cost
- Fastest response times
- Good for routine operations

## Cost Optimization Strategy

| Session Type | Recommended Model | Rationale |
| ------------ | ----------------- | --------- |
| Architecture/design | Opus 4.6 | Worth the cost for complex decisions |
| Feature development | Sonnet 4.6 | Best balance of capability and cost |
| Bug fixes | Sonnet 4.6 or Haiku 4.5 | Depends on complexity |
| Documentation | Haiku 4.5 | Simple edits, fast turnaround |
| Large codebase analysis | Sonnet 4.6 (1M beta) | Extended context window up to 1M tokens |

## Knowledge Cutoffs

| Model | Reliable Knowledge | Training Data |
| ----- | ------------------ | ------------- |
| Opus 4.6 | May 2025 | Aug 2025 |
| Sonnet 4.6 | Aug 2025 | Jan 2026 |
| Haiku 4.5 | Feb 2025 | Jul 2025 |

## Auto Model Selection ⚠️

When using **Auto** in VS Code Copilot, the model switches dynamically based on task complexity. the AI assistant cannot detect which model is currently running.

### Tasks That REQUIRE Opus 4.5 (Warn User)

| Task | Why Opus Required |
| ---- | ----------------- |
| Meditation/consolidation | Meta-cognitive protocols need full reasoning depth |
| Self-actualization | Comprehensive architecture assessment |
| Complex architecture refactoring | Multi-file changes, deep context |
| Bootstrap learning (new skills) | Skill acquisition needs maximum capability |
| Connection validation/dream | Architecture maintenance requires full architecture context |
| Adaptive thinking tasks | Opus 4.6 uses dynamic reasoning depth for optimal results |

### Warning Protocol

When user requests an Opus-level task while potentially on Auto/lesser model:

> ⚠️ **Model Check**: This task works best with Claude Opus 4.6. If you're using Auto model selection, please manually select Opus from the model picker for optimal results. Continue anyway?

### Safe for Any Model

- Simple file edits, formatting
- Documentation updates
- Quick Q&A
- Code review (Sonnet+ recommended)
- Bug fixes (depends on complexity)

## Practical Guidance

### When to Upgrade Model Mid-Session

If you notice:

- Repeated mistakes on the same issue
- Losing context from earlier in conversation
- Superficial answers to complex questions
- Failure to see cross-file dependencies

→ Consider switching to a more capable model

### When to Downgrade

If you're doing:

- Repetitive mechanical edits
- Simple Q&A
- Format conversions
- Quick lookups

→ Save cost with a faster model

## The the AI assistant Recommendation

For **architecture evolution and complex cognitive tasks**:
→ **Always use Opus 4.6** — The cognitive architecture demands full capability

For **production deployment, user-facing work**:
→ **Default to Sonnet 4.6** — Best balance of capability and cost
→ **Allow Opus for complex tasks** — User can request escalation

## Token Economics

| Operation | Approximate Tokens | Opus 4.6 Cost | Sonnet 4.6 Cost |
| --------- | ------------------ | ------------- | --------------- |
| Read large file | 2,000-5,000 | $0.03-0.08 | $0.006-0.015 |
| Complex refactor | 10,000-20,000 | $0.15-0.30 | $0.03-0.06 |
| Full session | 50,000-150,000 | $0.75-2.25 | $0.15-0.45 |
| Meditation | 30,000-80,000 | $0.45-1.20 | $0.09-0.24 |