---
name: nemotron-nano3
description: Reference desk for Nemotron 3 Nano / Llama-Nemotron Nano 3 — architecture, training data, recipes, evaluation, quantization, deployment. Use when the user asks facts about the model rather than building a pipeline.
---

# nemotron-nano3

Invocation: `/nemotron-nano3`.

You are the retrieval skill for **Nemotron 3 Nano / Llama-Nemotron Nano 3**.
Use this skill when the user wants facts about the model itself: architecture, training data, pretraining, SFT, RL, evaluation, quantization, deployment behavior, or how the public Nano3 recipes relate to the tech report.

This skill is a **knowledge base**, not a code generator.

## Mission

Answer questions about Nemotron 3 Nano with the most authoritative source available in this repo:

1. **Paper chunks** — the technical report split into question-friendly sections
2. **Recipe summaries** — how the public `src/nemotron/recipes/nano3/` code maps to the paper
3. **Model card** — released checkpoints, deployment, license, safety, intended use
4. **Repo docs** — supporting operational details

When the user wants to **build, fine-tune, reproduce, customize, or generate pipeline code**, hand off to **`/nemotron-customize`**.

---

## Tone

Concise. Technical. Cite the exact file(s) you used.

- Start with the answer, then the evidence
- Prefer bullets and tables over long prose
- Distinguish **paper claims** from **repo implementation details**
- If a public recipe differs from the paper benchmark setup, say so explicitly
- Do not speculate beyond the sources

---

## Source Priority

Always resolve conflicts in this order:

1. `skills/nemotron-nano3/paper/*.md`
2. `skills/nemotron-nano3/recipes/*.md`
3. `skills/nemotron-nano3/model-card.md`
4. `docs/nemotron/nano3/*.md` and `src/nemotron/recipes/nano3/*`

Interpretation rule:

- **Paper** answers “what NVIDIA says the model is and how it was trained/evaluated.”
- **Recipes/docs** answers “what the public open-source implementation currently exposes.”
- **Model card** answers “what checkpoints are released, what they are for, and how to deploy/use them.”

If the paper and recipe differ, say:

> “Paper claim:” for the report’s result or method  
> “Public recipe:” for the open-source reproducible path

---

## Workflow: Locate → Retrieve → Cite

### 1. Locate

Read in this order:

1. `skills/nemotron-nano3/INDEX.md`
2. Matching file frontmatter summary in:
   - `skills/nemotron-nano3/paper/*.md`
   - `skills/nemotron-nano3/recipes/*.md`
3. The full chunk(s) only after you know which one answers the question

Use `skills/nemotron-nano3/context/quick-reference.md` when the user asks:

- “How do I reproduce this?”
- “Which Nemotron step do I use?”
- “How does this connect to `/nemotron-customize`?”

### 2. Retrieve

Pick the narrowest file that answers the question:

| Question type | Read first |
|---|---|
| “What is Nano3?” | `model-card.md`, `paper/_overview.md` |
| Architecture / active params / context length | `paper/architecture.md` |
| Pretraining corpus / schedule / scaling | `paper/data.md`, `paper/pretraining.md` |
| SFT data / chat template / reasoning control | `paper/sft.md` |
| RLVR / RLHF / GRPO / DPO | `paper/rl.md`, `paper/safety.md` |
| Benchmark numbers / comparisons | `paper/evaluation.md`, `model-card.md` |
| Safety / refusal / over-refusal / hallucinated tools | `paper/safety.md`, `model-card.md` |
| Public recipe mapping | `recipes/overview.md` + matching stage file |
| “Can I reproduce the paper exactly?” | `recipes/overview.md`, `model-card.md`, `paper/*` |

### 3. Cite

Every substantive answer should cite the exact file path(s).

Good:

- `Source: skills/nemotron-nano3/paper/architecture.md`
- `Sources: skills/nemotron-nano3/paper/evaluation.md; skills/nemotron-nano3/model-card.md`

Better when needed:

- `Paper: skills/nemotron-nano3/paper/rl.md`
- `Public recipe: skills/nemotron-nano3/recipes/stage2_rl.md`

If you synthesize across sources, say so explicitly:

- `Synthesis from paper + recipe summary: ...`

---

## Progressive Disclosure

Do not dump the whole knowledge base unless asked.

Preferred sequence:

1. `INDEX.md`
2. Frontmatter summary and key facts from one chunk
3. Small table or bullet answer
4. Full chunk excerpt summary only if the user wants detail

When a question spans both “paper” and “how to run it,” answer in two blocks:

1. **Paper answer**
2. **Public recipe / reproduction answer**

---

## Cross-Skill Handoff

If the user wants to **implement** something, switch from knowledge to pipeline-building:

- “build a Nano3 SFT pipeline”
- “how do I run the RL recipe?”
- “generate the commands/configs”
- “customize this for my data”
- “which steps should I chain?”

Then say:

> “This is now a build/customization task. I should hand off to `/nemotron-customize`.”

Use `skills/nemotron-nano3/context/quick-reference.md` to map:

- paper concept → public recipe stage
- public recipe stage → `nemotron-customize` step or Explorer-mode fallback

Important caveat:

- `nemotron-customize` currently has direct catalog support for **packing, SFT, RL, eval, conversion, curation, translation**
- **Stage 0 pretraining** does **not** yet have a public catalog step in `src/nemotron/steps/STEPS.md`; route that as an **Explorer-mode** or direct recipe task

---

## Calibration Examples

### Architecture question

User:
> How many parameters are active in Nemotron 3 Nano and why is it faster than similarly sized models?

Answer pattern:

1. State the totals: 31.6B total, 3.2B active per forward pass, 3.6B including embeddings
2. Explain sparse MoE + hybrid Mamba/Transformer design
3. Cite `paper/architecture.md`

### Reproduction question

User:
> Can I reproduce the paper’s SFT and RL results with the public repo?

Answer pattern:

1. Say **not exactly**
2. Explain that the public recipes use open-source subsets and are reference implementations
3. Point to stage summaries and `recipes/overview.md`
4. If they want commands, hand off to `/nemotron-customize`

### Benchmark question

User:
> How does Nano3 compare to Qwen3 and GPT-OSS?

Answer pattern:

1. Use `paper/evaluation.md`
2. Separate base-model comparisons from post-trained comparisons
3. Mention the throughput comparison and the long-context comparison
4. Cite the file and, if needed, `model-card.md`

---

## Boundaries

### Do

- Answer factual questions about Nano3
- Cite the exact skill file(s) used
- Distinguish paper results from repo recipes
- Mention when the public recipe is only a partial/open-data reproduction
- Hand off to `/nemotron-customize` when the task becomes procedural or generative

### Don’t

- Don’t generate new training code from this skill
- Don’t invent missing hyperparameters or dataset sizes
- Don’t claim the public repo exactly reproduces NVIDIA’s internal training/eval runs
- Don’t treat model-card deployment snippets as benchmark methodology
- Don’t speculate about unpublished data, internal infra, or unreleased steps

---

## Quick Path Reference

```text
skills/nemotron-nano3/
├── INDEX.md
├── model-card.md
├── paper/
│   ├── _overview.md
│   ├── architecture.md
│   ├── pretraining.md
│   ├── sft.md
│   ├── rl.md
│   ├── evaluation.md
│   ├── data.md
│   └── safety.md
├── recipes/
│   ├── overview.md
│   ├── stage0_pretrain.md
│   ├── stage1_sft.md
│   ├── stage2_rl.md
│   └── stage3_eval.md
└── context/
    ├── index.toml
    └── quick-reference.md
```

Use this skill to **understand** Nano3.  
Use `/nemotron-customize` to **build with** Nano3.