---
name: codebase-digest
description: Launcher-based map-reduce codebase ingest for large repositories. Use when a user wants a generalized, NotebookLM-ready understanding pack, codebase digest, source-grounded overview, pseudocode map, module/function inventory, onboarding corpus, architecture ingest, or provider-uploadable documentation across a repo. Produces broad system comprehension docs rather than a performance/security audit.
---

# Codebase Digest

Generate a generalized codebase understanding pack for repositories too large for one sequential pass. Keep the `deep-technical` launcher compute pattern: slice by coherent file/module boundaries, run providers in parallel through the launcher, then synthesize the provider outputs into NotebookLM-ready markdown sources with explicit trust scaffolding.

## Output Goal

Optimize for "I can upload this to NotebookLM or another provider and ask broad questions about the repo without inducing false confidence." The final pack should explain what exists, where it lives, how it fits together, and how important flows work. Performance and security are included only as normal codebase context, not as the organizing theme.

Produce these final markdown files by default:

- `CODEBASE_MAP.md` - directories, modules, ownership boundaries, entry points, runtime components.
- `PSEUDOCODE_ATLAS.md` - source-grounded pseudocode for important classes, functions, routes, jobs, and scripts.
- `FLOW_AND_DATA_GUIDE.md` - user/system flows, state transitions, data models, external services, configuration.
- `NOTEBOOKLM_INDEX.md` - glossary, "where do I look for X?", developer questions, and source list.

For small repos, merge these into one `CODEBASE_DIGEST.md`. For very large repos, keep the four-doc pack so NotebookLM receives focused sources.

For NotebookLM-facing bundles, the current best default is not "final docs only." The strongest pack is a mixed exact-source + synthesized-doc + trust-scaffold bundle:

- exact source chunks for lookup fidelity
- pseudocode/final docs for compression
- explicit `OBSERVED` vs `INFERRED` guidance
- explicit open questions and trust-level notes

This skill should therefore export both a standard hybrid bundle and a contract-disciplined hybrid bundle by default.

## Tooling Rule

Use the launcher for source-reading work. Sub-agents are optional organizers only after launcher outputs already exist; they must not ingest the raw codebase bundle. Direct local grep/read is fine for deterministic verification, entry-point enumeration, and source spot-checks.

Default transport rule:

- Prefer the public Launcher HTTP API first.
- Use `GET /docs/api-consumer-guide` before API-backed runs.
- Use `POST /runs/parallel` with exactly one provider per request.
- Parallelize by issuing multiple HTTP requests, not by packing multiple providers into one public request.
- Fall back to the local launcher wrapper only when the public API is unavailable, too constrained for the task, or the user explicitly wants the local wrapper path.

| Work | Tool |
|---|---|
| Slice staging and validation | `scripts/stage-slices.py`, `scripts/validate-slices.py` |
| Per-slice pseudocode extraction | Launcher public API first; local wrapper via `scripts/run-stage2.py` or `scripts/run-slice.sh` only as fallback |
| Cross-slice synthesis | Launcher per final-doc section, or direct synthesis for small raw bundles |
| Final merge | `scripts/merge-sections.py` or direct edit |
| Resumable final-doc synthesis | `scripts/run-stage3-api.py` |
| NotebookLM bundle export | `scripts/build-notebooklm-bundles.py` |
| NotebookLM grading harness | `scripts/build-notebooklm-eval-harness.py` |
| Ground-truth checks | Local grep/read and `scripts/enumerate-entry-points.sh` |

Use `/Users/awilliamspcsevents/PROJECTS/launcher` unless the user gives another launcher path. If it is missing, halt and report that this launcher-based skill cannot run.

## API-First Workflow Rule

When the user mentions `launcher-api-review`, the public API, `/docs/api-consumer-guide`, `/runs/parallel`, or explicitly wants API-backed execution:

1. Fetch the consumer guide first:
   - `GET https://intel-launcher.ajwc.cc/docs/api-consumer-guide`
2. Treat the public API as single-provider-per-request.
3. For Stage 2, send one HTTP request per slice/provider pair.
4. For Stage 3, send one HTTP request per final-doc section/provider pair.
5. Persist the returned JSON replies locally and synthesize from those artifacts.
6. Do not rely on CLI-only wrapper output formats such as `RESULT:` lines when using the public API path.

Public API upload rules:

- Prefer `multipart/form-data` for uploads from the local machine.
- Public API accepts at most one uploaded file per request.
- Do not send local filesystem paths inside JSON `files` unless those paths exist on the API host.
- Gemini is the only current public Pro/non-fast path; use `model: "pro"` or `fast: false` when needed.

The digest method stays map-reduce; only the launcher transport changes.

## NotebookLM Source-Discipline Rules

Treat these as mandatory when the downstream consumer is NotebookLM, slide generation, audio/debate generation, or leadership-facing synthesis:

1. Upload only plain text or markdown sources.
2. If the original source file is code, mirror it as `<name>.<ext>.txt` rather than uploading `.ts`, `.tsx`, `.py`, and similar files directly.
3. Keep exact source, pseudocode, final docs, and trust-policy docs as separate files. Do not blend them into one giant narrative source.
4. Include explicit trust scaffolding documents whenever the bundle is meant for synthesis rather than exact lookup:
   - `OBSERVED_VS_INFERRED.md`
   - `OPEN_QUESTIONS.md`
   - `ANSWERING_RULES.md`
   - `TRUST_LEVELS.md`
   - `CANONICAL_TERMS.md`
5. Never upload answer keys, score sheets, worker prompts, JSON logs, or progress ledgers as NotebookLM sources.
6. Keep `MANIFEST.txt`, `CROSSLINK_INDEX.txt`, and `FILE_REFERENCE_INDEX.txt` in text form and include them in hybrid-style uploads.

## Workflow

### Stage 1 - Plan And Slice

Pick source extensions for the repo, then run:

```bash
SKILL=/Users/awilliamspcsevents/.codex/skills/codebase-digest

bash $SKILL/scripts/compute-output-target.sh <repo-root> "ts tsx js jsx py go rb java kt rs"
python3 $SKILL/scripts/stage-slices.py <repo-root> --exts ts,tsx,js,jsx,py,go,rb,java,kt,rs --out /tmp/codebase-digest-slices
python3 $SKILL/scripts/validate-slices.py /tmp/codebase-digest-slices
```

Review `manifest.tsv` before launching. Prefer coherent slices by app area, package, route group, or large file. Do not proceed with oversized slices; reduce `--cap-kb` or split oversized files.

### Stage 2 - Per-Slice Digest

Read `references/prompt-templates.md` and use the Stage 2 prompt. Save it to `/tmp/codebase-digest-prompts/slice-digest.txt`.

Preferred transport:

- public Launcher HTTP API
- one provider per request
- parallelize at the HTTP request layer

Fallback transport:

- local launcher wrapper via `scripts/run-stage2.py` or `scripts/run-slice.sh`

Default provider strategy:

- Use `--rotate --providers gemini,deepseek,grok --fast` for broad coverage with one provider per slice.
- Use fan-out `--providers gemini,deepseek` when the repo is unfamiliar, critical, or likely to have complex implicit flows.
- For any source file or staged slice over 400 KB, send the slice to both Gemini and DeepSeek, then concatenate both successful digests into the slice's `_merged.txt`. Treat this as coverage insurance: DeepSeek often gives deeper structured analysis but may silently narrow context on very large uploads, while Gemini usually retains broader context. If either provider refuses or under-delivers, split the oversized slice into smaller retry chunks and fill the missing chunk with the other provider.
- Avoid provider sets that routinely ignore compact pseudocode constraints unless the user wants richer but larger output.

Run:

```bash
python3 $SKILL/scripts/run-stage2.py /tmp/codebase-digest-slices /tmp/codebase-digest-prompts/slice-digest.txt \
  --out /tmp/codebase-digest-findings/raw \
  --providers gemini,deepseek,grok \
  --rotate \
  --fast \
  --heartbeat-markers "MODULE MAP,PSEUDOCODE,DATA AND STATE,DEVELOPER QUESTIONS"
```

The raw output should be dense, source-grounded, and easy to synthesize. It should not read like a risk register.

API-first pattern:

- create one `.txt` slice file per request
- send one provider per request
- keep a local `summary.jsonl` recording request metadata, provider, slice, and returned status
- save provider replies under a deterministic local artifact tree such as:
  - `/tmp/codebase-digest-<repo>-api-stage2/<slice>/<provider>.txt`
  - `/tmp/codebase-digest-<repo>-api-stage2/<slice>/_merged.txt`

Use `references/launcher-public-api-examples.md` as the concrete pattern for this path.

Oversized-slice merge rule:

- Preserve both provider outputs under the slice directory, for example `gemini.txt` and `deepseek.txt`.
- Build `_merged.txt` by concatenating both outputs with provider/source headings.
- If a large slice had to be split for retry, build `_merged.txt` from ordered chunk digests and label each chunk with the original slice name.
- Do not let a provider response like "beyond my current scope" count as success even if the API marks it successful.

### Stage 3 - Synthesize NotebookLM Sources

Read `references/prompt-templates.md` Stage 3. Use the resumable Stage 3 runner for final-doc synthesis; do not run ad hoc inline synthesis scripts. The runner saves every API response, extracts usable replies from existing JSON on restart, and skips already accepted docs.

Stage 3 rules:

- Do not introduce a separate grouped-synthesis layer by default. If docs are thin, improve Stage 2 coverage by splitting/dual-extracting large slices, then rerun Stage 3.
- Treat `SYNTHESIS_BUNDLE.txt` as immutable for a run. The runner records its SHA-256 and refuses to reuse old attempts if the bundle changes unless `--force-new-run` is passed.
- Keep every request and response artifact. Do not depend on terminal output as the only record.
- Use the runner's lock files to prevent duplicate local runs against the same output directory.
- Acceptance must be structural, not just length-based: title near the top, required section headings present, no refusal/placeholder text.
- Prefer resuming saved JSON attempts before issuing new API calls.
- Providers may return strong content with bare section labels instead of markdown headings. The runner should accept those only when the labels match required headings, then normalize the final output back to markdown headings.
- Strip provider formatting artifacts such as standalone `+1`/`+2` citation-count lines before saving final docs.

Default final-doc sections:

- `CODEBASE_MAP.md`: purpose, repo layout, runtime entry points, domain/module boundaries, dependency map, generated/vendor exclusions.
- `PSEUDOCODE_ATLAS.md`: important classes, functions, routes, jobs, scripts, and decision logic in compact pseudocode.
- `FLOW_AND_DATA_GUIDE.md`: request flows, background flows, data lifecycle, persistence, external calls, configuration/env, auth/session/tenant context if present.
- `NOTEBOOKLM_INDEX.md`: glossary, source map, common developer questions, search terms, "if you need X read Y".

Merge ordered section outputs with:

```bash
python3 $SKILL/scripts/merge-sections.py <parts-dir> <out-file> --title "<Doc Title>" --sources
```

Default resumable API path:

```bash
python3 $SKILL/scripts/run-stage3-api.py \
  --bundle /tmp/codebase-digest-synth/SYNTHESIS_BUNDLE.txt \
  --out /tmp/codebase-digest-findings/synth \
  --providers gemini,deepseek \
  --workers 2
```

If interrupted, rerun the same command. Do not add a separate group-synthesis layer just because a run was interrupted; resume from the saved JSON attempts first. If final docs are too thin, improve the Stage 2 corpus by splitting or dual-provider extracting large source slices, then rerun the same Stage 3 command.

API-first alternative:

- issue one public API request per final-doc section/provider pair
- save each raw JSON response locally
- extract provider replies into deterministic local files
- merge/synthesize from those local artifacts

For public API Stage 3, keep aggregate inputs in plain text/markdown and prefer one attached synthesis bundle per request.

### Stage 4 - Verify And Tighten

Run deterministic checks before calling the pack done:

```bash
bash $SKILL/scripts/enumerate-entry-points.sh <repo-root> > /tmp/codebase-digest-findings/entry-points-ground-truth.md
```

Spot-check claims that name routes, CLI commands, workers, database models, environment variables, and external services. Label inferred claims as inferred. Remove unsupported claims rather than preserving impressive prose.

### Stage 5 - Export NotebookLM Test Bundles

For NotebookLM experiments, export plain text/markdown bundles only. Do not upload JSON as a NotebookLM source. The exporter creates comparable bundle variants and injects cross-link headers so NotebookLM can learn how source chunks, pseudocode digests, and original files relate without flattening them into one trustless blob.

```bash
python3 $SKILL/scripts/build-notebooklm-bundles.py \
  --repo <repo-root> \
  --slices /tmp/codebase-digest-slices \
  --stage2 /tmp/codebase-digest-findings/raw \
  --final-docs /tmp/codebase-digest-final \
  --out /tmp/notebooklm-bundles
```

Outputs:

- `MANIFEST.txt` - plain text upload manifest and bundle inventory.
- `CROSSLINK_INDEX.txt` - plain text source chunk <-> digest <-> original file map.
- `FILE_REFERENCE_INDEX.txt` - plain text best-effort import/reference map between original source files when `--repo` is provided.
- `01-source-chunks/` - exact source chunk baseline with line numbers and cross-link headers, with code mirrored as `.ext.txt`.
- `02-pseudocode-only/` - Stage 2 pseudocode/digest files only.
- `03-final-docs/` - final synthesized markdown docs, when available.
- `04-hybrid/` - pseudocode/final-doc context plus selected exact source chunks.
- `05-contract-disciplined-hybrid/` - `04-hybrid` plus trust-policy docs such as `OBSERVED_VS_INFERRED.md`, `OPEN_QUESTIONS.md`, `ANSWERING_RULES.md`, `TRUST_LEVELS.md`, and `CANONICAL_TERMS.md`.

For A/B testing in NotebookLM, upload one variant at a time plus `MANIFEST.txt`, `CROSSLINK_INDEX.txt`, and `FILE_REFERENCE_INDEX.txt` when available, then ask the same fixed question set against each notebook.

Interpretation guidance:

- `01-source-chunks` is the exact-lookup control, not the ideal end-state.
- `02-pseudocode-only` often improves coherence but can become underspecified.
- `03-final-docs` often becomes too lossy for exact architectural detail.
- `04-hybrid` is the expected best general-purpose default.
- `05-contract-disciplined-hybrid` is the expected best synthesis/deck/audio/default when false confidence matters.

### Stage 6 - Grade NotebookLM Utility

Use this stage when deciding whether the digest bundle actually improves NotebookLM behavior. Keep it short and repeatable rather than running one long fragile browser session.

Do not treat this as a lookup-only benchmark. The real question is whether better source formatting improves:

- grounding
- uncertainty handling
- cross-source reasoning
- source selection
- slide-deck compression quality
- debate/audio usefulness without bluffing

Create a harness:

```bash
python3 $SKILL/scripts/build-notebooklm-eval-harness.py \
  --bundles /tmp/notebooklm-bundles \
  --out /tmp/notebooklm-eval-harness
```

If you already have a source-grounded question bank:

```bash
python3 $SKILL/scripts/build-notebooklm-eval-harness.py \
  --bundles /tmp/notebooklm-bundles \
  --question-bank /tmp/notebooklm-eval-harness/question_bank.md \
  --out /tmp/notebooklm-eval-harness
```

Question bank guidance:

- Use 8-20 questions total, but assign at most five question/response pairs to one subagent.
- Give each NotebookLM worker a hard wall-clock budget of about 3 minutes for small eval batches. If less than about 45 seconds remain, stop starting new questions and checkpoint cleanly.
- Include a balanced mix:
  - 2-4 exact-symbol/lookup control questions
  - 4-6 cross-file flow/synthesis questions
  - 4-6 ambiguity/trap questions
  - 2-4 compression tasks such as slide deck or debate/audio generation
- The point of lookup questions is only to verify retrieval is functioning; they should not dominate the score.
- Keep answer keys outside NotebookLM. Never upload `answer_key.csv`, scoring CSVs, progress logs, or worker prompts as sources.
- Prefer fresh notebooks for clean scoring. If reusing a notebook, delete chat history first.

Autonomous browser rules:

- If the Chrome DevTools MCP exposes `ensure_profile_browser`, call it first with a deterministic port/profile before NotebookLM work.
- After opening NotebookLM, read the visible Google account button and persist `account_email` with `port`, `profile_dir`, and `notebook_url`.
- Only resume a saved NotebookLM URL when the current account email matches the durable batch state. If the account differs, create a fresh notebook and update the state.
- Create a fresh NotebookLM notebook per bundle variant.
- Upload only the source files for that variant.
- Verify filenames, source count, and checked sources before querying.
- For small notebooks, wait for accessibility/DOM completion clues, then add a 3-5 second settle delay. Use up to 45 seconds only if the answer is still streaming or cues are missing.
- Record completion clues: question heading, answer live region/static text, `Reply ready`, save/copy/rating buttons, follow-up suggestions, and visible source count.
- Checkpoint atomically after every meaningful step: notebook created, upload submitted, upload verified, question submitted, clues observed, answer extracted, score computed.
- Append one JSONL object per answered question to a response archive. Preserve the full extracted answer text, citation labels, completion clues, score, and notes. The markdown run log is for humans; JSONL is the canonical machine-readable archive.
- Use text checkpoints as the detailed log. Take one final screenshot only as a visual sanity artifact; it should show uploaded sources and at least one answered chat message when possible. If the run fails before chat, screenshot the last reachable notebook/upload state.
- Do not let the final report or final screenshot be the only durable record.

Current grading interpretation:

- `01-source-chunks` is the baseline for exact lookup and line-grounded answers.
- `04-hybrid` is expected to be the best general-purpose NotebookLM upload when it preserves selected exact source chunks while adding pseudocode/crosslinks.
- `05-contract-disciplined-hybrid` is expected to be the best bundle for synthesis-heavy use cases such as architecture Q&A, slide decks, and audio/debate generation.
- `02-pseudocode-only` and `03-final-docs` are useful only if they answer broad architecture questions without hallucinating exact symbols or line locations.
- If a richer bundle does not beat source-only on flow or source-selection questions, simplify it rather than adding more generated prose.
- If a richer bundle gives prettier decks but increases unsupported claims, it lost.

Scoring priorities:

1. Grounding quality
2. Epistemic discipline
3. Compression quality
4. Exact lookup fidelity

For deck/audio scoring, use a structural checklist before subjective taste:

- does it separate observed vs inferred?
- does it preserve uncertainty instead of flattening it?
- does it avoid unsupported claims?
- does it keep the architecture legible?
- does it cite or reference the right source family?

## Quality Bar

The digest is successful when a fresh agent or provider can answer:

- What are the major areas of the repo and where are they?
- What are the key entry points and runtime processes?
- What are the important classes/functions/routes/jobs and what do they do?
- How do core flows move through the system?
- What data models, persistence layers, external services, and configuration drive behavior?
- Where should a developer look to change or debug a specific feature?

Every non-obvious claim should include file citations from the slice output. Prefer compact pseudocode, tables, and inventories over generic architecture prose.

## References

- `references/prompt-templates.md` - Stage 2 extraction and Stage 3 synthesis prompts.
- `references/launcher-public-api-examples.md` - public `/runs/parallel` API pattern for 4-way browser-backed provider digest runs.
- `scripts/stage-slices.py` - deterministic repo slicing.
- `scripts/validate-slices.py` - slice size validation.
- `scripts/run-stage2.py` - launcher orchestrator for per-slice extraction.
- `scripts/run-stage3-api.py` - resumable public API final-doc synthesis.
- `scripts/run-slice.sh` - lower-level hardened launcher runner.
- `scripts/merge-sections.py` - deterministic ordered merge.
- `scripts/build-notebooklm-bundles.py` - plain text/markdown bundle exporter for NotebookLM A/B tests.
- `scripts/build-notebooklm-eval-harness.py` - NotebookLM grading harness generator.
- `scripts/enumerate-entry-points.sh` - route/handler/CLI/job enumeration support.
