---
allowed-tools: Glob, Grep, Read, MultiEdit, Edit, TodoWrite, Task, Bash
argument-hint: [directory-or-pattern]
description: Generate code map headers: add dependency analysis and data flow documentation to each file
---

# Generate Code Map Headers

## Step 1: Find all code files
Use Glob to find files matching pattern: ${1:-**/*.{js,ts,jsx,tsx,py}}
IMPORTANT:
- Exclude: docs/, node_modules/, dist/, build/, .next/, __pycache__, *.md files
- Exclude: *.min.js, *.bundle.js, auto-generated files
- Skip files with: // @generated, /* auto-generated */, # Generated by

## Step 2: Create task list
Use TodoWrite to track all files that need documentation:
1. Create todo items for each file found
2. Group by file type and directory
3. Mark as 'pending', 'in_progress', or 'completed'
4. Track which files were skipped (already documented)

## Step 3: Build dependency graph
Analyze all files to create dependency order:
1. Build import graph for all files
2. Sort files by dependency depth (leaf nodes first)
3. Process files from least dependencies to most
This ensures child components are documented before parents

## Step 4: Smart dependency analysis
For each file found:
1. ALWAYS read the file first
2. Check if documentation already exists - if accurate, SKIP
3. Detect file type and framework:
   - **JS/TS**: React Component | Next.js Route | Express API | Vue Component | Hook | Service | Config
   - **Python**: Django Model/View | FastAPI Router | Flask Blueprint | SQLAlchemy Model | Pydantic Schema | Celery Task | pytest Test
4. Read imports based on language:
   - **JS/TS**: import/require statements, dynamic imports, lazy loading
   - **Python**: import/from statements, __import__(), importlib
5. Use Grep to find who imports this:
   - **JS/TS**: search for filename in import/require statements
   - **Python**: search for module name in: `import module`, `from module import`,
     `from . import`, `from ..package import`, relative imports
6. Detect circular dependencies and note them as warnings
7. Check for environment variables:
   - **JS/TS**: process.env.X
   - **Python**: os.environ, os.getenv(), python-dotenv
8. Find associated test files:
   - **JS/TS**: *.test.js, *.spec.ts, __tests__/
   - **Python**: test_*.py, *_test.py, tests/, pytest files

## Step 5: Generate or update file header documentation
Based on full context from related files, create/update headers.

**IMPORTANT: Be as comprehensive as possible.** Include all relevant details that help LLMs and developers understand:
- What the file does
- How it fits into the system
- What data flows through it
- What side effects it has
- How to use it

### Python File Headers - STANDARD Format

Use this format for most files:
```python
"""
Purpose: [What problem this solves - one line]
LLM-Note:
  Dependencies: imports from [file1.py, file2.py] | imported by [caller1.py, caller2.py] | tested by [tests/test_*.py]
  Data flow: receives X from caller → processes via Y → returns Z
  State/Effects: modifies self.X | writes to file/database | calls external API
  Integration: exposes func_a(), func_b(), ClassC | uses @decorator | FastAPI dependency
  Performance: caching strategy | async patterns | known bottlenecks
  Errors: raises ErrorType | handles X | fallback behavior
"""
```

### Python File Headers - DETAILED Format (for core/complex files)

Use this for files with many relationships, architectural decisions, or complex logic:

**Real Example: connectonion/agent.py**
```python
"""
Purpose: Orchestrate AI agent execution with LLM calls, tool execution, and automatic logging
LLM-Note:
  Dependencies: imports from [llm.py, tool_factory.py, prompts.py, decorators.py, logger.py, tool_executor.py, tool_registry.py] | imported by [__init__.py, debug_agent/__init__.py] | tested by [tests/test_agent.py, tests/test_agent_prompts.py, tests/test_agent_workflows.py]
  Data flow: receives user prompt: str from Agent.input() → creates/extends current_session with messages → calls llm.complete() with tool schemas → receives LLMResponse with tool_calls → executes tools via tool_executor.execute_and_record_tools() → appends tool results to messages → repeats loop until no tool_calls or max_iterations → logger logs to .co/logs/{name}.log and .co/sessions/{name}_{timestamp}.yaml → returns final response: str
  State/Effects: modifies self.current_session['messages', 'trace', 'turn', 'iteration'] | writes to .co/logs/{name}.log and .co/sessions/ via logger.py
  Integration: exposes Agent(name, tools, system_prompt, model, log, quiet), .input(prompt), .execute_tool(name, args), .add_tool(func), .remove_tool(name), .list_tools(), .reset_conversation() | tools stored in ToolRegistry with attribute access (agent.tools.tool_name) and instance storage (agent.tools.gmail) | tool execution delegates to tool_executor module | log defaults to .co/logs/ (None), can be True (current dir), False (disabled), or custom path | quiet=True suppresses console but keeps session logging | trust enforcement moved to host() for network access control
  Performance: max_iterations=10 default (configurable per-input) | session state persists across turns for multi-turn conversations | ToolRegistry provides O(1) tool lookup via .get() or attribute access
  Errors: LLM errors bubble up | tool execution errors captured in trace and returned to LLM for retry

Architecture:
    ┌──────────────────────────────────────────────────────────────────────┐
    │                           Agent.input(prompt)                        │
    └───────────────────────────────────┬──────────────────────────────────┘
                                        │
                                        ▼
    ┌──────────────────────────────────────────────────────────────────────┐
    │  Initialize/Restore Session                                          │
    │  - Create current_session dict if None                               │
    │  - Or restore from passed session (stateless API)                    │
    │  - Add user message to messages[]                                    │
    └───────────────────────────────────┬──────────────────────────────────┘
                                        │
                                        ▼
    ┌──────────────────────────────────────────────────────────────────────┐
    │  _run_iteration_loop(max_iterations)                                 │
    │  ┌────────────────────────────────────────────────────────────────┐  │
    │  │  while iteration < max_iterations:                             │  │
    │  │      1. _get_llm_decision()                                    │  │
    │  │         → llm.complete(messages, tools)                        │  │
    │  │         → returns LLMResponse with content/tool_calls          │  │
    │  │                                                                │  │
    │  │      2. if no tool_calls: return response.content              │  │
    │  │                                                                │  │
    │  │      3. _execute_and_record_tools(tool_calls)                  │  │
    │  │         → tool_executor executes each tool                     │  │
    │  │         → adds assistant message + tool results to messages    │  │
    │  │         → fires before_tools/after_each_tool/after_tools       │  │
    │  │                                                                │  │
    │  │      4. continue loop (LLM sees tool results)                  │  │
    │  └────────────────────────────────────────────────────────────────┘  │
    └───────────────────────────────────┬──────────────────────────────────┘
                                        │
                                        ▼
    ┌──────────────────────────────────────────────────────────────────────┐
    │  Return final response + log to YAML session                         │
    └──────────────────────────────────────────────────────────────────────┘

File Relationships:
    connectonion/
    ├── agent.py           # THIS FILE - orchestrates execution
    ├── llm.py             # LLM abstraction (OpenAI, Anthropic, Gemini)
    ├── tool_factory.py    # Convert functions to tool schemas
    ├── tool_registry.py   # Store and lookup tools (O(1))
    ├── tool_executor.py   # Execute tools with xray context
    ├── logger.py          # Terminal + file + YAML logging
    ├── prompts.py         # Load system prompts
    └── events.py          # Event system (before_llm, after_tools, etc.)

    Flow: agent.py → llm.py → provider API
                   → tool_executor.py → user functions
                   → logger.py → .co/logs/, .co/sessions/

Event System:
    after_user_input  → fires after user prompt added to messages
    before_llm        → fires before each LLM call (can modify messages)
    after_llm         → fires after LLM response (can inspect tool_calls)
    before_tools      → fires ONCE before all tools execute
    before_each_tool  → fires before EACH tool
    after_each_tool   → fires after EACH tool (don't add messages!)
    after_tools       → fires ONCE after all tools (safe to add messages)
    on_error          → fires on tool error
    on_complete       → fires when task completes

Session Structure:
    current_session = {
        'session_id': str,           # For API continuation
        'messages': List[Dict],      # OpenAI format messages
        'trace': List[Dict],         # Execution history
        'turn': int,                 # Conversation turn counter
        'iteration': int,            # Current loop iteration
        'user_prompt': str,          # Original user input
        'result': str                # Final response
    }

Design Notes:
    - Functions as tools: Pass any callable, auto-converts to schema
    - Class instances: Methods become tools, instance accessible via agent.tools.classname
    - Stateless API support: Pass session dict to continue conversations
    - Event-driven: Plugins hook into lifecycle via events
    - Fail-fast: Tool errors returned to LLM for retry, not silently ignored
"""
```

**Another Example: connectonion/llm.py**
```python
"""
Purpose: Unified LLM abstraction supporting OpenAI, Anthropic, Gemini, and managed keys
LLM-Note:
  Dependencies: imports from [openai, anthropic, os, usage.py] | imported by [agent.py, llm_do.py] | tested by [tests/test_llm.py, tests/real_api/]
  Data flow: messages: List[Dict] + tools: List[Schema] → provider-specific format → API call → normalize to LLMResponse(content, tool_calls, usage)
  State/Effects: reads API keys from environment | no persistent state | usage tracked per call
  Integration: exposes LLM base class, create_llm() factory, OpenAILLM, AnthropicLLM, GeminiLLM, ManagedKeysLLM | factory routes by model prefix (gpt-, claude-, gemini-, co/)
  Performance: synchronous API calls | no caching | provider timeouts respected
  Errors: raises provider-specific errors | ValueError for missing API keys

Architecture:
    ┌─────────────────────────────────────────────────────────────┐
    │                    create_llm(model, api_key)               │
    │  Routes to appropriate LLM class based on model prefix      │
    └─────────────────────────┬───────────────────────────────────┘
                              │
              ┌───────────────┼───────────────┬───────────────┐
              │               │               │               │
              ▼               ▼               ▼               ▼
    ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
    │ OpenAILLM   │  │AnthropicLLM │  │  GeminiLLM  │  │ManagedKeys  │
    │ gpt-*, o1-* │  │ claude-*    │  │ gemini-*    │  │ co/* prefix │
    └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘
           │                │                │                │
           ▼                ▼                ▼                ▼
    ┌─────────────────────────────────────────────────────────────┐
    │                    LLMResponse                              │
    │  content: str | tool_calls: List[ToolCall] | usage: Usage   │
    └─────────────────────────────────────────────────────────────┘

Model Routing:
    - gpt-*, o1-*, o4-*  → OpenAILLM (OPENAI_API_KEY)
    - claude-*           → AnthropicLLM (ANTHROPIC_API_KEY)
    - gemini-*, models/* → GeminiLLM (GOOGLE_API_KEY)
    - co/*               → ManagedKeysLLM (OPENONION_API_KEY)
"""
```

### JavaScript/TypeScript File Headers

```javascript
/**
 * @purpose [What problem this solves - one line]
 * @llm-note
 *   Dependencies: imports from [lib/api.js, utils/format.js] | imported by [app/page.tsx, components/Header.tsx] | tested by [__tests__/module.test.js]
 *   Data flow: receives {user: User, settings: Settings} from page.tsx → validates → fetches from api.js → transforms → returns {formattedData: DataResponse}
 *   State/Effects: updates Redux store.user | calls API POST /users | emits 'user-updated' event | writes to localStorage
 *   Integration: exposes {getUserData, updateUser} | uses parent's onUpdate callback | implements AuthMiddleware interface
 *   Performance: caches user data 5min | debounces API calls 300ms | lazy loads ProfileImage
 *   Errors: throws UserNotFoundError | handles network timeout | fallback to cache
 *
 * Architecture:
 *     ┌─────────────────────────────────────────────────────────┐
 *     │  Component Tree                                         │
 *     │  App → AuthProvider → UserProfile → UserService (this) │
 *     └─────────────────────────────────────────────────────────┘
 *
 * File Relationships:
 *     features/user/
 *     ├── UserProfile.tsx   # UI component
 *     ├── userService.ts    # THIS FILE
 *     └── userTypes.ts      # TypeScript types
 */
```

The LLM-Note MUST include ALL of these aspects:
1. **File Dependencies**:
   - Imports: [actual file paths this imports from]
   - Imported by: [use Grep to find who imports this file]
   - Test files: [any test files that test this module]

2. **Data Flow**:
   - Input: [what data comes in, from which files/sources]
   - Processing: [key transformations or business logic]
   - Output: [what data goes out, to which files/destinations]

3. **State & Side Effects**:
   - State mutations: [what global/shared state it modifies]
   - Side effects: [API calls, file I/O, DOM updates, database writes]
   - Event emissions: [events triggered, subscribers affected]

4. **Integration Points**:
   - APIs exposed: [public methods/functions other files use]
   - Callbacks used: [callbacks it receives from parent components]
   - Hooks/middleware: [lifecycle hooks, middleware it provides/uses]
   - External services: [third-party APIs, databases, services it connects to]

## Step 5: Smart header insertion/update
- If NO header exists: Add new header at file top
- If header EXISTS: Update it based on new analysis
- If header is ACCURATE: Skip the file
- Always use MultiEdit or Edit to preserve existing code

## Step 6: Update todo list and track progress
Use TodoWrite throughout the process:
1. Mark each file as 'in_progress' when starting
2. Mark as 'completed' when documentation added/updated
3. Mark as 'skipped' if already has accurate docs
4. Provide final summary:
   - Total files processed
   - Files updated vs skipped
   - Circular dependencies found
   - Missing test coverage
   - Critical warnings discovered
DO NOT create any new documentation files or modify any markdown files

## Step 7: File-type specific focus
**JavaScript/TypeScript:**
- **Components**: Props/State types, events emitted, context consumed, render conditions
- **API Routes**: Request/response types, middleware chain, status codes, error responses
- **Utilities**: Input/output types, pure functions, memoization, edge cases
- **Hooks**: useEffect deps, cleanup returns, custom hook composition
- **Services**: Axios/fetch configs, interceptors, retry logic, caching strategy

**Python:**
- **Classes**: __init__ params, @property decorators, inheritance (ABC, Protocol), __slots__
- **API Endpoints**: @app.route or @router decorators, Pydantic models, HTTPException handling
- **Async Functions**: async/await, asyncio.create_task(), gather(), event loops
- **Data Classes**: @dataclass fields, field(default_factory), __post_init__, asdict()
- **Decorators**: functools.wraps, *args/**kwargs handling, closure state
- **Type Hints**: Generic[T], Union, Optional, TypeVar, Protocol definitions

## Step 8: Critical warnings to include
Add warnings in LLM-note when detected:
- ⚠️ Circular dependency with [file.js]
- ⚠️ Performance: expensive operation in render/hot path
- ⚠️ Security: handles sensitive data/credentials
- ⚠️ Deprecated: scheduled for removal in v2.0
- ⚠️ TODO: unfinished implementation at line X

## Step 9: Process checklist
Before marking a file as completed, verify:
- [ ] Read all imported files for context
- [ ] Found all files that import this one (used Grep)
- [ ] Identified correct file type (Component/Class/API/etc)
- [ ] Documented actual file paths, not generic names
- [ ] Included data types in data flow
- [ ] Listed all side effects and state changes
- [ ] Added performance notes if relevant
- [ ] Added warnings for issues found
- [ ] Checked for associated test files
- [ ] Updated TodoWrite status

## Step 10: Important principles
- ONLY modify code files (.js/.ts/.jsx/.tsx/.py) - never .md files
- NEVER touch docs/ folder or any documentation files
- ONLY add/update comment headers at the top of code files
- ALWAYS read related files before documenting
- Keep docs minimal but include LLM context hints
- Focus on relationships and data flow over implementation
- Update stale docs when code has changed
- Skip files with accurate existing documentation