---
name: nocap-efficient-file-operations
description: Use for ALL file creation, modification, and editing tasks. Defines operation tiers, smart file targeting, large file strategy, multi-file dependency ordering, and verification. Invoke before any file operation. Re-invoke every 5 messages during file-heavy work.
---

Author: HyperWorX (https://github.com/HyperWorX)
License: MIT

# Efficient File Operations Protocol

Read this entire document before performing any file operation.
Process each section. These are operational rules, not suggestions.

---

## 1. Core Principle

Never regenerate content that already exists unchanged.

If a file exists and the task is to modify it, the output should
contain only the changes, not a reproduction of unchanged content.
The tools available (Edit, Bash, Write, Read, Grep, Glob) have
different costs. Default to the cheapest tool that accomplishes
the task. Escalate only when a cheaper tool is insufficient.

---

## 2. Operation Tiers

Tier 1 (lowest cost): Bash with append/insert commands.
- echo >> file, sed line insertion, tee -a.
- Use for: adding content to end of file, inserting at a known
  line position, simple line-level additions.

Tier 2 (low cost): Edit tool targeted edits.
- Use for: changing specific passages, fixing values, swapping
  sections, correcting errors.
- The Edit tool specifies old text and new text within a file.
  Ensure the old text is unique in the file. Include enough
  surrounding context (3-5 lines) to guarantee uniqueness.
- Use replace_all parameter when all instances need the same change.
- This is Claude Code's primary editing mechanism and the default
  for most modifications.

Tier 3 (medium cost): Bash with Python or scripting.
- Use for: programmatic bulk changes (regex patterns across many
  locations), section replacement by line range, structured format
  manipulation (JSON, XML, YAML, docx, pptx, xlsx), many small
  scattered changes where individual Edit tool calls would exceed
  the cost of a script.

Tier 4 (medium cost): copy source file, then targeted edits.
- Bash: cp source destination, then Tier 1-3 edits on copy.
- Use for: creating a new version based on an existing file where
  the majority of content is preserved.

Tier 5 (highest cost): Write tool with full content.
- Use for: genuinely new files with no existing source, OR when
  more than 60% of content changes, OR when the file format
  changes entirely (e.g., txt to structured md, restructuring
  document flow).
- This tier requires justification in the process trace.
- The Write tool requires that the file has been Read first if
  it already exists.

Default to the lowest applicable tier. Escalate only when the
lower tier is demonstrably insufficient.

---

## 3. Mandatory Pre-Edit Steps

Before any file modification:

### 3.0 Identify Target Files

Do not broadly search when context narrows the target. Before
reaching for Grep tool across a directory, use what is already
known:

1. **User's words.** If the user names a file, references a
   section, or describes content, that is the primary signal.
   Go directly to that file.

2. **Conversation history.** Files already viewed, edited, or
   discussed in this conversation are likely targets. Check
   recent tool calls before searching.

3. **Project structure.** If working in a known directory layout,
   use the structure: config changes go in config files, style
   changes go in stylesheets, content changes go in content
   files. Use Bash ls or Glob tool on the directory once early,
   then use that knowledge.

4. **File type narrows tool choice.** The user says "update the
   README" -- that is one file, one view, one or more edits. Not
   a recursive search.

5. **Dependency inference.** If the user asks to rename a function,
   the target files are: the file defining it, and files importing
   or calling it. Use the definition file to find the identifier,
   then targeted Grep tool search for that specific identifier
   across the project. Not a broad content search.

Broad search (Grep tool across directory) is a fallback when
context does not narrow the target, not a default first step.
Every broad search costs tokens reading irrelevant results.

Search narrowing order:
- Named file > described content > project structure inference >
  file type filtering > targeted Grep on specific pattern >
  broad recursive Grep.

Use Glob for file pattern matching (e.g., find all *.py files).
Use Grep for content search (e.g., find where a function is
called). These are dedicated Claude Code tools preferred over
bash equivalents.

### 3.1 View Before Edit

Always Read the file (or the relevant section) immediately before
editing. Do not edit based on memory of prior views if any
modifications have been made to the file in the current
conversation.

For files under 500 lines: Read the full file.
For files over 500 lines: Read the specific section(s) using
offset and limit parameters being edited, plus search (Grep tool)
for any content that might be affected by the change.

### 3.2 Backup Before Destructive Edits

Before operations that cannot be trivially undone (sed on the
original, multi-step edits to a critical file, format
conversion), copy the file first:

  cp file file.bak

or work on a copy:

  cp file working-copy

If the project is under git version control, a backup copy is
unnecessary for files tracked by git (use git checkout or git
stash to recover). For untracked files or files outside a git
repo, back up before destructive edits.

### 3.3 Assess Scope

Determine what percentage of the file changes:
- Under 10%: Tier 1-2.
- 10-40%: Tier 2-3.
- 40-60%: Tier 3-4.
- Over 60% or format conversion: Tier 5 (justified).

### 3.4 Search Before Propagating

If a change could affect other locations in the file (renames,
signature changes, value changes referenced elsewhere), search
the entire file for related patterns before committing any edit.
Use: Grep tool on the file

For multi-file changes, search across all affected files:
Grep tool with directory path

### 3.5 Announce Scope If Ambiguous

If the user's request uses ambiguous terms (update, rewrite,
revise, redo, rework), state the planned approach before
executing. Example: "I'll modify sections 3 and 5 with the new
data. The rest of the document stays as-is."

Default: announce scope before proceeding. Skip announcement
only when all of the following hold:
(1) user named the specific file,
(2) user specified the exact change,
(3) no other files are implied or required by the change.
If any condition is not met, or if uncertain, announce scope.

---

## 4. Edit-Family Verb Interpretation

These verbs default to TARGETED CHANGES, not full regeneration:
- update, edit, modify, change, fix, correct, revise, adjust,
  tweak, patch, amend.

These verbs may justify broader changes (assess on context):
- rewrite, redo, rework, overhaul, restructure, convert.

These verbs justify full generation:
- create, write, generate, draft, build, make (when no source
  file exists).

When "rewrite" is used but the actual changes are minor (e.g.,
"rewrite this paragraph"), apply it only to the specified scope.
"Rewrite the document" with no source file means generate new.
"Rewrite the document" with an existing source file means assess
what actually needs to change, default to targeted edits unless
changes exceed 60%.

---

## 5. Structured File Formats

For files where raw text manipulation would break structure:

### 5.1 JSON, XML, YAML
Use Python with the appropriate library (json, xml.etree,
pyyaml) to parse, modify, and write back. Do not use text-level
sed/Edit on structured data unless the change is trivially
safe (e.g., replacing a single known value).

### 5.2 docx, pptx, xlsx
Use python-docx, python-pptx, openpyxl respectively.
When modifying an existing file: load it, make targeted changes
to the document object, save. Do not rebuild the document from
scratch unless it is genuinely a new document.

### 5.3 Code Files
The Edit tool is the primary tool. For many related changes, a
Python script using re.sub or ast manipulation may be more
efficient. Always grep for all instances of changed identifiers
before and after editing.

---

## 6. Common Wasteful Patterns (Do Not Do These)

### 6.1 Regeneration for Append
WRONG: View file, then Write tool with all existing content
plus new content at the end.
RIGHT: Bash with append operation, or Edit tool anchored
to the last unique section.

### 6.2 Regeneration for Small Fix
WRONG: View file, then Write tool rewriting everything with
the fix included.
RIGHT: Edit tool targeting only the broken passage.

### 6.3 Regeneration for "Consistency"
WRONG: "I'll rewrite the file to make sure my changes are
consistent with the rest of the content."
RIGHT: Search for all related instances. Make targeted Edit
tool calls to each. Verify by viewing affected sections after
editing.

### 6.4 Regeneration Across Multiple Files
WRONG: User asks for changes to 3 files. Claude rewrites all 3.
RIGHT: Targeted edits to each file independently.

### 6.5 Broad Search When Context Narrows
WRONG: User says "fix the typo in the README." Claude runs
Grep tool across the entire project looking for typos.
RIGHT: Read the README directly. It was named.

### 6.6 Viewing Entire Large Files
WRONG: User asks to fix line 45. Claude Reads all 800 lines.
RIGHT: Read lines 40-55 (using offset and limit). Fix line 45.
Read lines 40-55 to verify.

---

## 7. Handling Edit Tool Failures

If an Edit tool call fails (non-unique text or text not found):

1. Read the file to diagnose. The content may have changed from
   a prior edit, or the old text may have a whitespace mismatch.
2. Adjust the old text with more surrounding context or corrected
   content to ensure uniqueness.
3. Retry.
4. If the passage genuinely appears multiple times and all
   instances need the same change, use the Edit tool with
   replace_all parameter, or use Bash with sed or Python re.sub.
5. Do NOT escalate to full file regeneration (Write tool) because
   the Edit tool failed. Diagnose and fix.

---

## 8. Section Replacement for Large Blocks

When a contiguous section (20+ lines) needs complete replacement
but the rest of the file is unchanged:

Option A: Edit tool using the first 3-5 lines of the section
as old text anchor. Replace with new content. This works if the
opening lines are unique.

Option B: Python script to read file, identify section by
markers (headings, comments, delimiters), replace section
content, write back.

Option C: sed with line ranges (if line numbers are confirmed
via recent Read): sed -i '200,350c\new content' file.
Note: for multi-line replacement, a heredoc or Python is cleaner.

When replacing a section, work bottom-up if multiple sections
are being replaced, so line numbers of earlier sections are not
shifted by later replacements.

---

## 9. Verification After Edits

After completing targeted edits:

- For small changes (1-3 edits): Read the edited sections to
  confirm correctness.
- For larger changes (4+ edits): Read a broader range or the
  full file to verify no unintended side effects.
- For structured formats: run a validation step if available
  (python -m json.tool, xmllint, python import test).
- For files with a backup: diff original against modified to
  confirm only intended changes were made:
  diff file.bak file
  This is cheaper than re-Reading the full file and catches
  unintended side effects from sed or scripted edits.

Verification is cheaper than debugging a corrupted file.

---

## 10. Decision Log

Before executing any file operation, include a brief decision
entry in the process trace:

[File op: {filename}
 Tier: {1-5}
 Reason: {why this tier}
 Changes: {summary of what changes}
 Rewrite justified: {yes/no, why if yes}]

This log serves two purposes:
1. Forces the planning step before execution.
2. Creates an auditable record of efficiency decisions.

---

## 11. When Full Rewrite IS Correct

Full rewrite (Tier 5) is the correct choice when:
- No source file exists (genuinely new content).
- The file format changes entirely (plain text to structured
  document, CSV to formatted report).
- More than 60% of content changes substantively (not just
  formatting, but actual content replacement).
- The file is very short (under 30 lines) and the overhead of
  planning targeted edits exceeds the cost of regeneration.
- The user explicitly requests "start from scratch" or "write
  new" or equivalent.

When Tier 5 is selected, note it in the decision log with the
justification.

### 11.1 Rewrite Bias Interaction

When the rewrite bias (nocap Section 14.7)
indicates a full rewrite of a component, reassess the file
operation tier using this procedure:

1. Take the tier §3.3 originally selected based on scope
   percentage (call this the "scope tier").
2. Check the Tier 5 justification conditions above (§11):
   (1) no source file exists, (2) format changes entirely,
   (3) more than 60% substantive content change, (4) very short
   file (<30 lines) where planning overhead exceeds regen cost,
   (5) user explicitly requested "start from scratch".
3. If ANY Tier 5 condition holds, escalate to Tier 5. The
   rewrite is both structural AND large-scale; regenerating
   the whole component is efficient.
4. If NO Tier 5 condition holds, stay at the scope tier. The
   rewrite is structural but localised (e.g., redesigning a
   15-line function within a 500-line file). Use the scope
   tier's tooling (Edit for Tier 2, scripted for Tier 3) and
   mark the work in the decision log as "rewrite within tier".
   The rewrite bias FCP result informs HOW you approach the
   edit (design the replacement coherently as a whole rather
   than accumulating patches) but does not force Tier 5 when
   the scope is genuinely narrow.

Log the tier decision in the decision log (Section 10) with
reference to the rewrite bias classification AND the Tier 5
condition-check result: `Tier: [N] | Rewrite bias: yes | Tier 5
conditions: [list which hold, or 'none']`.

---

## 12. Drift Mitigation

This skill's influence will degrade over long conversations.
Mitigations:

- The decision log (Section 10) forces re-engagement with the
  protocol at each file operation.
- If you notice yourself reaching for Write tool when an
  existing file is being modified, that is a drift signal. Stop.
  Re-read Section 2. Select the correct tier.
- The user may say "re-read skills" or "refresh protocol."
  When they do, re-read this document in full.
- Before any file operation in a conversation that has exceeded
  10 messages, re-read Section 6 (Common Wasteful Patterns)
  to recalibrate.

---

## 13. Large File Strategy

Files over 500 lines require different handling than small files.
Do not attempt to hold the entire file in working memory.

### 13.1 Reading

- Use Glob or Bash ls on the directory or file header first to
  understand structure (headings, section markers, function
  names).
- Use Grep tool to locate specific content by pattern.
- Use Read with offset and limit parameters to read only the
  relevant sections.
- Build a mental map: "Section A is lines 1-80, Section B is
  lines 81-200" etc. Record this if the file will be edited
  multiple times.

### 13.2 Editing

- For scattered changes: use Grep tool to find all locations
  first, note line numbers, then edit from bottom to top (so
  line numbers of earlier edits are not shifted).
- For section rewrites in large files: use line-range sed or
  Python scripts rather than Edit tool with large old text
  blocks (which are fragile and token-expensive).
- After editing, verify only the changed sections plus their
  immediate context, not the entire file.

### 13.3 When a File Is Too Large

If a single file exceeds ~2000 lines and requires extensive
modification, apply FCP to determine the operation tier:
(a) Tier 3 scripted operation: more reliable for bulk
    transformation across a large file.
(b) Interactive editing with targeted section reads: viable
    when changes are localised to known sections.
The biased default is (b). Generate the case for (a) first.

---

## 14. Multi-File Operations

### 14.1 Dependency Ordering

When changes span multiple files, determine the dependency
order before making any edits:

1. Identify which files define and which files consume the
   thing being changed (a function, a variable, a config key,
   a section heading referenced elsewhere).
2. Edit definition files first, consumption files second.
3. If a change in file B depends on a change in file A being
   complete, finish A (including verification) before starting B.

### 14.2 Failure Handling

If an edit fails partway through a multi-file operation:
- Stop. Do not continue to downstream files.
- Diagnose and fix the failed edit.
- Verify the fix.
- Then continue to the next file.

If a multi-file operation cannot be completed (tool failure,
unexpected file state), report what was completed and what
remains. Do not leave files in an inconsistent state without
noting it.

### 14.3 Cross-File Propagation

When a change in one file affects references in other files:

1. Complete the primary edit.
2. Search for all references: Grep tool with directory path
3. Edit each reference file.
4. Verify: Grep tool should return no results for old pattern
   (or only intentionally unchanged instances).

This is the file-operations equivalent of nocap-robust-review 2.5
(cross-reference propagation). If nocap-robust-review is active,
its procedures take precedence for the review step.

---

## 15. Encoding and Line Endings

### 15.1 Detection

Before editing a file with unexpected characters or behaviour:
  file filename          (reports encoding/type)
  head -c 3 filename | xxd  (checks for UTF-8 BOM)

### 15.2 Common Issues

- UTF-8 BOM (EF BB BF): some editors add this. Edit tool may
  fail on the first line if the BOM is present but not included
  in old text. Use: sed -i '1s/^\xEF\xBB\xBF//' filename
- Windows line endings (\r\n): cause visible ^M characters and
  can break scripts. Convert: sed -i 's/\r$//' filename
- Mixed encoding: if a file has non-UTF-8 characters, use Python
  with explicit encoding handling rather than text tools.

### 15.3 Preservation

When editing a file, preserve its existing encoding and line
ending convention. Do not convert unless the user requests it
or the encoding is causing errors.

---

## 16. Binary and Non-Text Files

The tier system (Section 2) applies to text files. Binary files
require different handling.

### 16.1 Identification

If the file extension or content suggests binary (images, PDFs,
compiled files, archives, database files), do not attempt
text-level editing. Use:
  file filename
to confirm the type.

### 16.2 Binary Operations

- Images: use Python PIL/Pillow or ImageMagick CLI.
- PDFs: Claude Code can read PDF files directly via the Read tool.
- Archives: use zip/unzip, tar.
- Office formats (docx, pptx, xlsx): use the appropriate
  Python library (Section 5.2).
- Compiled/executable files: cannot be meaningfully edited.
  Report this to the user.

### 16.3 The Tier System for Binary Files

Tier 1: CLI tools (convert, mogrify, pdftk) for single
  transformations.
Tier 2: not applicable (Edit tool does not work on binary).
Tier 3: Python scripts with appropriate libraries.
Tier 4: copy source, then Tier 1 or 3 operations on copy.
Tier 5: generate new binary from scratch (e.g., creating a
  new image, building a new PDF).

---

## 17. Cleanup

### 17.1 Working Files

After completing a multi-step operation, clean up intermediate
files:
- Remove .bak files if the operation succeeded and the user
  has the final output.
- Remove temporary working copies after the final file is in
  place.
- Do not leave partial or failed outputs without noting them.

### 17.2 When Not to Clean Up

- If the user might need to revert, keep backups until the
  conversation ends or the user confirms.
- If intermediate files have diagnostic value (showing what
  went wrong), note their existence rather than deleting.
