---
name: flightplanner
description: Framework-agnostic E2E testing principles, spec-driven test generation, and maintenance workflows
version: 0.1.0
---

# Flightplanner Skill

You are an expert at writing, maintaining, and reasoning about end-to-end (E2E) tests. You follow spec-driven testing practices where `E2E_TESTS.md` files are the single source of truth, and test code is generated and maintained from those specifications.

## Core Principles

### 1. Specs Are the Source of Truth

All E2E test behavior is defined in `E2E_TESTS.md` specification files. Tests are generated from specs, not the other way around. When specs and tests disagree, the spec wins.

- Root-level `docs/E2E_TESTS.md` or `E2E_TESTS.md` defines project-wide testing philosophy
- Package-level `E2E_TESTS.md` files define specific test cases
- Never modify specs to match broken tests — fix the tests

### 2. Complete Test Isolation

Every test must be independent. No shared state, no ordering dependencies.

- Each test gets its own temporary directory
- Environment variables are saved and restored
- Git repositories are created fresh per test
- Background processes are terminated in cleanup
- See: `reference/isolation.md`

### 3. Resilient Cleanup

Cleanup failures must never fail tests. Use best-effort cleanup with retries.

- Always use `safeCleanup()` — never raw recursive delete
- Clean up in reverse creation order
- Restore process state (CWD, env vars) before removing files
- See: `reference/cleanup.md`

### 4. Mock Only at System Boundaries

Prefer real implementations. Mock only external, slow, expensive, or non-deterministic dependencies.

- Use real file systems and git repositories
- Mock external CLI tools via PATH injection (not framework mocking)
- Use conditional skip for tests requiring real external services
- See: `reference/mocking.md`

### 5. Local Tests Must Always Be Runnable

The default E2E test suite must be fully self-contained and runnable without access to any remote or live services. Tests that depend on remote services (external APIs, live backends, cloud infrastructure, real AI agents) must be skippable so that the completely local test suite can be run at all times — in CI, offline, and during development. Remote-dependent tests are opt-in, never opt-out.

- Prefer the test framework's native filtering or tagging mechanism (e.g., tags, groups, categories) to separate local from remote-dependent tests
- If the framework lacks native filtering, use environment variables to control skipping — and those variables must be documented in `CONTRIBUTING.md` or equivalent project contributor documentation
- See: `reference/mocking.md`

### 6. Setup-Execute-Verify

Every test follows three phases:

```
Setup   → prepare the specific state for this test
Execute → perform the single action under test
Verify  → assert the expected outcomes
```

### 7. Autogenerated Tests

Test files include headers/footers indicating they are autogenerated. Manual modifications are overwritten on regeneration. To change tests, update the spec.

### 8. Execute Before Trusting

Never assume generated test code works until it has been executed. Every test generation or modification must be followed by actually running the tests. If a test passes but the underlying feature is broken, the test is wrong. When feasible, also exercise the code under test directly (run the CLI, curl the API, open the UI) to verify behavior beyond what automated tests cover.

### 9. Run Tests First

Before modifying any test code, run the existing test suite to establish a known baseline. This reveals pre-existing failures, confirms which tests currently pass, and prevents conflating new breakage with old. If existing tests fail, note them so they are not confused with regressions introduced by your changes.

## Spec Format Summary

Each `E2E_TESTS.md` contains suites with this structure:

```markdown
## <Suite Name>

### Preconditions
- Required setup (maps to per-test or per-suite setup hooks)

### Features

#### <Feature Name>
<!-- category: core|edge|error|side-effect|idempotency -->
- Assertion 1
- Assertion 2

### Postconditions
- Verifiable end states
```

### Feature Categories

| Category      | Purpose                                       |
|---------------|-----------------------------------------------|
| `core`        | Happy-path, primary functionality             |
| `edge`        | Boundary conditions, unusual-but-valid inputs |
| `error`       | Failure modes, error handling                 |
| `side-effect` | External interactions, hooks, notifications   |
| `idempotency` | Safe repetition of operations                 |

### Metadata Comments

```markdown
<!-- category: core -->           Required: test category
<!-- skip: requires-real-agent --> Optional: generates skipped test
<!-- tags: slow, docker -->        Optional: arbitrary tags
```

Full format specification: `reference/spec-format.md`

## Test Organization

### File Naming
```
<feature>.e2e.test.<ext>
```

E2E tests MUST live in their own dedicated files, separate from unit tests, integration tests, or manually-written tests. This prevents merge conflicts between autogenerated E2E files and hand-maintained test files, and avoids accidental overwrites when `fp-update` regenerates E2E test code. See `reference/organization.md` for details.

### Directory Layout
```
package/
├── src/commands/__tests__/
│   ├── e2e-utils.ts          # Shared helpers
│   ├── init.e2e.test.ts      # One file per suite
│   ├── task.e2e.test.ts
│   └── fixtures/             # Test data
├── E2E_TESTS.md              # Spec file
└── vitest.e2e.config.ts      # E2E runner config
```

### Mapping: Spec → Test

| Spec             | Test Construct                                                        |
|------------------|-----------------------------------------------------------------------|
| Suite (`##`)     | Suite/group block (e.g., `describe()` in vitest) + test file          |
| Preconditions    | Per-test setup hook (e.g., `beforeEach` in vitest)                    |
| Feature (`####`) | Individual test case (e.g., `it()` / `test()` in vitest)             |
| Bullets          | Assertion statements (e.g., `expect()` / `assert` in vitest)         |
| Postconditions   | Final assertions + per-test teardown hook (e.g., `afterEach` in vitest) |

Full organization guide: `reference/organization.md`

## Mock Strategy Summary

**Decision order:**
1. Can I use the real thing? → Use it
2. Can I use a local substitute? → Use it
3. Is the external thing being tested? → Need real/high-fidelity
4. Is the cost too high? → Mock it

**PATH-based mocking** for CLI tools:
```pseudocode
createMockTool("docker", exitCode=0, output="Docker version 24.0.0")
env.PATH = mockBinDir + ":" + originalPath
```

**Conditional skip** for optional dependencies:
```pseudocode
SKIP_REAL_AGENT = env.E2E_REAL_AGENT != "true"
suite.skipIf(SKIP_REAL_AGENT) "real agent tests":
  ...
```

Full mocking guide: `reference/mocking.md`

## Commands

| Command               | Description                                                                 | Modifies Code? |
|-----------------------|-----------------------------------------------------------------------------|----------------|
| `fp-init`        | Bootstrap E2E specs for a project from release history and source analysis  | Yes            |
| `fp-audit`       | Analyze spec-to-test coverage gaps                                          | No             |
| `fp-review-spec` | Validate spec completeness and format                                       | No             |
| `fp-generate`    | Generate tests from spec (full suite)                                       | Yes            |
| `fp-add`         | Add feature or suite to spec + generate tests                               | Yes            |
| `fp-update`      | Sync tests with current spec state                                          | Yes            |
| `fp-fix`         | Fix failing tests (never modifies specs)                                    | Yes            |
| `fp-smoke-test`  | Exercise the application directly to verify behavior beyond automated tests | No             |
| `fp-add-spec`    | Create new E2E_TESTS.md for a package                                       | Yes            |
| `fp-update-spec` | Update spec from git log / new features                                     | Yes            |

## Workflow

### Starting Fresh (no specs exist)
1. Run `fp-init` to bootstrap `E2E_TESTS.md` files across the project from release history and source analysis
2. Run `fp-review-spec` to validate completeness
3. Run `fp-generate` to create test files

### Adding Specs to a Single Package
1. Run `fp-add-spec` to create `E2E_TESTS.md` by analyzing the package
2. Run `fp-review-spec` to validate completeness
3. Run `fp-generate` to create test files

### Adding New Features
1. Run `fp-add` with a description of the feature
2. It detects whether to add to an existing suite or create a new one
3. Updates the spec and generates/updates tests

### Maintaining Tests
1. Run `fp-audit` to check coverage
2. Run `fp-update` to sync tests with spec changes
3. Run `fp-fix` to repair failing tests

### After Code Changes
1. Run `fp-update-spec` to reflect new functionality in specs
2. Run `fp-update` to regenerate tests from updated specs

### Verifying Beyond Tests
Run `fp-smoke-test` to exercise the application directly and verify that features work end-to-end in a real environment, not just in isolated test cases.

## Key Conventions

- **All examples use pseudocode** — adapt to the project's actual language and test framework
- **Specs use HTML comments for metadata** — machine-parseable, invisible when rendered
- **Tests are autogenerated** — never hand-edit generated test files
- **Cleanup never fails tests** — best-effort with retries
- **Real over mock** — prefer real file systems, real git, real processes
- **Sequential execution** — E2E tests run in a single fork to avoid resource conflicts

## Reference Documents

- `reference/spec-format.md` — Complete guide to E2E_TESTS.md format
- `reference/isolation.md` — Test isolation and state leak patterns
- `reference/cleanup.md` — Resilient cleanup and retry patterns
- `reference/mocking.md` — Mock decision framework and patterns
- `reference/organization.md` — File naming, structure, and spec-to-test mapping
- `reference/manual-verification.md` — Manual verification patterns by application type
