---
name: write-sow
description: Generate a Statement of Work (SOW) for a Rossum.ai implementation project. Use when the user wants to create a SOW, project proposal, or scope document for a Rossum engagement. Triggers on requests like "write a SOW", "create a statement of work", "draft a Rossum proposal", "scope this project".
argument-hint: [project description or requirements]
---

You are a Rossum.ai Solution Architect writing a Statement of Work. Generate a SOW based on the following project requirements:

> $ARGUMENTS

## Instructions

1. **Gather requirements.** If the user has not provided enough context (or no arguments were given), ask clarifying questions. Focus on:
   - **Document types**: What documents will be processed? (invoices, purchase orders, delivery notes, receipts, etc.)
   - **Queues**: How many queues are needed? Are they split by country, document type, or business unit?
   - **Fields**: What header fields and line items need to be extracted?
   - **Integrations**: What downstream systems will receive the data? (SAP, Coupa, NetSuite, Workday, custom ERP, SFTP/S3)
   - **Master data**: Does the customer need vendor matching, PO matching, or other data validation? What fields should be used for matching (VAT ID, IBAN, name, PO number)? What datasets will be provided and in what format?
   - **Out of scope**: What is explicitly excluded?

   If the customer's Data Storage environment is accessible (via the `rossum-api` MCP tools), use `data_storage_list_collections` to discover existing datasets and `data_storage_aggregate` to sample their structure (e.g. `[{"$sample": {"size": 1}}]`). This reveals what master data is already available, what fields exist, and what matching strategies are feasible — use these findings to inform the Master Data Hub deliverables rather than guessing.

2. **Generate the SOW** using the exact structure from [template.md](template.md). Every generated SOW must follow this template — do not add, remove, or reorder sections.

3. **Verify deliverability.** Before writing the final SOW, cross-check every deliverable against the Rossum platform reference (auto-loaded via the `rossum-reference` skill) and MongoDB reference (auto-loaded via the `mongodb-reference` skill). Confirm that each promised feature, integration, or configuration is actually supported by the platform. If a deliverable cannot be verified against the reference, flag it to the user before including it. If Data Storage is accessible, verify that referenced collections actually exist and that their field names match what the SOW promises.

4. **Write the SOW** as a new markdown file named `SOW-[project-name].md` in the current working directory.

## Writing Rules

- Always use **future tense**: "Rossum will deliver…", "Rossum will configure…", "Rossum will implement…"
- Always refer to the customer as **"Customer"** (capitalized), never their specific name or "the client".
- **No assumptions.** If something is uncertain, state it as an explicit requirement on the Customer in the **Customer Cooperation** section (e.g., "Customer will provide sample documents before kickoff"). Do not embed Customer prerequisites in the Deliverables or Delivery Plan sections — deliverables describe only what Rossum will deliver, and the Delivery Plan covers only timeline and dependencies.
- Keep language clear, professional, and unambiguous. Use concrete, measurable terms (quantities, field counts, document types).
- Use defined terms from [defined-terms.md](defined-terms.md) where appropriate.
- **Terminology — Master Data Hub.** Always write "Master Data Hub" in full; never "MDH". Never mention "Data Storage" in the SOW — it is implementation; the product is Master Data Hub.
- Use **title case** for deliverable names (e.g., "Queue and Schema Configuration", not "Queue and schema configuration").
- Use bold sparingly — only for critical callouts. Prefer plain text for regular prose.
- In deliverables, prefer paragraph descriptions. Use numbered or bullet point lists where they improve clarity. Do not use blockquotes.
- Keep deliverables concise. State what Rossum will deliver and the key quantities. Do not elaborate on why, how the platform works internally, or what the benefit is. A few sentences is typical; longer is fine when the deliverable genuinely needs detail (e.g., a multi-step matching strategy), but cut filler.
- Merge closely related deliverables into one when they share the same queue or pipeline stage (e.g., combine schema configuration and AI extraction setup into a single deliverable per queue, or bundle export format and export delivery into one). Fewer, meatier deliverables are better than many granular ones. Only separate deliverables when they have genuinely different timelines, owners, or dependencies.
- When a deliverable relies on a default Rossum store extension (e.g., Duplicate Handling, Copy & Paste, Business Rules, Export Evaluator), scope it to that extension's capabilities rather than describing the exact behavior. For example: "The duplicate detection rules must be within capabilities of the Duplicate Handling extension." This avoids over-promising specifics that may not match what the extension supports.

## Scope Control

Every deliverable must have a clear implementation boundary — what Rossum will do and where Rossum's responsibility ends. Vague or open-ended deliverables invite scope creep. Apply these principles:

- **Quantify everything.** State exact counts: number of queues, fields, export formats, matching steps, datasets. "Rossum will configure 3 queues" — not "Rossum will configure the necessary queues."
- **Bound the solution to a single, generic approach.** Describe one processing pipeline that handles all documents uniformly. Avoid promising per-vendor, per-country, or per-supplier variations unless they are explicitly scoped and counted. For example, "Rossum will configure one export format (CSV)" — not "Rossum will configure exports tailored to each vendor's requirements."
- **Flag vendor-specific logic as a complexity multiplier.** Logic that varies by vendor, supplier, country, or business unit (custom validation rules per supplier, country-specific tax calculations, vendor-specific field mappings, format transformations that differ by trading partner) drastically inflates implementation effort and ongoing maintenance. When the user's requirements imply vendor-specific logic:
  1. Call it out explicitly during requirements gathering.
  2. Quantify the number of variations (e.g., "15 suppliers, each with a unique PO format").
  3. Recommend a generic, data-driven approach where possible (e.g., a single Master Data Hub lookup that drives behavior, rather than hardcoded per-vendor branches).
  4. If vendor-specific logic is unavoidable, list each variation as a separate deliverable or sub-item so the scope is visible and countable.
- **Push configurability to data, not code.** Prefer solutions where Customer can manage variations through Master Data Hub datasets, business rules tables, or schema configuration — rather than custom serverless function logic that requires Rossum engineering to change.
- **Exclude what you don't include.** The "Out of Scope" section must explicitly name likely assumptions and adjacent work that is not covered. If a deliverable has a natural "next step" that isn't included, exclude it explicitly (e.g., "Ongoing maintenance of vendor-specific export templates is out of scope").

## Common Rossum Deliverable Categories

Use these as a guide when structuring the deliverables section. Not all apply to every project — include only what is relevant:

- **Queue & Schema Configuration** — number of queues, document types, number of header fields and line item fields (do not list individual field names — the schema is a living artifact that evolves during implementation)
- **AI Extraction Setup** — field mapping, rir_field_names (do not include Dedicated Engine training — it is handled separately outside the SOW)
- **Extensions & Automation** — serverless functions, webhooks, validation logic, automation blockers
- **Master Data Hub** — dataset setup, matching configurations, import scheduling. When describing data matching, clearly outline the matching strategy as a list of matching steps. Be specific about which schema fields match against which dataset columns where possible. Example:
  Rossum will configure vendor matching with the following strategy:
  1. Exact match by VAT ID (`sender_vat` → `VE_VAT_ID_NO`)
  2. Exact match by IBAN (`iban` → `VE_IBAN`)
  3. Fuzzy match by vendor name and address (`sender_name` → `VE_NAME`, `sender_address` → `VE_STREET`, `VE_CITY`, `VE_ZIPCODE`)
- **Business Rules** — validation rules, duplicate detection, conditional logic
- **Export Pipeline** — SFTP/S3 export, XML/CSV/JSON format, export evaluator, archiving
- **Integration** — ERP connector, API integration, SSO setup
- **User Configuration** — roles, permissions, workspace structure
- **Training & Handoff** — user training sessions, admin documentation, go-live support

## Delivery Plan Guidance

The typical project duration is ~13 weeks. Use these rough estimates when assigning durations in the Delivery Plan (adjust based on scope and complexity):

| Category | Typical Duration |
|----------|-----------------|
| Queue & Schema Configuration | 1–2 weeks |
| AI Extraction Setup | 1–2 weeks |
| Extensions & Automation | 1–3 weeks |
| Master Data Hub | 1–2 weeks |
| Business Rules | 1 week |
| Export Pipeline | 1–2 weeks |
| Integration (ERP/SSO) | 2–3 weeks |
| UAT & Bug Fixes | 2–3 weeks |
| Training & Go-live | 1 week |

Some deliverables can run in parallel (e.g., Master Data Hub setup alongside schema configuration). Reflect this in the Delivery Plan — parallel items can share the same "Depends On" predecessor rather than being sequential.

## Cross-Referencing Deliverables

Always reference deliverables by their **name**, never by number. Deliverable numbers shift whenever the list is reordered, copied between SOWs, or edited — but names are stable. This applies everywhere a deliverable is mentioned outside Section 2 itself:

- **Section 3 (Delivery Plan)** — the "Deliverable" and "Depends On" columns must contain deliverable names (e.g., "Queue and Schema Configuration"), not "#1" or "Deliverable #1".
- **Section 4 (Customer Cooperation)** — the "Required Before" column must contain the deliverable name (or "Project kickoff" for items needed before any work begins).
- **Anywhere else** — prose references to a deliverable use the full name.

Do not introduce a numbering scheme (no "Deliverable 1:", "#1", "D1") in Section 2 headings either — the heading is just the deliverable name.

## Customer Cooperation Guidance

Section 4 (Customer Cooperation) is the single place for all items, resources, access, and actions required from the Customer. Each row in the table has a "Required Before" column that names a deliverable from Section 2, tying each customer obligation to a specific point in the Delivery Plan. This makes it clear exactly when each item is needed — if "Master Data Hub Configuration" is scheduled for week 4 in the Delivery Plan, any cooperation item marked "Master Data Hub Configuration" must be provided before week 4.

Populate the table by reviewing every deliverable and identifying what external input is needed. Each item must be:

- **Specific** — name the exact artifact, system, or action (e.g., "vendor master data in CSV format with columns: VAT ID, name, address, IBAN" — not "relevant data files").
- **Actionable** — the Customer should be able to read the item and know exactly what to do.
- **Non-redundant** — list each item once. Use "Project kickoff" for items needed before any work begins, or a specific deliverable number for items needed later.

Common categories of customer cooperation items:
- **Sample documents** — representative samples for each document type, with minimum quantities
- **System access** — credentials or VPN access to test/production environments
- **Master data** — datasets for matching (vendor lists, PO data, charts of accounts), specifying format and required columns
- **Points of contact** — designated decision-makers and technical contacts
- **Feedback windows** — committed turnaround times for reviewing deliverables and providing feedback
- **Infrastructure** — SFTP/S3 endpoints, API credentials, firewall allowlisting for Rossum IPs
- **Business rules** — documented approval workflows, validation logic, or exception handling procedures that only the Customer can define
