---
name: lee-tenants-in-market
description: Ingests Triangle Pairlist tenant-requirement emails. On a schedule, reads the running user's inbox for [Triangle Pairlist] messages, screens each as a tenant requirement (a broker seeking space/investment) vs a listing (a broker marketing a property), extracts the requirement fields, and writes every screened email to the shared tenant-requirements store via lee_tenant_requirement_write. Screens from small per-message extracts (never full threads) and processes strictly sequentially with write-as-you-go, so a run completes in one session for the rolling 2-day Pairlist corpus. Run on a Cowork Scheduled Task (any cadence from hourly to daily converges; each run drains the rolling 2-day window), pinned to Haiku. Reading uses the Gmail connector; writing uses lee-raleigh-mcp.
---

# /lee-tenants-in-market

Scheduled ingest of Triangle Pairlist emails into the shared tenant-requirements D1 store. Capture-everything: store both requirements and listings (audit). A `queryable` flag gates the future broker query surface.

## Prerequisites (one-time, per runner)
- **Gmail connector** enabled in Cowork (Settings -> Connectors -> Gmail). This reads the runner's own inbox server-side; it is the ONLY supported way to read mail from a Cowork session.
- **lee-raleigh-mcp** connector enabled and the runner's email on the LEE_TENANT_WRITERS allowlist.

## Every run starts at Step 1
This is a routine ingest. Do **not** run a smoke check and do **not** write any throwaway / `smoke-*` record — go straight to Step 1. (One-time deploy validation is a separate manual gesture; see the appendix at the bottom. It is never part of a scheduled or routine run.)

## The two context rules (why past runs died)
Full Pairlist digest threads run 60-85KB; pulling even a few into the session, or fetching many in parallel, overflows the context window mid-run, the session resets, and the run restarts from scratch forever. Two rules prevent that, and they govern every step below:

1. **Never load a full thread into context.** Screen from a small extract — subject + sender + snippet, plus at most the first ~2KB of the individual message body when the extract isn't enough. Never call a get-thread tool; fetch single messages only.
2. **Strictly sequential, write-as-you-go.** One message at a time: screen -> resolve -> write, finishing the write before touching the next message. Never batch-fetch bodies in parallel. Work already written survives any interruption.

## Step 1 - Build the manifest
Via the Gmail connector, **search** for `subject:"[Triangle Pairlist]" newer_than:2d` and build a numbered manifest from the search results alone — per message: id, date, sender, subject, snippet. Order it **oldest first**. Do not fetch any bodies in this step. (Rolling 2-day window; UPSERT makes re-reads harmless.) Safety valve: if the manifest somehow exceeds 150 messages, process only the oldest 150 and report the overflow count in Step 5 — that volume is an anomaly an operator needs to see, not something to push through.

## Step 2 - Screen each email (the judgement call)
Work the manifest in order, in **batches of 5** — a batch is a progress-reporting checkpoint only; within a batch you still go strictly one message at a time. For each message:

**Get the screening extract.** Decide `record_type` from subject + snippet first — most Pairlist subjects are self-describing (e.g. "ISO: 5,000-7,000 SF Medical | Garner"). Fetch a body only when (a) subject + snippet cannot decide record_type, or (b) the message is a **requirement** (you need the requirement fields and broker signature) — clear listings need no body. When you do fetch, fetch **that single message only** (never its thread) and use **at most the first ~2KB of the body**, discarding the rest. Budget: after ~25 body fetches in one run, lean on subject + snippet for the remainder and mark them truncated — finishing the manifest beats enriching it. If a fetch fails as oversized (or the budget is spent), screen from subject + snippet alone: set `"truncated": true` in `raw_json` (Step 4), and on requirements also put `"body unavailable (oversized)"` in `additional_details` (listings record the gap via `raw_json` only — their requirement-only fields stay null).

**Decide `record_type`:**
- **requirement** - the sender is representing a tenant/buyer/investor SEEKING space or an investment. Tells: "ISO" (In Search Of), "seeking", "client looking for", "we need", an explicit requirement list.
- **listing** - the sender is MARKETING a property they have. Tells: "For Lease", "For Sale", "Now Leasing", "Available", "New to Market", "development opportunity", "reduced price".
Judge by who-wants-what, not by stray keywords ("Now Leasing 2,000 SF available" is a LISTING even though it has a size).

Then set, **on EVERY record (requirement AND listing)**:
- `reason`: **always required** — a one-clause rationale for the requirement-vs-listing call (e.g. "broker marketing a property they have" for a listing, "broker representing a tenant seeking space" for a requirement). Set it on listings too; it is the audit trail. A null `reason` is a bug.
- `is_investment`: true only if it seeks an INVESTMENT / $-budget property rather than space.
- `queryable`: true for space requirements; **false** for `is_investment` requirements (audit-only) and for all listings.

For **requirements only**, also extract (leave these null on listings): `tenant`, `requirement_sf` (verbatim), `sf_min` (int or null), `budget` (verbatim $ or null), `preferred_location`, `asset_type`, `tenure` (lease/purchase/both/null), `additional_details`.

Worked anchors (from the validated Phase-1 corpus):
- "ISO: 5,000-7,000 SF Medical | Garner" -> requirement, queryable, asset_type=medical, tenure=both, sf_min=5000.
- "ISO Investment $1.5M" -> requirement, is_investment=true, queryable=false, budget="$1.5M".
- "ISO 2nd Gen Restaurant Garner" -> requirement, queryable (sf=null, matchable by type+location).
- "North Graham Business Center Now Leasing" -> listing, queryable=false.

## Step 3 - Resolve broker contact
Triangle Pairlist runs on gaggle.email and encodes the original sender as `triangle-pairlist+<name>_at_<domain>@gaggle.email` -> `<name>@<domain>`.
- If the message's envelope `From:` is a gaggle alias (reading Pairlist directly, e.g. on Will's machine), decode it.
- Else parse the forwarded body's `From:` header (David reading Will's forwards now).
Set `broker_name`, `broker_email`, `broker_phone` (from the signature, if it was inside the extract you fetched; null otherwise).

## Step 4 - Write (immediately, before the next message)
For EACH screened message, call `lee_tenant_requirement_write` **as soon as it is screened** — do not accumulate a batch of pending writes:
- `source_message_id`: the email's RFC822 Message-ID if the connector exposes it; else the connector's stable per-message id. (This is the dedup key.)
- `received_date`: the email date (YYYY-MM-DD).
- all screened fields from Step 2-3, **including `reason` on every call** (a listing row carries `record_type`, `reason`, broker contact, and `raw_json`; the requirement-only fields stay null).
- `raw_json`: a JSON string of the **screening extract envelope** — headers (date, from, subject) + the body extract you screened from (capped at ~2KB) + `"truncated": true` whenever the body was capped or unavailable. Never put a full email body here: the write call transits the context window too, and an unbounded `raw_json` re-creates the overflow the extract rule exists to prevent.
Re-running is safe (UPSERT on source_message_id).

After each batch of 5, emit a one-line progress marker (`batch 3/11 done — 15/51 written`) and continue to the next batch until the manifest is drained.

## If a run is interrupted
Do not try to resume from remembered state, and **never reuse message or thread IDs from before a session reset** (they go stale and return "Requested entity was not found"). The next run simply repeats Steps 1-4 from a fresh search: writes are UPSERTs, extract-based re-screening is cheap, and the oldest-first order means the run converges instead of thrashing.

(Operator notes — for whoever maintains the Scheduled Task, not actions for the running agent:) If scheduled runs still fail to produce a Step 5 report after this structure, repin the task to Sonnet — the loop is model-agnostic, but Sonnet recovers better from tool errors. And after an outage longer than 2 days, messages have aged out of the `newer_than:2d` window unwritten — run the skill once manually with `newer_than:` widened to cover the gap (e.g. `newer_than:5d`).

## Step 5 - Report
Summarize: N in manifest, M requirements (K queryable), (N-M) listings, write errors, how many fell back to subject+snippet screening, and the manifest-overflow count if the Step 1 safety valve fired. Do not collapse requirements and listings.

---

## Appendix — one-time deploy validation (MANUAL only, never on scheduled/routine runs)
Run this by hand ONLY right after deploying a brand-new build, to confirm the runtime path before trusting a scheduled run. Skip it entirely on every normal run.
1. **Gmail reachable:** list 1 message matching `subject:"[Triangle Pairlist]"`. If it errors → `Gmail connector: BLOCKED`, stop.
2. **Write path reachable:** do ONE real Step 1–4 pass on a single recent email and confirm the row lands in D1. Use a REAL email — do not invent a throwaway `smoke-*` record (it would pollute the table). If the write returns `forbidden`, the runner isn't on `LEE_TENANT_WRITERS` — escalate to David.
