---
name: chatbot-saas
description: Build chatbot-dashboard SaaS apps — authenticated dashboard where user manages many projects, each project a container of stuff, with an agentic chatbot that reads/mutates same Supabase tables UI reads. Four non-negotiables — agentic (not workflow), UI+bot shared state, server is dumb (spec.md + rules/*.md in Supabase via /admin), security baseline applied. Covers full 7-item widget UI contract (streaming, pills, reset, stop, placeholder, sticky scroll, full markdown), SSE wiring, EAV knowledge management, full security code patterns (Helmet CSP, CORS, body cap, rate limiter, AUTH_TOKEN middleware, sanitizeSearch, SSE-safe error handling), Phase 0 tool-suggestions workflow.
---

# Chatbot-dashboard SaaS

A reusable skeleton for the class of web apps shipped repeatedly: an authenticated **dashboard** where the user manages many **projects**, each project is a **container of stuff** (generated HTML, images, audio, `.md` docs, kanban cards), and an **agentic chatbot** drives the whole thing — reads the same Supabase tables the UI reads, mutates them through tools, behaves according to `spec.md` + per-vertical `rules/*.md` files. The user can edit by hand. The bot can edit. Both converge on the same state.

This is a **guide, not a recipe**. Four non-negotiables:

1. **The chatbot is agentic, not a workflow** (§2)
2. **UI and chatbot share one source of truth** (§3)
3. **Server is dumb — behaviour lives in `spec.md` + `rules/*.md`** (§4)
4. **Security baseline applies to every vertical — always** (see "Security baseline" section)

Everything else — schema, tools, billing, galleries, phases — is a design decision per-usecase. When in doubt, check with Jon before committing.

---

## The end-user is NOT Jon

These apps are deployed for real customers — real estate agents, sales reps, course authors, clients. The `spec.md` + `rules/*.md` must be written for **that end-user**, not for Jon.

- **Identity and tone set for the end-user's context**, not Jon's
- **Onboarding addresses the user by their real name** (learned on first contact or from auth profile) — never hardcoded as "Jon"
- **Examples and copy in `spec.md` reflect the end-user's domain**
- **No references to internal tooling, LOG.md discipline, CLAUDE.md rules, or "Jon's preferences"** in anything the end-user can see
- **Multi-tenant assumptions apply from day one** — every query, every tool call, every storage path scoped to authenticated user

When a vertical has multiple classes of end-user (agents / brokers / clients):
- **>70% shared capabilities + tone** → one `spec.md` with role-specific sections (`## For Agents`, `## For Clients`)
- **Significant divergence** (different tools, tone, hard rules) → separate spec files: `spec-agent.md`, `spec-broker.md`, `spec-client.md`. Server picks which to load based on authenticated user's role.

---

## When to use this rule

- *"Build me a [vertical] version of ai-content-gen"*
- *"A dashboard where the chatbot controls the UI"*
- *"A chatbot that generates a gallery of stuff for each [customer / property / topic / quote]"*
- *"Like PGE quoteflow-lean but for [X]"*

---

## §1. The 7-item widget UI contract (NON-NEGOTIABLE)

Every chatbot ships with **all seven** below. No exceptions, no "we'll add it later." If any one is missing at deploy time, the chatbot is incomplete — treat as blocker.

**Scope**: full-page dashboards AND same-origin inline widgets. Only exempt surface: truly cross-origin `<iframe>` embed where SSE is hostile to host CSP. "It's just a widget" is not an exemption.

| # | Requirement | Why | Notes |
|---|-------------|-----|-------|
| 1 | **Stream responses** (SSE `/api/chat/stream`) | 4s blank typing dot during 3 tool calls = broken UX | Exempt only for genuine cross-origin iframes |
| 2 | **Suggested-reply pills** after bot turns soliciting choice | Shows kinds of answers that are useful; cuts dead-end conversations | Bot emits `[OPTIONS: A \| B \| C]` inline; frontend parses → clickable buttons. Disable after one click. |
| 3 | **Reset-chat button** | Users need an out when conversation derails | Tiny icon-button in header. Confirms once, then drops DOM messages + server-side session/history. Do NOT skip confirmation. |
| 4 | **Send button morphs to stop button while streaming** | User needs to abort long tool-use chain | Gold send-arrow → red stop-square when SSE connects. Click = `AbortController.abort()` + close SSE reader + push `[stopped]` marker. Reverts to send once stopped. |
| 5 | **Placeholder text never overflows input** | Clipped placeholder screams "prototype" | Max ~28 chars (*"Ask a question…"* not *"Ask about pricing, platform, case studies…"*). Longer context as hint above input. Test on 320px viewports. |
| 6 | **Sticky scroll (never yank)** | Yanking down mid-read is hostile | Wrap every mutation in `withStickyScroll(fn)`. Jump-to-latest pill when out of range |
| 7 | **Full markdown rendering** | Bot messages are always markdown | Headers, bold, italic, strikethrough, inline `code`, fenced ``` blocks, ordered + unordered lists (nested), `>` blockquotes, `\|` tables, `+++…+++` collapsibles, horizontal rules, auto-linked URLs, `[text](url)`, inline images. Custom ~100-line regex parser, no library. |

**Before marking a chatbot "done," walk this checklist out loud.** Paste table into LOG.md with ticks next to each.

### Chat input — Enter sends, Shift+Enter newline

```html
<textarea id="chat-input" rows="1" placeholder="Type a message — Enter to send, Shift+Enter for new line"></textarea>
```

```js
chatTextarea.addEventListener('keydown', e => {
  if (e.key === 'Enter' && !e.shiftKey && !e.isComposing) {
    e.preventDefault();
    sendChat();
  }
});
chatTextarea.addEventListener('input', () => {
  chatTextarea.style.height = 'auto';
  chatTextarea.style.height = Math.min(chatTextarea.scrollHeight, 200) + 'px';
});
```

`!e.isComposing` matters for IME users (Chinese/Japanese/Korean) — without it, Enter during composition sends a half-composed message.

### Sticky scroll wiring

```js
const STICKY_THRESHOLD_PX = 80;
function isNearBottom(log) {
  return (log.scrollHeight - log.scrollTop - log.clientHeight) < STICKY_THRESHOLD_PX;
}
function withStickyScroll(fn) {
  const log = document.getElementById('chat-log');
  const wasStuck = isNearBottom(log);  // capture BEFORE the append
  const result = fn();
  if (wasStuck) log.scrollTop = log.scrollHeight;
  else document.getElementById('jump-to-latest').classList.add('visible');
  return result;
}
log.addEventListener('scroll', () => {
  if (isNearBottom(log)) jumpPill.classList.remove('visible');
});
```

Wrap **every** function that mutates the chat log. User's own sent message is force-scroll (they just hit Enter — show it). Property-switch replay jumps to bottom (fresh view).

---

## §2. Agentic, not workflow (the most important rule)

Do **not** build as phase pipeline (`phase1_intake() → phase2_research() → phase3_build()`). That's a workflow — brittle, hard to change.

Build as **agent**:
- Bot has `spec.md` (identity + tone + hard rules) + set of `rules/*.md` it reads on demand
- Bot has **tools** it can call whenever it decides appropriate
- On each turn, bot looks at current project state + user message and **decides** what to do — call tool, ask question, summarize, wait. No preset order.
- `rules/*.md` **recommend** sequences ("for new listing, usually: intake → research → images → build"); bot uses judgment

**Signs you're drifting into workflow territory** (stop and refactor):
- Writing `if currentPhase == 'research': ...` in `server.ts`
- `rules/*.md` contain literal code or JSON step definitions
- Bot can't skip a phase even when obviously should
- Adding new capability requires editing server code, not just `.md` file

Reference: PGE quoteflow-lean is the cleanest example.

### §2a. Fat, discrete tools — the model picks parameters, not the tool

The agent principle above only works if the tool surface is **fat and parametric**, not a basket of mono-use functions. Build the model a **small set of broad, parameter-rich tools** and let it choose what to do — don't pre-decide for it by shipping 50 narrow ones.

**Hard rule.** Default ceiling: **~10–15 tools** for a mature build. If you're past 20, you're shipping a menu, not an agent.

**Fat over thin — the test.**

Bad (mono-use, hides judgment from the model):
- `create_linkedin_post`, `create_instagram_post`, `create_facebook_post`, `update_post_body`, `update_post_status`, `update_post_schedule`, `delete_post`, `archive_post`, `schedule_post`, `unschedule_post` …

Good (fat, parametric, one decision surface):
- `upsert_post(set_id, platform, body?, media_ids?, status?, scheduled_at?, mode='create'|'update'|'delete')` — covers all 10 above. The model writes one call with the parameters that matter for the user's actual ask.

Reference (social-copilot):
- `upsert_post` — create / update / delete / schedule across all 4 platforms
- `manage_images` — `mode: search | generate | select | attach | remove`
- `read_state` — broad context fetch with filters (project_id, set_id, kind, status, since, limit)
- `upsert_post_set`, `upsert_brand`, `update_brand_guide` — same pattern: one verb per entity, mode/level as a parameter

**Why this matters.**
1. **The model is smarter than your tool router.** It picks `mode='update'` correctly when the user says "change the caption" — you don't need to ship `update_caption` as its own function.
2. **Tool count is cognitive load on the model.** 40 narrow tools = the model spends turns deciding which to call. 10 fat tools = it spends turns deciding what to *do*.
3. **Schema drift stays contained.** New post-status? Add to the `status` enum on one tool, not a new `archive_post` function + handler + audit-log entry + UI button.
4. **The dashboard mirrors it.** UI verbs (drag-card, edit-inline, bulk-schedule) all hit the same fat tool the bot uses → one code path, one place for bugs.

**Anti-patterns to refuse:**
- One tool per HTTP verb (`get_X`, `post_X`, `put_X`, `delete_X`) — collapse to `upsert_X` + filter-rich `read_state`
- One tool per UI button — let the bot call the same fat tool the button hits
- One tool per "phase" of a workflow (`start_intake`, `complete_intake`, `start_research`) — phases are conversation moves, not tool calls
- "Convenience" tools that wrap one existing tool with hardcoded params — make the model pass the params

**When a narrow tool is actually justified:** truly distinct side-effect domain that doesn't compose with anything else (e.g., `send_whatsapp_message` is its own thing — not a mode of `upsert_post`). The test: would merging this into a fat tool require an `if mode == X` branch with ZERO shared code below it? Then keep it separate.

---

## §3. Manual + chatbot co-editing on shared state

What makes these apps feel alive.

1. **One source of truth per UI element.** Kanban card, project tile, form field = row (or JSON path) in Supabase. No separate "UI state" and "bot state".
2. **Every manual action is expressible as a tool call.** If user can drag card to "Done", bot has tool that does same `UPDATE`. Both code paths hit same SQL.
3. **Every bot action appears in UI immediately.** After tool call, server emits realtime event (Supabase Realtime or short polling) → dashboard re-renders without refresh.
4. **Bot narrates what it's doing.** Blockquote for process, headings for findings, bold for decisions. User can interrupt mid-task.
5. **Last-write-wins, but warn.** If user edits card while bot mid-operation: *"Assistant was updating this 3 seconds ago — your edit overrode it."* Don't try to merge.

The *shape* (kanban / project grid / table / timeline) is a usecase decision. The *principle* is not.

---

## SSE streaming — text deltas + tool events on one channel

Server-Sent Events on `/api/chat/stream` endpoint. Single typed event stream:

- `text_delta` — partial assistant text (re-render markdown on each delta)
- `tool_start` — input JSON, render streaming card immediately
- `tool_done` — output JSON + duration, swap card to success state
- `tool_error` — swap to error state
- `done` — final assistant message complete; show usage/cost footer if relevant
- `error` — request-level failure

**Why streaming matters here**: agentic loop often makes 3-6 tool calls per turn. Without streaming, user stares at typing indicator for 15+s. With streaming, watches each tool fire and response build live.

### Three traps that eat half your day

1. **`req.on('close')` is the WRONG event.** In Node ≥16 it fires when request body finishes uploading, not when connection closes. Watch `res.on('close')` instead — that's response-side socket close, what "user clicked stop / closed tab" maps to.
2. **Prime connection on first byte.** After `res.flushHeaders()`, write a single SSE comment line (`": ok\n\n"`) before doing anything else, then call `res.flush()` if it exists. Without prime, intermediaries (Cloudflare, nginx, Railway proxies) can hold the response buffer.
3. **Call `res.flush()` after every emit.** Plain Express doesn't always have `.flush()` — guard with `if (typeof res.flush === "function") res.flush();`. Without `compression` middleware, calls are no-ops but harmless. Skipping = events queue server-side, arrive in batches.

### The flex-column scroll trap (the chat doesn't scroll!)

If your chat layout is `display: grid` → `.chat-side { display: flex; flex-direction: column }` → `.chat-log { flex: 1; overflow-y: auto }`, chat-log grows to fit content and `overflow-y: auto` never triggers.

```css
.chat-side  { display: flex; flex-direction: column; min-height: 0; overflow: hidden; }
.chat-head  { flex-shrink: 0; }
.chat-log   { flex: 1 1 0; min-height: 0; overflow-y: auto; }
.chat-input-wrap { flex-shrink: 0; }
```

Crucial: `min-height: 0` on **both** chat-side container AND chat-log child. Without it, flex child's intrinsic minimum size is its content height — always grows, never scrolls.

---

## Tool-call rendering — collapsed cards with icons

Every tool call = first-class chat element, NOT hidden in console:
- **States**: `streaming` (spinner + accent border), `success` (icon + duration), `error` (red icon + bg)
- **Collapsed by default** — click to expand
- **Expanded view**: `Request` (input JSON) + `Response` (truncated to ~1000 chars)
- **Per-tool icon** via `toolIcon(name)` map → SVG
- **Inline action affordances**: if tool result includes `download_url` or `external_url`, render button under card

```css
.tool-call { padding: 4px 8px; cursor: pointer; border-radius: 4px;
             background: rgba(var(--accent-rgb), 0.1); }
.tool-call.collapsed .tool-detail { display: none; }
.tool-call.streaming { border-left: 2px solid var(--accent); }
.tool-call.error { color: var(--danger); background: rgba(var(--danger-rgb), 0.1); }
```

---

## §4. Server is dumb; `spec.md` + `rules/*.md` are the brain

Server's only job: route chat messages to Claude with right system prompt, execute tools bot calls, persist state to Supabase, stream response back. That's it.

- **Zero conditional business logic** in `server.ts` / `main.py`. No `if user_says_hi: ...`. No onboarding state machines.
- **Onboarding, tone, upsells, refusals, phase recommendations, error messaging — all in `spec.md` + `rules/*.md`**. Server reads them, glues them into system prompt, gets out of way.
- **Adding new behaviour = editing `.md` file**, not writing code.

Minimum `spec.md` coverage (≤ ~2000 tokens):
- Identity, tone, mission
- Hard rules the bot must never break
- When to ask vs when to act (autopilot defaults)
- When to call which category of tool
- Error handling and refusals
- Upsell / billing copy (if billing)

Long-form domain rules go in `rules/*.md`, loaded on demand via `read_rule` tool.

---

## §4.5. Live-editable `spec.md` + `rules/*.md` via Supabase + `/admin` panel (HARD RULE)

Every new chatbot's system prompt + on-demand rule files live in a Supabase table, NOT just on deployed filesystem. Edited live via `/admin` route on FastAPI service. Edits propagate to bot in ≤60s without redeploy.

**Why non-negotiable**:
- Stale customer names, deprecated emails, wrong pricing happen *between deploys*. Filesystem-only = every copy edit triggers deploy cycle + code review. Friction high enough that bot says out-of-date things for days.
- Separate content from code, like CMS solved for marketing pages 15 years ago.

### Architecture

```
admin browser ──HTTP Basic──▶ FastAPI /admin ──▶ kb_files (Supabase)
                                                       │
chat turn ──/api/chat──▶ kb.get_spec() ──▶ 60s in-memory cache ──▶ kb_files
                          │                              ▲
                          └──── filesystem fallback ─────┘
                                (chatbot/spec.md, rules/*.md)
                                only kicks in if Supabase unreachable
```

- **Source of truth**: `kb_files` table
- **Filesystem `.md`**: seed-only. Useful for `git diff` of initial state. Live edits don't sync back to git.
- **Cache**: 60s in-process LRU per pod. Saves bust cache on saving pod; other pods catch up within 60s.
- **Auth**: HTTP Basic, single `ADMIN_PASSWORD` env var, username `admin` (or `ADMIN_USER`)

### Schema (idempotent — same in every project)

```sql
create table if not exists kb_files (
  name        text primary key,        -- 'spec', '00-onboarding', '01-pricing', etc.
  kind        text not null check (kind in ('spec', 'rule')),
  body        text not null,
  description text,
  updated_at  timestamptz not null default now(),
  updated_by  text
);
create table if not exists kb_history (
  id         bigserial primary key,
  name       text not null references kb_files(name) on delete cascade,
  body       text not null,
  saved_at   timestamptz not null default now(),
  saved_by   text
);
create index if not exists kb_history_name_idx on kb_history(name, saved_at desc);
alter table kb_files   disable row level security;
alter table kb_history disable row level security;
```

Bot's FastAPI service-role key talks directly. RLS off because only client is FastAPI server — admin auth happens at route layer.

### Which Supabase project for `kb_files`

**Do NOT create new Supabase project just to host `kb_files`.**

1. **If chatbot already uses a Supabase project** (leads, conversations, claude_messages, auth) → put `kb_files` + `kb_history` in that same project. Multiple chatbot tables coexist fine.
2. **If chatbot doesn't use a Supabase project yet** → default to `SUPABASE_URL_EAVCHATBOT` (`ssbfuqjsgistzvqgnfvu`). Land kb tables there along with any leads / conversations tables.
3. **Only create new Supabase project after explicit user permission.** Each project is billable.

### Required endpoints (FastAPI side)

| Method | Path | Purpose |
|--------|------|---------|
| `GET`  | `/admin` | Static HTML editor (auth-gated) |
| `GET`  | `/admin/api/kb` | List all kb_files rows |
| `GET`  | `/admin/api/kb/{name}` | Get one file's body |
| `PUT`  | `/admin/api/kb/{name}` | Update body, snapshot prior to kb_history |
| `GET`  | `/admin/api/kb/{name}/history` | List historical versions |
| `POST` | `/admin/api/kb/{name}/restore/{history_id}` | Roll back to prior version |

CORS `allow_methods` must include `PUT` for `/admin/api/kb/{name}`.

### Deploy steps (new chatbot)

1. Apply migration to chosen Supabase project (Management API: `POST /v1/projects/{ref}/database/query`)
2. Run seed script — populates `kb_files` from on-disk `spec.md` + `rules/*.md`
3. Set `ADMIN_PASSWORD` on Railway (and optionally `ADMIN_USER`)
4. Push code → Railway auto-deploys
5. Verify: `GET /admin` returns 401 without auth, 200 with auth; `GET /admin/api/kb` lists every seeded row

Bot reads via `kb.get_spec()` / `kb.get_rule(name)` instead of filesystem reads.

### When NOT to use

- One-off bots that won't outlive demo (internal dashboard chat, debug widget). Migration + admin panel is ~10min setup; for ephemeral tools it's overkill.
- Bots whose system prompt is generated dynamically per request (user-personalised system prompts assembled from row data). No static `spec.md` to edit.

---

## EAV knowledge management (shared schema for demos / MVPs)

For demos / pre-production prototypes / pitches that need a remote DB but don't have their own data lifecycle yet, use the **shared EAV chatbot Supabase** (`SUPABASE_URL_EAVCHATBOT`):

- Entity-Attribute-Value schema: `entities`, `attributes`, `values`
- JSONB column on entity table for arbitrary metadata
- Multi-tenant by `tenant_id` or `app_id`
- Use for client demos, MVPs, internal tools

**Server is dumb — EAV holds the structure**. All flow / rules / onboarding live in `spec.md`, NOT in `server.js`.

For full production with own users / billing / compliance scope → new dedicated Supabase project (with explicit permission).

---

## Security baseline (apply to EVERY vertical)

These are not usecase decisions. Every chatbot-dashboard vertical ships with all of them. The canonical reference is `SimplaDocs/clients/PGE/pge-quoteflow-lean/` — `src/server.ts` (helmet, CORS, rate limit, auth middleware) and `src/db.ts` (sanitizeSearch, service-role client).

### Non-negotiables

1. **Helmet with a real CSP.** Lock `script`, `style`, `img`, `font`, `connect` sources to `'self'` plus only the specific external fonts/CDNs this vertical actually loads. No wildcards.
2. **CORS same-origin only** — `cors({ origin: false })`. The API is not a public endpoint.
3. **Explicit JSON body cap** — `express.json({ limit: "1mb" })` (or up to 10mb if uploads). Never unlimited.
4. **Rate-limit every route that calls a paid LLM API.** `/api/chat` and any tool endpoint that hits Anthropic / OpenAI / ElevenLabs gets `express-rate-limit` (or slowapi on FastAPI). Default: 30 req/min per IP. This is a spend control as much as security.
5. **Auth gate on `/api/*`.** Either a Supabase OTP session (preferred for customer-facing) or a `Bearer ${AUTH_TOKEN}` env-gated middleware for internal demos. **Never ship with `/api/*` fully open.**
6. **Supabase keys split correctly.** `SUPABASE_SERVICE_ROLE_KEY` is server-side only and never reaches the browser. Only `SUPABASE_ANON_KEY` is exposed to the frontend.
7. **Sanitize every string that flows into a PostgREST `.or()` / `.ilike()` filter.** Strip `,.()"'\` before interpolation. Tool-call arguments come from the LLM — treat them as untrusted user input.
8. **RLS on every user-owned table.** Multi-tenant from day one. Every row is `user_id`-scoped and every policy is enforced in SQL. Defence in depth.
9. **Never commit secrets.** `.env` gitignored, `.env.example` lists keys without values, Railway holds production env. Grep before every commit.
10. **SSE-safe error handling.** Once `res.flushHeaders()` is called on a streaming route, errors must be emitted as an SSE `error` event, not `res.status(500).json(...)`. Always check `res.headersSent` before responding with an error.

### Reference patterns — copy verbatim, adjust per vertical

#### Helmet + CSP (Express)

```ts
import helmet from "helmet";

app.use(helmet({
  contentSecurityPolicy: {
    directives: {
      defaultSrc:    ["'self'"],
      scriptSrc:     ["'self'", "'unsafe-inline'"],   // only if vertical renders inline <script> blocks
      scriptSrcAttr: ["'unsafe-inline'"],              // only if vertical uses inline onclick= handlers
      styleSrc:      ["'self'", "'unsafe-inline'", "https://fonts.googleapis.com"],
      fontSrc:       ["'self'", "https://fonts.gstatic.com"],
      imgSrc:        ["'self'", "data:"],
      connectSrc:    ["'self'"],
    },
  },
}));
```

`'unsafe-inline'` on `scriptSrc` / `scriptSrcAttr` is a concession for server-rendered dashboards that still use inline handlers. Drop it the moment the vertical can live without them.

#### CORS + body cap

```ts
app.use(cors({ origin: false }));
app.use(express.json({ limit: "1mb" }));
```

#### Rate limiter on the chat route

```ts
import rateLimit from "express-rate-limit";

const chatLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 30,             // 30 requests per minute per IP
  standardHeaders: true,
  legacyHeaders: false,
  message: { error: "Too many requests — please wait a moment" },
});

app.post("/api/chat", chatLimiter, async (req, res) => { /* ... */ });
```

Tune `max` per vertical, but never remove the limiter. A loop bug in the frontend + an open `/api/chat` = a five-figure Anthropic bill in an afternoon.

#### Auth middleware (internal-demo pattern)

```ts
const AUTH_TOKEN = process.env.AUTH_TOKEN?.trim() || null;

app.use("/api", (req, res, next) => {
  if (!AUTH_TOKEN) return next();
  const header = req.headers.authorization;
  if (header === `Bearer ${AUTH_TOKEN}`) return next();
  res.status(401).json({ error: "Unauthorized" });
});
```

For customer-facing verticals, replace this with a Supabase OTP session check. The `Bearer ${AUTH_TOKEN}` pattern is fine for internal demos, client PoCs, and single-operator tools — but `AUTH_TOKEN` must always be set in production. The "if unset, routes are open" fallback is a development convenience only.

#### PostgREST filter sanitizer

```ts
// src/db.ts
/** Strip characters that could inject PostgREST filter operators */
export function sanitizeSearch(input: string): string {
  return input.replace(/[,.()"'\\]/g, "").trim();
}

// usage
const q = `%${sanitizeSearch(query)}%`;
supabase.from("quotes").or(`customer_name.ilike.${q},customer_ref.ilike.${q}`);
```

Every tool handler that takes a `search` / `query` / name-fragment string must pass it through `sanitizeSearch()` before interpolation. The LLM generates these arguments — `,` and `)` in a tool input are a PostgREST injection, not a typo.

#### Supabase client split

```ts
// src/db.ts — SERVICE ROLE, server-only
import { createClient } from "@supabase/supabase-js";

export const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!
);
```

The service-role key bypasses RLS, which is why non-negotiable #8 (RLS on every table) still applies: RLS catches the day a tool handler forgets to filter by `user_id`. That day is coming.

Frontend code uses `SUPABASE_ANON_KEY` (and only the anon key) when it talks to Supabase directly for auth.

#### SSE-safe error handling

```ts
app.post("/api/chat/stream", async (req, res) => {
  try {
    res.setHeader("Content-Type", "text/event-stream");
    res.flushHeaders();
    // ... emit events
  } catch (err) {
    if (res.headersSent) {
      // already streaming — emit error event, don't try to write JSON status
      res.write(`event: error\ndata: ${JSON.stringify({ error: err.message })}\n\n`);
      res.end();
    } else {
      res.status(500).json({ error: err.message });
    }
  }
});
```

### Quick-reference table

| Area | Rule |
|------|------|
| **Auth** | Supabase OTP (email/phone). No passwords. Per `supabase-railway-stack` skill. |
| **RLS** | Every table has `auth.uid() = user_id` policy. No exceptions. |
| **Service-role key** | Server-only. Never exposed to client. Use only for tool execution. |
| **Anon key** | Client-side auth flows only. Never grants table access without RLS. |
| **Admin route** | HTTP Basic + single `ADMIN_PASSWORD` env var. No public access. |
| **CORS** | `cors({ origin: false })`. Do not `allow_origins: ["*"]` in production. |
| **Helmet + CSP** | Lock script/style/img/font/connect sources. No wildcards. |
| **Body cap** | `express.json({ limit: "1mb" })`. Never unlimited. |
| **Rate limit** | `express-rate-limit` 30/min on every paid-LLM route. |
| **PostgREST sanitize** | `sanitizeSearch()` on every string into `.or()` / `.ilike()`. |
| **Tool execution** | Server validates user has access to entity being mutated before calling tool. |
| **PII** | Never log full user messages with PII to shared log. Per-tenant logging. |
| **SSE errors** | Check `res.headersSent` before responding with error. |
| **Secret leakage** | Never put any secret in system prompt or `rules/*.md`. Env vars only. |

### Things the LLM will try to skip (anti-patterns to refuse)

- **"It's just a demo, skip rate limits."** No. Demos run on Jon's Anthropic bill. The 30 req/min limiter is a spend control.
- **"CORS `origin: '*'` is easier."** No. Same-origin only.
- **"I'll sanitize the search strings later."** No. Every `.or()` / `.ilike()` with an interpolated string is a PostgREST injection unless sanitized at the call site.
- **"Tool results go straight into `innerHTML` because the bot wouldn't generate anything bad."** No. If any tool result is ever rendered as HTML, it goes through a sanitizer (DOMPurify or equivalent) regardless of source.
- **"Let me put `SUPABASE_SERVICE_ROLE_KEY` in a `NEXT_PUBLIC_*` / `VITE_*` var."** No. The service-role key never leaves the server. Stop the moment the LLM suggests this.
- **"RLS isn't needed because the server always filters by `user_id`."** No. Until one tool handler forgets — and then it's a cross-tenant leak.
- **"AUTH_TOKEN can stay unset in prod, we'll add auth later."** No. `/api/*` is never open in production.
- **"Skip Helmet because it's just a dev API."** No. Set it from the first deploy.
- **"Wrap SSE writes in try/catch and write `res.status(500).json(...)` from the catch."** No. Check `res.headersSent` first — after `flushHeaders()` you can't send a JSON status, only an SSE `error` event.

---

## Other flourishes worth borrowing

- **Animated thinking dot** before first event arrives, removed when text/tool starts
- **Smart scroll threshold** of ~150px (not 80px) for streaming chats — more room for user to read history
- **Token/cost footer** on final message in dev mode (hide in prod unless user is operator)
- **Per-message timestamp** as hover affordance, not always visible
- **Click tool card → toggle `.collapsed`** — pure CSS class swap, no JS animation library

## Autopilot pattern (revised May 2026, from PGE QuoteFlow)

**The default for any workflow-driving dashboard chatbot.** Reference impls: `clients/PGE/pge-quoteflow-lean` (RFQ agent) and `projects/social-copilot/` (social-media management).

### 1. Welcome card on empty chat-log
Title + tagline + 2–4 clickable starter suggestions (`<button class="suggestion" data-suggestion="...">...</button>`). Replaces a cold blinking cursor. See `pge-quoteflow-lean/server.ts:315`.

### 2. Inline `[Pill Text]` syntax in bot messages
Any `[text]` not followed by `(` is a pill. Frontend regex `/\[([^\]]+)\](?!\()/g` → `<span class="option-pill" data-pill="text">text</span>`. Click sends pill text back as a chat message. **Avoid the older `[OPTIONS: A | B | C]` on-its-own-line format** — inline is more flexible: *"Want me to draft the post or check the brand voice first? [Draft post] [Check brand voice]"*.

### 3. Spec rule: every turn ends with pills when there's a next action
Quote from PGE spec.md: *"ALWAYS end your response with clickable option pills when there are next actions. Never write 'Would you like me to proceed?' without also providing pills. The pills ARE the options."*

### 4. Three-state processing_mode (NOT binary)

```
workspace.settings.processing_mode = 'ask' | 'supervised' | 'auto_process'
```

Read at the top of every turn, injected into the system prompt header. Resolved in code (e.g. `instructions.ts:resolveProcessingMode`).

- **`ask`** — confirm every action, wait for pill approval before any mutation. Use for new users / delicate campaigns.
- **`supervised` (default)** — explain + proceed on routine actions (drafting, querying, generating). Pause for approval on risky (publishing, sending real emails, > $1 API spend in one turn).
- **`auto_process`** — execute and narrate. Make forward progress, don't block on unknowns. Apply best-guess defaults + flag for review + keep moving. Pause only for irreversible operations the user hasn't pre-approved or > $5 single-turn API spend.

### 5. Raw thinking, not narration prose

Use Anthropic's `thinking: { type: 'enabled', budget_tokens: ~4000 }` parameter. Stream `content_block_delta` events with `delta.type === 'thinking_delta'` to the frontend as a collapsed-by-default **"💭 Reasoning"** card. Spec.md should **explicitly forbid** the bot writing `> blockquote` narration about what it's about to do — thinking does that work for free. The bot's final text response goes straight to **findings → confidence dots → pills**.

Trade-offs:
- ✅ Authentic — what the model actually thought, not a polished retell
- ✅ Better tool-use decisions in agentic loops (more reasoning per iteration)
- ✅ Skimmable — one collapsed card per turn instead of N blockquote lines
- ⚠️ Thinking tokens are billed; usually offset by shorter prose output
- ⚠️ Requires Anthropic SDK ≥ 0.97 + `claude-opus-4-7` (or any model with thinking support)

Implementation in `chat.ts`:
```ts
const stream = anthropic.messages.stream({
  model: env.CLAUDE_MODEL,
  max_tokens: 16000,                                       // must exceed budget_tokens
  thinking: { type: 'enabled', budget_tokens: 4000 },
  ...
});
stream.on('streamEvent', (event) => {
  if (event.type === 'content_block_start' && event.content_block?.type === 'thinking') emit({ type: 'thinking_start' });
  else if (event.type === 'content_block_delta' && event.delta?.type === 'thinking_delta') emit({ type: 'thinking_delta', text: event.delta.thinking });
  else if (event.type === 'content_block_stop' && currently_thinking) emit({ type: 'thinking_done' });
});
```

### 6. Narration discipline (when bot DOES write prose)

PGE spec.md is the canonical reference. Required elements:
- **Headings** (`### Section`) labelling every part of the response
- **Bold** for key identifiers (client names, dollar amounts, statuses, dates)
- **`inline code`** for slugs / IDs / part numbers / hashtags
- **Italics** for supplementary context
- **Confidence dots** 🟢 (high) / 🟡 (medium) / 🔴 (low) on every key finding so the user can skim
- **Collapsible `+++Summary\nbody\n+++`** — if it has a `###` heading, make it collapsible. Exceptions: opening summary, 1-3 sentence recommendation, pill row, errors.

### 7. Three-part response structure (two parts since raw thinking)

1. **Findings** — heading + bullets + confidence dots
2. **Decision** — pills

(In the old narration-prose pattern there was a 3rd part: opening narration blockquote. Killed by raw thinking.)

### 8. Deep links

When referencing a specific entity (client, post, post-set, render job, etc.), use `[[entity:id:label]]` syntax. Frontend regex captures it, renders as clickable, opens the relevant tab/modal. Examples:
- `[[client:irvines:View Irvines]]`
- `[[post:UUID:View Post]]`
- `[[set:UUID:View Set]]`

Backend doesn't need to know about this — purely a frontend rendering convention.

### 9. Storing learnings — `save_instruction` tool

When the user corrects you ("actually do Y instead") or gives a new rule:
1. Acknowledge + apply immediately
2. Ask "Should I save this so I remember next time? [Yes] [No]"
3. If yes → `save_instruction({ category, title, body, project_id? })`

Categories: `edge_case`, `client_note`, `vendor_note`, `user_preference`, `process_note`.

Implementation: workspace-scoped (or project-scoped) `chatbot_instructions` table. `buildInstructionContext(workspaceId, projectId)` is called at the top of every chat turn and appends an "## Persistent instructions" block to the system prompt. New instructions take effect on the next turn — no restart, no redeploy.

Reference impl: `projects/social-copilot/app/src/tools/instructions.ts` (Social Studio) + `pge-quoteflow-lean` `get_instructions`/`create_instruction` tools.

### 10. Sticky scroll + jump pill

Wrap every mutation in `withStickyScroll(fn)`. Jump-to-latest pill appears when user has scrolled up.

## What NOT to bake in

- Don't render markdown via `innerHTML` from arbitrary text without escaping. Custom parser must escape HTML before applying markdown transforms.
- Don't put raw tool outputs in assistant text. Use the tool card.

---

## Phase 0 — tool/script brainstorming with end-user

Triggered when (any one):
- User row created in last 24h AND zero prior messages
- New project / property / quote / topic row created with no associated artifacts yet
- User explicitly types *"what can you do?"* / *"help"* / *"getting started"*

Bot's `rules/00-onboarding.md` instructs it to:
1. **Ask what user's current manual workflow looks like.** What do they do today? What do they hate doing?
2. **Surface existing tool inventory in plain language** — *"here's what I can currently do for you: generate listings, enhance photos, build scroll-story pages…"*
3. **Explicitly ask for gaps** — *"is there anything you wish I could do that I haven't mentioned?"*
4. **Record each suggestion** — write to `tool_suggestions` table with user, project, request text, status (`new`, `reviewed`, `built`, `rejected`)
5. **Never promise new capability on the spot.** *"Noted — I've logged this as a suggestion."* NOT *"I'll build that for you now."*
6. **Flag gaps mid-workflow too** — any time during session when bot lacks a tool for current task, log as implicit suggestion automatically

Operator reviews `tool_suggestions` regularly, decides which to build, adds script, wires as tool, redeploys.

---

## Reference implementations

- **PGE quoteflow-lean** (`SimplaDocs/clients/PGE/`) — cleanest agentic example
- **orchestratum-bot** ([projects/orchestratum-ai/chatbot/](../../../projects/orchestratum-ai/chatbot/)) — reference for Supabase-backed kb_files + /admin panel
- **Obscura Films demo-1-project-tracker** ([projects/Obscura Films/demo-1-project-tracker/](../../../projects/Obscura%20Films/demo-1-project-tracker/)) — canonical pure-JS Express SSE implementation
- **ZimRoots chatbot** ([projects/zimroots/chatbot/](../../../projects/zimroots/chatbot/)) — canonical pill widget (`parseOptions()`, `[OPTIONS: A | B]` syntax, disabled-after-click)

---

## Pairs with

- `supabase-railway-stack` — auth + DB + RLS + deploys
- `claude-code-habits` — every paid Claude API call logs via `ai-billing-log`
- `elevenlabs-tts` — if chatbot has voice output (use turbo for real-time)
- `qa-chatbot-saas` — verifies all 7 widget contract items + Pattern 27 architecture (kb_files in Supabase via /admin) + security baseline
- `html-deliverable-qa` — for the dashboard HTML
- Project-local skills — when client has own brand / aesthetic for their chatbot UI
