---
name: postgres-core-schema-design
description: >
  Use when starting a new PostgreSQL database, picking naming conventions, deciding multi-schema layout, or hardening search_path against SECURITY DEFINER injection.
  Prevents using SERIAL where IDENTITY is required (v10+), shipping a quoted-identifier mess ("camelCase" forces quotes everywhere), leaving public schema writable, and SECURITY DEFINER search_path hijack.
  Covers snake_case naming, IDENTITY vs SERIAL (v10+), multi-schema layout patterns, search_path semantics + injection risks, v15 public-schema default lockdown, multi-tenant via schema vs RLS decision tree.
  Keywords: schema design, naming convention, IDENTITY, GENERATED AS IDENTITY, SERIAL deprecated, search_path, SECURITY DEFINER, public schema, snake_case, multi-tenant, where should I put my tables, can I use camelCase, why does serial cause sequence issues, search path hijack, what schema layout, public schema permission denied, function ignoring search_path, identity vs serial which one
license: MIT
compatibility: "Designed for Claude Code. Requires PostgreSQL 15, 16, or 17."
metadata:
  author: OpenAEC-Foundation
  version: "1.0"
---

# postgres-core-schema-design

## Quick Reference :

PostgreSQL schema design fixes four classes of mistake on day one : identifier-casing chaos (`"camelCase"` forces every consumer to quote every reference), the `serial` foot-gun (an implicit sequence with broken privilege semantics, deprecated in favor of `GENERATED ... AS IDENTITY` since v10), a writable `public` schema that lets any logged-in role create objects in the trust-by-default namespace, and `SECURITY DEFINER` functions whose unqualified references can be hijacked by a caller-controlled `search_path`. These four decisions are made once and ALWAYS at table-creation time : retrofitting them later means a coordinated migration and a recompile of every dependent function. This skill encodes the decisions you make BEFORE the first `CREATE TABLE`.

A PostgreSQL database is one logical container with N schemas inside it. A schema is a namespace for tables, indexes, views, functions, types, sequences, and operators. Connection-level access is per-database (one `CONNECT` privilege per database, no cross-database queries except via FDW or `dblink`). Inside a database, name resolution walks the `search_path` GUC left-to-right and resolves the first match. The default `search_path` is `"$user", public` : a per-user schema (created on demand, never auto-created), then the shared `public` schema. PostgreSQL 15 changed the security default : on a freshly initialized cluster, `public` no longer grants `CREATE` to role `PUBLIC` : pre-existing databases upgraded to v15 keep their pre-v15 permissive grant. Therefore : on any database that PRE-DATES v15 you MUST `REVOKE CREATE ON SCHEMA public FROM PUBLIC` explicitly. Every `SECURITY DEFINER` function MUST pin `search_path` via `SET search_path = ...` in its definition, with `pg_temp` last, otherwise a malicious caller can substitute objects via their own temporary schema.

## When To Use This Skill :

ALWAYS use this skill when :
- Starting a new database : naming conventions, schema layout, identity-column choice are decided once
- Designing a multi-tenant database : schema-per-tenant vs RLS vs database-per-tenant tradeoff
- Writing or auditing any `SECURITY DEFINER` function : `search_path` must be pinned
- Onboarding an existing database whose `public` schema is wide-open and you suspect v14-or-earlier defaults
- Choosing between `serial` / `bigserial` and `GENERATED ... AS IDENTITY` for a new primary key column
- Picking singular vs plural table names, FK column naming, primary-key column naming

NEVER use this skill for :
- Row-Level Security policy authoring : see `postgres-core-rls-policies`
- Role + privilege model (CREATE ROLE, GRANT) : see `postgres-core-roles-privileges`
- Index naming + design : see `postgres-core-indexing-strategy`
- Schema migration execution (lock timing, NOT VALID + VALIDATE) : see `postgres-impl-zero-downtime-ddl`

## Decision Trees :

### Identity column : IDENTITY or SERIAL? :

```
PostgreSQL version >= 10?
├── No (v9.6 or earlier) : SERIAL / BIGSERIAL is the only option
└── Yes (v10+) : ALWAYS GENERATED ... AS IDENTITY. NEVER serial in new code.
    │
    Should application code be allowed to supply its own id?
    ├── Never (PK is system-issued, app must not override) :
    │       id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY
    │       (override requires INSERT ... OVERRIDING SYSTEM VALUE)
    └── Sometimes (data migration loads supply ids, app does not) :
            id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
            (app INSERT without id : sequence allocates ; with id : honored)
```

### Multi-tenancy : schema-per-tenant or RLS? :

```
Tenant count and isolation strength?
├── < 50 tenants AND tenants must NEVER share table-level access :
│   schema-per-tenant (one schema per tenant, identical DDL)
│   + Pros : strong isolation at GRANT level, per-tenant DDL freedom,
│            simple "DROP SCHEMA tenant_x CASCADE" for offboarding
│   + Cons : N copies of every DDL change, sequences/types duplicated,
│            connection pool needs SET search_path per request
├── 50 - 10,000 tenants AND DDL must stay singular :
│   RLS (one schema, tenant_id column, FORCE ROW LEVEL SECURITY policies)
│   + Pros : one DDL, planner sees one statistics set, one set of indexes,
│            CONNECT-time SET app.current_tenant
│   + Cons : every query carries tenant_id implicitly, ROLE must not BYPASSRLS,
│            cross-tenant aggregate requires a privileged role
└── > 10,000 tenants OR regulated isolation (HIPAA, GDPR DPA per-tenant) :
    database-per-tenant (one database per tenant in the cluster, or N clusters)
    + Pros : independent backup / restore / drop / encryption per tenant
    + Cons : no cross-tenant query at all without FDW, max_connections explodes
```

### Should this name be quoted? :

```
Identifier matches ^[a-z_][a-z0-9_]*$ AND is not a reserved word?
├── Yes : write it unquoted. PostgreSQL folds unquoted identifiers to lowercase.
└── No : you are forcing every consumer to quote it forever.
    ├── Contains uppercase ("CustomerId") : FIX, rename to snake_case
    ├── Contains hyphen / space / special : FIX, rename to snake_case
    ├── Reserved word ("user", "order", "table") : FIX, rename
    │   (e.g. "user" -> app_user, "order" -> sales_order)
    └── Truly external system requirement (FDW to MS SQL) : quote at the
        FDW boundary, snake_case everywhere else.
```

### Does my SECURITY DEFINER function need `SET search_path`? :

```
Function uses SECURITY DEFINER?
├── No (default SECURITY INVOKER) : caller's search_path is fine
└── Yes : ALWAYS add SET search_path = pg_catalog, <trusted_schemas>, pg_temp
    │
    Does the function reference unqualified table/operator/function names?
    ├── Yes : MANDATORY to pin search_path AND schema-qualify the references
    │        (defense-in-depth : pin search_path AND write admin.pwds, not pwds)
    └── No (every reference is schema-qualified) : pin search_path anyway as
        operators/casts also resolve via search_path. Cost : zero.
```

### Where do I put my tables? :

```
Single application, one logical domain?
├── Yes : public schema is fine AFTER `REVOKE CREATE ON SCHEMA public
│         FROM PUBLIC` (v14-or-earlier databases : MANDATORY).
│         New v15+ databases : already revoked at initdb time.
└── No : multiple modules / layers / privilege boundaries?
    ├── Module separation (app, auth, billing, analytics) :
    │       one schema per module, dedicated role per module with
    │       USAGE on its own schema + SELECT on read-only schemas it needs
    ├── ETL staging vs serving :
    │       schemas etl_staging (UNLOGGED tables, BYPASSRLS service role),
    │       serving (final tables, RLS, app role has SELECT only)
    └── Multi-tenant : see "schema-per-tenant or RLS" decision above
```

## Patterns :

### Pattern 1 : IDENTITY column for every new primary key

ALWAYS use `GENERATED ALWAYS AS IDENTITY` for new surrogate primary keys on v10+ unless the application has a documented need to supply ids.
NEVER use `serial` / `bigserial` in new code : it creates an implicit sequence whose ownership and privilege semantics surprise everyone.

```sql
-- Preferred (system-assigned, app cannot override accidentally) :
CREATE TABLE customers (
  id        bigint  GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  email     citext  NOT NULL UNIQUE,
  full_name text    NOT NULL,
  created_at timestamptz NOT NULL DEFAULT now()
);

-- Migration loads need to override : use OVERRIDING SYSTEM VALUE
INSERT INTO customers (id, email, full_name)
  OVERRIDING SYSTEM VALUE
  VALUES (42, 'alice@example.com', 'Alice');

-- App is the legitimate id-owner (rare, e.g. distributed id generator) :
CREATE TABLE events (
  id   bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  ...
);
-- Now both work :
INSERT INTO events DEFAULT VALUES;            -- sequence allocates
INSERT INTO events (id) VALUES (9999);        -- app-supplied id honored
```

WHY : `serial` is documented as a "shorthand for creating a sequence and integer column" but the sequence is owned independently, ALTER COLUMN TYPE breaks the dependency, INSERT-grants leak the sequence implicitly, and the column type stays `integer` (not `bigint`) which exhausts at 2.1B rows. `GENERATED AS IDENTITY` is the SQL-standard spelling, is privilege-clean (the sequence is owned by the identity column, not visible to non-owners), and chooses bigint by default in modern style. Source : postgresql.org/docs/17/sql-createtable.html (Identity Columns section), Anti-Pattern 1 in vooronderzoek §19.

### Pattern 2 : snake_case identifiers, never quoted

ALWAYS use lowercase snake_case for every table, column, index, sequence, and function name.
NEVER ship `"CamelCase"` identifiers : PostgreSQL folds unquoted names to lowercase, so a quoted `"CustomerId"` is a DIFFERENT identifier than an unquoted `CustomerId` (which resolves to `customerid`).

```sql
-- Correct
CREATE TABLE sales_order (
  id            bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  customer_id   bigint NOT NULL REFERENCES customers(id),
  ordered_at    timestamptz NOT NULL DEFAULT now(),
  total_amount  numeric(12,2) NOT NULL
);

CREATE INDEX sales_order_customer_id_idx ON sales_order (customer_id);

-- Wrong (forces every consumer to quote forever, breaks ORM model classes,
-- breaks pg_dump --schema-only diffs across environments)
CREATE TABLE "SalesOrder" (
  "ID"          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  "CustomerID"  bigint NOT NULL,
  ...
);
```

Conventions enforced project-wide :
- Tables : singular OR plural, pick one project-wide and document. Both are legitimate.
- Primary key : `id` (preferred — works with `JOIN ... USING(id)` mechanically) OR `<entity>_id` (preferred — self-describing in joins). NEVER mix both styles.
- Foreign keys : `<referenced_entity>_id` (e.g. `customer_id` in `sales_order`).
- Junction tables : `<entity_a>_<entity_b>` alphabetical (e.g. `customer_tag`, not `tag_customer`).
- Indexes : `<table>_<column(s)>_<type>_idx` (e.g. `sales_order_customer_id_idx`, `sales_order_search_gin_idx`).
- Sequences (when not implicit) : `<table>_<column>_seq` (PostgreSQL's auto-generated default).

WHY : an unquoted identifier is case-folded to lowercase at parse time, so `SELECT * FROM Customers` and `SELECT * FROM customers` resolve to the same relation. The MOMENT you write `CREATE TABLE "Customers"` (with quotes and capitals), every subsequent reference must be quoted too, and ANY tool that lowercases generates a "relation does not exist" error. Source : postgresql.org/docs/17/sql-syntax-lexical.html (Identifiers and Key Words).

### Pattern 3 : Lock down `public` schema on v14-and-earlier databases

ALWAYS run `REVOKE CREATE ON SCHEMA public FROM PUBLIC` on any database that was initialized before PostgreSQL 15 (or upgraded in place from v14 or earlier).
NEVER assume `public` is locked just because the cluster runs v15+ : pre-existing databases keep their old permissive grant. Only databases freshly created by initdb on v15+ have it revoked by default.

```sql
-- Detect the state on every database :
SELECT n.nspname,
       has_schema_privilege('public', n.nspname, 'CREATE') AS public_can_create
FROM pg_namespace n
WHERE n.nspname = 'public';

-- Lock it down (idempotent ; safe to run on already-revoked databases)
REVOKE CREATE ON SCHEMA public FROM PUBLIC;

-- Verify
SELECT has_schema_privilege('public', 'public', 'CREATE');
-- Returns f after revoke.

-- Application roles need explicit GRANT :
GRANT USAGE  ON SCHEMA public TO app_reader, app_writer;
GRANT CREATE ON SCHEMA public TO app_writer;
```

WHY : the historical default let any logged-in role create objects in `public`. That meant a low-privilege role could shadow tables, install malicious functions, or set type defaults that get picked up by `search_path`. CVE-2018-1058 used exactly this vector against `pg_dump`. The v15 change closed the default but cannot retroactively fix old databases. Source : postgresql.org/docs/17/ddl-schemas.html (Schemas section, "Public Schema"), PostgreSQL 15 release notes.

### Pattern 4 : Pin `search_path` on every `SECURITY DEFINER` function

ALWAYS set `search_path` explicitly on `SECURITY DEFINER` functions, ending with `pg_temp` so the per-session temp schema cannot shadow trusted objects.
NEVER ship a `SECURITY DEFINER` function without `SET search_path` : the caller controls their own `search_path`, and the function inherits it.

```sql
-- Correct : search_path is pinned, schemas are pre-qualified anyway
CREATE FUNCTION admin.check_password(uname text, pass text)
RETURNS boolean
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = pg_catalog, admin, pg_temp
AS $$
DECLARE passed boolean;
BEGIN
  SELECT (pwd = $2) INTO passed
  FROM admin.pwds
  WHERE username = $1;
  RETURN passed;
END;
$$;

-- Restrict execute privileges atomically with creation :
BEGIN;
  ALTER FUNCTION admin.check_password(text, text) OWNER TO admin_owner;
  REVOKE ALL ON FUNCTION admin.check_password(text, text) FROM PUBLIC;
  GRANT EXECUTE ON FUNCTION admin.check_password(text, text) TO app_role;
COMMIT;

-- Wrong : caller's search_path determines what `pwds` means
CREATE FUNCTION check_password(uname text, pass text)
RETURNS boolean
LANGUAGE plpgsql
SECURITY DEFINER  -- runs as owner, but resolves names via caller's path
AS $$
BEGIN
  -- attacker : SET search_path = malicious, public ;
  -- then `pwds` resolves to attacker's malicious.pwds
  RETURN (SELECT pwd = $2 FROM pwds WHERE username = $1);
END;
$$;
```

WHY : a `SECURITY DEFINER` function executes with the OWNER's privileges, but name resolution walks the CALLER's `search_path` unless the function pins its own. Putting `pg_temp` last forbids the temporary schema (always writable) from shadowing trusted names. Pinning `pg_catalog` first additionally protects against operator/cast hijacking. Source : postgresql.org/docs/17/sql-createfunction.html (Writing SECURITY DEFINER Functions Safely), CVE-2007-2138.

### Pattern 5 : Multi-schema layout for module separation

ALWAYS use one schema per module when the application has distinct privilege boundaries (auth vs application vs analytics).
NEVER throw everything into `public` and try to enforce module boundaries with role-by-table grants : that scales poorly and forgets new tables.

```sql
-- One database, four schemas, four roles :
CREATE SCHEMA auth      AUTHORIZATION auth_owner;
CREATE SCHEMA app       AUTHORIZATION app_owner;
CREATE SCHEMA etl       AUTHORIZATION etl_owner;
CREATE SCHEMA reporting AUTHORIZATION reporting_owner;

-- Per-role search_path so each service sees its own schema first :
ALTER ROLE auth_service      SET search_path = auth, public;
ALTER ROLE app_service       SET search_path = app, public;
ALTER ROLE etl_service       SET search_path = etl, app, public;
ALTER ROLE reporting_service SET search_path = reporting, app, public;

-- Default privileges so newly-created objects inherit grants automatically :
ALTER DEFAULT PRIVILEGES FOR ROLE app_owner IN SCHEMA app
  GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO app_service;
ALTER DEFAULT PRIVILEGES FOR ROLE app_owner IN SCHEMA app
  GRANT USAGE ON SEQUENCES TO app_service;

-- Cross-schema FK is fine inside one database :
CREATE TABLE app.session (
  id         bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id    bigint NOT NULL REFERENCES auth.user(id),
  expires_at timestamptz NOT NULL
);
```

WHY : `search_path` + `ALTER DEFAULT PRIVILEGES` together make module separation cheap : new tables in `app` schema are automatically grantable to `app_service`, the service role's `search_path` resolves bare names to its own schema, and cross-schema references stay legible (`auth.user`). Source : postgresql.org/docs/17/ddl-schemas.html, postgresql.org/docs/17/sql-alterdefaultprivileges.html.

### Pattern 6 : Set `search_path` correctly at every level

ALWAYS understand the four levels at which `search_path` can be set and which one applies.
NEVER assume `SET search_path` inside one session affects another session.

| Level | How to set | Survives reconnect? | Use case |
|---|---|---|---|
| Cluster-wide default | `postgresql.conf : search_path = ...` | Yes (server-wide) | Last-resort default ; rarely customized |
| Database default | `ALTER DATABASE x SET search_path = ...` | Yes (per database) | Multi-schema app database |
| Role default | `ALTER ROLE r SET search_path = ...` | Yes (per role) | Service-role isolation (Pattern 5) |
| Session | `SET search_path = ...` | Until reconnect | Ad-hoc psql work |
| Transaction | `SET LOCAL search_path = ...` | Until COMMIT/ROLLBACK | Migration scripts, multi-tenant request handlers |
| Function | `SET search_path = ...` in CREATE FUNCTION | Until function exits | `SECURITY DEFINER` hardening (Pattern 4) |

```sql
-- Show effective resolution chain
SHOW search_path;

-- Inspect the role/database defaults that build the session value
SELECT rolname, rolconfig FROM pg_roles      WHERE rolname = current_user;
SELECT datname, datconfig FROM pg_database   WHERE datname = current_database();
```

WHY : a `SET search_path` issued in a SQL console only affects the open session. Server-side connection pools (PgBouncer transaction mode) make session `SET` unreliable because the next transaction may arrive on a different server connection. Use `ALTER ROLE` for durable per-role config, `SET LOCAL` inside a transaction for per-request multi-tenancy. Source : postgresql.org/docs/17/config-setting.html, postgresql.org/docs/17/runtime-config-client.html.

## Anti-Patterns :

(One-liners ; full diagnosis + fix in `references/anti-patterns.md`.)

- `serial` / `bigserial` in new code : implicit-sequence privilege foot-gun, type stuck at `integer`. Fix : `GENERATED ALWAYS AS IDENTITY`.
- Quoted `"CamelCase"` identifiers : every consumer must quote forever, ORM round-tripping breaks. Fix : snake_case, no quotes.
- `public` schema left writable on pre-v15 databases : low-privilege roles can shadow objects, CVE-2018-1058 risk. Fix : `REVOKE CREATE ON SCHEMA public FROM PUBLIC`.
- `SECURITY DEFINER` function without `SET search_path` : caller can hijack name resolution. Fix : pin `pg_catalog, <trusted>, pg_temp` in CREATE FUNCTION.
- Reserved-word identifiers (`user`, `order`, `table`) : every statement must quote, breaks tooling. Fix : `app_user`, `sales_order`.
- Mixing `id` and `<entity>_id` PK styles in one schema : joins ambiguous, ORMs misalign. Fix : pick one project-wide, document in CLAUDE.md.
- Schema-per-tenant for > ~50 tenants : DDL changes become an N-pass loop, query planner stats explode. Fix : RLS (see `postgres-core-rls-policies`).
- Cross-database joins via `dblink` for normal queries : connection overhead per call, no transactional integrity. Fix : merge into one database with schemas, or use `postgres_fdw` carefully.
- Missing `pg_temp` last in pinned `search_path` of `SECURITY DEFINER` : temp-schema injection vector still open. Fix : ALWAYS end with `pg_temp`.

## Reference Links :

- [references/methods.md](references/methods.md) : IDENTITY column reference (ALWAYS vs BY DEFAULT, OVERRIDING SYSTEM/USER VALUE, sequence ownership), search_path resolution levels, GRANT matrix for schemas, naming-convention table.
- [references/examples.md](references/examples.md) : End-to-end multi-schema setup, SECURITY DEFINER hardening walk-through, IDENTITY migration from serial, public-schema lockdown audit script, schema-per-tenant vs RLS comparison.
- [references/anti-patterns.md](references/anti-patterns.md) : Each anti-pattern with symptom, cause, SQLSTATE (where applicable), detection query, and fix migration.

## See Also :

- `postgres-core-architecture` : MVCC, isolation, WAL durability (the runtime model your schema lives in)
- `postgres-core-version-matrix` : v15 / v16 / v17 feature deltas (including v15 public-schema default)
- `postgres-core-rls-policies` : RLS for multi-tenant, FORCE ROW LEVEL SECURITY, BYPASSRLS audits
- `postgres-core-roles-privileges` : CREATE ROLE, GRANT, ALTER DEFAULT PRIVILEGES details
- `postgres-impl-zero-downtime-ddl` : safe execution of the schema changes you design here
- Vooronderzoek section : §2 (Full SQL Surface : IDENTITY), §13 (Security Model : search_path), §19 (Anti-Patterns : serial, public schema, search_path injection)
- Official docs : https://www.postgresql.org/docs/17/ddl-schemas.html, https://www.postgresql.org/docs/17/sql-createtable.html (Identity Columns), https://www.postgresql.org/docs/17/sql-createfunction.html (Writing SECURITY DEFINER Functions Safely)
