---
name: zero-trust-patterns
description: "Zero-Trust security patterns — mTLS between microservices (Istio/SPIFFE), SPIRE workload identity, OPA/Envoy authorization, NetworkPolicy default-deny-all, short-lived credentials, service mesh security, and Kubernetes RBAC hardening."
---
# Zero-Trust Patterns

## When to Activate

- Designing service-to-service communication in Kubernetes or cloud environments
- Implementing mTLS between microservices
- Setting up SPIFFE/SPIRE for workload identity
- Configuring Istio or Linkerd service mesh
- Writing Kubernetes NetworkPolicies
- Reviewing east-west traffic security
- Building BeyondCorp-style access controls
- Auditing existing cluster network policies for trust boundary gaps

---

## Core Principles (NIST SP 800-207)

Zero-Trust Architecture operates on four pillars:

1. **Never trust, always verify** — even traffic from within the private network is untrusted. Every connection requires authentication and authorization.
2. **Explicit verification** — identity + device + context checked at every request (not just at the perimeter).
3. **Least Privilege Access** — minimal rights, just enough to complete the task, scoped to the operation.
4. **Assume Breach** — design to minimize lateral movement when an attacker is already inside the network.

---

## Service Identity with SPIFFE/SPIRE

**SPIFFE** (Secure Production Identity Framework for Everyone) is the standard for cryptographic service identities.

### SVID Format

```
spiffe://trust-domain/path/to/workload
# Example:
spiffe://prod.example.com/ns/payments/sa/checkout-service
```

### SPIRE Server + Agent Setup

```yaml
# spire-server.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: spire-server
  namespace: spire
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spire-server
  template:
    spec:
      containers:
        - name: spire-server
          image: ghcr.io/spiffe/spire-server:1.9.0
          args:
            - -config
            - /run/spire/config/server.conf
          volumeMounts:
            - name: spire-config
              mountPath: /run/spire/config
            - name: spire-data
              mountPath: /run/spire/data
```

```hcl
# server.conf
server {
  bind_address = "0.0.0.0"
  bind_port    = "8081"
  trust_domain = "prod.example.com"
  data_dir     = "/run/spire/data"
  log_level    = "INFO"

  # JWT SVIDs for service-to-service auth
  jwt_issuer = "https://spire-server.spire.svc.cluster.local"
}

plugins {
  DataStore "sql" {
    plugin_data {
      database_type = "sqlite3"
      connection_string = "/run/spire/data/datastore.sqlite3"
    }
  }

  NodeAttestor "k8s_psat" {
    plugin_data {
      clusters = {
        "my-cluster" = {
          service_account_allow_list = ["spire:spire-agent"]
        }
      }
    }
  }

  KeyManager "memory" {
    plugin_data {}
  }
}
```

### Workload Registration

```bash
# Register a workload (checkout-service in payments namespace)
kubectl exec -n spire spire-server-0 -- \
  /opt/spire/bin/spire-server entry create \
  -spiffeID spiffe://prod.example.com/ns/payments/sa/checkout-service \
  -parentID spiffe://prod.example.com/k8s-node/node1 \
  -selector k8s:ns:payments \
  -selector k8s:sa:checkout-service
```

### Retrieving SVIDs from Workload API (Go)

```go
import (
    "github.com/spiffe/go-spiffe/v2/workloadapi"
    "github.com/spiffe/go-spiffe/v2/spiffetls/tlsconfig"
)

func newTLSConfig(ctx context.Context) (*tls.Config, error) {
    source, err := workloadapi.NewX509Source(ctx)
    if err != nil {
        return nil, fmt.Errorf("create x509 source: %w", err)
    }

    // TLS config that automatically rotates certificates
    return tlsconfig.MTLSClientConfig(source, source, tlsconfig.AuthorizeAny()), nil
}
```

---

## Istio Service Mesh

### Installation

```bash
# Install Istio with ambient mode (no sidecars, ztunnel at node level)
istioctl install --set profile=ambient

# Or with sidecar mode (traditional)
istioctl install --set profile=default

# Enable sidecar injection for a namespace
kubectl label namespace payments istio-injection=enabled
```

### Enforce Strict mTLS Namespace-Wide

```yaml
# peer-authentication.yaml — enforce mTLS for entire namespace
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: payments
spec:
  mtls:
    mode: STRICT   # Reject all non-mTLS traffic. Never leave on PERMISSIVE in production.
```

### Authorization Policies (Default-Deny + Allowlist)

```yaml
# Step 1: Default deny ALL ingress in namespace
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: deny-all
  namespace: payments
spec: {}   # Empty spec = deny all

---
# Step 2: Explicitly allow checkout → payment-processor
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: allow-checkout-to-processor
  namespace: payments
spec:
  selector:
    matchLabels:
      app: payment-processor
  rules:
    - from:
        - source:
            principals:
              - "cluster.local/ns/payments/sa/checkout-service"  # SPIFFE-based identity
      to:
        - operation:
            methods: ["POST"]
            paths: ["/v1/payments"]
```

### Traffic Management

```yaml
# VirtualService: retry + timeout
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: payment-processor
  namespace: payments
spec:
  hosts:
    - payment-processor
  http:
    - retries:
        attempts: 3
        perTryTimeout: 2s
        retryOn: gateway-error,connect-failure,retriable-4xx
      timeout: 10s
      route:
        - destination:
            host: payment-processor
```

### Observability (Zero-Code)

```bash
# Prometheus metrics auto-collected: request_total, request_duration_ms, etc.
# Jaeger traces: every request gets a trace ID automatically

# View live traffic between services
istioctl proxy-config log payment-processor-pod --level debug

# Kiali topology graph
kubectl port-forward svc/kiali -n istio-system 20001:20001
```

---

## Linkerd (Lightweight Alternative)

Choose Linkerd over Istio for: smaller teams, simpler operational model, better performance, no CRD explosion.

```bash
# Install Linkerd
linkerd install --crds | kubectl apply -f -
linkerd install | kubectl apply -f -

# Inject into deployment (or annotate namespace)
kubectl annotate namespace payments linkerd.io/inject=enabled

# Verify mTLS is working
linkerd viz tap deploy/checkout-service -n payments

# Traffic tap shows real-time encrypted traffic
linkerd viz edges deployment -n payments
```

```yaml
# Linkerd Server (equivalent to Istio AuthorizationPolicy)
apiVersion: policy.linkerd.io/v1beta3
kind: Server
metadata:
  name: payment-processor
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-processor
  port: 8080
  proxyProtocol: HTTP/2
---
apiVersion: policy.linkerd.io/v1beta3
kind: MeshTLSAuthentication
metadata:
  name: checkout-service-authn
  namespace: payments
spec:
  identities:
    - "checkout-service.payments.serviceaccount.identity.linkerd.cluster.local"
---
apiVersion: policy.linkerd.io/v1beta3
kind: AuthorizationPolicy
metadata:
  name: checkout-to-processor
  namespace: payments
spec:
  targetRef:
    group: policy.linkerd.io
    kind: Server
    name: payment-processor
  requiredAuthenticationRefs:
    - name: checkout-service-authn
      kind: MeshTLSAuthentication
      group: policy.linkerd.io
```

---

## Kubernetes NetworkPolicies

NetworkPolicies are the Layer-3/4 firewall — always use them even with a service mesh.

### Pattern: Default Deny All, then Allowlist

```yaml
# 1. Default deny all ingress AND egress in namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: payments
spec:
  podSelector: {}   # applies to ALL pods in namespace
  policyTypes:
    - Ingress
    - Egress
```

```yaml
# 2. Allow checkout → payment-processor on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-checkout-to-processor
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-processor
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: checkout-service
      ports:
        - protocol: TCP
          port: 8080
```

```yaml
# 3. Allow DNS resolution (required for all pods)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: payments
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
```

### Cilium — Layer-7 NetworkPolicies (HTTP-aware)

```yaml
# Block all HTTP except POST /v1/payments
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: l7-payment-processor
  namespace: payments
spec:
  endpointSelector:
    matchLabels:
      app: payment-processor
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: checkout-service
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: POST
                path: /v1/payments
```

---

## East-West Traffic Security

### Secrets: Never in Environment Variables

```yaml
# WRONG: Secret as environment variable
env:
  - name: DB_PASSWORD
    value: "supersecret"   # visible in pod spec, logs, ps aux

# WRONG: Even from k8s secret (still exposed as env var)
env:
  - name: DB_PASSWORD
    valueFrom:
      secretKeyRef:
        name: db-secret
        key: password
```

```yaml
# CORRECT: Vault Agent Sidecar (secrets never touch etcd)
annotations:
  vault.hashicorp.com/agent-inject: "true"
  vault.hashicorp.com/agent-inject-secret-db: "secret/payments/db"
  vault.hashicorp.com/role: "checkout-service"
  vault.hashicorp.com/agent-inject-template-db: |
    {{- with secret "secret/payments/db" -}}
    DB_PASSWORD={{ .Data.data.password }}
    {{- end }}
```

```yaml
# CORRECT: External Secrets Operator (ESO) — syncs from AWS Secrets Manager / GCP Secret Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-secret
  namespace: payments
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
  data:
    - secretKey: DB_PASSWORD
      remoteRef:
        key: prod/payments/db
        property: password
```

### Kubernetes Token Volume Projection (Audience-Bound, Short-Lived)

```yaml
volumes:
  - name: vault-token
    projected:
      sources:
        - serviceAccountToken:
            audience: vault           # audience-bound, not usable for anything else
            expirationSeconds: 3600   # short-lived
            path: token
```

---

## Zero-Trust for APIs — BeyondCorp Pattern

```
User Request
     │
     ▼
Identity-Aware Proxy (IAP / Verified Access)
     │  checks:
     │  - Identity (Google/OIDC token)
     │  - Device trust (MDM-enrolled, cert-based)
     │  - Context (IP reputation, time of day)
     │
     ▼  [Allow or Deny]
Backend Service (no direct internet exposure)
```

### AWS Verified Access Example

```hcl
# Terraform: AWS Verified Access endpoint
resource "aws_verifiedaccess_endpoint" "internal_api" {
  verified_access_group_id = aws_verifiedaccess_group.main.id
  endpoint_type            = "load-balancer"
  attachment_type          = "vpc"

  load_balancer_options {
    load_balancer_arn = aws_lb.internal.arn
    port              = 443
    protocol          = "https"
    subnet_ids        = var.private_subnet_ids
  }

  security_group_ids = [aws_security_group.verified_access.id]
}

resource "aws_verifiedaccess_trust_provider" "oidc" {
  trust_provider_type = "user"
  user_trust_provider_type = "oidc"

  oidc_options {
    issuer                 = "https://accounts.google.com"
    authorization_endpoint = "https://accounts.google.com/o/oauth2/v2/auth"
    token_endpoint         = "https://oauth2.googleapis.com/token"
    client_id              = var.oauth_client_id
    client_secret          = var.oauth_client_secret
    scope                  = "openid email"
  }
}
```

---

## Zero-Trust Checklist

Before deploying any new service:

- [ ] Service identity: SPIFFE SVID or Kubernetes Service Account Token with audience binding?
- [ ] mTLS: `PeerAuthentication` set to `STRICT` (not `PERMISSIVE`) in production?
- [ ] Authorization: `deny-all` policy exists, explicit allowlist for each route?
- [ ] NetworkPolicy: `default-deny-all` in namespace, per-service allowlist?
- [ ] Secrets: stored in Vault/ESO — not in env vars or k8s Secret mounted as env?
- [ ] Egress: only required external endpoints whitelisted?
- [ ] East-west: no plain HTTP between services in the mesh?
- [ ] Observability: mTLS telemetry visible in Prometheus/Jaeger/Kiali?

---

## Tool Reference

| Tool | Purpose |
|------|---------|
| `SPIRE` | SPIFFE reference implementation — workload identity |
| `Istio` | Full-featured service mesh — mTLS, traffic management, observability |
| `Linkerd` | Lightweight service mesh — simpler, better performance |
| `Cilium` | eBPF-based CNI — Layer-7 NetworkPolicies, identity-aware routing |
| `Vault` | Secret management — dynamic secrets, lease-based rotation |
| `External Secrets Operator` | Sync secrets from AWS/GCP/Azure into k8s |
| `OPA/Gatekeeper` | Policy enforcement — enforce Zero-Trust rules as admission webhooks |
| `cert-manager` | Automate certificate rotation for mTLS |

---

## Related Skills

- `kubernetes-patterns` — base k8s patterns (Deployments, Services, Ingress)
- `devsecops-patterns` — SAST/DAST/OPA policy automation
- `auth-patterns` — user authentication (JWT, OAuth2) — different from service identity
- `supply-chain-security` — SBOM, SLSA, Sigstore for artifact trust
