---
name: forge-dockerfile
description: Production-grade Dockerfiles. Multi-stage builds, non-root execution, deterministic dependencies, sub-200MB final image, signal-handling exec-form CMD, build-time secret mounts, .dockerignore hygiene, language-specific defaults. Contains worked Dockerfiles for Node, Go, Rust, Python. Use whenever writing or auditing a Dockerfile.
license: MIT
---

# forge-dockerfile

You are writing a Dockerfile that will be built thousands of times in CI and pulled millions of times in production. Default agent output produces 2GB images that run as root, rebuild dependencies on every code change, and bake secrets into intermediate layers. This skill exists to stop that.

## Quick reference (the things you must never ship)

1. `FROM node:22` (or any image) without an explicit version tag pinned.
2. `FROM node:latest` - latest is never a tag you ship.
3. Image runs as root (no `USER` directive).
4. `COPY . .` as the first COPY (kills the dependency cache).
5. `COPY .env` (or any `.env*`).
6. `ENV API_KEY=...` or any secret baked into the image.
7. `CMD foo arg1 arg2` in shell form (breaks SIGTERM handling).
8. No `.dockerignore` next to the Dockerfile.
9. `apt-get install -y` without `&& rm -rf /var/lib/apt/lists/*`.
10. Final image over 1GB without a documented reason.

## Hard rules

### Image size

**1. Multi-stage build, always.** Final stage contains the runtime only. Build tools, compilers, test deps, dev headers stay in the build stage.

**2. Pick the smallest base that works.**

| Stack | Base for build | Base for final |
| --- | --- | --- |
| Node | `node:22.11.0-alpine3.20` | same, or `gcr.io/distroless/nodejs22-debian12:nonroot` |
| Go (static) | `golang:1.23.4-alpine` | `gcr.io/distroless/static-debian12:nonroot` |
| Rust | `rust:1.83-slim-bookworm` | `gcr.io/distroless/cc-debian12:nonroot` |
| Python | `python:3.13-slim-bookworm` | same (or `:alpine` if no native deps) |

Never `:latest`, never the full image without a reason.

**3. Final image under 200MB for most services.** Above 1GB is a red flag. Node services with native modules can sit at 300-500MB.

### Layer caching

**4. Copy package manifests BEFORE source.** Dependency installs are the slowest layer. Cache them.

```dockerfile
# GOOD: cache-friendly
COPY package.json package-lock.json ./
RUN npm ci --ignore-scripts
COPY . .
RUN npm run build

# BAD: dependency layer invalidates on every source change
COPY . .
RUN npm ci && npm run build
```

**5. One `RUN` per logical step.** Combine `apt-get update && apt-get install && rm -rf /var/lib/apt/lists/*` in one RUN. Do not combine unrelated steps - cache invalidation gets too coarse.

**6. `COPY . .` is never the first COPY.** It invalidates the cache on every code change. Manifests first, source second.

### Determinism

**7. Pin everything.** Base image with full tag (`node:22.11.0-alpine3.20`, not `node:22-alpine`). Lockfile present and copied. System packages with versions where supported.

**8. No `apt-get install -y` without a version pin or lockfile equivalent.** Unpinned installs are reproducible only by luck.

```dockerfile
# BAD
RUN apt-get update && apt-get install -y curl jq

# BETTER
RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        curl=7.88.* \
        jq=1.7.* \
    && rm -rf /var/lib/apt/lists/*
```

**9. No `curl | sh` installers.** They fetch the latest of whatever is upstream. If you must, pin the URL to a specific release tag and verify a checksum.

### Security

**10. Run as non-root.**

```dockerfile
# Alpine-based image
RUN addgroup -S app && adduser -S app -G app
USER app

# Distroless: use the built-in nonroot user
USER nonroot:nonroot
```

**11. Never `COPY .env`.** Use `.dockerignore` to exclude `.env*`, `.git`, `node_modules` (when you reinstall), `.DS_Store`, IDE folders.

**12. Secrets mounted at runtime, not baked.** Use `--mount=type=secret` for build-time (npm tokens, GitHub tokens), env vars at runtime for everything else. Never `ENV API_KEY=...`.

```dockerfile
# BuildKit secret mount for npm token
RUN --mount=type=secret,id=npm_token \
    NPM_TOKEN=$(cat /run/secrets/npm_token) \
    npm ci --ignore-scripts
```

**13. Drop capabilities and set read-only filesystem in the orchestrator.** Document the expected runtime constraints at the bottom of the Dockerfile.

```dockerfile
# Runtime constraints (set in k8s / compose):
#   securityContext:
#     readOnlyRootFilesystem: true
#     allowPrivilegeEscalation: false
#     capabilities: { drop: ["ALL"] }
#     seccompProfile: { type: RuntimeDefault }
```

### Health and signals

**14. `HEALTHCHECK` or document the absence.** If the orchestrator handles it (k8s readiness/liveness), say so in a comment.

```dockerfile
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD wget --quiet --tries=1 --spider http://localhost:3000/health || exit 1
```

**15. `CMD ["binary", "args"]` exec form, never shell form.** Shell form (`CMD command arg`) wraps in `/bin/sh -c` and breaks signal handling. SIGTERM never reaches the app, container takes 30s to shut down.

```dockerfile
# BAD
CMD node dist/server.js

# GOOD
CMD ["node", "dist/server.js"]
```

**16. Trap SIGTERM in your app, or use `tini` as PID 1.** Most languages do not handle SIGTERM gracefully by default. Document the choice.

### Build context

**17. `.dockerignore` is mandatory.** At minimum:

```
.git
.github
node_modules
dist
build
.env*
!.env.example
.DS_Store
.vscode
.idea
*.md
!README.md
Dockerfile
docker-compose*.yml
test
coverage
```

**18. No `ADD` when `COPY` would do.** `ADD` has surprising behavior with URLs and tar files.

### Language-specific defaults

**Node:**
- `npm ci` not `npm install`. Reproducible.
- `NODE_ENV=production` in the final stage.
- `corepack enable` if using pnpm or yarn.
- Use `--ignore-scripts` on `npm ci` for security (some packages run code on install).

**Python:**
- `pip install --no-cache-dir -r requirements.txt`.
- `PYTHONUNBUFFERED=1`, `PYTHONDONTWRITEBYTECODE=1`.
- Use `uv` or `pip-tools` for lockfiles. Bare `requirements.txt` without pinned transitive deps is not reproducible.

**Go:**
- Static binary: `CGO_ENABLED=0 GOOS=linux go build`.
- Final stage `gcr.io/distroless/static-debian12:nonroot` - typically under 20MB.

**Rust:**
- Build stage `rust:1.83-slim`, final stage `gcr.io/distroless/cc-debian12`.
- Cache the dependency build: copy `Cargo.toml` + `Cargo.lock`, build against an empty `src/main.rs`, then copy real source.

## Common AI-output patterns to reject

| Pattern | Why wrong | Fix |
| --- | --- | --- |
| `FROM node:latest` | Unpinned, breaks reproducibility | `FROM node:22.11.0-alpine3.20` |
| Single-stage 1.5GB image | Build tools baked in | Multi-stage; final has runtime only |
| `COPY . .` first | Kills dep cache | Manifests, install, then source |
| No `USER` directive | Runs as root | `USER app` or `USER nonroot:nonroot` |
| `COPY .env` | Secret in image | `.dockerignore` excludes it |
| `ENV API_KEY=...` | Secret baked at build | Runtime env, never `ENV` for secrets |
| `CMD foo arg` (shell form) | SIGTERM swallowed | `CMD ["foo", "arg"]` exec form |
| No `.dockerignore` | Sends 500MB of `node_modules` to daemon | Add `.dockerignore` |
| `RUN apt-get install ...` no cleanup | 100MB of apt cache in image | `&& rm -rf /var/lib/apt/lists/*` |
| `RUN curl ... \| sh` | Floating install | Pinned URL + checksum |
| `imagePullPolicy: Always` with `:latest` | Re-pulls every restart | Pinned tag |

## Worked example: Node service Dockerfile

```dockerfile
# Production Dockerfile for a Hono/Node service.
# Follows forge-dockerfile.
# Target: sub-200MB final, non-root, deterministic, signal-handled.

# ─── build stage ──────────────────────────────────────────────────

FROM node:22.11.0-alpine3.20 AS build

WORKDIR /app

# Manifests first (cache hit on dep install layer).
COPY package.json package-lock.json* ./
RUN npm ci --ignore-scripts

# Source next.
COPY tsconfig.json ./
COPY src ./src

# Typecheck + bundle.
RUN npx tsc --noEmit
RUN npx esbuild src/server.ts \
    --bundle --platform=node --target=node22 \
    --outfile=dist/server.js

# Drop dev deps for the production stage to copy.
RUN npm prune --omit=dev

# ─── runtime stage ────────────────────────────────────────────────

FROM node:22.11.0-alpine3.20

# Non-root user.
RUN addgroup -S app && adduser -S app -G app

WORKDIR /app
COPY --from=build --chown=app:app /app/dist ./dist
COPY --from=build --chown=app:app /app/node_modules ./node_modules
COPY --from=build --chown=app:app /app/package.json ./

ENV NODE_ENV=production
ENV PORT=3000

USER app

# Exec form so SIGTERM reaches the process.
CMD ["node", "dist/server.js"]

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
    CMD wget --quiet --tries=1 --spider http://localhost:3000/health || exit 1

# Runtime constraints (set in k8s):
#   readOnlyRootFilesystem: true
#   allowPrivilegeEscalation: false
#   capabilities: { drop: ["ALL"] }
#   resources: { requests: {cpu: 50m, memory: 128Mi}, limits: {memory: 256Mi} }
```

## Worked example: Go service Dockerfile

```dockerfile
FROM golang:1.23.4-alpine AS build

WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .

RUN CGO_ENABLED=0 GOOS=linux go build -trimpath -ldflags="-s -w" -o /out/app ./cmd/server

# Final: distroless, ~20MB
FROM gcr.io/distroless/static-debian12:nonroot

COPY --from=build /out/app /app

USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/app"]
```

## Worked example: companion .dockerignore

```
.git
.github
node_modules
dist
build
coverage
.env
.env.*
!.env.example
.DS_Store
.vscode
.idea
*.md
!README.md
Dockerfile
docker-compose*.yml
test
__tests__
*.test.ts
*.spec.ts
```

## Workflow

When writing a Dockerfile:

1. **Identify the runtime.** Compiled binary? Interpreted? Asset bundle?
2. **Pick build stage and final stage bases separately.**
3. **Write the COPY order to maximize cache hits.** Manifest, install, source, build.
4. **Add `.dockerignore` before the first build.**
5. **Build twice.** Second build should hit the cache hard. If not, your COPY order is wrong.
6. **Inspect with `dive` or `docker history`.** Layer sizes tell you where you wasted space.

## Verification

```bash
bash skills/infra/forge-dockerfile/verify/check_dockerfile.sh path/to/Dockerfile
```

Flags: `:latest` tag, missing USER, secret-shaped ENV, COPY of .env, CMD in shell form, missing .dockerignore.

## When to skip this skill

- Local dev Dockerfiles where size and security do not matter.
- Throwaway one-off containers.
- Devcontainers, which have their own conventions.

## Related skills

- [`forge-secrets`](../../security/forge-secrets/SKILL.md) - secrets never bake into the image.
- [`forge-kubernetes`](../forge-kubernetes/SKILL.md) - runtime constraints documented at the bottom of the Dockerfile.
- [`forge-github-actions`](../forge-github-actions/SKILL.md) - building and pushing the image.
