---
name: devops_and_infrastructure
type: skill
description: 'DevOps practices, infrastructure as code, CI/CD pipelines, containerisation,
  orchestration, cloud platforms, monitoring, logging, and site reliability engineering.
  Covers AWS, Azure, GCP, Docker, Kubernetes, Terraform, Ansible, and GitHub Actions.

  '
triggers:
- docker
- dockerfile
- container
- kubernetes
- k8s
- helm
- terraform
- ansible
- ci/cd
- pipeline
- github actions
- gitlab ci
- jenkins
- deploy
- deployment
- infrastructure
- cloud
- aws
- azure
- gcp
- monitoring
- prometheus
- grafana
- logging
- elk
- load balancer
- nginx
- reverse proxy
- secrets management
- vault
- service mesh
- istio
- devops
- sre
- reliability
- scalability
compatible_tools:
- code_executor
- file_reader
- file_writer
- web_search
- docx_generator
- claude_gate
escalation_hints:
- multi-region active-active architecture
- zero-downtime migration of production database
- regulatory compliance for cloud infrastructure
scope:
  allowed_paths:
  - workspaces/
  - runtime/outputs/
  blocked_verbs: []
---

# Devops And Infrastructure

Docker best practices:
- Use official base images, pin versions explicitly
- Multi-stage builds to minimise image size
- Non-root user in production containers
- COPY over ADD unless extracting archives
- .dockerignore to exclude unnecessary files
- One process per container
- Environment variables for configuration
- Health checks defined in Dockerfile

Kubernetes essentials:
- Deployments for stateless workloads
- StatefulSets for stateful workloads (databases)
- Services for stable networking
- ConfigMaps for non-sensitive config
- Secrets for sensitive config (consider external
  secret management — Vault, AWS Secrets Manager)
- Resource requests and limits always set
- Liveness and readiness probes defined
- PodDisruptionBudgets for availability

Terraform patterns:
- Remote state storage (S3 + DynamoDB for locking)
- Separate state per environment
- Modules for reusable infrastructure
- Variable validation blocks
- Output values for cross-module references
- Plan before apply — always review changes

CI/CD principles:
- Fast feedback — tests run in parallel
- Build once, deploy many times
- Environment parity — dev matches prod
- Automated rollback on failure
- Secrets never in pipeline YAML — use vault
- Branch protection with required status checks

Monitoring and observability:
- The three pillars: metrics, logs, traces
- RED method: Rate, Errors, Duration
- USE method: Utilisation, Saturation, Errors
- Alert on symptoms not causes
- SLOs and error budgets
- Runbooks linked from every alert

Cloud platform key services:
AWS: EC2, ECS, EKS, Lambda, RDS, S3, CloudFront,
     Route53, IAM, VPC, SQS, SNS, CloudWatch
Azure: AKS, App Service, Functions, SQL Database,
       Blob Storage, Key Vault, Active Directory
GCP: GKE, Cloud Run, Cloud Functions, BigQuery,
     Cloud Storage, IAM, Cloud Monitoring

Security in DevOps:
- Shift left security — scan early and often
- Container image scanning (Trivy, Snyk)
- SAST in CI pipeline
- Dependency vulnerability scanning
- Least privilege IAM policies
- Network policies in Kubernetes
