## 🚨 CRITICAL GUIDELINES ### Windows File Path Requirements **MANDATORY: Always Use Backslashes on Windows for File Paths** When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). **Examples:** - ❌ WRONG: `D:/repos/project/file.tsx` - ✅ CORRECT: `D:\repos\project\file.tsx` This applies to: - Edit tool file_path parameter - Write tool file_path parameter - All file operations on Windows systems ### Documentation Guidelines **NEVER create new documentation files unless explicitly requested by the user.** - **Priority**: Update existing README.md files rather than creating new documentation - **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise - **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone - **User preference**: Only create additional .md files when user specifically asks for documentation --- # Docker Compose Patterns for Production (2025) ## Overview This skill documents production-ready Docker Compose patterns and best practices for 2025, based on official Docker documentation and industry standards. ## File Format Changes (2025) **IMPORTANT:** The `version` field is now **obsolete** in Docker Compose v2.42+. **Correct (2025):** ```yaml services: app: image: myapp:latest ``` **Incorrect (deprecated):** ```yaml version: '3.8' # DO NOT USE services: app: image: myapp:latest ``` ## Multiple Environment Strategy ### Pattern: Base + Environment Overrides **compose.yaml (base):** ```yaml services: app: build: context: ./app dockerfile: Dockerfile environment: - NODE_ENV=production restart: unless-stopped ``` **compose.override.yaml (development - auto-loaded):** ```yaml services: app: build: target: development volumes: - ./app/src:/app/src:cached environment: - NODE_ENV=development - DEBUG=* ports: - "9229:9229" # Debugger ``` **compose.prod.yaml (production - explicit):** ```yaml services: app: build: target: production deploy: replicas: 3 resources: limits: cpus: '1' memory: 512M restart_policy: condition: on-failure max_attempts: 3 ``` **Usage:** ```bash # Development (auto-loads compose.override.yaml) docker compose up # Production docker compose -f compose.yaml -f compose.prod.yaml up -d # CI/CD docker compose -f compose.yaml -f compose.ci.yaml up --abort-on-container-exit ``` ## Environment Variable Management ### Pattern: .env Files per Environment **.env.template (committed to git):** ```bash # Database DB_HOST=sqlserver DB_PORT=1433 DB_NAME=myapp DB_USER=sa # DB_PASSWORD= (set in actual .env) # Redis REDIS_HOST=redis REDIS_PORT=6379 # REDIS_PASSWORD= (set in actual .env) # Application NODE_ENV=production LOG_LEVEL=info ``` **.env.dev:** ```bash DB_PASSWORD=Dev!Pass123 REDIS_PASSWORD=redis-dev-123 NODE_ENV=development LOG_LEVEL=debug ``` **.env.prod:** ```bash DB_PASSWORD=${PROD_DB_PASSWORD} # From CI/CD REDIS_PASSWORD=${PROD_REDIS_PASSWORD} NODE_ENV=production LOG_LEVEL=info ``` **Load specific environment:** ```bash docker compose --env-file .env.dev up ``` ## Security Patterns ### Pattern: Run as Non-Root User ```yaml services: app: image: node:20-alpine user: "1000:1000" # UID:GID read_only: true tmpfs: - /tmp - /app/.cache cap_drop: - ALL cap_add: - NET_BIND_SERVICE # Only if binding to ports < 1024 security_opt: - no-new-privileges:true ``` **Create user in Dockerfile:** ```dockerfile FROM node:20-alpine # Create app user RUN addgroup -g 1000 appuser && \ adduser -D -u 1000 -G appuser appuser # Set ownership WORKDIR /app COPY --chown=appuser:appuser . . USER appuser ``` ### Pattern: Secrets Management **Docker Swarm secrets (production):** ```yaml services: app: secrets: - db_password - api_key secrets: db_password: file: ./secrets/db_password.txt api_key: external: true # Managed by Swarm ``` **Access secrets in application:** ```javascript // Read from /run/secrets/ const fs = require('fs'); const dbPassword = fs.readFileSync('/run/secrets/db_password', 'utf8').trim(); ``` **Development alternative (environment):** ```yaml services: app: environment: - DB_PASSWORD_FILE=/run/secrets/db_password ``` ## Health Check Patterns ### Pattern: Comprehensive Health Checks **HTTP endpoint:** ```yaml services: web: healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health"] interval: 30s timeout: 3s retries: 3 start_period: 40s ``` **Database ping:** ```yaml services: postgres: healthcheck: test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER"] interval: 10s timeout: 3s retries: 3 ``` **Custom script:** ```yaml services: app: healthcheck: test: ["CMD", "node", "/app/scripts/healthcheck.js"] interval: 30s timeout: 3s retries: 3 start_period: 40s ``` **healthcheck.js:** ```javascript const http = require('http'); const options = { hostname: 'localhost', port: 8080, path: '/health', timeout: 2000 }; const req = http.request(options, (res) => { process.exit(res.statusCode === 200 ? 0 : 1); }); req.on('error', () => process.exit(1)); req.on('timeout', () => { req.destroy(); process.exit(1); }); req.end(); ``` ## Dependency Management ### Pattern: Ordered Startup with Conditions ```yaml services: web: depends_on: database: condition: service_healthy redis: condition: service_started migration: condition: service_completed_successfully database: healthcheck: test: ["CMD-SHELL", "pg_isready"] interval: 10s redis: # No health check needed, just wait for start migration: image: myapp:latest command: npm run migrate restart: "no" # Run once depends_on: database: condition: service_healthy ``` ## Network Isolation Patterns ### Pattern: Three-Tier Network Architecture ```yaml services: nginx: image: nginx:alpine networks: - frontend ports: - "80:80" api: build: ./api networks: - frontend - backend database: image: postgres:16-alpine networks: - backend # No frontend access networks: frontend: driver: bridge backend: driver: bridge internal: true # No external access ``` ### Pattern: Service-Specific Networks ```yaml services: web-app: networks: - public - app-network api: networks: - app-network - data-network postgres: networks: - data-network redis: networks: - data-network networks: public: driver: bridge app-network: driver: bridge internal: true data-network: driver: bridge internal: true ``` ## Volume Patterns ### Pattern: Named Volumes for Persistence ```yaml services: database: volumes: - db-data:/var/lib/postgresql/data # Persistent data - ./init:/docker-entrypoint-initdb.d:ro # Init scripts (read-only) - db-logs:/var/log/postgresql # Logs volumes: db-data: driver: local driver_opts: type: none o: bind device: /mnt/data/postgres # Host path db-logs: driver: local ``` ### Pattern: Development Bind Mounts ```yaml services: app: volumes: - ./src:/app/src:cached # macOS optimization - /app/node_modules # Don't overwrite installed modules - app-cache:/app/.cache # Named volume for cache ``` **Volume mount options:** - `:ro` - Read-only - `:rw` - Read-write (default) - `:cached` - macOS performance optimization (host authoritative) - `:delegated` - macOS performance optimization (container authoritative) - `:z` - SELinux single container - `:Z` - SELinux multi-container ## Resource Management Patterns ### Pattern: CPU and Memory Limits ```yaml services: app: deploy: resources: limits: cpus: '1.0' memory: 512M reservations: cpus: '0.5' memory: 256M ``` **Calculate total resources:** ```yaml # 3 app replicas + database + redis services: app: deploy: replicas: 3 resources: limits: cpus: '0.5' # 3 x 0.5 = 1.5 CPUs memory: 512M # 3 x 512M = 1.5GB database: deploy: resources: limits: cpus: '2' # 2 CPUs memory: 4G # 4GB redis: deploy: resources: limits: cpus: '0.5' # 0.5 CPUs memory: 512M # 512MB # Total: 4 CPUs, 6GB RAM minimum ``` ## Logging Patterns ### Pattern: Centralized Logging ```yaml services: app: logging: driver: "json-file" options: max-size: "10m" max-file: "3" compress: "true" labels: "app,environment" ``` **Alternative: Log to stdout/stderr (12-factor):** ```yaml services: app: logging: driver: "json-file" options: max-size: "10m" max-file: "3" ``` **View logs:** ```bash docker compose logs -f app docker compose logs --since 30m app docker compose logs --tail 100 app ``` ## Init Container Pattern ### Pattern: Database Migration ```yaml services: migration: image: myapp:latest command: npm run migrate depends_on: database: condition: service_healthy restart: "no" # Run once networks: - backend app: image: myapp:latest depends_on: migration: condition: service_completed_successfully networks: - backend ``` ## YAML Anchors and Aliases ### Pattern: Reusable Configuration ```yaml x-common-app-config: &common-app restart: unless-stopped logging: driver: "json-file" options: max-size: "10m" max-file: "3" security_opt: - no-new-privileges:true cap_drop: - ALL cap_add: - NET_BIND_SERVICE services: app1: <<: *common-app build: ./app1 ports: - "8001:8080" app2: <<: *common-app build: ./app2 ports: - "8002:8080" app3: <<: *common-app build: ./app3 ports: - "8003:8080" ``` ### Pattern: Environment-Specific Overrides ```yaml x-logging: &default-logging driver: "json-file" options: max-size: "10m" max-file: "3" x-resources: &default-resources limits: cpus: '1' memory: 512M reservations: cpus: '0.5' memory: 256M services: app: logging: *default-logging deploy: resources: *default-resources ``` ## Port Binding Patterns ### Pattern: Security-First Port Binding ```yaml services: # Public services web: ports: - "80:8080" - "443:8443" # Development only (localhost binding) debug: ports: - "127.0.0.1:9229:9229" # Debugger only accessible from host # Environment-based binding app: ports: - "${DOCKER_WEB_PORT_FORWARD:-127.0.0.1:8000}:8000" ``` **Environment control:** ```bash # Development (.env.dev) DOCKER_WEB_PORT_FORWARD=127.0.0.1:8000 # Localhost only # Production (.env.prod) DOCKER_WEB_PORT_FORWARD=8000 # All interfaces ``` ## Restart Policy Patterns ```yaml services: # Always restart (production services) app: restart: always # Restart unless manually stopped (most common) database: restart: unless-stopped # Never restart (one-time tasks) migration: restart: "no" # Restart on failure only (with Swarm) worker: deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s ``` ## Validation and Testing ### Pattern: Pre-Deployment Validation ```bash #!/bin/bash set -euo pipefail echo "Validating Compose syntax..." docker compose config > /dev/null echo "Building images..." docker compose build echo "Running security scan..." for service in $(docker compose config --services); do image=$(docker compose config | yq ".services.$service.image") if [ -n "$image" ]; then docker scout cves "$image" || true fi done echo "Starting services..." docker compose up -d echo "Checking health..." sleep 10 docker compose ps echo "Running smoke tests..." curl -f http://localhost:8080/health || exit 1 echo "✓ All checks passed" ``` ## Complete Production Example ```yaml # Modern Compose format (no version field for v2.40+) x-common-service: &common-service restart: unless-stopped logging: driver: "json-file" options: max-size: "10m" max-file: "3" security_opt: - no-new-privileges:true services: nginx: <<: *common-service image: nginxinc/nginx-unprivileged:alpine ports: - "80:8080" volumes: - ./nginx/conf.d:/etc/nginx/conf.d:ro networks: - frontend depends_on: api: condition: service_healthy healthcheck: test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:8080/health"] interval: 30s api: <<: *common-service build: context: ./api dockerfile: Dockerfile target: production user: "1000:1000" read_only: true tmpfs: - /tmp cap_drop: - ALL cap_add: - NET_BIND_SERVICE networks: - frontend - backend depends_on: migration: condition: service_completed_successfully redis: condition: service_started env_file: - .env healthcheck: test: ["CMD", "node", "healthcheck.js"] interval: 30s start_period: 40s deploy: resources: limits: cpus: '1' memory: 512M migration: image: myapp:latest command: npm run migrate restart: "no" networks: - backend depends_on: postgres: condition: service_healthy postgres: <<: *common-service image: postgres:16-alpine environment: - POSTGRES_PASSWORD_FILE=/run/secrets/postgres_password secrets: - postgres_password volumes: - postgres-data:/var/lib/postgresql/data networks: - backend healthcheck: test: ["CMD-SHELL", "pg_isready"] interval: 10s deploy: resources: limits: cpus: '1' memory: 2G redis: <<: *common-service image: redis:7.4-alpine command: redis-server --requirepass ${REDIS_PASSWORD} volumes: - redis-data:/data networks: - backend healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 10s networks: frontend: driver: bridge backend: driver: bridge internal: true volumes: postgres-data: driver: local redis-data: driver: local secrets: postgres_password: file: ./secrets/postgres_password.txt ``` ## Common Mistakes to Avoid 1. **Using `version` field** - Obsolete in 2025 2. **No health checks** - Leads to race conditions 3. **Running as root** - Security risk 4. **No resource limits** - Can exhaust host resources 5. **Hardcoded secrets** - Use secrets or environment variables 6. **No logging limits** - Disk space issues 7. **Bind mounts in production** - Use named volumes 8. **Missing restart policies** - Services don't recover 9. **No network isolation** - All services can talk to each other 10. **Not using .dockerignore** - Larger build contexts ## Troubleshooting Commands ```bash # Validate syntax docker compose config # View merged configuration docker compose config --services # Check which file is being used docker compose config --files # View environment interpolation docker compose config --no-interpolate # Check service dependencies docker compose config | yq '.services.*.depends_on' # View resource usage docker stats $(docker compose ps -q) # Debug startup issues docker compose up --no-deps service-name # Force recreate docker compose up --force-recreate service-name ``` ## References - [Docker Compose Documentation](https://docs.docker.com/compose/) - [Compose v2.42+ Release Notes](https://github.com/docker/compose/releases) - [Best Practices](https://docs.docker.com/compose/how-tos/production/)