--- name: docker-best-practices description: Comprehensive Docker best practices for images, containers, and production deployments --- ## 🚨 CRITICAL GUIDELINES ### Windows File Path Requirements **MANDATORY: Always Use Backslashes on Windows for File Paths** When using Edit or Write tools on Windows, you MUST use backslashes (`\`) in file paths, NOT forward slashes (`/`). **Examples:** - ❌ WRONG: `D:/repos/project/file.tsx` - ✅ CORRECT: `D:\repos\project\file.tsx` This applies to: - Edit tool file_path parameter - Write tool file_path parameter - All file operations on Windows systems ### Documentation Guidelines **NEVER create new documentation files unless explicitly requested by the user.** - **Priority**: Update existing README.md files rather than creating new documentation - **Repository cleanliness**: Keep repository root clean - only README.md unless user requests otherwise - **Style**: Documentation should be concise, direct, and professional - avoid AI-generated tone - **User preference**: Only create additional .md files when user specifically asks for documentation --- # Docker Best Practices This skill provides current Docker best practices across all aspects of container development, deployment, and operation. ## Image Best Practices ### Base Image Selection **2025 Recommended Hierarchy:** 1. **Wolfi/Chainguard** (`cgr.dev/chainguard/*`) - Zero-CVE goal, SBOM included 2. **Alpine** (`alpine:3.19`) - ~7MB, minimal attack surface 3. **Distroless** (`gcr.io/distroless/*`) - ~2MB, no shell 4. **Slim variants** (`node:20-slim`) - ~70MB, balanced **Key rules:** - Always specify exact version tags: `node:20.11.0-alpine3.19` - Never use `latest` (unpredictable, breaks reproducibility) - Use official images from trusted registries - Match base image to actual needs ### Dockerfile Structure **Optimal layer ordering** (least to most frequently changing): ```dockerfile 1. Base image and system dependencies 2. Application dependencies (package.json, requirements.txt, etc.) 3. Application code 4. Configuration and metadata ``` **Rationale:** Docker caches layers. If code changes but dependencies don't, cached dependency layers are reused, speeding up builds. **Example:** ```dockerfile FROM python:3.12-slim # 1. System packages (rarely change) RUN apt-get update && apt-get install -y --no-install-recommends \ gcc \ && rm -rf /var/lib/apt/lists/* # 2. Dependencies (change occasionally) COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # 3. Application code (changes frequently) COPY . /app WORKDIR /app CMD ["python", "app.py"] ``` ### Multi-Stage Builds Use multi-stage builds to separate build dependencies from runtime: ```dockerfile # Build stage FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci COPY . . RUN npm run build # Production stage FROM node:20-alpine AS runtime WORKDIR /app # Only copy what's needed for runtime COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules USER node CMD ["node", "dist/server.js"] ``` **Benefits:** - Smaller final images (no build tools) - Better security (fewer attack vectors) - Faster deployment (smaller upload/download) ### Layer Optimization **Combine commands** to reduce layers and image size: ```dockerfile # Bad - 3 layers, cleanup doesn't reduce size RUN apt-get update RUN apt-get install -y curl RUN rm -rf /var/lib/apt/lists/* # Good - 1 layer, cleanup effective RUN apt-get update && \ apt-get install -y --no-install-recommends curl && \ rm -rf /var/lib/apt/lists/* ``` ### .dockerignore Always create `.dockerignore` to exclude unnecessary files: ``` # Version control .git .gitignore # Dependencies node_modules __pycache__ *.pyc # IDE .vscode .idea # OS .DS_Store Thumbs.db # Logs *.log logs/ # Testing coverage/ .nyc_output *.test.js # Documentation README.md docs/ # Environment .env .env.local *.local ``` ## Container Runtime Best Practices ### Security ```bash docker run \ # Run as non-root --user 1000:1000 \ # Drop all capabilities, add only needed ones --cap-drop=ALL \ --cap-add=NET_BIND_SERVICE \ # Read-only filesystem --read-only \ # Temporary writable filesystems --tmpfs /tmp:noexec,nosuid \ # No new privileges --security-opt="no-new-privileges:true" \ # Resource limits --memory="512m" \ --cpus="1.0" \ my-image ``` ### Resource Management Always set resource limits in production: ```yaml # docker-compose.yml services: app: deploy: resources: limits: cpus: '2.0' memory: 1G reservations: cpus: '1.0' memory: 512M ``` ### Health Checks Implement health checks for all long-running containers: ```dockerfile HEALTHCHECK --interval=30s --timeout=3s --retries=3 --start-period=40s \ CMD curl -f http://localhost:3000/health || exit 1 ``` Or in compose: ```yaml services: app: healthcheck: test: ["CMD", "curl", "-f", "http://localhost/health"] interval: 30s timeout: 3s retries: 3 start_period: 40s ``` ### Logging Configure proper logging to prevent disk fill-up: ```yaml services: app: logging: driver: "json-file" options: max-size: "10m" max-file: "3" ``` Or system-wide in `/etc/docker/daemon.json`: ```json { "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" } } ``` ### Restart Policies ```yaml services: app: # For development restart: "no" # For production restart: unless-stopped # Or with fine-grained control (Swarm mode) deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s ``` ## Docker Compose Best Practices ### File Structure ```yaml # No version field needed (Compose v2.40.3+) services: # Service definitions web: # ... api: # ... database: # ... networks: # Custom networks (preferred) frontend: backend: internal: true volumes: # Named volumes (preferred for persistence) db-data: app-data: configs: # Configuration files (Swarm mode) app-config: file: ./config/app.conf secrets: # Secrets (Swarm mode) db-password: file: ./secrets/db_pass.txt ``` ### Network Isolation ```yaml networks: frontend: driver: bridge backend: driver: bridge internal: true # No external access services: web: networks: - frontend api: networks: - frontend - backend database: networks: - backend # Not accessible from frontend ``` ### Environment Variables ```yaml services: app: # Load from file (preferred for non-secrets) env_file: - .env # Inline for service-specific vars environment: - NODE_ENV=production - LOG_LEVEL=info # For Swarm mode secrets secrets: - db_password ``` **Important:** - Add `.env` to `.gitignore` - Provide `.env.example` as template - Never commit secrets to version control ### Dependency Management ```yaml services: api: depends_on: database: condition: service_healthy # Wait for health check redis: condition: service_started # Just wait for start ``` ## Production Best Practices ### Image Tagging Strategy ```bash # Use semantic versioning my-app:1.2.3 my-app:1.2 my-app:1 my-app:latest # Include git commit for traceability my-app:1.2.3-abc123f # Environment tags my-app:1.2.3-production my-app:1.2.3-staging ``` ### Secrets Management **Never do this:** ```dockerfile # BAD - secret in layer history ENV API_KEY=secret123 RUN echo "password" > /app/config ``` **Do this:** ```bash # Use Docker secrets (Swarm) or external secret management docker secret create db_password ./password.txt # Or mount secrets at runtime docker run -v /secure/secrets:/run/secrets:ro my-app # Or use environment files (not in image) docker run --env-file /secure/.env my-app ``` ### Monitoring & Observability ```yaml services: app: # Health checks healthcheck: test: ["CMD", "curl", "-f", "http://localhost/health"] interval: 30s # Labels for monitoring tools labels: - "prometheus.io/scrape=true" - "prometheus.io/port=9090" - "com.company.team=backend" - "com.company.version=1.2.3" # Logging logging: driver: "json-file" options: max-size: "10m" max-file: "3" ``` ### Backup Strategy ```bash # Backup named volume docker run --rm \ -v VOLUME_NAME:/data \ -v $(pwd):/backup \ alpine tar czf /backup/backup-$(date +%Y%m%d).tar.gz -C /data . # Restore volume docker run --rm \ -v VOLUME_NAME:/data \ -v $(pwd):/backup \ alpine tar xzf /backup/backup.tar.gz -C /data ``` ### Update Strategy ```yaml services: app: # For Swarm mode - rolling updates deploy: replicas: 3 update_config: parallelism: 1 # Update 1 at a time delay: 10s # Wait 10s between updates failure_action: rollback monitor: 60s rollback_config: parallelism: 1 delay: 5s ``` ## Platform-Specific Best Practices ### Linux - Use user namespace remapping for added security - Leverage native performance advantages - Use Alpine for smallest images - Configure SELinux/AppArmor profiles - Use systemd for Docker daemon management ```json // /etc/docker/daemon.json { "userns-remap": "default", "log-driver": "json-file", "log-opts": { "max-size": "10m", "max-file": "3" }, "storage-driver": "overlay2", "live-restore": true } ``` ### macOS - Allocate sufficient resources in Docker Desktop - Use `:delegated` or `:cached` for bind mounts - Consider multi-platform builds for ARM (M1/M2) - Limit file sharing to necessary directories ```yaml # Better volume performance on macOS volumes: - ./src:/app/src:delegated # Host writes are delayed - ./build:/app/build:cached # Container writes are cached ``` ### Windows - Choose container type: Windows or Linux - Use forward slashes in paths - Ensure drives are shared in Docker Desktop - Be aware of line ending differences (CRLF vs LF) - Consider WSL2 backend for better performance ```yaml # Windows-compatible paths volumes: - C:/Users/name/app:/app # Forward slashes work # or - C:\Users\name\app:/app # Backslashes need escaping in YAML ``` ## Performance Best Practices ### Build Performance ```bash # Use BuildKit (faster, better caching) export DOCKER_BUILDKIT=1 # Use cache mounts RUN --mount=type=cache,target=/root/.cache/pip \ pip install -r requirements.txt # Use bind mounts for dependencies RUN --mount=type=bind,source=package.json,target=package.json \ --mount=type=bind,source=package-lock.json,target=package-lock.json \ --mount=type=cache,target=/root/.npm \ npm ci ``` ### Image Size - Use multi-stage builds - Choose minimal base images - Clean up in the same layer - Use .dockerignore - Remove build dependencies ```dockerfile # Install and cleanup in one layer RUN apt-get update && \ apt-get install -y --no-install-recommends \ package1 \ package2 && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* ``` ### Runtime Performance ```dockerfile # Use exec form (no shell overhead) CMD ["node", "server.js"] # Good # vs CMD node server.js # Bad - spawns shell # Optimize signals STOPSIGNAL SIGTERM # Run as non-root (slightly faster, much more secure) USER appuser ``` ## Security Best Practices Summary **Image Security:** - Use official, minimal base images - Scan for vulnerabilities (Docker Scout, Trivy) - Don't include secrets in layers - Run as non-root user - Keep images updated **Runtime Security:** - Drop capabilities - Use read-only filesystem - Set resource limits - Enable security options - Isolate networks - Use secrets management **Compliance:** - Follow CIS Docker Benchmark - Implement container scanning in CI/CD - Use signed images (Docker Content Trust) - Maintain audit logs - Regular security reviews ## Common Anti-Patterns to Avoid ❌ **Don't:** - Run as root - Use `--privileged` - Mount Docker socket - Use `latest` tag - Hardcode secrets - Skip health checks - Ignore resource limits - Use huge base images - Skip vulnerability scanning - Expose unnecessary ports - Use inefficient layer caching - Commit secrets to Git ✅ **Do:** - Run as non-root - Use minimal capabilities - Isolate containers - Tag with versions - Use secrets management - Implement health checks - Set resource limits - Use minimal images - Scan regularly - Apply least privilege - Optimize build cache - Use .env.example templates ## Checklist for Production-Ready Images - [ ] Based on official, versioned, minimal image - [ ] Multi-stage build (if applicable) - [ ] Runs as non-root user - [ ] No secrets in layers - [ ] .dockerignore configured - [ ] Vulnerability scan passed - [ ] Health check implemented - [ ] Proper labeling (version, description, etc.) - [ ] Efficient layer caching - [ ] Resource limits defined - [ ] Logging configured - [ ] Signals handled correctly - [ ] Security options set - [ ] Documentation complete - [ ] Tested on target platform(s) This skill represents current Docker best practices. Always verify against official documentation for the latest recommendations, as Docker evolves continuously.