Files
gh-josiahsiegel-claude-code…/skills/docker-best-practices.md
2025-11-30 08:29:02 +08:00

13 KiB

name, description
name description
docker-best-practices Comprehensive Docker best practices for images, containers, and production deployments

🚨 CRITICAL GUIDELINES

Windows File Path Requirements

MANDATORY: Always Use Backslashes on Windows for File Paths

When using Edit or Write tools on Windows, you MUST use backslashes (\) in file paths, NOT forward slashes (/).

Examples:

  • WRONG: D:/repos/project/file.tsx
  • CORRECT: D:\repos\project\file.tsx

This applies to:

  • Edit tool file_path parameter
  • Write tool file_path parameter
  • All file operations on Windows systems

Documentation Guidelines

NEVER create new documentation files unless explicitly requested by the user.

  • Priority: Update existing README.md files rather than creating new documentation
  • Repository cleanliness: Keep repository root clean - only README.md unless user requests otherwise
  • Style: Documentation should be concise, direct, and professional - avoid AI-generated tone
  • User preference: Only create additional .md files when user specifically asks for documentation

Docker Best Practices

This skill provides current Docker best practices across all aspects of container development, deployment, and operation.

Image Best Practices

Base Image Selection

2025 Recommended Hierarchy:

  1. Wolfi/Chainguard (cgr.dev/chainguard/*) - Zero-CVE goal, SBOM included
  2. Alpine (alpine:3.19) - ~7MB, minimal attack surface
  3. Distroless (gcr.io/distroless/*) - ~2MB, no shell
  4. Slim variants (node:20-slim) - ~70MB, balanced

Key rules:

  • Always specify exact version tags: node:20.11.0-alpine3.19
  • Never use latest (unpredictable, breaks reproducibility)
  • Use official images from trusted registries
  • Match base image to actual needs

Dockerfile Structure

Optimal layer ordering (least to most frequently changing):

1. Base image and system dependencies
2. Application dependencies (package.json, requirements.txt, etc.)
3. Application code
4. Configuration and metadata

Rationale: Docker caches layers. If code changes but dependencies don't, cached dependency layers are reused, speeding up builds.

Example:

FROM python:3.12-slim

# 1. System packages (rarely change)
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# 2. Dependencies (change occasionally)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 3. Application code (changes frequently)
COPY . /app
WORKDIR /app

CMD ["python", "app.py"]

Multi-Stage Builds

Use multi-stage builds to separate build dependencies from runtime:

# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:20-alpine AS runtime
WORKDIR /app
# Only copy what's needed for runtime
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]

Benefits:

  • Smaller final images (no build tools)
  • Better security (fewer attack vectors)
  • Faster deployment (smaller upload/download)

Layer Optimization

Combine commands to reduce layers and image size:

# Bad - 3 layers, cleanup doesn't reduce size
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

# Good - 1 layer, cleanup effective
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

.dockerignore

Always create .dockerignore to exclude unnecessary files:

# Version control
.git
.gitignore

# Dependencies
node_modules
__pycache__
*.pyc

# IDE
.vscode
.idea

# OS
.DS_Store
Thumbs.db

# Logs
*.log
logs/

# Testing
coverage/
.nyc_output
*.test.js

# Documentation
README.md
docs/

# Environment
.env
.env.local
*.local

Container Runtime Best Practices

Security

docker run \
  # Run as non-root
  --user 1000:1000 \
  # Drop all capabilities, add only needed ones
  --cap-drop=ALL \
  --cap-add=NET_BIND_SERVICE \
  # Read-only filesystem
  --read-only \
  # Temporary writable filesystems
  --tmpfs /tmp:noexec,nosuid \
  # No new privileges
  --security-opt="no-new-privileges:true" \
  # Resource limits
  --memory="512m" \
  --cpus="1.0" \
  my-image

Resource Management

Always set resource limits in production:

# docker-compose.yml
services:
  app:
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 1G
        reservations:
          cpus: '1.0'
          memory: 512M

Health Checks

Implement health checks for all long-running containers:

HEALTHCHECK --interval=30s --timeout=3s --retries=3 --start-period=40s \
  CMD curl -f http://localhost:3000/health || exit 1

Or in compose:

services:
  app:
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/health"]
      interval: 30s
      timeout: 3s
      retries: 3
      start_period: 40s

Logging

Configure proper logging to prevent disk fill-up:

services:
  app:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Or system-wide in /etc/docker/daemon.json:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

Restart Policies

services:
  app:
    # For development
    restart: "no"

    # For production
    restart: unless-stopped

    # Or with fine-grained control (Swarm mode)
    deploy:
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 120s

Docker Compose Best Practices

File Structure

# No version field needed (Compose v2.40.3+)

services:
  # Service definitions
  web:
    # ...
  api:
    # ...
  database:
    # ...

networks:
  # Custom networks (preferred)
  frontend:
  backend:
    internal: true

volumes:
  # Named volumes (preferred for persistence)
  db-data:
  app-data:

configs:
  # Configuration files (Swarm mode)
  app-config:
    file: ./config/app.conf

secrets:
  # Secrets (Swarm mode)
  db-password:
    file: ./secrets/db_pass.txt

Network Isolation

networks:
  frontend:
    driver: bridge
  backend:
    driver: bridge
    internal: true  # No external access

services:
  web:
    networks:
      - frontend

  api:
    networks:
      - frontend
      - backend

  database:
    networks:
      - backend  # Not accessible from frontend

Environment Variables

services:
  app:
    # Load from file (preferred for non-secrets)
    env_file:
      - .env

    # Inline for service-specific vars
    environment:
      - NODE_ENV=production
      - LOG_LEVEL=info

    # For Swarm mode secrets
    secrets:
      - db_password

Important:

  • Add .env to .gitignore
  • Provide .env.example as template
  • Never commit secrets to version control

Dependency Management

services:
  api:
    depends_on:
      database:
        condition: service_healthy  # Wait for health check
      redis:
        condition: service_started   # Just wait for start

Production Best Practices

Image Tagging Strategy

# Use semantic versioning
my-app:1.2.3
my-app:1.2
my-app:1
my-app:latest

# Include git commit for traceability
my-app:1.2.3-abc123f

# Environment tags
my-app:1.2.3-production
my-app:1.2.3-staging

Secrets Management

Never do this:

# BAD - secret in layer history
ENV API_KEY=secret123
RUN echo "password" > /app/config

Do this:

# Use Docker secrets (Swarm) or external secret management
docker secret create db_password ./password.txt

# Or mount secrets at runtime
docker run -v /secure/secrets:/run/secrets:ro my-app

# Or use environment files (not in image)
docker run --env-file /secure/.env my-app

Monitoring & Observability

services:
  app:
    # Health checks
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/health"]
      interval: 30s

    # Labels for monitoring tools
    labels:
      - "prometheus.io/scrape=true"
      - "prometheus.io/port=9090"
      - "com.company.team=backend"
      - "com.company.version=1.2.3"

    # Logging
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Backup Strategy

# Backup named volume
docker run --rm \
  -v VOLUME_NAME:/data \
  -v $(pwd):/backup \
  alpine tar czf /backup/backup-$(date +%Y%m%d).tar.gz -C /data .

# Restore volume
docker run --rm \
  -v VOLUME_NAME:/data \
  -v $(pwd):/backup \
  alpine tar xzf /backup/backup.tar.gz -C /data

Update Strategy

services:
  app:
    # For Swarm mode - rolling updates
    deploy:
      replicas: 3
      update_config:
        parallelism: 1        # Update 1 at a time
        delay: 10s            # Wait 10s between updates
        failure_action: rollback
        monitor: 60s
      rollback_config:
        parallelism: 1
        delay: 5s

Platform-Specific Best Practices

Linux

  • Use user namespace remapping for added security
  • Leverage native performance advantages
  • Use Alpine for smallest images
  • Configure SELinux/AppArmor profiles
  • Use systemd for Docker daemon management
// /etc/docker/daemon.json
{
  "userns-remap": "default",
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  },
  "storage-driver": "overlay2",
  "live-restore": true
}

macOS

  • Allocate sufficient resources in Docker Desktop
  • Use :delegated or :cached for bind mounts
  • Consider multi-platform builds for ARM (M1/M2)
  • Limit file sharing to necessary directories
# Better volume performance on macOS
volumes:
  - ./src:/app/src:delegated  # Host writes are delayed
  - ./build:/app/build:cached  # Container writes are cached

Windows

  • Choose container type: Windows or Linux
  • Use forward slashes in paths
  • Ensure drives are shared in Docker Desktop
  • Be aware of line ending differences (CRLF vs LF)
  • Consider WSL2 backend for better performance
# Windows-compatible paths
volumes:
  - C:/Users/name/app:/app  # Forward slashes work
  # or
  - C:\Users\name\app:/app  # Backslashes need escaping in YAML

Performance Best Practices

Build Performance

# Use BuildKit (faster, better caching)
export DOCKER_BUILDKIT=1

# Use cache mounts
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

# Use bind mounts for dependencies
RUN --mount=type=bind,source=package.json,target=package.json \
    --mount=type=bind,source=package-lock.json,target=package-lock.json \
    --mount=type=cache,target=/root/.npm \
    npm ci

Image Size

  • Use multi-stage builds
  • Choose minimal base images
  • Clean up in the same layer
  • Use .dockerignore
  • Remove build dependencies
# Install and cleanup in one layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    package1 \
    package2 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

Runtime Performance

# Use exec form (no shell overhead)
CMD ["node", "server.js"]  # Good
# vs
CMD node server.js         # Bad - spawns shell

# Optimize signals
STOPSIGNAL SIGTERM

# Run as non-root (slightly faster, much more secure)
USER appuser

Security Best Practices Summary

Image Security:

  • Use official, minimal base images
  • Scan for vulnerabilities (Docker Scout, Trivy)
  • Don't include secrets in layers
  • Run as non-root user
  • Keep images updated

Runtime Security:

  • Drop capabilities
  • Use read-only filesystem
  • Set resource limits
  • Enable security options
  • Isolate networks
  • Use secrets management

Compliance:

  • Follow CIS Docker Benchmark
  • Implement container scanning in CI/CD
  • Use signed images (Docker Content Trust)
  • Maintain audit logs
  • Regular security reviews

Common Anti-Patterns to Avoid

Don't:

  • Run as root
  • Use --privileged
  • Mount Docker socket
  • Use latest tag
  • Hardcode secrets
  • Skip health checks
  • Ignore resource limits
  • Use huge base images
  • Skip vulnerability scanning
  • Expose unnecessary ports
  • Use inefficient layer caching
  • Commit secrets to Git

Do:

  • Run as non-root
  • Use minimal capabilities
  • Isolate containers
  • Tag with versions
  • Use secrets management
  • Implement health checks
  • Set resource limits
  • Use minimal images
  • Scan regularly
  • Apply least privilege
  • Optimize build cache
  • Use .env.example templates

Checklist for Production-Ready Images

  • Based on official, versioned, minimal image
  • Multi-stage build (if applicable)
  • Runs as non-root user
  • No secrets in layers
  • .dockerignore configured
  • Vulnerability scan passed
  • Health check implemented
  • Proper labeling (version, description, etc.)
  • Efficient layer caching
  • Resource limits defined
  • Logging configured
  • Signals handled correctly
  • Security options set
  • Documentation complete
  • Tested on target platform(s)

This skill represents current Docker best practices. Always verify against official documentation for the latest recommendations, as Docker evolves continuously.