Files
gh-levnikolaevich-claude-co…/skills/ln-114-project-docs-creator/references/templates/runbook_template.md
2025-11-30 08:37:27 +08:00

13 KiB

Operations Runbook: {{PROJECT_NAME}}

Document Version: 1.0 Date: {{DATE}} Status: {{STATUS}}


1. Overview

1.1 Purpose

This runbook provides step-by-step operational procedures for {{PROJECT_NAME}} across all environments: local development, testing, and production.

  • Architecture: {{ARCHITECTURE_LINK}}
  • Tech Stack: {{TECH_STACK_LINK}}
  • API Spec: {{API_SPEC_LINK}}
  • Database Schema: {{DATABASE_SCHEMA_LINK}}

1.3 Key Contacts

{{KEY_CONTACTS}}


2. Prerequisites

2.1 Required Tools

{{REQUIRED_TOOLS}}

2.2 Access Requirements

{{ACCESS_REQUIREMENTS}}

2.3 Environment Variables

See Appendix A: Environment Variables for complete reference.


3. Local Development

3.1 Initial Setup

# Clone repository
git clone https://github.com/org/{{PROJECT_NAME}}.git
cd {{PROJECT_NAME}}

# Copy environment template
cp .env.example .env

# Edit .env with your credentials
# See Appendix A for required variables

# Build and start services
docker compose up -d

# Wait for services to be ready (check logs)
docker compose logs -f app

Expected output:

app-1  | Server started on port 3000
db-1   | database system is ready to accept connections

3.2 Docker Commands

Start all services:

docker compose up -d

Stop all services:

docker compose down

Rebuild after code changes:

docker compose down
docker compose build --no-cache app
docker compose up -d

View logs:

# All services
docker compose logs -f

# Specific service
docker compose logs -f app

# Last 100 lines
docker compose logs --tail 100 app

Exec into running container:

docker compose exec app sh
# or
docker compose exec app bash

Restart specific service:

docker compose restart app

3.3 Database Operations (Local)

Run migrations:

docker compose exec app npm run migrate

# Or using Prisma
docker compose exec app npx prisma migrate dev

Seed database:

docker compose exec app npm run seed

Reset database (⚠️ DESTRUCTIVE):

docker compose down
docker volume rm {{PROJECT_NAME}}_postgres_data
docker compose up -d
docker compose exec app npm run migrate
docker compose exec app npm run seed

Database shell:

# PostgreSQL
docker compose exec db psql -U {{DB_USER}} -d {{DB_NAME}}

# MySQL
docker compose exec db mysql -u {{DB_USER}} -p{{DB_PASSWORD}} {{DB_NAME}}

3.4 Common Development Tasks

Install dependencies (after package.json changes):

docker compose down
docker compose build app
docker compose up -d

Run linter:

docker compose exec app npm run lint

# Fix automatically
docker compose exec app npm run lint:fix

Format code:

docker compose exec app npm run format

Check syntax (TypeScript):

docker compose exec app npm run type-check

4. Testing

4.1 Run All Tests

# Using docker-compose.test.yml
docker compose -f docker-compose.test.yml up --abort-on-container-exit

4.2 Run Specific Test Types

Unit tests:

docker compose exec app npm run test:unit

# Watch mode
docker compose exec app npm run test:unit:watch

Integration tests:

docker compose exec app npm run test:integration

E2E tests:

# Start app first
docker compose up -d

# Run E2E
docker compose exec app npm run test:e2e

4.3 Test Coverage

docker compose exec app npm run test:coverage

# Open coverage report
open coverage/index.html

4.4 Debug Tests

# Run single test file
docker compose exec app npm test -- path/to/test.spec.ts

# Run with debugging
docker compose exec app node --inspect-brk=0.0.0.0:9229 node_modules/.bin/jest

5. Build & Deployment

5.1 Build for Production

# Build production image
docker build -t {{PROJECT_NAME}}:{{VERSION}} .

# Test production build locally
docker run -p 3000:3000 --env-file .env.production {{PROJECT_NAME}}:{{VERSION}}

5.2 Deployment to Production

{{DEPLOYMENT_PROCEDURE}}


6. Production Operations

6.1 SSH Access

SSH to production server:

ssh {{PRODUCTION_USER}}@{{PRODUCTION_HOST}}

# Or with SSH key
ssh -i ~/.ssh/{{PROJECT_NAME}}_prod.pem {{PRODUCTION_USER}}@{{PRODUCTION_HOST}}

SSH via jump host (if behind VPN):

ssh -J {{JUMP_HOST}} {{PRODUCTION_USER}}@{{PRODUCTION_HOST}}

6.2 Health Checks

Check application status:

# Health endpoint
curl http://localhost:3000/health

# Expected response:
# {"status": "ok", "uptime": 123456, "timestamp": "2024-01-01T00:00:00Z"}

Check service status:

docker compose ps

# Expected output:
# NAME                STATUS              PORTS
# app-1               Up 5 minutes        0.0.0.0:3000->3000/tcp
# db-1                Up 5 minutes        5432/tcp
# cache-1             Up 5 minutes        6379/tcp

Check resource usage:

docker stats

# Or specific container
docker stats app-1

6.3 Monitoring & Logs

View logs:

# Real-time logs (all services)
docker compose logs -f

# Last 500 lines from app
docker compose logs --tail 500 app

# Logs from specific time
docker compose logs --since 2024-01-01T00:00:00 app

# Save logs to file
docker compose logs --no-color app > app-logs-$(date +%Y%m%d).log

Search logs:

# Find errors
docker compose logs app | grep ERROR

# Find specific request
docker compose logs app | grep "request_id=123"

Log rotation: {{LOG_ROTATION}}

6.4 Common Maintenance Tasks

Restart application (zero downtime):

docker compose up -d --no-deps --force-recreate app

Clear cache:

docker compose exec cache redis-cli FLUSHALL

Database backup:

# PostgreSQL
docker compose exec db pg_dump -U {{DB_USER}} {{DB_NAME}} > backup-$(date +%Y%m%d-%H%M%S).sql

# MySQL
docker compose exec db mysqldump -u {{DB_USER}} -p{{DB_PASSWORD}} {{DB_NAME}} > backup-$(date +%Y%m%d-%H%M%S).sql

Database restore:

# PostgreSQL
cat backup-20240101-120000.sql | docker compose exec -T db psql -U {{DB_USER}} {{DB_NAME}}

# MySQL
cat backup-20240101-120000.sql | docker compose exec -T db mysql -u {{DB_USER}} -p{{DB_PASSWORD}} {{DB_NAME}}

Update dependencies:

# Update package.json
docker compose exec app npm update

# Rebuild image
docker compose down
docker compose build --no-cache app
docker compose up -d

7. Troubleshooting

7.1 Common Issues

Issue 1: Application won't start

Symptoms:

app-1  | Error: Cannot connect to database
app-1  | Error: ECONNREFUSED

Diagnosis:

# Check if database is running
docker compose ps db

# Check database logs
docker compose logs db

# Test database connection
docker compose exec app nc -zv db 5432

Resolution:

# Restart database
docker compose restart db

# Wait for database to be ready
docker compose logs -f db
# Look for: "database system is ready to accept connections"

# Restart app
docker compose restart app

Issue 2: Out of disk space

Symptoms:

Error: no space left on device

Diagnosis:

df -h
docker system df

Resolution:

# Remove unused Docker resources
docker system prune -a

# Remove specific volumes (⚠️ DESTRUCTIVE)
docker volume rm {{PROJECT_NAME}}_postgres_data

# Remove old log files
find /var/log -name "*.log" -mtime +30 -delete

Issue 3: {{ISSUE_3_NAME}}

{{ISSUE_3_TROUBLESHOOTING}}


7.2 Emergency Procedures

Production outage:

# 1. Check health status
curl http://localhost:3000/health

# 2. Check logs for errors
docker compose logs --tail 200 app | grep ERROR

# 3. Restart services
docker compose restart

# 4. If restart fails, rollback
git reset --hard HEAD~1
docker compose down && docker compose up -d

# 5. Notify team (Slack/PagerDuty)

Database corruption:

# 1. Stop application
docker compose stop app

# 2. Restore from latest backup
./scripts/restore-db.sh {{LATEST_BACKUP}}

# 3. Verify data integrity
docker compose exec db psql -U {{DB_USER}} -d {{DB_NAME}} -c "SELECT COUNT(*) FROM users;"

# 4. Restart application
docker compose start app

8. Appendices

Appendix A: Environment Variables Reference

Required variables:

Variable Example Description
DATABASE_URL postgresql://user:pass@db:5432/myapp Database connection string
REDIS_URL redis://cache:6379 Cache connection string
API_KEY sk_live_abc123... External API key (e.g., Stripe)
JWT_SECRET random_secret_key JWT signing secret
NODE_ENV development or production Environment mode

Optional variables:

Variable Default Description
PORT 3000 Application port
LOG_LEVEL info Logging verbosity (debug/info/warn/error)
RATE_LIMIT 100 API rate limit (requests/minute)

Appendix B: Service Dependencies

{{SERVICE_DEPENDENCIES}}


Appendix C: Port Mapping

{{PORT_MAPPING}}


9. Maintenance

Last Updated: {{DATE}}

Update Triggers:

  • New deployment procedures
  • Infrastructure changes (new services, ports)
  • New operational commands
  • Troubleshooting scenarios discovered
  • Environment variable changes
  • SSH access changes

Verification:

  • All commands tested in staging
  • SSH access verified
  • Health check procedures validated
  • Backup/restore procedures tested
  • Emergency procedures reviewed
  • Contact information current

Version: 1.0.0 Template Last Updated: 2025-11-16