676 lines
13 KiB
Markdown
676 lines
13 KiB
Markdown
# CI/CD Best Practices
|
|
|
|
Comprehensive guide to CI/CD pipeline design, testing strategies, and deployment patterns.
|
|
|
|
## Table of Contents
|
|
|
|
- [Pipeline Design Principles](#pipeline-design-principles)
|
|
- [Testing in CI/CD](#testing-in-cicd)
|
|
- [Deployment Strategies](#deployment-strategies)
|
|
- [Dependency Management](#dependency-management)
|
|
- [Artifact & Release Management](#artifact--release-management)
|
|
- [Platform Patterns](#platform-patterns)
|
|
|
|
---
|
|
|
|
## Pipeline Design Principles
|
|
|
|
### Fast Feedback Loops
|
|
|
|
Design pipelines to provide feedback quickly:
|
|
|
|
**Priority ordering:**
|
|
1. Linting and code formatting (seconds)
|
|
2. Unit tests (1-5 minutes)
|
|
3. Integration tests (5-15 minutes)
|
|
4. E2E tests (15-30 minutes)
|
|
5. Deployment (varies)
|
|
|
|
**Fail fast pattern:**
|
|
```yaml
|
|
# GitHub Actions
|
|
jobs:
|
|
lint:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- run: npm run lint
|
|
|
|
test:
|
|
needs: lint # Only run if lint passes
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- run: npm test
|
|
|
|
e2e:
|
|
needs: [lint, test] # Run after basic checks
|
|
```
|
|
|
|
### Job Parallelization
|
|
|
|
Run independent jobs concurrently:
|
|
|
|
**GitHub Actions:**
|
|
```yaml
|
|
jobs:
|
|
lint:
|
|
runs-on: ubuntu-latest
|
|
|
|
test:
|
|
runs-on: ubuntu-latest
|
|
# No 'needs' - runs in parallel with lint
|
|
|
|
build:
|
|
needs: [lint, test] # Wait for both
|
|
runs-on: ubuntu-latest
|
|
```
|
|
|
|
**GitLab CI:**
|
|
```yaml
|
|
stages:
|
|
- validate
|
|
- test
|
|
- build
|
|
|
|
# Jobs in same stage run in parallel
|
|
unit-test:
|
|
stage: test
|
|
|
|
integration-test:
|
|
stage: test
|
|
|
|
e2e-test:
|
|
stage: test
|
|
```
|
|
|
|
### Monorepo Strategies
|
|
|
|
**Path-based triggers (GitHub):**
|
|
```yaml
|
|
on:
|
|
push:
|
|
paths:
|
|
- 'services/api/**'
|
|
- 'shared/**'
|
|
|
|
jobs:
|
|
api-test:
|
|
if: |
|
|
contains(github.event.head_commit.modified, 'services/api/') ||
|
|
contains(github.event.head_commit.modified, 'shared/')
|
|
```
|
|
|
|
**GitLab rules:**
|
|
```yaml
|
|
api-test:
|
|
rules:
|
|
- changes:
|
|
- services/api/**/*
|
|
- shared/**/*
|
|
|
|
frontend-test:
|
|
rules:
|
|
- changes:
|
|
- services/frontend/**/*
|
|
- shared/**/*
|
|
```
|
|
|
|
### Matrix Builds
|
|
|
|
Test across multiple versions/platforms:
|
|
|
|
**GitHub Actions:**
|
|
```yaml
|
|
strategy:
|
|
matrix:
|
|
os: [ubuntu-latest, macos-latest, windows-latest]
|
|
node: [18, 20, 22]
|
|
include:
|
|
- os: ubuntu-latest
|
|
node: 22
|
|
coverage: true
|
|
exclude:
|
|
- os: windows-latest
|
|
node: 18
|
|
fail-fast: false # See all results
|
|
```
|
|
|
|
**GitLab parallel:**
|
|
```yaml
|
|
test:
|
|
parallel:
|
|
matrix:
|
|
- NODE_VERSION: ['18', '20', '22']
|
|
OS: ['ubuntu', 'alpine']
|
|
```
|
|
|
|
---
|
|
|
|
## Testing in CI/CD
|
|
|
|
### Test Pyramid Strategy
|
|
|
|
Maintain proper test distribution:
|
|
|
|
```
|
|
/\
|
|
/E2E\ 10% - Slow, expensive, flaky
|
|
/-----\
|
|
/ Int \ 20% - Medium speed
|
|
/--------\
|
|
/ Unit \ 70% - Fast, reliable
|
|
/------------\
|
|
```
|
|
|
|
**Implementation:**
|
|
```yaml
|
|
jobs:
|
|
unit-test:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- run: npm run test:unit # Fast, runs on every commit
|
|
|
|
integration-test:
|
|
runs-on: ubuntu-latest
|
|
needs: unit-test
|
|
steps:
|
|
- run: npm run test:integration # Medium, after unit tests
|
|
|
|
e2e-test:
|
|
runs-on: ubuntu-latest
|
|
needs: [unit-test, integration-test]
|
|
if: github.ref == 'refs/heads/main' # Only on main branch
|
|
steps:
|
|
- run: npm run test:e2e # Slow, only on main
|
|
```
|
|
|
|
### Test Splitting & Parallelization
|
|
|
|
Split large test suites:
|
|
|
|
**GitHub Actions:**
|
|
```yaml
|
|
strategy:
|
|
matrix:
|
|
shard: [1, 2, 3, 4]
|
|
steps:
|
|
- run: npm test -- --shard=${{ matrix.shard }}/4
|
|
```
|
|
|
|
**Playwright example:**
|
|
```yaml
|
|
strategy:
|
|
matrix:
|
|
shardIndex: [1, 2, 3, 4]
|
|
shardTotal: [4]
|
|
steps:
|
|
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
|
|
```
|
|
|
|
### Code Coverage
|
|
|
|
**Track coverage trends:**
|
|
```yaml
|
|
- name: Run tests with coverage
|
|
run: npm test -- --coverage
|
|
|
|
- name: Upload coverage
|
|
uses: codecov/codecov-action@v4
|
|
with:
|
|
files: ./coverage/lcov.info
|
|
fail_ci_if_error: true # Fail if upload fails
|
|
|
|
- name: Coverage check
|
|
run: |
|
|
COVERAGE=$(jq -r '.total.lines.pct' coverage/coverage-summary.json)
|
|
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
|
|
echo "Coverage $COVERAGE% is below 80%"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
### Test Environment Management
|
|
|
|
**Docker Compose for services:**
|
|
```yaml
|
|
jobs:
|
|
integration-test:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Start services
|
|
run: docker-compose up -d postgres redis
|
|
|
|
- name: Wait for services
|
|
run: |
|
|
timeout 30 bash -c 'until docker-compose exec -T postgres pg_isready; do sleep 1; done'
|
|
|
|
- name: Run tests
|
|
run: npm run test:integration
|
|
|
|
- name: Cleanup
|
|
if: always()
|
|
run: docker-compose down
|
|
```
|
|
|
|
**GitLab services:**
|
|
```yaml
|
|
integration-test:
|
|
services:
|
|
- postgres:15
|
|
- redis:7-alpine
|
|
variables:
|
|
POSTGRES_DB: testdb
|
|
POSTGRES_PASSWORD: password
|
|
script:
|
|
- npm run test:integration
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment Strategies
|
|
|
|
### Deployment Patterns
|
|
|
|
**1. Direct Deployment (Simple)**
|
|
```yaml
|
|
deploy:
|
|
if: github.ref == 'refs/heads/main'
|
|
steps:
|
|
- run: |
|
|
aws s3 sync dist/ s3://${{ secrets.S3_BUCKET }}
|
|
aws cloudfront create-invalidation --distribution-id ${{ secrets.CF_DIST }}
|
|
```
|
|
|
|
**2. Blue-Green Deployment**
|
|
```yaml
|
|
deploy:
|
|
steps:
|
|
- name: Deploy to staging slot
|
|
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
|
|
|
|
- name: Health check
|
|
run: |
|
|
for i in {1..10}; do
|
|
if curl -f https://$APP.azurewebsites.net/health; then
|
|
echo "Health check passed"
|
|
exit 0
|
|
fi
|
|
sleep 10
|
|
done
|
|
exit 1
|
|
|
|
- name: Rollback on failure
|
|
if: failure()
|
|
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
|
|
```
|
|
|
|
**3. Canary Deployment**
|
|
```yaml
|
|
deploy-canary:
|
|
steps:
|
|
- run: kubectl set image deployment/app app=myapp:${{ github.sha }}
|
|
- run: kubectl patch deployment app -p '{"spec":{"replicas":1}}' # 1 pod
|
|
- run: sleep 300 # Monitor for 5 minutes
|
|
- run: kubectl scale deployment app --replicas=10 # Scale to full
|
|
```
|
|
|
|
### Environment Management
|
|
|
|
**GitHub Environments:**
|
|
```yaml
|
|
jobs:
|
|
deploy-staging:
|
|
environment:
|
|
name: staging
|
|
url: https://staging.example.com
|
|
steps:
|
|
- run: ./deploy.sh staging
|
|
|
|
deploy-production:
|
|
needs: deploy-staging
|
|
environment:
|
|
name: production
|
|
url: https://example.com
|
|
steps:
|
|
- run: ./deploy.sh production
|
|
```
|
|
|
|
**Protection rules:**
|
|
- Require approval for production
|
|
- Restrict to specific branches
|
|
- Add deployment delay
|
|
|
|
**GitLab environments:**
|
|
```yaml
|
|
deploy:staging:
|
|
stage: deploy
|
|
environment:
|
|
name: staging
|
|
url: https://staging.example.com
|
|
on_stop: stop:staging
|
|
only:
|
|
- develop
|
|
|
|
deploy:production:
|
|
stage: deploy
|
|
environment:
|
|
name: production
|
|
url: https://example.com
|
|
when: manual # Require manual trigger
|
|
only:
|
|
- main
|
|
```
|
|
|
|
### Deployment Gates
|
|
|
|
**Pre-deployment checks:**
|
|
```yaml
|
|
pre-deploy-checks:
|
|
steps:
|
|
- name: Check migration status
|
|
run: ./scripts/check-migrations.sh
|
|
|
|
- name: Verify dependencies
|
|
run: npm audit --audit-level=high
|
|
|
|
- name: Check service health
|
|
run: curl -f https://api.example.com/health
|
|
```
|
|
|
|
**Post-deployment validation:**
|
|
```yaml
|
|
post-deploy-validation:
|
|
needs: deploy
|
|
steps:
|
|
- name: Smoke tests
|
|
run: npm run test:smoke
|
|
|
|
- name: Monitor errors
|
|
run: |
|
|
ERROR_COUNT=$(datadog-api errors --since 5m)
|
|
if [ $ERROR_COUNT -gt 10 ]; then
|
|
echo "Error spike detected!"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
---
|
|
|
|
## Dependency Management
|
|
|
|
### Lock Files
|
|
|
|
**Always commit lock files:**
|
|
- `package-lock.json` (npm)
|
|
- `yarn.lock` (Yarn)
|
|
- `pnpm-lock.yaml` (pnpm)
|
|
- `Cargo.lock` (Rust)
|
|
- `Gemfile.lock` (Ruby)
|
|
- `poetry.lock` (Python)
|
|
|
|
**Use deterministic install commands:**
|
|
```bash
|
|
# Good - uses lock file
|
|
npm ci # Not npm install
|
|
yarn install --frozen-lockfile
|
|
pnpm install --frozen-lockfile
|
|
pip install -r requirements.txt
|
|
|
|
# Bad - updates lock file
|
|
npm install
|
|
```
|
|
|
|
### Dependency Caching
|
|
|
|
**See optimization.md for detailed caching strategies**
|
|
|
|
Quick reference:
|
|
- Hash lock files for cache keys
|
|
- Include OS/platform in cache key
|
|
- Use restore-keys for partial matches
|
|
- Separate cache for build artifacts vs dependencies
|
|
|
|
### Security Scanning
|
|
|
|
**Automated vulnerability checks:**
|
|
```yaml
|
|
security-scan:
|
|
steps:
|
|
- name: Dependency audit
|
|
run: |
|
|
npm audit --audit-level=high
|
|
# Or: pip-audit, cargo audit, bundle audit
|
|
|
|
- name: SAST scanning
|
|
uses: github/codeql-action/analyze@v3
|
|
|
|
- name: Container scanning
|
|
run: trivy image myapp:${{ github.sha }}
|
|
```
|
|
|
|
### Dependency Updates
|
|
|
|
**Automated dependency updates:**
|
|
- Dependabot (GitHub)
|
|
- Renovate
|
|
- GitLab Dependency Scanning
|
|
|
|
**Configuration example (Dependabot):**
|
|
```yaml
|
|
# .github/dependabot.yml
|
|
version: 2
|
|
updates:
|
|
- package-ecosystem: npm
|
|
directory: "/"
|
|
schedule:
|
|
interval: weekly
|
|
open-pull-requests-limit: 5
|
|
groups:
|
|
dev-dependencies:
|
|
dependency-type: development
|
|
```
|
|
|
|
---
|
|
|
|
## Artifact & Release Management
|
|
|
|
### Artifact Strategy
|
|
|
|
**Build once, deploy many:**
|
|
```yaml
|
|
build:
|
|
steps:
|
|
- run: npm run build
|
|
- uses: actions/upload-artifact@v4
|
|
with:
|
|
name: dist-${{ github.sha }}
|
|
path: dist/
|
|
retention-days: 7
|
|
|
|
deploy-staging:
|
|
needs: build
|
|
steps:
|
|
- uses: actions/download-artifact@v4
|
|
with:
|
|
name: dist-${{ github.sha }}
|
|
- run: ./deploy.sh staging
|
|
|
|
deploy-production:
|
|
needs: [build, deploy-staging]
|
|
steps:
|
|
- uses: actions/download-artifact@v4
|
|
with:
|
|
name: dist-${{ github.sha }}
|
|
- run: ./deploy.sh production
|
|
```
|
|
|
|
### Container Image Management
|
|
|
|
**Multi-stage builds:**
|
|
```dockerfile
|
|
# Build stage
|
|
FROM node:20-alpine AS builder
|
|
WORKDIR /app
|
|
COPY package*.json ./
|
|
RUN npm ci --only=production
|
|
COPY . .
|
|
RUN npm run build
|
|
|
|
# Production stage
|
|
FROM node:20-alpine
|
|
WORKDIR /app
|
|
COPY --from=builder /app/dist ./dist
|
|
COPY --from=builder /app/node_modules ./node_modules
|
|
USER node
|
|
CMD ["node", "dist/server.js"]
|
|
```
|
|
|
|
**Image tagging strategy:**
|
|
```yaml
|
|
- name: Build and tag images
|
|
run: |
|
|
docker build -t myapp:${{ github.sha }} .
|
|
docker tag myapp:${{ github.sha }} myapp:latest
|
|
docker tag myapp:${{ github.sha }} myapp:v1.2.3
|
|
```
|
|
|
|
### Release Automation
|
|
|
|
**Semantic versioning:**
|
|
```yaml
|
|
release:
|
|
if: startsWith(github.ref, 'refs/tags/v')
|
|
steps:
|
|
- uses: actions/create-release@v1
|
|
with:
|
|
tag_name: ${{ github.ref }}
|
|
release_name: Release ${{ github.ref }}
|
|
body: |
|
|
Changes in this release:
|
|
${{ github.event.head_commit.message }}
|
|
```
|
|
|
|
**Changelog generation:**
|
|
```yaml
|
|
- name: Generate changelog
|
|
run: |
|
|
git log $(git describe --tags --abbrev=0)..HEAD \
|
|
--pretty=format:"- %s (%h)" > CHANGELOG.md
|
|
```
|
|
|
|
---
|
|
|
|
## Platform Patterns
|
|
|
|
### GitHub Actions
|
|
|
|
**Reusable workflows:**
|
|
```yaml
|
|
# .github/workflows/reusable-test.yml
|
|
on:
|
|
workflow_call:
|
|
inputs:
|
|
node-version:
|
|
required: true
|
|
type: string
|
|
|
|
jobs:
|
|
test:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/setup-node@v4
|
|
with:
|
|
node-version: ${{ inputs.node-version }}
|
|
- run: npm test
|
|
```
|
|
|
|
**Composite actions:**
|
|
```yaml
|
|
# .github/actions/setup-app/action.yml
|
|
name: Setup Application
|
|
runs:
|
|
using: composite
|
|
steps:
|
|
- uses: actions/setup-node@v4
|
|
with:
|
|
node-version: 20
|
|
- run: npm ci
|
|
shell: bash
|
|
```
|
|
|
|
### GitLab CI
|
|
|
|
**Templates & extends:**
|
|
```yaml
|
|
.test_template:
|
|
image: node:20
|
|
before_script:
|
|
- npm ci
|
|
script:
|
|
- npm test
|
|
|
|
unit-test:
|
|
extends: .test_template
|
|
script:
|
|
- npm run test:unit
|
|
|
|
integration-test:
|
|
extends: .test_template
|
|
script:
|
|
- npm run test:integration
|
|
```
|
|
|
|
**Dynamic child pipelines:**
|
|
```yaml
|
|
generate-pipeline:
|
|
script:
|
|
- ./generate-config.sh > pipeline.yml
|
|
artifacts:
|
|
paths:
|
|
- pipeline.yml
|
|
|
|
trigger-pipeline:
|
|
trigger:
|
|
include:
|
|
- artifact: pipeline.yml
|
|
job: generate-pipeline
|
|
```
|
|
|
|
---
|
|
|
|
## Continuous Improvement
|
|
|
|
### Metrics to Track
|
|
|
|
- **Build duration:** Target < 10 minutes
|
|
- **Failure rate:** Target < 5%
|
|
- **Time to recovery:** Target < 1 hour
|
|
- **Deployment frequency:** Aim for multiple/day
|
|
- **Lead time:** Commit to production < 1 day
|
|
|
|
### Pipeline Optimization Checklist
|
|
|
|
- [ ] Jobs run in parallel where possible
|
|
- [ ] Dependencies are cached
|
|
- [ ] Test suite is properly split
|
|
- [ ] Linting fails fast
|
|
- [ ] Only necessary tests run on PRs
|
|
- [ ] Artifacts are reused across jobs
|
|
- [ ] Pipeline has appropriate timeouts
|
|
- [ ] Flaky tests are identified and fixed
|
|
- [ ] Security scanning is automated
|
|
- [ ] Deployment requires approval
|
|
|
|
### Regular Reviews
|
|
|
|
**Monthly:**
|
|
- Review build duration trends
|
|
- Analyze failure patterns
|
|
- Update dependencies
|
|
- Review security scan results
|
|
|
|
**Quarterly:**
|
|
- Audit pipeline efficiency
|
|
- Review deployment frequency
|
|
- Update CI/CD tools and actions
|
|
- Team retrospective on CI/CD pain points
|