21 KiB
DevOps Practices
CI/CD, Infrastructure, Deployment, and Monitoring
Consolidated from:
- devops-engineer skills
- cloud-architect skills
- site-reliability-engineer skills
- release-manager skills
CI/CD Patterns Skill
Expert-level CI/CD pipeline design patterns and best practices
Core Principles
- Pipeline as Code: All pipeline configuration in version control
- Fast Feedback: Fail fast, provide clear error messages
- Build Once: Build artifacts once, deploy everywhere
- Idempotent: Running twice produces same result
- Secure by Default: Security scanning integrated, not optional
Multi-Stage Pipeline Pattern
┌─────────┐ ┌──────┐ ┌──────────┐ ┌────────┐ ┌────────┐
│ Build │──>│ Test │──>│ Security │──>│ Deploy │──>│ Verify │
└─────────┘ └──────┘ └──────────┘ └────────┘ └────────┘
Fast Medium Slow Manual Quick
(<2 min) (<5 min) (<10 min) (Approval) (<2 min)
Stage Ordering
- Build: Compile code, create artifacts (fast fail)
- Test: Unit → Integration → E2E (fastest first)
- Security: SAST → Dependency scan → Container scan
- Deploy: Dev → Staging → Prod (progressive)
- Verify: Smoke tests, health checks
Optimization Strategies
1. Caching
# Cache dependencies
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .pip/
- .m2/
- .gradle/
Impact: 50-80% faster builds
2. Parallelization
# Run tests in parallel
test:
parallel: 4
script:
- npm test -- --shard=${CI_NODE_INDEX}/${CI_NODE_TOTAL}
Impact: 4x faster test execution
3. Conditional Execution
# Skip unnecessary steps
deploy:
only:
- main
- /^release-.*$/
changes:
- src/**
- Dockerfile
Impact: Reduce unnecessary runs by 70%
4. Docker Layer Caching
# Multi-stage build
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
FROM node:18-alpine
COPY --from=builder /app/dist /app/dist
COPY --from=builder /app/node_modules /app/node_modules
Impact: 10x faster Docker builds
Security Scanning Integration
SAST (Static Application Security Testing)
sast:
stage: security
image: returntocorp/semgrep
script:
- semgrep --config=auto --json --output=sast-report.json .
artifacts:
reports:
sast: sast-report.json
Tools:
- Semgrep: Fast, customizable (free)
- SonarQube: Comprehensive code quality
- CodeQL: GitHub's semantic analysis
Dependency Scanning
dependency_scan:
stage: security
script:
- npm audit --audit-level=high
- snyk test --severity-threshold=high
allow_failure: false # Fail on critical vulnerabilities
Tools:
- Snyk: Comprehensive, auto-fix (free tier)
- Dependabot: GitHub native
- npm audit: Built-in Node.js
- safety: Python packages
Container Scanning
container_scan:
stage: security
image: aquasec/trivy
script:
- trivy image --severity HIGH,CRITICAL myapp:${CI_COMMIT_SHA}
Tools:
- Trivy: Fast, accurate (free)
- Clair: CoreOS project
- Anchore: Policy-based
Secret Detection
secrets_scan:
stage: security
image: zricethezav/gitleaks
script:
- gitleaks detect --source . --verbose
Tools:
- Gitleaks: Fast, configurable
- TruffleHog: High accuracy
- git-secrets: AWS focus
Testing Strategies
Test Pyramid
/\
/ \ E2E Tests (5%)
/____\ Slow, brittle
/ \
/ Integration \ (15%)
/________________\
/ \
/ Unit Tests (80%) \ Fast, reliable
/______________________\
Test Execution Order
- Linting: Fastest, catches syntax errors
- Unit tests: Fast, isolated
- Integration tests: Medium, database/API
- E2E tests: Slow, full system
Coverage Requirements
test:
script:
- npm test -- --coverage --coverageThreshold='{"global":{"branches":80,"functions":80,"lines":80}}'
Thresholds:
- Unit: ≥80% coverage (enforce)
- Integration: ≥60% coverage (goal)
- E2E: Critical paths only
Artifact Management
Build Artifacts
build:
script:
- npm run build
artifacts:
name: "build-${CI_COMMIT_SHA}"
paths:
- dist/
expire_in: 1 week
Docker Images
build_image:
script:
- docker build -t ${REGISTRY}/${IMAGE}:${CI_COMMIT_SHA} .
- docker tag ${REGISTRY}/${IMAGE}:${CI_COMMIT_SHA} ${REGISTRY}/${IMAGE}:latest
- docker push ${REGISTRY}/${IMAGE}:${CI_COMMIT_SHA}
- docker push ${REGISTRY}/${IMAGE}:latest
Tagging Strategy:
- Commit SHA: Immutable, traceable
- Semantic version: v1.2.3 (releases)
- Branch name: develop, staging
- latest: Most recent (use with caution)
Deployment Patterns
Environment Progression
Commit → Dev (auto) → Staging (auto) → Prod (manual)
Deployment with Approval
deploy_prod:
stage: deploy
environment:
name: production
url: https://app.example.com
when: manual # Require manual trigger
only:
- main
script:
- ./deploy.sh production
Deployment with Verification
deploy:
script:
- ./deploy.sh
- |
# Wait for deployment
for i in {1..30}; do
if curl -f https://app.example.com/health; then
echo "Deployment successful!"
exit 0
fi
sleep 10
done
echo "Deployment failed!"
exit 1
Rollback on Failure
deploy:
script:
- ./deploy.sh || (./rollback.sh && exit 1)
Notification Patterns
Slack Notifications
notify_slack:
stage: .post
when: on_failure
script:
- |
curl -X POST -H 'Content-type: application/json' \
--data "{
\"text\": \"Pipeline failed for ${CI_PROJECT_NAME} on ${CI_COMMIT_BRANCH}\",
\"attachments\": [{
\"color\": \"danger\",
\"fields\": [{
\"title\": \"Commit\",
\"value\": \"${CI_COMMIT_SHORT_SHA}: ${CI_COMMIT_MESSAGE}\"
}, {
\"title\": \"Author\",
\"value\": \"${CI_COMMIT_AUTHOR}\"
}, {
\"title\": \"Pipeline\",
\"value\": \"${CI_PIPELINE_URL}\"
}]
}]
}" \
${SLACK_WEBHOOK_URL}
Email on Production Deploy
notify_email:
stage: .post
only:
- main
script:
- |
echo "Deployed ${CI_COMMIT_SHORT_SHA} to production" | \
mail -s "Production Deployment" team@example.com
Branch Protection
Required Checks
# .github/workflows/required-checks.yml
name: Required Checks
on: [pull_request]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm run lint
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm test
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm audit
GitHub Branch Protection Rules
- Require pull request reviews (1-2 reviewers)
- Require status checks to pass
- Require branches to be up to date
- Include administrators
- Restrict force pushes
Common Patterns by Platform
GitHub Actions
name: CI/CD
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
NODE_VERSION: 18
REGISTRY: ghcr.io
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- run: npm ci
- run: npm run build
- run: npm test -- --coverage
- uses: codecov/codecov-action@v3
GitLab CI
stages:
- build
- test
- security
- deploy
variables:
DOCKER_DRIVER: overlay2
SECURE_ANALYZERS_PREFIX: "registry.gitlab.com/security-products"
build:
stage: build
script:
- npm ci
- npm run build
artifacts:
paths:
- dist/
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
test:
stage: test
script:
- npm test -- --coverage
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
Jenkins
pipeline {
agent any
environment {
NODE_VERSION = '18'
REGISTRY = 'registry.example.com'
}
stages {
stage('Build') {
steps {
sh 'npm ci'
sh 'npm run build'
}
}
stage('Test') {
parallel {
stage('Unit') {
steps {
sh 'npm test'
}
}
stage('Lint') {
steps {
sh 'npm run lint'
}
}
}
}
stage('Security') {
steps {
sh 'npm audit'
sh 'snyk test'
}
}
stage('Deploy') {
when {
branch 'main'
}
steps {
sh './deploy.sh'
}
}
}
post {
always {
junit 'reports/**/*.xml'
publishHTML([
reportDir: 'coverage',
reportFiles: 'index.html',
reportName: 'Coverage'
])
}
failure {
emailext(
subject: "Build Failed: ${env.JOB_NAME}",
body: "Check ${env.BUILD_URL}",
to: "${env.CHANGE_AUTHOR_EMAIL}"
)
}
}
}
Cost Optimization
GitHub Actions
- Use caching (50% faster, free)
- Use matrix builds sparingly
- Self-hosted runners for private repos
- Cost: $0.008/minute (Linux)
GitLab CI
- Use shared runners (free tier: 400 minutes/month)
- Cache dependencies
- Limit parallel jobs
- Cost: Free tier available, $19/user/month Pro
Jenkins
- Use spot instances for agents
- Shut down idle agents
- Containerized agents
- Cost: Infrastructure only
Troubleshooting
Slow Builds
- Profile pipeline (which stage is slow?)
- Add caching for dependencies
- Parallelize independent jobs
- Optimize Docker layers
- Use smaller base images
Flaky Tests
- Identify flaky tests (run 100x)
- Add explicit waits (not sleep)
- Mock external dependencies
- Isolate test data
- Retry failed tests (max 3x)
Failed Deployments
- Check deployment logs
- Verify health checks
- Check resource constraints
- Validate configuration
- Rollback if needed
Best Practices Summary
✅ DO:
- Keep pipelines fast (<10 min total)
- Fail fast (lint first, slow tests last)
- Cache dependencies
- Use semantic versioning
- Scan for vulnerabilities
- Require manual approval for prod
- Send notifications on failure
- Monitor pipeline performance
❌ DON'T:
- Hardcode secrets (use secrets management)
- Skip tests in CI
- Deploy without verification
- Use latest tag in prod
- Ignore security warnings
- Run unnecessary jobs
- Leave old artifacts
Quick Reference
| Task | GitHub Actions | GitLab CI | Jenkins |
|---|---|---|---|
| Syntax | YAML | YAML | Groovy |
| Caching | cache: key |
cache: section |
Pipeline plugin |
| Artifacts | actions/upload-artifact |
artifacts: section |
archiveArtifacts |
| Secrets | Repository secrets | CI/CD variables | Credentials plugin |
| Matrix | strategy: matrix: |
parallel: |
matrix {} |
| Conditions | if: |
only: / except: |
when {} |
🚀 MCP Integration: GitHub + Context7 for CI/CD Automation
Runtime Detection & Usage
The skill automatically detects available MCPs for CI/CD workflow enhancement:
const hasGitHub = typeof mcp__github__create_or_update_file !== 'undefined';
const hasContext7 = typeof mcp__context7__get_library_docs !== 'undefined';
if (hasGitHub && hasContext7) {
// Get latest CI/CD framework documentation
const githubActionsDocs = await mcp__context7__get_library_docs({
context7CompatibleLibraryID: "/actions/toolkit",
topic: "GitHub Actions workflow syntax caching artifacts",
tokens: 3000
});
// Create optimized workflow directly in repository
await mcp__github__create_or_update_file({
owner: "myorg",
repo: "myapp",
path: ".github/workflows/ci.yml",
content: generatedWorkflow,
message: "Add optimized CI/CD pipeline with caching"
});
} else {
console.log("ℹ️ GitHub/Context7 MCP not available");
console.log(" GitHub: npx @modelcontextprotocol/create-server github");
console.log(" Context7: npm install -g @context7/mcp-server");
}
Real-World Workflow Examples
Example 1: Multi-Stage Pipeline Generation with Best Practices
// Without MCP: Manual workflow writing (3 hours)
// 1. Read GitHub Actions docs
// 2. Research caching strategies
// 3. Write YAML from scratch
// 4. Test and debug
// 5. Optimize
// With GitHub + Context7 MCP: AI-assisted generation (15 minutes)
const actionsDocs = await mcp__context7__get_library_docs({
context7CompatibleLibraryID: "/actions/toolkit",
topic: "caching dependencies matrix builds artifacts security scanning",
tokens: 4000
});
const securityDocs = await mcp__context7__get_library_docs({
context7CompatibleLibraryID: "/returntocorp/semgrep",
topic: "CI integration security scanning",
tokens: 2500
});
// Generate optimized workflow
const workflow = generateGitHubActionsWorkflow({
language: "node",
stages: ["build", "test", "security", "deploy"],
patterns: actionsDocs,
securityScan: securityDocs
});
// Deploy directly to repository
await mcp__github__create_or_update_file({
owner: "myorg",
repo: "myapp",
path: ".github/workflows/ci.yml",
content: workflow,
message: "feat: add optimized CI/CD pipeline
- Multi-stage build with caching
- Parallel test execution
- Security scanning (SAST + dependency)
- Conditional deployment"
});
// ✅ 12x faster pipeline creation
// ✅ Latest best practices applied
// ✅ Automatic repository integration
Example 2: GitLab CI to GitHub Actions Migration
// Analyze existing GitLab CI configuration
const gitlabConfig = await mcp__github__get_file_contents({
owner: "myorg",
repo: "legacy-app",
path: ".gitlab-ci.yml"
});
// Get GitHub Actions patterns
const migrationDocs = await mcp__context7__get_library_docs({
context7CompatibleLibraryID: "/actions/toolkit",
topic: "GitLab CI migration GitHub Actions equivalents",
tokens: 3500
});
// Convert GitLab CI → GitHub Actions
const convertedWorkflow = convertGitLabToGitHubActions({
gitlabConfig: gitlabConfig.content,
patterns: migrationDocs
});
// Create PR with converted workflow
await mcp__github__create_pull_request({
owner: "myorg",
repo: "legacy-app",
title: "Migrate from GitLab CI to GitHub Actions",
body: `## Migration Summary
- Converted all stages to GitHub Actions jobs
- Preserved caching strategy
- Maintained deployment logic
- Added security scanning
## Changes
- \`.gitlab-ci.yml\` → \`.github/workflows/ci.yml\`
- Updated cache paths for GitHub Actions
- Converted variables to GitHub secrets
`,
head: "feat/github-actions-migration",
base: "main",
files: [
{ path: ".github/workflows/ci.yml", content: convertedWorkflow }
]
});
// ✅ Migration (1 hour vs 1 day)
// ✅ Automatic PR creation
// ✅ Best practices applied
Example 3: CI/CD Performance Optimization
// Analyze current pipeline performance
const workflows = await mcp__github__list_workflows({
owner: "myorg",
repo: "myapp"
});
const runs = await mcp__github__list_workflow_runs({
owner: "myorg",
repo: "myapp",
workflow_id: workflows[0].id,
per_page: 100
});
// Get optimization patterns
const optimizationDocs = await mcp__context7__get_library_docs({
context7CompatibleLibraryID: "/actions/toolkit",
topic: "workflow optimization caching parallelization",
tokens: 3500
});
// Analyze bottlenecks
const analysis = analyzeWorkflowPerformance(runs);
// Results: Test stage takes 15 min (80% of total time)
// Optimize with parallelization
const optimizedWorkflow = await optimizeWorkflow({
currentWorkflow: workflow,
bottlenecks: analysis.bottlenecks,
patterns: optimizationDocs,
strategies: ["parallelize-tests", "cache-dependencies", "matrix-builds"]
});
// Deploy optimized workflow
await mcp__github__create_or_update_file({
owner: "myorg",
repo: "myapp",
path: ".github/workflows/ci.yml",
content: optimizedWorkflow,
message: "perf: optimize CI pipeline
- Parallelize tests across 4 runners
- Add dependency caching
- Use matrix strategy for multi-version testing
Reduces pipeline time: 18 min → 5 min (72% faster)"
});
// ✅ Pipeline time: 18 min → 5 min (72% faster)
// ✅ Cost reduction: 4x less compute time
// ✅ Faster feedback for developers
Example 4: Automated Security Scanning Integration
// Get security tool documentation
const securityTools = [
"/returntocorp/semgrep",
"/aquasecurity/trivy",
"/zricethezav/gitleaks"
];
const securityDocs = await Promise.all(
securityTools.map(tool =>
mcp__context7__get_library_docs({
context7CompatibleLibraryID: tool,
topic: "CI integration security scanning",
tokens: 2500
})
)
);
// Generate comprehensive security workflow
const securityWorkflow = generateSecurityWorkflow({
sast: securityDocs[0], // Semgrep
container: securityDocs[1], // Trivy
secrets: securityDocs[2] // Gitleaks
});
// Add to repository
await mcp__github__create_or_update_file({
owner: "myorg",
repo: "myapp",
path: ".github/workflows/security.yml",
content: securityWorkflow,
message: "security: add comprehensive security scanning
- SAST: Semgrep for code analysis
- Container: Trivy for image scanning
- Secrets: Gitleaks for credential detection
Fails pipeline on HIGH/CRITICAL vulnerabilities"
});
// ✅ Comprehensive security (30 min vs 4 hours)
// ✅ Multiple scan types integrated
// ✅ Production-ready thresholds
Available Library IDs for CI/CD
GitHub Actions:
/actions/toolkit- GitHub Actions core/actions/cache- Caching action/actions/upload-artifact- Artifact uploads/actions/download-artifact- Artifact downloads
GitLab CI:
/gitlab-org/gitlab- GitLab CI/CD/gitlab-org/gitlab-runner- GitLab Runner
Jenkins:
/jenkinsci/jenkins- Jenkins core/jenkinsci/pipeline-plugin- Pipeline as Code
CI/CD Tools:
/circleci/circleci-docs- CircleCI/travis-ci/travis-ci- Travis CI/drone/drone- Drone CI
Security Scanning:
/returntocorp/semgrep- SAST/aquasecurity/trivy- Container scanning/zricethezav/gitleaks- Secret detection/snyk/cli- Dependency scanning
Build Tools:
/docker/build-push-action- Docker builds/docker/metadata-action- Docker metadata
Benefits Comparison
| Task | Without MCP | With GitHub + Context7 MCP | Time Saved |
|---|---|---|---|
| New pipeline creation | 3 hours (manual docs + trial/error) | 15 min (AI-assisted) | 92% faster |
| GitLab → GitHub migration | 1 day (conversion + testing) | 1 hour (automated) | 88% faster |
| Pipeline optimization | 4 hours (profiling + research) | 30 min (analysis + apply) | 87% faster |
| Security integration | 4 hours (tool research + setup) | 30 min (multi-tool setup) | 87% faster |
| Branch protection setup | 30 min (manual clicking) | 2 min (API automation) | 93% faster |
When to Use GitHub + Context7 MCP
Ideal for:
- ✅ Creating new CI/CD pipelines from scratch
- ✅ Migrating between CI/CD platforms
- ✅ Optimizing existing pipeline performance
- ✅ Integrating security scanning tools
- ✅ Setting up branch protection rules
- ✅ Automating PR workflows
- ✅ Multi-repo pipeline standardization
Not needed for:
- ❌ Simple one-stage builds
- ❌ Basic linting workflows
- ❌ Trivial pipeline modifications
Installation
# Install GitHub MCP
npx @modelcontextprotocol/create-server github
# Install Context7 MCP
npm install -g @context7/mcp-server
# Configure in Claude Code MCP settings
# Both servers will auto-detect and enable integration
Security Best Practices
When using GitHub + Context7 MCP for CI/CD:
- Never commit secrets to workflow files (use GitHub Secrets)
- Validate all security scanner configurations
- Use pinned versions for actions (not
@main) - Review generated workflows before merging
- Enable branch protection on main branches
- Require status checks before merging
- Use least privilege for CI/CD service accounts
Version: 1.0.0 Last Updated: 2025-01-20 Patterns: 20+ Best Practices: Production-tested