Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:51:12 +08:00
commit 1878d01517
21 changed files with 8728 additions and 0 deletions

View File

@@ -0,0 +1,675 @@
# CI/CD Best Practices
Comprehensive guide to CI/CD pipeline design, testing strategies, and deployment patterns.
## Table of Contents
- [Pipeline Design Principles](#pipeline-design-principles)
- [Testing in CI/CD](#testing-in-cicd)
- [Deployment Strategies](#deployment-strategies)
- [Dependency Management](#dependency-management)
- [Artifact & Release Management](#artifact--release-management)
- [Platform Patterns](#platform-patterns)
---
## Pipeline Design Principles
### Fast Feedback Loops
Design pipelines to provide feedback quickly:
**Priority ordering:**
1. Linting and code formatting (seconds)
2. Unit tests (1-5 minutes)
3. Integration tests (5-15 minutes)
4. E2E tests (15-30 minutes)
5. Deployment (varies)
**Fail fast pattern:**
```yaml
# GitHub Actions
jobs:
lint:
runs-on: ubuntu-latest
steps:
- run: npm run lint
test:
needs: lint # Only run if lint passes
runs-on: ubuntu-latest
steps:
- run: npm test
e2e:
needs: [lint, test] # Run after basic checks
```
### Job Parallelization
Run independent jobs concurrently:
**GitHub Actions:**
```yaml
jobs:
lint:
runs-on: ubuntu-latest
test:
runs-on: ubuntu-latest
# No 'needs' - runs in parallel with lint
build:
needs: [lint, test] # Wait for both
runs-on: ubuntu-latest
```
**GitLab CI:**
```yaml
stages:
- validate
- test
- build
# Jobs in same stage run in parallel
unit-test:
stage: test
integration-test:
stage: test
e2e-test:
stage: test
```
### Monorepo Strategies
**Path-based triggers (GitHub):**
```yaml
on:
push:
paths:
- 'services/api/**'
- 'shared/**'
jobs:
api-test:
if: |
contains(github.event.head_commit.modified, 'services/api/') ||
contains(github.event.head_commit.modified, 'shared/')
```
**GitLab rules:**
```yaml
api-test:
rules:
- changes:
- services/api/**/*
- shared/**/*
frontend-test:
rules:
- changes:
- services/frontend/**/*
- shared/**/*
```
### Matrix Builds
Test across multiple versions/platforms:
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
node: [18, 20, 22]
include:
- os: ubuntu-latest
node: 22
coverage: true
exclude:
- os: windows-latest
node: 18
fail-fast: false # See all results
```
**GitLab parallel:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
OS: ['ubuntu', 'alpine']
```
---
## Testing in CI/CD
### Test Pyramid Strategy
Maintain proper test distribution:
```
/\
/E2E\ 10% - Slow, expensive, flaky
/-----\
/ Int \ 20% - Medium speed
/--------\
/ Unit \ 70% - Fast, reliable
/------------\
```
**Implementation:**
```yaml
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- run: npm run test:unit # Fast, runs on every commit
integration-test:
runs-on: ubuntu-latest
needs: unit-test
steps:
- run: npm run test:integration # Medium, after unit tests
e2e-test:
runs-on: ubuntu-latest
needs: [unit-test, integration-test]
if: github.ref == 'refs/heads/main' # Only on main branch
steps:
- run: npm run test:e2e # Slow, only on main
```
### Test Splitting & Parallelization
Split large test suites:
**GitHub Actions:**
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
**Playwright example:**
```yaml
strategy:
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
```
### Code Coverage
**Track coverage trends:**
```yaml
- name: Run tests with coverage
run: npm test -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v4
with:
files: ./coverage/lcov.info
fail_ci_if_error: true # Fail if upload fails
- name: Coverage check
run: |
COVERAGE=$(jq -r '.total.lines.pct' coverage/coverage-summary.json)
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "Coverage $COVERAGE% is below 80%"
exit 1
fi
```
### Test Environment Management
**Docker Compose for services:**
```yaml
jobs:
integration-test:
runs-on: ubuntu-latest
steps:
- name: Start services
run: docker-compose up -d postgres redis
- name: Wait for services
run: |
timeout 30 bash -c 'until docker-compose exec -T postgres pg_isready; do sleep 1; done'
- name: Run tests
run: npm run test:integration
- name: Cleanup
if: always()
run: docker-compose down
```
**GitLab services:**
```yaml
integration-test:
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: password
script:
- npm run test:integration
```
---
## Deployment Strategies
### Deployment Patterns
**1. Direct Deployment (Simple)**
```yaml
deploy:
if: github.ref == 'refs/heads/main'
steps:
- run: |
aws s3 sync dist/ s3://${{ secrets.S3_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.CF_DIST }}
```
**2. Blue-Green Deployment**
```yaml
deploy:
steps:
- name: Deploy to staging slot
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
- name: Health check
run: |
for i in {1..10}; do
if curl -f https://$APP.azurewebsites.net/health; then
echo "Health check passed"
exit 0
fi
sleep 10
done
exit 1
- name: Rollback on failure
if: failure()
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
```
**3. Canary Deployment**
```yaml
deploy-canary:
steps:
- run: kubectl set image deployment/app app=myapp:${{ github.sha }}
- run: kubectl patch deployment app -p '{"spec":{"replicas":1}}' # 1 pod
- run: sleep 300 # Monitor for 5 minutes
- run: kubectl scale deployment app --replicas=10 # Scale to full
```
### Environment Management
**GitHub Environments:**
```yaml
jobs:
deploy-staging:
environment:
name: staging
url: https://staging.example.com
steps:
- run: ./deploy.sh staging
deploy-production:
needs: deploy-staging
environment:
name: production
url: https://example.com
steps:
- run: ./deploy.sh production
```
**Protection rules:**
- Require approval for production
- Restrict to specific branches
- Add deployment delay
**GitLab environments:**
```yaml
deploy:staging:
stage: deploy
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
only:
- develop
deploy:production:
stage: deploy
environment:
name: production
url: https://example.com
when: manual # Require manual trigger
only:
- main
```
### Deployment Gates
**Pre-deployment checks:**
```yaml
pre-deploy-checks:
steps:
- name: Check migration status
run: ./scripts/check-migrations.sh
- name: Verify dependencies
run: npm audit --audit-level=high
- name: Check service health
run: curl -f https://api.example.com/health
```
**Post-deployment validation:**
```yaml
post-deploy-validation:
needs: deploy
steps:
- name: Smoke tests
run: npm run test:smoke
- name: Monitor errors
run: |
ERROR_COUNT=$(datadog-api errors --since 5m)
if [ $ERROR_COUNT -gt 10 ]; then
echo "Error spike detected!"
exit 1
fi
```
---
## Dependency Management
### Lock Files
**Always commit lock files:**
- `package-lock.json` (npm)
- `yarn.lock` (Yarn)
- `pnpm-lock.yaml` (pnpm)
- `Cargo.lock` (Rust)
- `Gemfile.lock` (Ruby)
- `poetry.lock` (Python)
**Use deterministic install commands:**
```bash
# Good - uses lock file
npm ci # Not npm install
yarn install --frozen-lockfile
pnpm install --frozen-lockfile
pip install -r requirements.txt
# Bad - updates lock file
npm install
```
### Dependency Caching
**See optimization.md for detailed caching strategies**
Quick reference:
- Hash lock files for cache keys
- Include OS/platform in cache key
- Use restore-keys for partial matches
- Separate cache for build artifacts vs dependencies
### Security Scanning
**Automated vulnerability checks:**
```yaml
security-scan:
steps:
- name: Dependency audit
run: |
npm audit --audit-level=high
# Or: pip-audit, cargo audit, bundle audit
- name: SAST scanning
uses: github/codeql-action/analyze@v3
- name: Container scanning
run: trivy image myapp:${{ github.sha }}
```
### Dependency Updates
**Automated dependency updates:**
- Dependabot (GitHub)
- Renovate
- GitLab Dependency Scanning
**Configuration example (Dependabot):**
```yaml
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: npm
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
groups:
dev-dependencies:
dependency-type: development
```
---
## Artifact & Release Management
### Artifact Strategy
**Build once, deploy many:**
```yaml
build:
steps:
- run: npm run build
- uses: actions/upload-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
retention-days: 7
deploy-staging:
needs: build
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
- run: ./deploy.sh staging
deploy-production:
needs: [build, deploy-staging]
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
- run: ./deploy.sh production
```
### Container Image Management
**Multi-stage builds:**
```dockerfile
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]
```
**Image tagging strategy:**
```yaml
- name: Build and tag images
run: |
docker build -t myapp:${{ github.sha }} .
docker tag myapp:${{ github.sha }} myapp:latest
docker tag myapp:${{ github.sha }} myapp:v1.2.3
```
### Release Automation
**Semantic versioning:**
```yaml
release:
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/create-release@v1
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
body: |
Changes in this release:
${{ github.event.head_commit.message }}
```
**Changelog generation:**
```yaml
- name: Generate changelog
run: |
git log $(git describe --tags --abbrev=0)..HEAD \
--pretty=format:"- %s (%h)" > CHANGELOG.md
```
---
## Platform Patterns
### GitHub Actions
**Reusable workflows:**
```yaml
# .github/workflows/reusable-test.yml
on:
workflow_call:
inputs:
node-version:
required: true
type: string
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
- run: npm test
```
**Composite actions:**
```yaml
# .github/actions/setup-app/action.yml
name: Setup Application
runs:
using: composite
steps:
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
shell: bash
```
### GitLab CI
**Templates & extends:**
```yaml
.test_template:
image: node:20
before_script:
- npm ci
script:
- npm test
unit-test:
extends: .test_template
script:
- npm run test:unit
integration-test:
extends: .test_template
script:
- npm run test:integration
```
**Dynamic child pipelines:**
```yaml
generate-pipeline:
script:
- ./generate-config.sh > pipeline.yml
artifacts:
paths:
- pipeline.yml
trigger-pipeline:
trigger:
include:
- artifact: pipeline.yml
job: generate-pipeline
```
---
## Continuous Improvement
### Metrics to Track
- **Build duration:** Target < 10 minutes
- **Failure rate:** Target < 5%
- **Time to recovery:** Target < 1 hour
- **Deployment frequency:** Aim for multiple/day
- **Lead time:** Commit to production < 1 day
### Pipeline Optimization Checklist
- [ ] Jobs run in parallel where possible
- [ ] Dependencies are cached
- [ ] Test suite is properly split
- [ ] Linting fails fast
- [ ] Only necessary tests run on PRs
- [ ] Artifacts are reused across jobs
- [ ] Pipeline has appropriate timeouts
- [ ] Flaky tests are identified and fixed
- [ ] Security scanning is automated
- [ ] Deployment requires approval
### Regular Reviews
**Monthly:**
- Review build duration trends
- Analyze failure patterns
- Update dependencies
- Review security scan results
**Quarterly:**
- Audit pipeline efficiency
- Review deployment frequency
- Update CI/CD tools and actions
- Team retrospective on CI/CD pain points

862
references/devsecops.md Normal file
View File

@@ -0,0 +1,862 @@
# DevSecOps in CI/CD
Comprehensive guide to integrating security into CI/CD pipelines with SAST, DAST, SCA, and security gates.
## Table of Contents
- [Shift-Left Security](#shift-left-security)
- [SAST (Static Application Security Testing)](#sast-static-application-security-testing)
- [DAST (Dynamic Application Security Testing)](#dast-dynamic-application-security-testing)
- [SCA (Software Composition Analysis)](#sca-software-composition-analysis)
- [Container Security](#container-security)
- [Secret Scanning](#secret-scanning)
- [Security Gates & Quality Gates](#security-gates--quality-gates)
- [Compliance & License Scanning](#compliance--license-scanning)
---
## Shift-Left Security
**Core principle:** Integrate security testing early in the development lifecycle, not just before production.
**Security testing stages in CI/CD:**
```
Commit → SAST → Unit Tests → SCA → Build → Container Scan → Deploy to Test → DAST → Production
↓ ↓ ↓ ↓ ↓
Secret Code Dependency Docker Dynamic Security
Scan Analysis Vuln Check Image Scan App Testing Gates
```
**Benefits:**
- Find vulnerabilities early (cheaper to fix)
- Faster feedback to developers
- Reduce security debt
- Prevent vulnerable code from reaching production
---
## SAST (Static Application Security Testing)
Analyzes source code, bytecode, or binaries for security vulnerabilities without executing the application.
### Tools by Language
| Language | Tools | GitHub Actions | GitLab CI |
|----------|-------|----------------|-----------|
| **Multi-language** | CodeQL, Semgrep, SonarQube | ✅ | ✅ |
| **JavaScript/TypeScript** | ESLint (security plugins), NodeJsScan | ✅ | ✅ |
| **Python** | Bandit, Pylint, Safety | ✅ | ✅ |
| **Go** | Gosec, GoSec Scanner | ✅ | ✅ |
| **Java** | SpotBugs, FindSecBugs, PMD | ✅ | ✅ |
| **C#/.NET** | Security Code Scan, Roslyn Analyzers | ✅ | ✅ |
### CodeQL (GitHub)
**GitHub Actions:**
```yaml
name: CodeQL Analysis
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * 1' # Weekly scan
jobs:
analyze:
name: Analyze Code
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ['javascript', 'python']
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
queries: security-extended
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
```
**Key features:**
- Supports 10+ languages
- Deep semantic analysis
- Low false positive rate
- Integrates with GitHub Security tab
- Custom query support
### Semgrep
**GitHub Actions:**
```yaml
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/owasp-top-ten
p/cwe-top-25
```
**GitLab CI:**
```yaml
semgrep:
stage: test
image: returntocorp/semgrep
script:
- semgrep --config=auto --sarif --output=semgrep.sarif .
artifacts:
reports:
sast: semgrep.sarif
```
**Benefits:**
- Fast (runs in seconds)
- Highly customizable rules
- Multi-language support
- CI-native design
### Language-Specific SAST
**Python - Bandit:**
```yaml
# GitHub Actions
- name: Run Bandit
run: |
pip install bandit
bandit -r src/ -f json -o bandit-report.json
bandit -r src/ --exit-zero -ll # Only high severity fails build
# GitLab CI
bandit:
stage: test
image: python:3.11
script:
- pip install bandit
- bandit -r src/ -ll -f gitlab > bandit-report.json
artifacts:
reports:
sast: bandit-report.json
```
**JavaScript - ESLint Security Plugin:**
```yaml
# GitHub Actions
- name: Run ESLint Security
run: |
npm install eslint-plugin-security
npx eslint . --plugin=security --format=json --output-file=eslint-security.json
```
**Go - Gosec:**
```yaml
# GitHub Actions
- name: Run Gosec
uses: securego/gosec@master
with:
args: '-fmt sarif -out gosec.sarif ./...'
# GitLab CI
gosec:
stage: test
image: securego/gosec:latest
script:
- gosec -fmt json -out gosec-report.json ./...
artifacts:
reports:
sast: gosec-report.json
```
### SonarQube/SonarCloud
**GitHub Actions:**
```yaml
- name: SonarCloud Scan
uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
with:
args: >
-Dsonar.projectKey=my-project
-Dsonar.organization=my-org
-Dsonar.sources=src
-Dsonar.tests=tests
-Dsonar.python.coverage.reportPaths=coverage.xml
```
**GitLab CI:**
```yaml
sonarqube:
stage: test
image: sonarsource/sonar-scanner-cli:latest
script:
- sonar-scanner
-Dsonar.projectKey=$CI_PROJECT_NAME
-Dsonar.sources=src
-Dsonar.host.url=$SONAR_HOST_URL
-Dsonar.login=$SONAR_TOKEN
```
---
## DAST (Dynamic Application Security Testing)
Tests running applications for vulnerabilities by simulating attacks.
### OWASP ZAP
**Full scan workflow (GitHub Actions):**
```yaml
name: DAST Scan
on:
schedule:
- cron: '0 3 * * 1' # Weekly scan
workflow_dispatch:
jobs:
dast:
runs-on: ubuntu-latest
services:
app:
image: myapp:latest
ports:
- 8080:8080
steps:
- name: Wait for app to start
run: |
timeout 60 bash -c 'until curl -f http://localhost:8080/health; do sleep 2; done'
- name: ZAP Baseline Scan
uses: zaproxy/action-baseline@v0.10.0
with:
target: 'http://localhost:8080'
rules_file_name: '.zap/rules.tsv'
fail_action: true
- name: Upload ZAP report
if: always()
uses: actions/upload-artifact@v4
with:
name: zap-report
path: report_html.html
```
**GitLab CI:**
```yaml
dast:
stage: test
image: owasp/zap2docker-stable
services:
- name: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
alias: testapp
script:
# Baseline scan
- zap-baseline.py -t http://testapp:8080 -r zap-report.html -J zap-report.json
artifacts:
when: always
paths:
- zap-report.html
- zap-report.json
reports:
dast: zap-report.json
only:
- schedules
- main
```
**ZAP scan types:**
1. **Baseline Scan** (Fast, ~1-2 min)
```bash
zap-baseline.py -t https://staging.example.com -r report.html
```
- Passive scanning only
- No active attacks
- Good for PR checks
2. **Full Scan** (Comprehensive, 10-60 min)
```bash
zap-full-scan.py -t https://staging.example.com -r report.html
```
- Active + Passive scanning
- Attempts exploits
- Use on staging only
3. **API Scan**
```bash
zap-api-scan.py -t https://api.example.com/openapi.json -f openapi -r report.html
```
- For REST APIs
- OpenAPI/Swagger support
### Other DAST Tools
**Nuclei:**
```yaml
- name: Run Nuclei
uses: projectdiscovery/nuclei-action@main
with:
target: https://staging.example.com
templates: cves,vulnerabilities,exposures
```
**Nikto (Web server scanner):**
```yaml
nikto:
stage: dast
image: sullo/nikto
script:
- nikto -h http://testapp:8080 -Format json -output nikto-report.json
```
---
## SCA (Software Composition Analysis)
Identifies vulnerabilities in third-party dependencies and libraries.
### Dependency Scanning
**GitHub Dependabot (Built-in):**
```yaml
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "npm"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
```
**GitHub Actions - Dependency Review:**
```yaml
- name: Dependency Review
uses: actions/dependency-review-action@v4
with:
fail-on-severity: high
```
**npm audit:**
```yaml
- name: npm audit
run: |
npm audit --audit-level=high
# Or with audit-ci for better control
npx audit-ci --high
```
**pip-audit (Python):**
```yaml
- name: Python Security Check
run: |
pip install pip-audit
pip-audit --requirement requirements.txt --format json --output pip-audit.json
```
**Snyk:**
```yaml
# GitHub Actions
- name: Run Snyk
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high --fail-on=all
# GitLab CI
snyk:
stage: test
image: snyk/snyk:node
script:
- snyk test --severity-threshold=high --json-file-output=snyk-report.json
artifacts:
reports:
dependency_scanning: snyk-report.json
```
**OWASP Dependency-Check:**
```yaml
- name: OWASP Dependency Check
run: |
wget https://github.com/jeremylong/DependencyCheck/releases/download/v8.4.0/dependency-check-8.4.0-release.zip
unzip dependency-check-8.4.0-release.zip
./dependency-check/bin/dependency-check.sh \
--scan . \
--format JSON \
--out dependency-check-report.json \
--failOnCVSS 7
```
### GitLab Dependency Scanning (Built-in)
```yaml
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
dependency_scanning:
variables:
DS_EXCLUDED_PATHS: "test/,tests/,spec/,vendor/"
```
---
## Container Security
### Image Scanning
**Trivy (Comprehensive):**
```yaml
# GitHub Actions
- name: Run Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1'
- name: Upload to Security tab
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: 'trivy-results.sarif'
# GitLab CI
trivy:
stage: test
image: aquasec/trivy:latest
script:
- trivy image --severity HIGH,CRITICAL --format json --output trivy-report.json $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- trivy image --severity HIGH,CRITICAL --exit-code 1 $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
artifacts:
reports:
container_scanning: trivy-report.json
```
**Grype:**
```yaml
- name: Scan with Grype
uses: anchore/scan-action@v3
with:
image: myapp:latest
fail-build: true
severity-cutoff: high
output-format: sarif
- name: Upload Grype results
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: ${{ steps.scan.outputs.sarif }}
```
**Clair:**
```yaml
clair:
stage: scan
image: arminc/clair-scanner:latest
script:
- clair-scanner --ip $(hostname -i) myapp:latest
```
### SBOM (Software Bill of Materials)
**Syft:**
```yaml
- name: Generate SBOM
uses: anchore/sbom-action@v0
with:
image: myapp:${{ github.sha }}
format: spdx-json
output-file: sbom.spdx.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.spdx.json
```
**CycloneDX:**
```yaml
- name: Generate CycloneDX SBOM
run: |
npm install -g @cyclonedx/cyclonedx-npm
cyclonedx-npm --output-file sbom.json
```
---
## Secret Scanning
### Pre-commit Prevention
**TruffleHog:**
```yaml
# GitHub Actions
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
# GitLab CI
trufflehog:
stage: test
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json
```
**Gitleaks:**
```yaml
# GitHub Actions
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# GitLab CI
gitleaks:
stage: test
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
```
**GitGuardian:**
```yaml
- name: GitGuardian scan
uses: GitGuardian/ggshield-action@master
env:
GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
```
### GitHub Secret Scanning (Native)
Enable in: **Settings → Code security and analysis → Secret scanning**
- Automatic detection
- Partner patterns (AWS, Azure, GCP, etc.)
- Push protection (prevents commits with secrets)
---
## Security Gates & Quality Gates
### Fail Pipeline on Security Issues
**Threshold-based gates:**
```yaml
security-gate:
stage: gate
script:
# Check vulnerability count
- |
CRITICAL=$(jq '.vulnerabilities | map(select(.severity=="CRITICAL")) | length' trivy-report.json)
HIGH=$(jq '.vulnerabilities | map(select(.severity=="HIGH")) | length' trivy-report.json)
echo "Critical: $CRITICAL, High: $HIGH"
if [ "$CRITICAL" -gt 0 ]; then
echo "❌ CRITICAL vulnerabilities found!"
exit 1
fi
if [ "$HIGH" -gt 5 ]; then
echo "❌ Too many HIGH vulnerabilities: $HIGH"
exit 1
fi
```
**SonarQube Quality Gate:**
```yaml
- name: Check Quality Gate
run: |
STATUS=$(curl -u $SONAR_TOKEN: "$SONAR_HOST/api/qualitygates/project_status?projectKey=$PROJECT_KEY" | jq -r '.projectStatus.status')
if [ "$STATUS" != "OK" ]; then
echo "Quality gate failed: $STATUS"
exit 1
fi
```
### Manual Approval for Production
**GitHub Actions:**
```yaml
deploy-production:
runs-on: ubuntu-latest
needs: [sast, dast, container-scan]
environment:
name: production
# Requires manual approval in Settings → Environments
steps:
- run: echo "Deploying to production"
```
**GitLab CI:**
```yaml
deploy:production:
stage: deploy
needs: [sast, dast, container_scanning]
script:
- ./deploy.sh production
when: manual
only:
- main
```
---
## Compliance & License Scanning
### License Compliance
**FOSSology:**
```yaml
license-scan:
stage: compliance
image: fossology/fossology:latest
script:
- fossology --scan ./src
```
**License Finder:**
```yaml
- name: Check Licenses
run: |
gem install license_finder
license_finder --decisions-file .license_finder.yml
```
**npm license checker:**
```yaml
- name: License Check
run: |
npx license-checker --production --onlyAllow "MIT;Apache-2.0;BSD-3-Clause;ISC"
```
### Policy as Code
**Open Policy Agent (OPA):**
```yaml
policy-check:
stage: gate
image: openpolicyagent/opa:latest
script:
- opa test policies/
- opa eval --data policies/ --input violations.json "data.security.allow"
```
---
## Complete DevSecOps Pipeline
**Comprehensive example (GitHub Actions):**
```yaml
name: DevSecOps Pipeline
on: [push, pull_request]
jobs:
# Stage 1: Secret Scanning
secret-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: trufflesecurity/trufflehog@main
# Stage 2: SAST
sast:
runs-on: ubuntu-latest
needs: secret-scan
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
# Stage 3: SCA
sca:
runs-on: ubuntu-latest
needs: secret-scan
steps:
- uses: actions/checkout@v4
- run: npm audit --audit-level=high
- uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
# Stage 4: Build & Container Scan
build-scan:
runs-on: ubuntu-latest
needs: [sast, sca]
steps:
- uses: actions/checkout@v4
- run: docker build -t myapp:${{ github.sha }} .
- uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
exit-code: '1'
# Stage 5: DAST
dast:
runs-on: ubuntu-latest
needs: build-scan
if: github.ref == 'refs/heads/main'
steps:
- uses: zaproxy/action-baseline@v0.10.0
with:
target: 'https://staging.example.com'
# Stage 6: Security Gate
security-gate:
runs-on: ubuntu-latest
needs: [sast, sca, build-scan, dast]
steps:
- run: echo "All security checks passed!"
- run: echo "Ready for deployment"
# Stage 7: Deploy
deploy:
runs-on: ubuntu-latest
needs: security-gate
environment: production
steps:
- run: echo "Deploying to production"
```
---
## Best Practices
### 1. Fail Fast
- Run secret scanning first
- Run SAST early in pipeline
- Block PRs with critical vulnerabilities
### 2. Balance Speed vs Security
- SAST/SCA on every PR (fast)
- Container scanning after build
- DAST on schedules or staging only (slow)
### 3. Prioritize Findings
**Focus on:**
- Critical/High severity
- Exploitable vulnerabilities
- Direct dependencies (not transitive)
- Public-facing components
### 4. Developer Experience
- Clear error messages
- Link to remediation guidance
- Don't overwhelm with noise
- Use quality gates, not just fail/pass
### 5. Continuous Improvement
- Track security debt over time
- Set SLAs for vulnerability remediation
- Regular tool evaluation
- Security training for developers
### 6. Reporting & Metrics
**Track:**
- Mean Time to Remediate (MTTR)
- Vulnerability backlog
- False positive rate
- Coverage (% of code scanned)
```yaml
- name: Generate Security Report
run: |
echo "## Security Scan Summary" >> $GITHUB_STEP_SUMMARY
echo "- SAST: ✅ Passed" >> $GITHUB_STEP_SUMMARY
echo "- SCA: ⚠️ 3 vulnerabilities" >> $GITHUB_STEP_SUMMARY
echo "- Container: ✅ Passed" >> $GITHUB_STEP_SUMMARY
echo "- DAST: 🔄 Scheduled" >> $GITHUB_STEP_SUMMARY
```
---
## Tool Comparison
| Category | Tool | Speed | Accuracy | Cost | Best For |
|----------|------|-------|----------|------|----------|
| **SAST** | CodeQL | Medium | High | Free (GH) | Deep analysis |
| | Semgrep | Fast | Medium | Free/Paid | Custom rules |
| | SonarQube | Medium | High | Free/Paid | Quality + Security |
| **DAST** | OWASP ZAP | Medium | High | Free | Web apps |
| | Burp Suite | Slow | High | Paid | Professional |
| **SCA** | Snyk | Fast | High | Free/Paid | Easy integration |
| | Dependabot | Fast | Medium | Free (GH) | Auto PRs |
| **Container** | Trivy | Fast | High | Free | Fast scans |
| | Grype | Fast | High | Free | SBOM support |
| **Secrets** | TruffleHog | Fast | High | Free/Paid | Git history |
| | GitGuardian | Fast | High | Paid | Real-time |
---
## Security Scanning Schedule
**Recommended frequency:**
| Scan Type | PR | Main Branch | Schedule | Notes |
|-----------|----|-----------|-----------| ------|
| Secret Scanning | ✅ Every | ✅ Every | - | Fast, critical |
| SAST | ✅ Every | ✅ Every | - | Fast, essential |
| SCA | ✅ Every | ✅ Every | Weekly | Check dependencies |
| Linting | ✅ Every | ✅ Every | - | Very fast |
| Container Scan | ❌ No | ✅ Every | - | After build |
| DAST Baseline | ❌ No | ✅ Every | - | Medium speed |
| DAST Full | ❌ No | ❌ No | Weekly | Very slow |
| Penetration Test | ❌ No | ❌ No | Quarterly | Manual |
---
## Security Checklist
- [ ] Secret scanning enabled and running
- [ ] SAST configured for all languages used
- [ ] Dependency scanning (SCA) enabled
- [ ] Container images scanned before deployment
- [ ] DAST running on staging environment
- [ ] Security findings triaged in issue tracker
- [ ] Quality gates prevent vulnerable deployments
- [ ] SBOM generated for releases
- [ ] Security scan results tracked over time
- [ ] Vulnerability remediation SLAs defined
- [ ] Security training for developers
- [ ] Incident response plan documented

651
references/optimization.md Normal file
View File

@@ -0,0 +1,651 @@
# CI/CD Pipeline Optimization
Comprehensive guide to improving pipeline performance through caching, parallelization, and smart resource usage.
## Table of Contents
- [Caching Strategies](#caching-strategies)
- [Parallelization Techniques](#parallelization-techniques)
- [Build Optimization](#build-optimization)
- [Test Optimization](#test-optimization)
- [Resource Management](#resource-management)
- [Monitoring & Metrics](#monitoring--metrics)
---
## Caching Strategies
### Dependency Caching
**Impact:** Can reduce build times by 50-90%
#### GitHub Actions
**Node.js/npm:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- run: npm ci
```
**Python/pip:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- run: pip install -r requirements.txt
```
**Go modules:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-
- run: go build
```
**Rust/Cargo:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- run: cargo build --release
```
**Maven:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-maven-
- run: mvn clean install
```
#### GitLab CI
**Global cache:**
```yaml
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .npm/
- vendor/
```
**Job-specific cache:**
```yaml
build:
cache:
key: build-${CI_COMMIT_REF_SLUG}
paths:
- target/
policy: push # Upload only
test:
cache:
key: build-${CI_COMMIT_REF_SLUG}
paths:
- target/
policy: pull # Download only
```
**Cache with files checksum:**
```yaml
cache:
key:
files:
- package-lock.json
- yarn.lock
paths:
- node_modules/
```
### Build Artifact Caching
**Docker layer caching (GitHub):**
```yaml
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
cache-from: type=gha
cache-to: type=gha,mode=max
push: false
tags: myapp:latest
```
**Docker layer caching (GitLab):**
```yaml
build:
image: docker:latest
services:
- docker:dind
variables:
DOCKER_DRIVER: overlay2
script:
- docker pull $CI_REGISTRY_IMAGE:latest || true
- docker build --cache-from $CI_REGISTRY_IMAGE:latest -t $CI_REGISTRY_IMAGE:latest .
- docker push $CI_REGISTRY_IMAGE:latest
```
**Gradle build cache:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.gradle/caches
~/.gradle/wrapper
key: ${{ runner.os }}-gradle-${{ hashFiles('**/*.gradle*', '**/gradle-wrapper.properties') }}
- run: ./gradlew build --build-cache
```
### Cache Best Practices
**Key strategies:**
- Include OS/platform: `${{ runner.os }}-` or `${CI_RUNNER_OS}`
- Hash lock files: `hashFiles('**/package-lock.json')`
- Use restore-keys for fallback matches
- Separate caches for different purposes
**Cache invalidation:**
```yaml
# Version in cache key
cache:
key: v2-${CI_COMMIT_REF_SLUG}-${CI_PIPELINE_ID}
```
**Cache size management:**
- GitHub: 10GB per repository (LRU eviction after 7 days)
- GitLab: Configurable per runner
---
## Parallelization Techniques
### Job Parallelization
**Remove unnecessary dependencies:**
```yaml
# Before - Sequential
jobs:
lint:
test:
needs: lint
build:
needs: test
# After - Parallel
jobs:
lint:
test:
build:
needs: [lint, test] # Only wait for what's needed
```
### Matrix Builds
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
node: [18, 20, 22]
include:
- os: ubuntu-latest
node: 22
coverage: true
exclude:
- os: macos-latest
node: 18
fail-fast: false
max-parallel: 10 # Limit concurrent jobs
```
**GitLab parallel:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
TEST_SUITE: ['unit', 'integration']
script:
- nvm use $NODE_VERSION
- npm run test:$TEST_SUITE
```
### Test Splitting
**Jest sharding:**
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
**Playwright sharding:**
```yaml
strategy:
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
```
**Pytest splitting:**
```yaml
strategy:
matrix:
group: [1, 2, 3, 4]
steps:
- run: pytest --splits 4 --group ${{ matrix.group }}
```
### Conditional Execution
**Path-based:**
```yaml
jobs:
frontend-test:
if: contains(github.event.head_commit.modified, 'frontend/')
backend-test:
if: contains(github.event.head_commit.modified, 'backend/')
```
**GitLab rules:**
```yaml
frontend-test:
rules:
- changes:
- frontend/**/*
backend-test:
rules:
- changes:
- backend/**/*
```
---
## Build Optimization
### Incremental Builds
**Turb
orepo (monorepo):**
```yaml
- run: npx turbo run build test lint --filter=[HEAD^1]
```
**Nx (monorepo):**
```yaml
- run: npx nx affected --target=build --base=origin/main
```
### Compiler Optimizations
**TypeScript incremental:**
```json
{
"compilerOptions": {
"incremental": true,
"tsBuildInfoFile": ".tsbuildinfo"
}
}
```
**Cache tsbuildinfo:**
```yaml
- uses: actions/cache@v4
with:
path: .tsbuildinfo
key: ts-build-${{ hashFiles('**/*.ts') }}
```
### Multi-stage Docker Builds
```dockerfile
# Build stage
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]
```
### Build Tool Configuration
**Webpack production mode:**
```javascript
module.exports = {
mode: 'production',
optimization: {
minimize: true,
splitChunks: {
chunks: 'all'
}
}
}
```
**Vite optimization:**
```javascript
export default {
build: {
minify: 'terser',
rollupOptions: {
output: {
manualChunks(id) {
if (id.includes('node_modules')) {
return 'vendor';
}
}
}
}
}
}
```
---
## Test Optimization
### Test Categorization
**Run fast tests first:**
```yaml
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- run: npm run test:unit # Fast (1-5 min)
integration-test:
needs: unit-test
runs-on: ubuntu-latest
steps:
- run: npm run test:integration # Medium (5-15 min)
e2e-test:
needs: [unit-test, integration-test]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- run: npm run test:e2e # Slow (15-30 min)
```
### Selective Test Execution
**Run only changed:**
```yaml
- name: Get changed files
id: changed
run: |
if [ "${{ github.event_name }}" == "pull_request" ]; then
echo "files=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | tr '\n' ' ')" >> $GITHUB_OUTPUT
fi
- name: Run affected tests
if: steps.changed.outputs.files
run: npm test -- --findRelatedTests ${{ steps.changed.outputs.files }}
```
### Test Fixtures & Data
**Reuse test databases:**
```yaml
services:
postgres:
image: postgres:15
env:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: testpass
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- run: npm test # All tests share same DB
```
**Snapshot testing:**
```javascript
// Faster than full rendering tests
expect(component).toMatchSnapshot();
```
### Mock External Services
```javascript
// Instead of hitting real APIs
jest.mock('./api', () => ({
fetchData: jest.fn(() => Promise.resolve(mockData))
}));
```
---
## Resource Management
### Job Timeouts
**Prevent hung jobs:**
```yaml
jobs:
test:
timeout-minutes: 30 # Default: 360 (6 hours)
build:
timeout-minutes: 15
```
**GitLab:**
```yaml
test:
timeout: 30m # Default: 1h
```
### Concurrency Control
**GitHub Actions:**
```yaml
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true # Cancel old runs
```
**GitLab:**
```yaml
workflow:
auto_cancel:
on_new_commit: interruptible
job:
interruptible: true
```
### Resource Allocation
**GitLab runner tags:**
```yaml
build:
tags:
- high-memory
- ssd
```
**Kubernetes resource limits:**
```yaml
# GitLab Runner config
[[runners]]
[runners.kubernetes]
cpu_request = "1"
cpu_limit = "2"
memory_request = "2Gi"
memory_limit = "4Gi"
```
---
## Monitoring & Metrics
### Track Key Metrics
**Build duration:**
```yaml
- name: Track duration
run: |
START=$SECONDS
npm run build
DURATION=$((SECONDS - START))
echo "Build took ${DURATION}s"
```
**Cache hit rate:**
```yaml
- uses: actions/cache@v4
id: cache
with:
path: node_modules
key: ${{ hashFiles('package-lock.json') }}
- name: Cache stats
run: |
if [ "${{ steps.cache.outputs.cache-hit }}" == "true" ]; then
echo "Cache hit!"
else
echo "Cache miss"
fi
```
### Performance Regression Detection
**Compare against baseline:**
```yaml
- name: Benchmark
run: npm run benchmark > results.json
- name: Compare
run: |
CURRENT=$(jq '.duration' results.json)
BASELINE=120
if [ $CURRENT -gt $((BASELINE * 120 / 100)) ]; then
echo "Performance regression: ${CURRENT}s vs ${BASELINE}s baseline"
exit 1
fi
```
### External Monitoring
**DataDog CI Visibility:**
```yaml
- run: datadog-ci junit upload --service myapp junit-results.xml
```
**BuildPulse (flaky test detection):**
```yaml
- uses: buildpulse/buildpulse-action@v0.11.0
with:
account: myaccount
repository: myrepo
path: test-results/*.xml
```
---
## Optimization Checklist
### Quick Wins
- [ ] Enable dependency caching
- [ ] Remove unnecessary job dependencies
- [ ] Add job timeouts
- [ ] Enable concurrency cancellation
- [ ] Use `npm ci` instead of `npm install`
### Medium Impact
- [ ] Implement test sharding
- [ ] Use Docker layer caching
- [ ] Add path-based triggers
- [ ] Split slow test suites
- [ ] Use matrix builds for parallel execution
### Advanced
- [ ] Implement incremental builds (Nx, Turborepo)
- [ ] Use remote caching
- [ ] Optimize Docker images (multi-stage, distroless)
- [ ] Implement test impact analysis
- [ ] Set up distributed test execution
### Monitoring
- [ ] Track build duration trends
- [ ] Monitor cache hit rates
- [ ] Identify flaky tests
- [ ] Measure test execution time
- [ ] Set up performance regression alerts
---
## Performance Targets
**Build times:**
- Lint: < 1 minute
- Unit tests: < 5 minutes
- Integration tests: < 15 minutes
- E2E tests: < 30 minutes
- Full pipeline: < 20 minutes
**Resource usage:**
- Cache hit rate: > 80%
- Job success rate: > 95%
- Concurrent jobs: Balanced across available runners
- Queue time: < 2 minutes
**Cost optimization:**
- Build minutes used: Monitor monthly trends
- Storage: Keep artifacts < 7 days unless needed
- Self-hosted runners: Monitor utilization (target 60-80%)

611
references/security.md Normal file
View File

@@ -0,0 +1,611 @@
# CI/CD Security
Comprehensive guide to securing CI/CD pipelines, secrets management, and supply chain security.
## Table of Contents
- [Secrets Management](#secrets-management)
- [OIDC Authentication](#oidc-authentication)
- [Supply Chain Security](#supply-chain-security)
- [Access Control](#access-control)
- [Secure Pipeline Patterns](#secure-pipeline-patterns)
- [Vulnerability Scanning](#vulnerability-scanning)
---
## Secrets Management
### Never Commit Secrets
**Prevention methods:**
- Use `.gitignore` for sensitive files
- Enable pre-commit hooks (git-secrets, gitleaks)
- Use secret scanning (GitHub, GitLab)
**If secrets are exposed:**
1. Rotate compromised credentials immediately
2. Remove from git history: `git filter-repo` or BFG Repo-Cleaner
3. Audit access logs for unauthorized usage
### Platform Secret Stores
**GitHub Secrets:**
```yaml
# Repository, Environment, or Organization secrets
steps:
- name: Deploy
env:
API_KEY: ${{ secrets.API_KEY }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
run: ./deploy.sh
```
**Secret hierarchy:**
1. Environment secrets (highest priority)
2. Repository secrets
3. Organization secrets (lowest priority)
**GitLab CI/CD Variables:**
```yaml
# Project > Settings > CI/CD > Variables
deploy:
script:
- echo $API_KEY
- deploy --token $DEPLOY_TOKEN
variables:
ENVIRONMENT: "production" # Non-secret variable
```
**Variable types:**
- **Protected:** Only available on protected branches
- **Masked:** Hidden in job logs
- **Environment scope:** Limit to specific environments
### External Secret Management
**HashiCorp Vault:**
```yaml
# GitHub Actions
- uses: hashicorp/vault-action@v3
with:
url: https://vault.example.com
method: jwt
role: cicd-role
secrets: |
secret/data/app api_key | API_KEY ;
secret/data/db password | DB_PASSWORD
```
**AWS Secrets Manager:**
```yaml
- name: Get secrets
run: |
SECRET=$(aws secretsmanager get-secret-value \
--secret-id prod/api/key \
--query SecretString --output text)
echo "::add-mask::$SECRET"
echo "API_KEY=$SECRET" >> $GITHUB_ENV
```
**Azure Key Vault:**
```yaml
- uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "my-keyvault"
secrets: 'api-key, db-password'
```
### Secret Rotation
**Implement rotation policies:**
```yaml
check-secret-age:
steps:
- name: Check secret age
run: |
CREATED=$(aws secretsmanager describe-secret \
--secret-id myapp/api-key \
--query 'CreatedDate' --output text)
AGE=$(( ($(date +%s) - $(date -d "$CREATED" +%s)) / 86400 ))
if [ $AGE -gt 90 ]; then
echo "Secret is $AGE days old, rotation required"
exit 1
fi
```
**Best practices:**
- Rotate secrets every 90 days
- Use short-lived credentials when possible
- Audit secret access logs
- Automate rotation where possible
---
## OIDC Authentication
### Why OIDC?
**Benefits over static credentials:**
- No long-lived secrets in CI/CD
- Automatic token expiration
- Fine-grained permissions
- Audit trail of authentication
### GitHub Actions OIDC
**AWS example:**
```yaml
permissions:
id-token: write # Required for OIDC
contents: read
jobs:
deploy:
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
- run: aws s3 sync dist/ s3://my-bucket
```
**AWS IAM Trust Policy:**
```json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:owner/repo:ref:refs/heads/main"
}
}
}]
}
```
**GCP example:**
```yaml
- uses: google-github-actions/auth@v2
with:
workload_identity_provider: 'projects/123/locations/global/workloadIdentityPools/github/providers/github-provider'
service_account: 'github-actions@project.iam.gserviceaccount.com'
- run: gcloud storage cp dist/* gs://my-bucket
```
**Azure example:**
```yaml
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- run: az storage blob upload-batch -d mycontainer -s dist/
```
### GitLab OIDC
**Configure ID token:**
```yaml
deploy:
id_tokens:
GITLAB_OIDC_TOKEN:
aud: https://aws.amazonaws.com
script:
- |
CREDENTIALS=$(aws sts assume-role-with-web-identity \
--role-arn $AWS_ROLE_ARN \
--role-session-name gitlab-ci \
--web-identity-token $GITLAB_OIDC_TOKEN \
--duration-seconds 3600)
```
**Vault integration:**
```yaml
deploy:
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
before_script:
- export VAULT_TOKEN=$(vault write -field=token auth/jwt/login role=cicd-role jwt=$VAULT_ID_TOKEN)
```
---
## Supply Chain Security
### Dependency Verification
**Lock files:**
- Always commit lock files
- Use `npm ci`, not `npm install`
- Enable `--frozen-lockfile` (Yarn) or `--frozen-lockfile` (pnpm)
**Checksum verification:**
```yaml
- name: Verify dependencies
run: |
npm ci --audit=true
npx lockfile-lint --path package-lock.json --validate-https
```
**SBOM generation:**
```yaml
- name: Generate SBOM
run: |
syft dir:. -o spdx-json > sbom.json
- uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.json
```
### Action/Workflow Security
**Pin to commit SHA (GitHub):**
```yaml
# Bad - mutable tag
- uses: actions/checkout@v4
# Better - specific version
- uses: actions/checkout@v4.1.0
# Best - pinned to SHA
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.0
```
**Verify action sources:**
- Only use actions from trusted sources
- Review action code before first use
- Monitor Dependabot alerts for actions
- Use verified creators when possible
**GitLab include verification:**
```yaml
include:
- project: 'security/ci-templates'
ref: 'v2.1.0' # Pin to specific version
file: '/security-scan.yml'
```
### Container Image Security
**Use specific tags:**
```yaml
# Bad
image: node:latest
# Good
image: node:20.11.0-alpine
# Best
image: node:20.11.0-alpine@sha256:abc123...
```
**Minimal base images:**
```dockerfile
# Prefer distroless or alpine
FROM gcr.io/distroless/node20-debian12
# Or alpine
FROM node:20-alpine
```
**Image scanning:**
```yaml
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan image
run: |
trivy image --severity HIGH,CRITICAL myapp:${{ github.sha }}
grype myapp:${{ github.sha }}
```
### Code Signing
**Sign commits:**
```bash
git config --global user.signingkey <key-id>
git config --global commit.gpgsign true
```
**Verify signed commits (GitHub):**
```yaml
- name: Verify signatures
run: |
git verify-commit HEAD || exit 1
```
**Sign artifacts:**
```yaml
- name: Sign release
run: |
cosign sign myregistry/myapp:${{ github.sha }}
```
---
## Access Control
### Principle of Least Privilege
**GitHub permissions:**
```yaml
# Minimal permissions
permissions:
contents: read # Only read code
pull-requests: write # Comment on PRs
jobs:
deploy:
permissions:
contents: read
id-token: write # For OIDC
```
**GitLab protected branches:**
- Configure in Settings > Repository > Protected branches
- Restrict who can push and merge
- Require approval before merge
### Branch Protection
**GitHub branch protection rules:**
- Require pull request reviews
- Require status checks to pass
- Require signed commits
- Require linear history
- Include administrators
- Restrict who can push
**GitLab merge request approval rules:**
```yaml
# .gitlab/CODEOWNERS
* @senior-devs
/infra/ @devops-team
/security/ @security-team
```
### Environment Protection
**GitHub environment rules:**
- Required reviewers (up to 6)
- Wait timer before deployment
- Deployment branches (limit to specific branches)
- Custom deployment protection rules
**GitLab deployment protection:**
```yaml
production:
environment:
name: production
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual # Require manual trigger
only:
variables:
- $APPROVED == "true"
```
### Audit Logging
**Enable audit logs:**
- GitHub: Enterprise > Settings > Audit log
- GitLab: Admin Area > Monitoring > Audit Events
**Monitor for:**
- Secret access
- Permission changes
- Workflow modifications
- Deployment approvals
---
## Secure Pipeline Patterns
### Isolate Untrusted Code
**Separate test from deploy:**
```yaml
test:
# Runs on PRs from forks
permissions:
contents: read
pull-requests: write
deploy:
if: github.event_name == 'push' # Not on PR
permissions:
contents: read
id-token: write
```
**GitLab fork protection:**
```yaml
deploy:
rules:
- if: '$CI_PROJECT_PATH == "myorg/myrepo"' # Only from main repo
- if: '$CI_COMMIT_BRANCH == "main"'
```
### Sanitize Inputs
**Avoid command injection:**
```yaml
# Bad - command injection risk
- run: echo "Title: ${{ github.event.issue.title }}"
# Good - use environment variable
- env:
TITLE: ${{ github.event.issue.title }}
run: echo "Title: $TITLE"
```
**Validate inputs:**
```yaml
- name: Validate version
run: |
if [[ ! "${{ inputs.version }}" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Invalid version format"
exit 1
fi
```
### Network Restrictions
**Limit egress:**
```yaml
# GitHub Actions with StepSecurity
- uses: step-security/harden-runner@v2
with:
egress-policy: block
allowed-endpoints: |
api.github.com:443
npmjs.org:443
```
**GitLab network policy:**
```yaml
# Kubernetes NetworkPolicy for GitLab Runner pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: gitlab-runner-policy
spec:
podSelector:
matchLabels:
app: gitlab-runner
policyTypes:
- Egress
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443
```
---
## Vulnerability Scanning
### Dependency Scanning
**npm audit:**
```yaml
- run: npm audit --audit-level=high
```
**Snyk:**
```yaml
- uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high
```
**GitLab Dependency Scanning:**
```yaml
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
```
### Static Application Security Testing (SAST)
**CodeQL (GitHub):**
```yaml
- uses: github/codeql-action/init@v3
with:
languages: javascript, python
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
```
**SonarQube:**
```yaml
- uses: sonarsource/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
```
### Container Scanning
**Trivy:**
```yaml
- run: |
docker build -t myapp .
trivy image --severity HIGH,CRITICAL --exit-code 1 myapp
```
**Grype:**
```yaml
- uses: anchore/scan-action@v3
with:
image: myapp:latest
fail-build: true
severity-cutoff: high
```
### Dynamic Application Security Testing (DAST)
**OWASP ZAP:**
```yaml
dast:
stage: test
image: owasp/zap2docker-stable
script:
- zap-baseline.py -t https://staging.example.com -r report.html
artifacts:
paths:
- report.html
```
---
## Security Checklist
### Repository Level
- [ ] Enable branch protection
- [ ] Require code review
- [ ] Enable secret scanning
- [ ] Configure CODEOWNERS
- [ ] Enable signed commits
- [ ] Audit third-party integrations
### Pipeline Level
- [ ] Use OIDC instead of static credentials
- [ ] Pin actions/includes to specific versions
- [ ] Minimize permissions
- [ ] Sanitize user inputs
- [ ] Enable vulnerability scanning
- [ ] Separate test from deploy workflows
- [ ] Add security gates
### Secrets Management
- [ ] Use platform secret stores
- [ ] Enable secret masking
- [ ] Rotate secrets regularly
- [ ] Use short-lived credentials
- [ ] Audit secret access
- [ ] Never log secrets
### Monitoring & Response
- [ ] Enable audit logging
- [ ] Monitor for security alerts
- [ ] Set up incident response plan
- [ ] Regular security reviews
- [ ] Dependency update automation
- [ ] Security training for team

View File

@@ -0,0 +1,656 @@
# CI/CD Troubleshooting
Comprehensive guide to diagnosing and resolving common CI/CD pipeline issues.
## Table of Contents
- [Pipeline Failures](#pipeline-failures)
- [Dependency Issues](#dependency-issues)
- [Docker & Container Problems](#docker--container-problems)
- [Authentication & Permissions](#authentication--permissions)
- [Performance Issues](#performance-issues)
- [Platform-Specific Issues](#platform-specific-issues)
---
## Pipeline Failures
### Workflow Not Triggering
**GitHub Actions:**
**Symptoms:** Workflow doesn't run on push/PR
**Common causes:**
1. Workflow file in wrong location (must be `.github/workflows/`)
2. Invalid YAML syntax
3. Branch/path filters excluding the changes
4. Workflow disabled in repository settings
**Diagnostics:**
```bash
# Validate YAML
yamllint .github/workflows/ci.yml
# Check if workflow is disabled
gh workflow list --repo owner/repo
```
**Solutions:**
```yaml
# Check trigger configuration
on:
push:
branches: [main] # Ensure your branch matches
paths-ignore:
- 'docs/**' # May be excluding your changes
# Enable workflow
gh workflow enable ci.yml --repo owner/repo
```
**GitLab CI:**
**Symptoms:** Pipeline doesn't start
**Diagnostics:**
```bash
# Validate .gitlab-ci.yml
gl-ci-lint < .gitlab-ci.yml
# Check CI/CD settings
# Project > Settings > CI/CD > General pipelines
```
**Solutions:**
- Check if CI/CD is enabled for the project
- Verify `.gitlab-ci.yml` is in repository root
- Check pipeline must succeed setting isn't blocking
- Review `only`/`except` or `rules` configuration
### Jobs Failing Intermittently
**Symptoms:** Same job passes sometimes, fails others
**Common causes:**
1. Flaky tests
2. Race conditions
3. Network timeouts
4. Resource constraints
5. Time-dependent tests
**Identify flaky tests:**
```yaml
# GitHub Actions - Run multiple times
strategy:
matrix:
attempt: [1, 2, 3, 4, 5]
steps:
- run: npm test
```
**Solutions:**
```javascript
// Add retries to flaky tests
jest.retryTimes(3);
// Increase timeouts
jest.setTimeout(30000);
// Fix race conditions
await waitFor(() => expect(element).toBeInDocument(), {
timeout: 5000
});
```
**Network retry pattern:**
```yaml
- name: Install with retry
uses: nick-invision/retry@v2
with:
timeout_minutes: 10
max_attempts: 3
command: npm ci
```
### Timeout Errors
**Symptoms:** "Job exceeded maximum time" or similar
**Solutions:**
```yaml
# GitHub Actions - Increase timeout
jobs:
build:
timeout-minutes: 60 # Default: 360
# GitLab CI
test:
timeout: 2h # Default: 1h
```
**Optimize long-running jobs:**
- Add caching for dependencies
- Split tests into parallel jobs
- Use faster runners
- Identify and optimize slow tests
### Exit Code Errors
**Symptoms:** "Process completed with exit code 1"
**Diagnostics:**
```yaml
# Add verbose logging
- run: npm test -- --verbose
# Check specific exit codes
- run: |
npm test
EXIT_CODE=$?
echo "Exit code: $EXIT_CODE"
if [ $EXIT_CODE -eq 127 ]; then
echo "Command not found"
elif [ $EXIT_CODE -eq 1 ]; then
echo "General error"
fi
exit $EXIT_CODE
```
**Common exit codes:**
- `1`: General error
- `2`: Misuse of shell command
- `126`: Command cannot execute
- `127`: Command not found
- `130`: Terminated by Ctrl+C
- `137`: Killed (OOM)
- `143`: Terminated (SIGTERM)
---
## Dependency Issues
### "Module not found" or "Cannot find package"
**Symptoms:** Build fails with missing dependency error
**Causes:**
1. Missing dependency in `package.json`
2. Cache corruption
3. Lock file out of sync
4. Private package access issues
**Solutions:**
```yaml
# Clear cache and reinstall
- run: rm -rf node_modules package-lock.json
- run: npm install
# Use npm ci for clean install
- run: npm ci
# Clear GitHub Actions cache
# Settings > Actions > Caches > Delete specific cache
# GitLab - clear cache
cache:
key: $CI_COMMIT_REF_SLUG
policy: push # Force new cache
```
### Version Conflicts
**Symptoms:** Dependency resolution errors, peer dependency warnings
**Diagnostics:**
```bash
# Check for conflicts
npm ls
npm outdated
# View dependency tree
npm list --depth=1
```
**Solutions:**
```json
// Use overrides (package.json)
{
"overrides": {
"problematic-package": "2.0.0"
}
}
// Or resolutions (Yarn)
{
"resolutions": {
"problematic-package": "2.0.0"
}
}
```
### Private Package Access
**Symptoms:** "401 Unauthorized" or "404 Not Found" for private packages
**GitHub Packages:**
```yaml
- run: |
echo "@myorg:registry=https://npm.pkg.github.com" >> .npmrc
echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" >> .npmrc
- run: npm ci
```
**npm Registry:**
```yaml
- run: echo "//registry.npmjs.org/:_authToken=${{ secrets.NPM_TOKEN }}" >> .npmrc
- run: npm ci
```
**GitLab Package Registry:**
```yaml
before_script:
- echo "@mygroup:registry=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/npm/" >> .npmrc
- echo "${CI_API_V4_URL#https?}/projects/${CI_PROJECT_ID}/packages/npm/:_authToken=${CI_JOB_TOKEN}" >> .npmrc
```
---
## Docker & Container Problems
### "Cannot connect to Docker daemon"
**Symptoms:** Docker commands fail with connection error
**GitHub Actions:**
```yaml
# Ensure Docker is available
runs-on: ubuntu-latest # Has Docker pre-installed
steps:
- run: docker ps # Test Docker access
```
**GitLab CI:**
```yaml
# Use Docker-in-Docker
image: docker:latest
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2376
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
```
### Image Pull Errors
**Symptoms:** "Error response from daemon: pull access denied" or timeout
**Solutions:**
```yaml
# GitHub Actions - Login to registry
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# Or for Docker Hub
- uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# Add retry logic
- run: |
for i in {1..3}; do
docker pull myimage:latest && break
sleep 5
done
```
### "No space left on device"
**Symptoms:** Docker build fails with disk space error
**Solutions:**
```yaml
# GitHub Actions - Clean up space
- run: docker system prune -af --volumes
# Or use built-in action
- uses: jlumbroso/free-disk-space@main
with:
tool-cache: true
android: true
dotnet: true
# GitLab - configure runner
[[runners]]
[runners.docker]
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
[runners.docker.tmpfs]
"/tmp" = "rw,noexec"
```
### Multi-platform Build Issues
**Symptoms:** Build fails for ARM/different architecture
**Solution:**
```yaml
- uses: docker/setup-qemu-action@v3
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
platforms: linux/amd64,linux/arm64
context: .
push: false
```
---
## Authentication & Permissions
### "Permission denied" or "403 Forbidden"
**GitHub Actions:**
**Symptoms:** Cannot push, create release, or access API
**Solutions:**
```yaml
# Add necessary permissions
permissions:
contents: write # For pushing tags/releases
pull-requests: write # For commenting on PRs
packages: write # For pushing packages
id-token: write # For OIDC
# Check GITHUB_TOKEN permissions
- run: |
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
https://api.github.com/repos/${{ github.repository }}
```
**GitLab CI:**
**Symptoms:** Cannot push to repository or access API
**Solutions:**
```yaml
# Use CI_JOB_TOKEN for API access
script:
- 'curl --header "JOB-TOKEN: $CI_JOB_TOKEN" "${CI_API_V4_URL}/projects"'
# Or use personal/project access token
variables:
GIT_STRATEGY: clone
before_script:
- git config --global user.email "ci@example.com"
- git config --global user.name "CI Bot"
```
### Git Push Failures
**Symptoms:** "failed to push some refs" or "protected branch"
**Solutions:**
```yaml
# GitHub Actions - Check branch protection
# Settings > Branches > Branch protection rules
# Allow bypass
permissions:
contents: write
# Or use PAT with admin access
- uses: actions/checkout@v4
with:
token: ${{ secrets.ADMIN_PAT }}
# GitLab - Grant permissions
# Settings > Repository > Protected Branches
# Add CI/CD role with push permission
```
### AWS Credentials Issues
**Symptoms:** "Unable to locate credentials"
**Solutions:**
```yaml
# Using OIDC (recommended)
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
# Using secrets (legacy)
- uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
# Test credentials
- run: aws sts get-caller-identity
```
---
## Performance Issues
### Slow Pipeline Execution
**Diagnostics:**
```bash
# GitHub - View timing
gh run view <run-id> --log
# Identify slow steps
# Each step shows duration in UI
```
**Solutions:**
- See [optimization.md](optimization.md) for comprehensive guide
- Add dependency caching
- Parallelize independent jobs
- Use faster runners
- Reduce test scope on PRs
### Cache Not Working
**Symptoms:** Cache always misses, builds still slow
**Diagnostics:**
```yaml
- uses: actions/cache@v4
id: cache
with:
path: node_modules
key: ${{ hashFiles('**/package-lock.json') }}
- run: echo "Cache hit: ${{ steps.cache.outputs.cache-hit }}"
```
**Common issues:**
1. Key changes every time
2. Path doesn't exist
3. Cache size exceeds limit
4. Cache evicted (LRU after 7 days on GitHub)
**Solutions:**
```yaml
# Use consistent key
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
# Add restore-keys for partial match
restore-keys: |
${{ runner.os }}-node-
# Check cache size
- run: du -sh node_modules
```
---
## Platform-Specific Issues
### GitHub Actions
**"Resource not accessible by integration":**
```yaml
# Add required permission
permissions:
issues: write # Or whatever resource you're accessing
```
**"Workflow is not shared":**
- Reusable workflows must be in `.github/workflows/`
- Repository must be public or org member
- Check workflow access settings
**"No runner available":**
- Self-hosted: Check runner is online and has matching labels
- GitHub-hosted: May hit concurrent job limit (check usage)
### GitLab CI
**"This job is stuck":**
- No runner available with matching tags
- All runners are busy
- Runner not configured for this project
**Solutions:**
```yaml
# Remove tags to use any available runner
job:
tags: []
# Or check runner configuration
# Settings > CI/CD > Runners
```
**"Job failed (system failure)":**
- Runner disconnected
- Resource limits exceeded
- Infrastructure issue
**Check runner logs:**
```bash
# On runner host
journalctl -u gitlab-runner -f
```
---
## Debugging Techniques
### Enable Debug Logging
**GitHub Actions:**
```yaml
# Repository > Settings > Secrets > Add:
# ACTIONS_RUNNER_DEBUG = true
# ACTIONS_STEP_DEBUG = true
```
**GitLab CI:**
```yaml
variables:
CI_DEBUG_TRACE: "true" # Caution: May expose secrets!
```
### Interactive Debugging
**GitHub Actions:**
```yaml
# Add tmate for SSH access
- uses: mxschmitt/action-tmate@v3
if: failure()
```
**Local reproduction:**
```bash
# Use act to run GitHub Actions locally
act -j build
# Or nektos/act for Docker
docker run -v $(pwd):/workspace -it nektos/act -j build
```
### Reproduce Locally
```bash
# GitHub Actions - Use same Docker image
docker run -it ubuntu:latest bash
# Install dependencies and test
apt-get update && apt-get install -y nodejs npm
npm ci
npm test
```
---
## Prevention Strategies
### Pre-commit Checks
```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: check-yaml
- id: check-added-large-files
- repo: local
hooks:
- id: tests
name: Run tests
entry: npm test
language: system
pass_filenames: false
```
### CI/CD Health Monitoring
Use the `scripts/ci_health.py` script:
```bash
python3 scripts/ci_health.py --platform github --repo owner/repo
```
### Regular Maintenance
- [ ] Monthly: Review failed job patterns
- [ ] Monthly: Update actions/dependencies
- [ ] Quarterly: Audit pipeline efficiency
- [ ] Quarterly: Review and clean old caches
- [ ] Yearly: Major version updates
---
## Getting Help
**GitHub Actions:**
- Community Forum: https://github.community
- Documentation: https://docs.github.com/actions
- Status: https://www.githubstatus.com
**GitLab CI:**
- Forum: https://forum.gitlab.com
- Documentation: https://docs.gitlab.com/ee/ci
- Status: https://status.gitlab.com
**General CI/CD:**
- Stack Overflow: Tag [github-actions] or [gitlab-ci]
- Reddit: r/devops, r/cicd