Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:51:12 +08:00
commit 1878d01517
21 changed files with 8728 additions and 0 deletions

View File

@@ -0,0 +1,11 @@
{
"name": "ci-cd",
"description": "CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, and securing pipelines across GitHub Actions, GitLab CI, and other platforms.",
"version": "1.0.0",
"author": {
"name": "DevOps Skills"
},
"skills": [
"./"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# ci-cd
CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, and securing pipelines across GitHub Actions, GitLab CI, and other platforms.

573
SKILL.md Normal file
View File

@@ -0,0 +1,573 @@
---
name: ci-cd
description: CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, implementing caching strategies, setting up deployments, securing pipelines with OIDC/secrets management, and troubleshooting common issues across GitHub Actions, GitLab CI, and other platforms.
---
# CI/CD Pipelines
Comprehensive guide for CI/CD pipeline design, optimization, security, and troubleshooting across GitHub Actions, GitLab CI, and other platforms.
## When to Use This Skill
Use this skill when:
- Creating new CI/CD workflows or pipelines
- Debugging pipeline failures or flaky tests
- Optimizing slow builds or test suites
- Implementing caching strategies
- Setting up deployment workflows
- Securing pipelines (secrets, OIDC, supply chain)
- Implementing DevSecOps security scanning (SAST, DAST, SCA)
- Troubleshooting platform-specific issues
- Analyzing pipeline performance
- Implementing matrix builds or test sharding
- Configuring multi-environment deployments
## Core Workflows
### 1. Creating a New Pipeline
**Decision tree:**
```
What are you building?
├── Node.js/Frontend → GitHub: templates/github-actions/node-ci.yml | GitLab: templates/gitlab-ci/node-ci.yml
├── Python → GitHub: templates/github-actions/python-ci.yml | GitLab: templates/gitlab-ci/python-ci.yml
├── Go → GitHub: templates/github-actions/go-ci.yml | GitLab: templates/gitlab-ci/go-ci.yml
├── Docker Image → GitHub: templates/github-actions/docker-build.yml | GitLab: templates/gitlab-ci/docker-build.yml
├── Other → Follow the pipeline design pattern below
```
**Basic pipeline structure:**
```yaml
# 1. Fast feedback (lint, format) - <1 min
# 2. Unit tests - 1-5 min
# 3. Integration tests - 5-15 min
# 4. Build artifacts
# 5. E2E tests (optional, main branch only) - 15-30 min
# 6. Deploy (with approval gates)
```
**Key principles:**
- Fail fast: Run cheap validation first
- Parallelize: Remove unnecessary job dependencies
- Cache dependencies: Use `actions/cache` or GitLab cache
- Use artifacts: Build once, deploy many times
See [best_practices.md](references/best_practices.md) for comprehensive pipeline design patterns.
### 2. Optimizing Pipeline Performance
**Quick wins checklist:**
- [ ] Add dependency caching (50-90% faster builds)
- [ ] Remove unnecessary `needs` dependencies
- [ ] Add path filters to skip unnecessary runs
- [ ] Use `npm ci` instead of `npm install`
- [ ] Add job timeouts to prevent hung builds
- [ ] Enable concurrency cancellation for duplicate runs
**Analyze existing pipeline:**
```bash
# Use the pipeline analyzer script
python3 scripts/pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
```
**Common optimizations:**
- **Slow tests:** Shard tests with matrix builds
- **Repeated dependency installs:** Add caching
- **Sequential jobs:** Parallelize with proper `needs`
- **Full test suite on every PR:** Use path filters or test impact analysis
See [optimization.md](references/optimization.md) for detailed caching strategies, parallelization techniques, and performance tuning.
### 3. Securing Your Pipeline
**Essential security checklist:**
- [ ] Use OIDC instead of static credentials
- [ ] Pin actions/includes to commit SHAs
- [ ] Use minimal permissions
- [ ] Enable secret scanning
- [ ] Add vulnerability scanning (dependencies, containers)
- [ ] Implement branch protection
- [ ] Separate test from deploy workflows
**Quick setup - OIDC authentication:**
**GitHub Actions → AWS:**
```yaml
permissions:
id-token: write
contents: read
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
```
**Secrets management:**
- Store in platform secret stores (GitHub Secrets, GitLab CI/CD Variables)
- Mark as "masked" in GitLab
- Use environment-specific secrets
- Rotate regularly (every 90 days)
- Never log secrets
See [security.md](references/security.md) for comprehensive security patterns, supply chain security, and secrets management.
### 4. Troubleshooting Pipeline Failures
**Systematic approach:**
**Step 1: Check pipeline health**
```bash
python3 scripts/ci_health.py --platform github --repo owner/repo
```
**Step 2: Identify the failure type**
| Error Pattern | Common Cause | Quick Fix |
|---------------|--------------|-----------|
| "Module not found" | Missing dependency or cache issue | Clear cache, run `npm ci` |
| "Timeout" | Job taking too long | Add caching, increase timeout |
| "Permission denied" | Missing permissions | Add to `permissions:` block |
| "Cannot connect to Docker daemon" | Docker not available | Use correct runner or DinD |
| Intermittent failures | Flaky tests or race conditions | Add retries, fix timing issues |
**Step 3: Enable debug logging**
GitHub Actions:
```yaml
# Add repository secrets:
# ACTIONS_RUNNER_DEBUG = true
# ACTIONS_STEP_DEBUG = true
```
GitLab CI:
```yaml
variables:
CI_DEBUG_TRACE: "true"
```
**Step 4: Reproduce locally**
```bash
# GitHub Actions - use act
act -j build
# Or Docker
docker run -it ubuntu:latest bash
# Then manually run the failing steps
```
See [troubleshooting.md](references/troubleshooting.md) for comprehensive issue diagnosis, platform-specific problems, and solutions.
### 5. Implementing Deployment Workflows
**Deployment pattern selection:**
| Pattern | Use Case | Complexity | Risk |
|---------|----------|------------|------|
| Direct | Simple apps, low traffic | Low | Medium |
| Blue-Green | Zero downtime required | Medium | Low |
| Canary | Gradual rollout, monitoring | High | Very Low |
| Rolling | Kubernetes, containers | Medium | Low |
**Basic deployment structure:**
```yaml
deploy:
needs: [build, test]
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://example.com
steps:
- name: Download artifacts
- name: Deploy
- name: Health check
- name: Rollback on failure
```
**Multi-environment setup:**
- **Development:** Auto-deploy on develop branch
- **Staging:** Auto-deploy on main, requires passing tests
- **Production:** Manual approval required, smoke tests mandatory
See [best_practices.md](references/best_practices.md#deployment-strategies) for detailed deployment patterns and environment management.
### 6. Implementing DevSecOps Security Scanning
**Security scanning types:**
| Scan Type | Purpose | When to Run | Speed | Tools |
|-----------|---------|-------------|-------|-------|
| Secret Scanning | Find exposed credentials | Every commit | Fast (<1 min) | TruffleHog, Gitleaks |
| SAST | Find code vulnerabilities | Every commit | Medium (5-15 min) | CodeQL, Semgrep, Bandit, Gosec |
| SCA | Find dependency vulnerabilities | Every commit | Fast (1-5 min) | npm audit, pip-audit, Snyk |
| Container Scanning | Find image vulnerabilities | After build | Medium (5-10 min) | Trivy, Grype |
| DAST | Find runtime vulnerabilities | Scheduled/main only | Slow (15-60 min) | OWASP ZAP |
**Quick setup - Add security to existing pipeline:**
**GitHub Actions:**
```yaml
jobs:
# Add before build job
secret-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: trufflesecurity/trufflehog@main
- uses: gitleaks/gitleaks-action@v2
sast:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
with:
languages: javascript # or python, go
- uses: github/codeql-action/analyze@v3
build:
needs: [secret-scan, sast] # Add dependencies
```
**GitLab CI:**
```yaml
stages:
- security # Add before other stages
- build
- test
# Secret scanning
secret-scan:
stage: security
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail
# SAST
sast:semgrep:
stage: security
image: returntocorp/semgrep
script:
- semgrep scan --config=auto .
# Use GitLab templates
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
```
**Comprehensive security pipeline templates:**
- **GitHub Actions:** `templates/github-actions/security-scan.yml` - Complete DevSecOps pipeline with all scanning stages
- **GitLab CI:** `templates/gitlab-ci/security-scan.yml` - Complete DevSecOps pipeline with GitLab security templates
**Security gate pattern:**
Add a security gate job that evaluates all security scan results and fails the pipeline if critical issues are found:
```yaml
security-gate:
needs: [secret-scan, sast, sca, container-scan]
script:
# Check for critical vulnerabilities
# Parse JSON reports and evaluate thresholds
# Fail if critical issues found
```
**Language-specific security tools:**
- **Node.js:** CodeQL, Semgrep, npm audit, eslint-plugin-security
- **Python:** CodeQL, Semgrep, Bandit, pip-audit, Safety
- **Go:** CodeQL, Semgrep, Gosec, govulncheck
All language-specific templates now include security scanning stages. See:
- `templates/github-actions/node-ci.yml`
- `templates/github-actions/python-ci.yml`
- `templates/github-actions/go-ci.yml`
- `templates/gitlab-ci/node-ci.yml`
- `templates/gitlab-ci/python-ci.yml`
- `templates/gitlab-ci/go-ci.yml`
See [devsecops.md](references/devsecops.md) for comprehensive DevSecOps guide covering all security scanning types, tool comparisons, and implementation patterns.
## Quick Reference Commands
### GitHub Actions
```bash
# List workflows
gh workflow list
# View recent runs
gh run list --limit 20
# View specific run
gh run view <run-id>
# Re-run failed jobs
gh run rerun <run-id> --failed
# Download logs
gh run view <run-id> --log > logs.txt
# Trigger workflow manually
gh workflow run ci.yml
# Check workflow status
gh run watch
```
### GitLab CI
```bash
# View pipelines
gl project-pipelines list
# Pipeline status
gl project-pipeline get <pipeline-id>
# Retry failed jobs
gl project-pipeline retry <pipeline-id>
# Cancel pipeline
gl project-pipeline cancel <pipeline-id>
# Download artifacts
gl project-job artifacts <job-id>
```
## Platform-Specific Patterns
### GitHub Actions
**Reusable workflows:**
```yaml
# .github/workflows/reusable-test.yml
on:
workflow_call:
inputs:
node-version:
required: true
type: string
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
```
**Call from another workflow:**
```yaml
jobs:
test:
uses: ./.github/workflows/reusable-test.yml
with:
node-version: '20'
```
### GitLab CI
**Templates with extends:**
```yaml
.test_template:
image: node:20
before_script:
- npm ci
unit-test:
extends: .test_template
script:
- npm run test:unit
integration-test:
extends: .test_template
script:
- npm run test:integration
```
**DAG pipelines with needs:**
```yaml
build:
stage: build
test:unit:
stage: test
needs: [build]
test:integration:
stage: test
needs: [build]
deploy:
stage: deploy
needs: [test:unit, test:integration]
```
## Diagnostic Scripts
### Pipeline Analyzer
Analyzes workflow configuration for optimization opportunities:
```bash
# GitHub Actions
python3 scripts/pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
# GitLab CI
python3 scripts/pipeline_analyzer.py --platform gitlab --config .gitlab-ci.yml
```
**Identifies:**
- Missing caching opportunities
- Unnecessary sequential execution
- Outdated action versions
- Unused artifacts
- Overly broad triggers
### CI Health Checker
Checks pipeline status and identifies issues:
```bash
# GitHub Actions
python3 scripts/ci_health.py --platform github --repo owner/repo --limit 20
# GitLab CI
python3 scripts/ci_health.py --platform gitlab --project-id 12345 --token $GITLAB_TOKEN
```
**Provides:**
- Success/failure rates
- Recent failure patterns
- Workflow-specific insights
- Actionable recommendations
## Reference Documentation
For deep-dive information on specific topics:
- **[best_practices.md](references/best_practices.md)** - Pipeline design, testing strategies, deployment patterns, dependency management, artifact handling, platform-specific patterns
- **[security.md](references/security.md)** - Secrets management, OIDC authentication, supply chain security, access control, vulnerability scanning, secure pipeline patterns
- **[devsecops.md](references/devsecops.md)** - Comprehensive DevSecOps guide: SAST (CodeQL, Semgrep, Bandit, Gosec), DAST (OWASP ZAP), SCA (npm audit, pip-audit, Snyk), container security (Trivy, Grype, SBOM), secret scanning (TruffleHog, Gitleaks), security gates, license compliance
- **[optimization.md](references/optimization.md)** - Caching strategies (dependencies, Docker layers, build artifacts), parallelization techniques, test splitting, build optimization, resource management
- **[troubleshooting.md](references/troubleshooting.md)** - Common issues (workflow not triggering, flaky tests, timeouts, dependency errors), Docker problems, authentication issues, platform-specific debugging
## Templates
Starter templates for common use cases:
### GitHub Actions
- **`assets/templates/github-actions/node-ci.yml`** - Complete Node.js CI/CD with security scanning, caching, matrix testing, and multi-environment deployment
- **`assets/templates/github-actions/python-ci.yml`** - Python pipeline with security scanning, pytest, coverage, PyPI deployment
- **`assets/templates/github-actions/go-ci.yml`** - Go pipeline with security scanning, multi-platform builds, benchmarks, integration tests
- **`assets/templates/github-actions/docker-build.yml`** - Docker build with multi-platform support, security scanning, SBOM generation, and signing
- **`assets/templates/github-actions/security-scan.yml`** - Comprehensive DevSecOps pipeline with SAST, DAST, SCA, container scanning, and security gates
### GitLab CI
- **`assets/templates/gitlab-ci/node-ci.yml`** - GitLab CI pipeline with security scanning, parallel execution, services, and deployment stages
- **`assets/templates/gitlab-ci/python-ci.yml`** - Python pipeline with security scanning, parallel testing, Docker builds, PyPI and Cloud Run deployment
- **`assets/templates/gitlab-ci/go-ci.yml`** - Go pipeline with security scanning, multi-platform builds, benchmarks, Kubernetes deployment
- **`assets/templates/gitlab-ci/docker-build.yml`** - Docker build with DinD, multi-arch, Container Registry, security scanning
- **`assets/templates/gitlab-ci/security-scan.yml`** - Comprehensive DevSecOps pipeline with SAST, DAST, SCA, container scanning, GitLab security templates, and security gates
## Common Patterns
### Caching Dependencies
**GitHub Actions:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- run: npm ci
```
**GitLab CI:**
```yaml
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
```
### Matrix Builds
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
node: [18, 20, 22]
fail-fast: false
```
**GitLab CI:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
```
### Conditional Execution
**GitHub Actions:**
```yaml
- name: Deploy
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
```
**GitLab CI:**
```yaml
deploy:
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual
```
## Best Practices Summary
**Performance:**
- Enable dependency caching
- Parallelize independent jobs
- Add path filters to reduce unnecessary runs
- Use matrix builds for cross-platform testing
**Security:**
- Use OIDC for cloud authentication
- Pin actions to commit SHAs
- Enable secret scanning and vulnerability checks
- Apply principle of least privilege
**Reliability:**
- Add timeouts to prevent hung jobs
- Implement retry logic for flaky operations
- Use health checks after deployments
- Enable concurrency cancellation
**Maintainability:**
- Use reusable workflows/templates
- Document non-obvious decisions
- Keep workflows DRY with extends/includes
- Regular dependency updates
## Getting Started
1. **New pipeline:** Start with a template from `assets/templates/`
2. **Add security scanning:** Use DevSecOps templates or add security stages to existing pipelines (see workflow 6 above)
3. **Optimize existing:** Run `scripts/pipeline_analyzer.py`
4. **Debug issues:** Check `references/troubleshooting.md`
5. **Improve security:** Review `references/security.md` and `references/devsecops.md` checklists
6. **Speed up builds:** See `references/optimization.md`

View File

@@ -0,0 +1,164 @@
# Docker Build & Push Pipeline
# Multi-platform build with caching and security scanning
name: Docker Build
on:
push:
branches: [main]
tags: ['v*']
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build:
name: Build & Push Docker Image
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
packages: write
security-events: write # For uploading SARIF
steps:
- uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha,prefix={{branch}}-
- name: Build and push Docker image
id: build
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.meta.outputs.version }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
- name: Upload Trivy results to GitHub Security
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: 'trivy-results.sarif'
- name: Run Grype vulnerability scanner
uses: anchore/scan-action@v3
with:
image: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.meta.outputs.version }}
fail-build: true
severity-cutoff: high
- name: Generate SBOM
uses: anchore/sbom-action@v0
with:
image: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.meta.outputs.version }}
format: spdx-json
output-file: sbom.spdx.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.spdx.json
sign:
name: Sign Container Image
runs-on: ubuntu-latest
needs: build
if: github.event_name != 'pull_request'
permissions:
contents: read
packages: write
id-token: write
steps:
- uses: actions/checkout@v4
- name: Install Cosign
uses: sigstore/cosign-installer@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Sign the images
env:
DIGEST: ${{ needs.build.outputs.digest }}
run: |
cosign sign --yes ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${DIGEST}
deploy:
name: Deploy to Kubernetes
runs-on: ubuntu-latest
needs: [build, sign]
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://example.com
steps:
- uses: actions/checkout@v4
- name: Configure kubectl
uses: azure/k8s-set-context@v3
with:
method: kubeconfig
kubeconfig: ${{ secrets.KUBE_CONFIG }}
- name: Deploy to Kubernetes
run: |
kubectl set image deployment/myapp \
myapp=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
--record
- name: Verify deployment
run: |
kubectl rollout status deployment/myapp --timeout=5m
- name: Run smoke tests
run: |
POD=$(kubectl get pod -l app=myapp -o jsonpath="{.items[0].metadata.name}")
kubectl exec $POD -- curl -f http://localhost:8080/health

View File

@@ -0,0 +1,420 @@
# Go CI/CD Pipeline
# Optimized with caching, matrix testing, and deployment
name: Go CI
on:
push:
branches: [main, develop]
paths-ignore:
- '**.md'
- 'docs/**'
pull_request:
branches: [main]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
GO_VERSION: '1.22'
jobs:
# Security: Secret Scanning
secret-scan:
name: Secret Scanning
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: TruffleHog Secret Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Security: SAST
sast:
name: Static Analysis (CodeQL)
runs-on: ubuntu-latest
timeout-minutes: 15
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: go
queries: security-and-quality
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
lint:
name: Lint
runs-on: ubuntu-latest
needs: [secret-scan]
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: true
- name: Run golangci-lint
uses: golangci/golangci-lint-action@v4
with:
version: latest
args: --timeout=5m
- name: Check formatting
run: |
if [ "$(gofmt -s -l . | wc -l)" -gt 0 ]; then
echo "Please run: gofmt -s -w ."
gofmt -s -l .
exit 1
fi
- name: Check go mod tidy
run: |
go mod tidy
git diff --exit-code go.mod go.sum
test:
name: Test (Go ${{ matrix.go-version }}, ${{ matrix.os }})
runs-on: ${{ matrix.os }}
timeout-minutes: 20
strategy:
matrix:
go-version: ['1.21', '1.22']
os: [ubuntu-latest, macos-latest]
fail-fast: false
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: testdb
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7-alpine
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ matrix.go-version }}
cache: true
- name: Download dependencies
run: go mod download
- name: Run unit tests
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/testdb?sslmode=disable
REDIS_URL: redis://localhost:6379
run: |
go test -v -race -coverprofile=coverage.out -covermode=atomic ./...
- name: Upload coverage to Codecov
if: matrix.go-version == '1.22' && matrix.os == 'ubuntu-latest'
uses: codecov/codecov-action@v4
with:
files: ./coverage.out
fail_ci_if_error: false
- name: Run benchmarks
if: matrix.go-version == '1.22' && matrix.os == 'ubuntu-latest'
run: go test -bench=. -benchmem ./... | tee benchmark.txt
- name: Upload benchmark results
if: matrix.go-version == '1.22' && matrix.os == 'ubuntu-latest'
uses: actions/upload-artifact@v4
with:
name: benchmark-results
path: benchmark.txt
security:
name: Security Scanning
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: true
- name: Run Gosec
uses: securego/gosec@master
with:
args: '-fmt json -out gosec-report.json ./...'
continue-on-error: true
- name: Run govulncheck
run: |
go install golang.org/x/vuln/cmd/govulncheck@latest
govulncheck ./...
- name: Upload security reports
if: always()
uses: actions/upload-artifact@v4
with:
name: security-reports
path: gosec-report.json
build:
name: Build
runs-on: ubuntu-latest
needs: [lint, test, sast, security]
timeout-minutes: 15
strategy:
matrix:
goos: [linux, darwin, windows]
goarch: [amd64, arm64]
exclude:
- goos: windows
goarch: arm64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: true
- name: Build binary
env:
GOOS: ${{ matrix.goos }}
GOARCH: ${{ matrix.goarch }}
CGO_ENABLED: 0
run: |
OUTPUT="myapp-${{ matrix.goos }}-${{ matrix.goarch }}"
if [ "${{ matrix.goos }}" = "windows" ]; then
OUTPUT="${OUTPUT}.exe"
fi
go build \
-ldflags="-s -w -X main.version=${{ github.sha }} -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-o $OUTPUT \
./cmd/myapp
ls -lh $OUTPUT
- name: Upload binary
uses: actions/upload-artifact@v4
with:
name: myapp-${{ matrix.goos }}-${{ matrix.goarch }}
path: myapp-*
retention-days: 7
integration-test:
name: Integration Tests
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main' || github.event_name == 'pull_request'
timeout-minutes: 30
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: testdb
options: >-
--health-cmd pg_isready
--health-interval 10s
ports:
- 5432:5432
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: true
- name: Download Linux binary
uses: actions/download-artifact@v4
with:
name: myapp-linux-amd64
- name: Make binary executable
run: chmod +x myapp-linux-amd64
- name: Run integration tests
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/testdb?sslmode=disable
BINARY_PATH: ./myapp-linux-amd64
run: go test -v -tags=integration ./tests/integration/...
docker:
name: Build Docker Image
runs-on: ubuntu-latest
needs: [build, test]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/v')
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
VERSION=${{ github.sha }}
BUILD_TIME=${{ github.event.head_commit.timestamp }}
deploy:
name: Deploy to Production
runs-on: ubuntu-latest
needs: [docker, integration-test]
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://api.example.com
permissions:
contents: read
id-token: write
steps:
- uses: actions/checkout@v4
- uses: google-github-actions/auth@v2
with:
workload_identity_provider: ${{ secrets.GCP_WORKLOAD_IDENTITY_PROVIDER }}
service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}
- name: Deploy to Cloud Run
run: |
gcloud run deploy myapp \
--image ghcr.io/${{ github.repository }}:${{ github.sha }} \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--memory 512Mi \
--cpu 1 \
--max-instances 10
- name: Health check
run: |
URL=$(gcloud run services describe myapp --region us-central1 --format 'value(status.url)')
for i in {1..10}; do
if curl -f $URL/health; then
echo "Health check passed"
exit 0
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
exit 1
release:
name: Create Release
runs-on: ubuntu-latest
needs: [build]
if: startsWith(github.ref, 'refs/tags/v')
permissions:
contents: write
steps:
- uses: actions/checkout@v4
- name: Download all artifacts
uses: actions/download-artifact@v4
with:
path: artifacts/
- name: Create checksums
run: |
cd artifacts
for dir in myapp-*; do
cd $dir
sha256sum * > checksums.txt
cd ..
done
- name: Create GitHub Release
uses: softprops/action-gh-release@v1
with:
files: artifacts/**/*
generate_release_notes: true
draft: false
prerelease: false

View File

@@ -0,0 +1,313 @@
# Node.js CI/CD Pipeline
# Optimized workflow with caching, matrix testing, and deployment
name: Node.js CI
on:
push:
branches: [main, develop]
paths-ignore:
- '**.md'
- 'docs/**'
pull_request:
branches: [main]
# Cancel in-progress runs for same workflow
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Security: Secret Scanning
secret-scan:
name: Secret Scanning
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: TruffleHog Secret Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Security: SAST
sast:
name: Static Analysis
runs-on: ubuntu-latest
timeout-minutes: 15
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: javascript
queries: security-and-quality
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/owasp-top-ten
# Security: Dependency Scanning
dependency-scan:
name: Dependency Security
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: npm audit
run: |
npm audit --audit-level=moderate --json > npm-audit.json || true
npm audit --audit-level=high
continue-on-error: false
- name: Upload audit results
if: always()
uses: actions/upload-artifact@v4
with:
name: npm-audit-report
path: npm-audit.json
lint:
name: Lint
runs-on: ubuntu-latest
needs: [secret-scan]
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
- name: Check formatting
run: npm run format:check
test:
name: Test (Node ${{ matrix.node }})
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
matrix:
node: [18, 20, 22]
fail-fast: false
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run unit tests
run: npm run test:unit
- name: Run integration tests
run: npm run test:integration
if: matrix.node == 20 # Only run on one version
- name: Upload coverage
uses: codecov/codecov-action@v4
if: matrix.node == 20
with:
files: ./coverage/lcov.info
fail_ci_if_error: false
build:
name: Build
runs-on: ubuntu-latest
needs: [lint, test, sast, dependency-scan]
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build application
run: npm run build
- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
retention-days: 7
e2e:
name: E2E Tests
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Run E2E tests
run: npm run test:e2e
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: e2e-results
path: test-results/
deploy-staging:
name: Deploy to Staging
runs-on: ubuntu-latest
needs: [build, test]
if: github.ref == 'refs/heads/develop'
environment:
name: staging
url: https://staging.example.com
permissions:
contents: read
id-token: write # For OIDC
steps:
- uses: actions/checkout@v4
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Deploy to S3
run: |
aws s3 sync dist/ s3://${{ secrets.STAGING_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.STAGING_CF_DIST }} --paths "/*"
- name: Smoke tests
run: |
sleep 10
curl -f https://staging.example.com/health || exit 1
deploy-production:
name: Deploy to Production
runs-on: ubuntu-latest
needs: [e2e]
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://example.com
permissions:
contents: read
id-token: write
steps:
- uses: actions/checkout@v4
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Deploy to S3
run: |
aws s3 sync dist/ s3://${{ secrets.PRODUCTION_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.PRODUCTION_CF_DIST }} --paths "/*"
- name: Health check
run: |
for i in {1..10}; do
if curl -f https://example.com/health; then
echo "Health check passed"
exit 0
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
echo "Health check failed"
exit 1
- name: Create deployment record
run: |
echo "Deployed version: ${{ github.sha }}"
echo "Deployment time: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
# Optionally create release with gh CLI:
# gh release create v${{ github.run_number }} \
# --title "Release v${{ github.run_number }}" \
# --notes "Deployed commit ${{ github.sha }}"

View File

@@ -0,0 +1,388 @@
# Python CI/CD Pipeline
# Optimized with caching, matrix testing, and deployment
name: Python CI
on:
push:
branches: [main, develop]
paths-ignore:
- '**.md'
- 'docs/**'
pull_request:
branches: [main]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Security: Secret Scanning
secret-scan:
name: Secret Scanning
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: TruffleHog Secret Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Security: SAST
sast:
name: Static Analysis (CodeQL)
runs-on: ubuntu-latest
timeout-minutes: 15
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: python
queries: security-and-quality
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
lint:
name: Lint & Format Check
runs-on: ubuntu-latest
needs: [secret-scan]
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ruff black mypy isort
- name: Run ruff
run: ruff check .
- name: Check formatting with black
run: black --check .
- name: Check import sorting
run: isort --check-only .
- name: Type check with mypy
run: mypy .
continue-on-error: true # Don't fail on type errors initially
test:
name: Test (Python ${{ matrix.python-version }})
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
matrix:
python-version: ['3.9', '3.10', '3.11', '3.12']
fail-fast: false
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: testdb
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7-alpine
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements-dev.txt
- name: Run unit tests
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/testdb
REDIS_URL: redis://localhost:6379
run: |
pytest tests/unit \
--cov=src \
--cov-report=xml \
--cov-report=term \
--junitxml=junit.xml \
-v
- name: Run integration tests
if: matrix.python-version == '3.11'
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/testdb
REDIS_URL: redis://localhost:6379
run: |
pytest tests/integration -v
- name: Upload coverage to Codecov
if: matrix.python-version == '3.11'
uses: codecov/codecov-action@v4
with:
files: ./coverage.xml
fail_ci_if_error: false
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results-${{ matrix.python-version }}
path: junit.xml
security:
name: Security Scanning
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run bandit security scan
run: |
pip install bandit
bandit -r src/ -f json -o bandit-report.json -ll || true
bandit -r src/ -ll
continue-on-error: false
- name: Run safety check
run: |
pip install safety
safety check --json --output safety-report.json || true
safety check
continue-on-error: true
- name: pip-audit dependency scan
run: |
pip install pip-audit
pip-audit --requirement requirements.txt --format json --output pip-audit.json || true
pip-audit --requirement requirements.txt
continue-on-error: false
- name: Upload security reports
if: always()
uses: actions/upload-artifact@v4
with:
name: security-reports
path: |
bandit-report.json
safety-report.json
pip-audit.json
build:
name: Build Package
runs-on: ubuntu-latest
needs: [lint, test, sast, security]
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install build tools
run: |
python -m pip install --upgrade pip
pip install build wheel setuptools
- name: Build package
run: python -m build
- name: Upload distribution
uses: actions/upload-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
retention-days: 7
e2e:
name: E2E Tests
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Install package
run: |
pip install dist/*.whl
pip install -r requirements-dev.txt
- name: Run E2E tests
run: pytest tests/e2e -v
deploy-pypi:
name: Deploy to PyPI
runs-on: ubuntu-latest
needs: [build, test]
if: startsWith(github.ref, 'refs/tags/v')
environment:
name: pypi
url: https://pypi.org/project/your-package
permissions:
id-token: write # For trusted publishing
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
# Uses OIDC trusted publishing - no token needed!
deploy-docker:
name: Build & Push Docker Image
runs-on: ubuntu-latest
needs: [build, test]
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/v')
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy-cloud:
name: Deploy to Cloud Run
runs-on: ubuntu-latest
needs: deploy-docker
if: github.ref == 'refs/heads/main'
environment:
name: production
url: https://your-app.run.app
permissions:
contents: read
id-token: write
steps:
- uses: actions/checkout@v4
- uses: google-github-actions/auth@v2
with:
workload_identity_provider: ${{ secrets.GCP_WORKLOAD_IDENTITY_PROVIDER }}
service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}
- name: Deploy to Cloud Run
run: |
gcloud run deploy your-app \
--image ghcr.io/${{ github.repository }}:${{ github.sha }} \
--region us-central1 \
--platform managed \
--allow-unauthenticated
- name: Health check
run: |
URL=$(gcloud run services describe your-app --region us-central1 --format 'value(status.url)')
curl -f $URL/health || exit 1

View File

@@ -0,0 +1,416 @@
# Complete DevSecOps Security Scanning Pipeline
# SAST, DAST, SCA, Container Scanning, Secret Scanning
name: Security Scanning
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * 1' # Weekly full scan on Monday 2 AM
concurrency:
group: security-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Stage 1: Secret Scanning
secret-scan:
name: Secret Scanning
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for secret scanning
- name: TruffleHog Secret Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Stage 2: SAST (Static Application Security Testing)
sast-codeql:
name: CodeQL Analysis
runs-on: ubuntu-latest
needs: secret-scan
timeout-minutes: 30
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ['javascript', 'python']
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
queries: security-and-quality
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
sast-semgrep:
name: Semgrep SAST
runs-on: ubuntu-latest
needs: secret-scan
timeout-minutes: 15
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/owasp-top-ten
p/cwe-top-25
publishToken: ${{ secrets.SEMGREP_APP_TOKEN }}
# Stage 3: SCA (Software Composition Analysis)
sca-dependencies:
name: Dependency Scanning
runs-on: ubuntu-latest
needs: secret-scan
timeout-minutes: 15
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install dependencies
run: npm ci
- name: npm audit
run: |
npm audit --audit-level=moderate --json > npm-audit.json
npm audit --audit-level=high
continue-on-error: true
- name: Dependency Review (PR only)
if: github.event_name == 'pull_request'
uses: actions/dependency-review-action@v4
with:
fail-on-severity: high
- name: Snyk Security Scan
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high --json-file-output=snyk-report.json
continue-on-error: true
- name: Upload scan results
if: always()
uses: actions/upload-artifact@v4
with:
name: sca-reports
path: |
npm-audit.json
snyk-report.json
# Stage 4: Build Application
build:
name: Build Application
runs-on: ubuntu-latest
needs: [sast-codeql, sast-semgrep, sca-dependencies]
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: 'npm'
- run: npm ci
- run: npm run build
- uses: actions/upload-artifact@v4
with:
name: build-output
path: dist/
# Stage 5: Container Security Scanning
container-scan:
name: Container Security
runs-on: ubuntu-latest
needs: build
timeout-minutes: 20
permissions:
contents: read
security-events: write
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build Docker image
uses: docker/build-push-action@v5
with:
context: .
load: true
tags: myapp:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1'
- name: Upload Trivy results to GitHub Security
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: 'trivy-results.sarif'
- name: Run Grype vulnerability scanner
uses: anchore/scan-action@v3
id: grype
with:
image: myapp:${{ github.sha }}
fail-build: true
severity-cutoff: high
output-format: sarif
- name: Upload Grype results
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: ${{ steps.grype.outputs.sarif }}
- name: Generate SBOM with Syft
uses: anchore/sbom-action@v0
with:
image: myapp:${{ github.sha }}
format: spdx-json
output-file: sbom.spdx.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.spdx.json
# Stage 6: DAST (Dynamic Application Security Testing)
dast-baseline:
name: DAST Baseline Scan
runs-on: ubuntu-latest
needs: container-scan
if: github.ref == 'refs/heads/main' || github.event_name == 'schedule'
timeout-minutes: 30
permissions:
contents: read
issues: write
services:
app:
image: myapp:latest
ports:
- 8080:8080
options: --health-cmd "curl -f http://localhost:8080/health" --health-interval 10s
steps:
- uses: actions/checkout@v4
- name: Wait for application
run: |
timeout 60 bash -c 'until curl -f http://localhost:8080/health; do sleep 2; done'
- name: OWASP ZAP Baseline Scan
uses: zaproxy/action-baseline@v0.10.0
with:
target: 'http://localhost:8080'
rules_file_name: '.zap/rules.tsv'
cmd_options: '-a'
fail_action: true
- name: Upload ZAP report
if: always()
uses: actions/upload-artifact@v4
with:
name: zap-baseline-report
path: report_html.html
dast-full-scan:
name: DAST Full Scan
runs-on: ubuntu-latest
needs: container-scan
if: github.event_name == 'schedule'
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: OWASP ZAP Full Scan
uses: zaproxy/action-full-scan@v0.10.0
with:
target: 'https://staging.example.com'
rules_file_name: '.zap/rules.tsv'
allow_issue_writing: false
- name: Upload ZAP report
if: always()
uses: actions/upload-artifact@v4
with:
name: zap-full-scan-report
path: report_html.html
# Stage 7: License Compliance
license-check:
name: License Compliance
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- name: Check licenses
run: |
npx license-checker --production \
--onlyAllow "MIT;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC;0BSD" \
--json --out license-report.json
- name: Upload license report
uses: actions/upload-artifact@v4
with:
name: license-report
path: license-report.json
# Stage 8: Security Gate
security-gate:
name: Security Quality Gate
runs-on: ubuntu-latest
needs: [sast-codeql, sast-semgrep, sca-dependencies, container-scan, license-check]
if: always()
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- name: Download all artifacts
uses: actions/download-artifact@v4
- name: Evaluate Security Posture
run: |
echo "## 🔒 Security Scan Summary" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
# Check job statuses
echo "### Scan Results" >> $GITHUB_STEP_SUMMARY
echo "- ✅ Secret Scanning: Complete" >> $GITHUB_STEP_SUMMARY
echo "- ✅ SAST (CodeQL): Complete" >> $GITHUB_STEP_SUMMARY
echo "- ✅ SAST (Semgrep): Complete" >> $GITHUB_STEP_SUMMARY
echo "- ✅ SCA (Dependencies): Complete" >> $GITHUB_STEP_SUMMARY
echo "- ✅ Container Scanning: Complete" >> $GITHUB_STEP_SUMMARY
echo "- ✅ License Compliance: Complete" >> $GITHUB_STEP_SUMMARY
# Parse results and determine if we can proceed
echo "" >> $GITHUB_STEP_SUMMARY
echo "### Security Gate Status" >> $GITHUB_STEP_SUMMARY
if [ "${{ needs.sast-codeql.result }}" == "failure" ] || \
[ "${{ needs.container-scan.result }}" == "failure" ]; then
echo "❌ **Security gate FAILED** - Critical vulnerabilities found" >> $GITHUB_STEP_SUMMARY
exit 1
else
echo "✅ **Security gate PASSED** - No critical issues detected" >> $GITHUB_STEP_SUMMARY
fi
# Stage 9: Security Report
security-report:
name: Generate Security Report
runs-on: ubuntu-latest
needs: security-gate
if: always() && github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
- name: Download all artifacts
uses: actions/download-artifact@v4
- name: Create unified security report
run: |
cat << EOF > security-report.md
# Security Scan Report - $(date +%Y-%m-%d)
## Summary
- **Repository:** ${{ github.repository }}
- **Branch:** ${{ github.ref_name }}
- **Commit:** ${{ github.sha }}
- **Scan Date:** $(date -u +"%Y-%m-%d %H:%M:%S UTC")
## Scans Performed
1. Secret Scanning (TruffleHog, Gitleaks)
2. SAST (CodeQL, Semgrep)
3. SCA (npm audit, Snyk)
4. Container Scanning (Trivy, Grype)
5. License Compliance
## Status
All security scans completed. See artifacts for detailed reports.
EOF
cat security-report.md >> $GITHUB_STEP_SUMMARY
- name: Upload security report
uses: actions/upload-artifact@v4
with:
name: security-report
path: security-report.md
retention-days: 90

View File

@@ -0,0 +1,298 @@
# GitLab CI/CD Docker Build Pipeline
# Multi-platform build with DinD, security scanning, and Container Registry
stages:
- build
- scan
- sign
- deploy
variables:
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: "/certs"
# Use project's container registry
IMAGE_NAME: $CI_REGISTRY_IMAGE
IMAGE_TAG: $CI_COMMIT_SHORT_SHA
# BuildKit for better caching
DOCKER_BUILDKIT: 1
# Build multi-platform Docker image
build:
stage: build
image: docker:24-cli
services:
- docker:24-dind
before_script:
# Login to GitLab Container Registry
- echo $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
# Set up buildx for multi-platform builds
- docker buildx create --use --name builder || docker buildx use builder
- docker buildx inspect --bootstrap
script:
# Build and push multi-platform image
- |
docker buildx build \
--platform linux/amd64,linux/arm64 \
--cache-from type=registry,ref=$IMAGE_NAME:buildcache \
--cache-to type=registry,ref=$IMAGE_NAME:buildcache,mode=max \
--tag $IMAGE_NAME:$IMAGE_TAG \
--tag $IMAGE_NAME:latest \
--push \
--build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
--build-arg CI_COMMIT_REF_NAME=$CI_COMMIT_REF_NAME \
.
- echo "IMAGE_FULL_NAME=$IMAGE_NAME:$IMAGE_TAG" >> build.env
artifacts:
reports:
dotenv: build.env
only:
- branches
- tags
tags:
- docker
# Alternative: Simple build without multi-arch
build:simple:
stage: build
image: docker:24-cli
services:
- docker:24-dind
before_script:
- echo $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
# Pull previous image for layer caching
- docker pull $IMAGE_NAME:latest || true
# Build with cache
- |
docker build \
--cache-from $IMAGE_NAME:latest \
--tag $IMAGE_NAME:$IMAGE_TAG \
--tag $IMAGE_NAME:latest \
--build-arg BUILDKIT_INLINE_CACHE=1 \
.
# Push images
- docker push $IMAGE_NAME:$IMAGE_TAG
- docker push $IMAGE_NAME:latest
- echo "IMAGE_FULL_NAME=$IMAGE_NAME:$IMAGE_TAG" >> build.env
artifacts:
reports:
dotenv: build.env
only:
- branches
- tags
when: manual # Use this OR the multi-arch build above
tags:
- docker
# Trivy vulnerability scanning
trivy:scan:
stage: scan
image: aquasec/trivy:latest
needs: [build]
variables:
GIT_STRATEGY: none
script:
# Scan for HIGH and CRITICAL vulnerabilities
- trivy image --severity HIGH,CRITICAL --exit-code 0 --format json --output trivy-report.json $IMAGE_FULL_NAME
- trivy image --severity HIGH,CRITICAL --exit-code 1 $IMAGE_FULL_NAME
artifacts:
when: always
paths:
- trivy-report.json
expire_in: 30 days
allow_failure: false
only:
- branches
- tags
# Grype scanning (alternative/additional)
grype:scan:
stage: scan
image: anchore/grype:latest
needs: [build]
variables:
GIT_STRATEGY: none
before_script:
- echo $CI_REGISTRY_PASSWORD | grype registry login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
- grype $IMAGE_FULL_NAME --fail-on high --output json --file grype-report.json
artifacts:
when: always
paths:
- grype-report.json
expire_in: 30 days
allow_failure: true
only:
- branches
- tags
# GitLab Container Scanning (uses Trivy)
container_scanning:
stage: scan
needs: [build]
variables:
CS_IMAGE: $IMAGE_FULL_NAME
GIT_STRATEGY: none
allow_failure: true
include:
- template: Security/Container-Scanning.gitlab-ci.yml
# Generate SBOM (Software Bill of Materials)
sbom:
stage: scan
image: anchore/syft:latest
needs: [build]
variables:
GIT_STRATEGY: none
before_script:
- echo $CI_REGISTRY_PASSWORD | syft registry login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
- syft $IMAGE_FULL_NAME -o spdx-json > sbom.spdx.json
- syft $IMAGE_FULL_NAME -o cyclonedx-json > sbom.cyclonedx.json
artifacts:
paths:
- sbom.spdx.json
- sbom.cyclonedx.json
expire_in: 90 days
only:
- main
- tags
# Sign container image (requires cosign setup)
sign:image:
stage: sign
image: gcr.io/projectsigstore/cosign:latest
needs: [build, trivy:scan]
variables:
GIT_STRATEGY: none
before_script:
- echo $CI_REGISTRY_PASSWORD | cosign login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
# Sign using keyless mode with OIDC
- cosign sign --yes $IMAGE_FULL_NAME
only:
- main
- tags
when: manual # Require manual approval for signing
# Deploy to Kubernetes
deploy:staging:
stage: deploy
image: bitnami/kubectl:latest
needs: [build, trivy:scan]
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
before_script:
- kubectl config use-context staging-cluster
script:
- kubectl set image deployment/myapp myapp=$IMAGE_FULL_NAME --namespace=staging --record
- kubectl rollout status deployment/myapp --namespace=staging --timeout=5m
# Verify deployment
- |
POD=$(kubectl get pod -n staging -l app=myapp -o jsonpath="{.items[0].metadata.name}")
kubectl exec -n staging $POD -- curl -f http://localhost:8080/health || exit 1
only:
- develop
when: manual
stop:staging:
stage: deploy
image: bitnami/kubectl:latest
environment:
name: staging
action: stop
script:
- kubectl scale deployment/myapp --replicas=0 --namespace=staging
when: manual
only:
- develop
deploy:production:
stage: deploy
image: bitnami/kubectl:latest
needs: [build, trivy:scan, sign:image]
environment:
name: production
url: https://example.com
before_script:
- kubectl config use-context production-cluster
script:
- |
echo "Deploying version $IMAGE_TAG to production"
- kubectl set image deployment/myapp myapp=$IMAGE_FULL_NAME --namespace=production --record
- kubectl rollout status deployment/myapp --namespace=production --timeout=5m
# Health check
- sleep 10
- |
for i in {1..10}; do
POD=$(kubectl get pod -n production -l app=myapp -o jsonpath="{.items[0].metadata.name}")
if kubectl exec -n production $POD -- curl -f http://localhost:8080/health; then
echo "Health check passed"
exit 0
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
echo "Health check failed"
exit 1
only:
- main
when: manual # Require manual approval for production
# Create GitLab release
release:
stage: deploy
image: registry.gitlab.com/gitlab-org/release-cli:latest
needs: [deploy:production]
script:
- echo "Creating release for $CI_COMMIT_TAG"
release:
tag_name: $CI_COMMIT_TAG
description: |
Docker Image: $IMAGE_FULL_NAME
Changes in this release:
$CI_COMMIT_MESSAGE
only:
- tags
# Cleanup old images from Container Registry
cleanup:registry:
stage: deploy
image: alpine:latest
before_script:
- apk add --no-cache curl jq
script:
- |
# Keep last 10 images, delete older ones
echo "Cleaning up old container images..."
# Use GitLab API to manage container registry
# This is a placeholder - implement based on your retention policy
only:
- schedules
when: manual
# Workflow rules
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_COMMIT_BRANCH == "develop"'
- if: '$CI_COMMIT_TAG'
- if: '$CI_PIPELINE_SOURCE == "schedule"'
# Additional optimizations
.interruptible_jobs:
interruptible: true
build:
extends: .interruptible_jobs
trivy:scan:
extends: .interruptible_jobs

View File

@@ -0,0 +1,548 @@
# GitLab CI/CD Pipeline for Go
# Optimized with caching, parallel execution, and deployment
stages:
- security
- validate
- test
- build
- deploy
variables:
GO_VERSION: "1.22"
GOPATH: "$CI_PROJECT_DIR/.go"
GOCACHE: "$CI_PROJECT_DIR/.cache/go-build"
# Global cache configuration
cache:
key:
files:
- go.mod
- go.sum
paths:
- .go/pkg/mod/
- .cache/go-build/
policy: pull
# Reusable configuration
.go_template:
image: golang:${GO_VERSION}
before_script:
- go version
- go env
- mkdir -p .go
- export PATH=$GOPATH/bin:$PATH
# Validation stage
lint:
extends: .go_template
stage: validate
cache:
key:
files:
- go.mod
- go.sum
paths:
- .go/pkg/mod/
- .cache/go-build/
policy: pull-push
script:
# Install golangci-lint
- curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin
- golangci-lint run --timeout=5m
format-check:
extends: .go_template
stage: validate
script:
- test -z "$(gofmt -s -l .)"
- go mod tidy
- git diff --exit-code go.mod go.sum
only:
- merge_requests
- main
- develop
# Security: Secret Scanning
secret-scan:trufflehog:
stage: security
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json || true
- |
if [ -s trufflehog-report.json ]; then
echo "❌ Secrets detected!"
cat trufflehog-report.json
exit 1
fi
artifacts:
when: always
paths:
- trufflehog-report.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
secret-scan:gitleaks:
stage: security
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
artifacts:
when: always
paths:
- gitleaks-report.json
expire_in: 30 days
allow_failure: true
only:
- merge_requests
- main
- develop
# Security: SAST
sast:semgrep:
stage: security
image: returntocorp/semgrep
script:
- semgrep scan --config=auto --sarif --output=semgrep.sarif .
- semgrep scan --config=p/owasp-top-ten --json --output=semgrep-owasp.json .
artifacts:
reports:
sast: semgrep.sarif
paths:
- semgrep.sarif
- semgrep-owasp.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
security:gosec:
extends: .go_template
stage: security
script:
- go install github.com/securego/gosec/v2/cmd/gosec@latest
- gosec -fmt json -out gosec-report.json ./... || true
- gosec ./... # Fail on findings
artifacts:
when: always
paths:
- gosec-report.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
security:govulncheck:
extends: .go_template
stage: security
script:
- go install golang.org/x/vuln/cmd/govulncheck@latest
- govulncheck ./...
allow_failure: false
only:
- merge_requests
- main
- develop
# Test stage with matrix
test:
extends: .go_template
stage: test
parallel:
matrix:
- GO_VERSION: ["1.21", "1.22"]
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb?sslmode=disable"
REDIS_URL: "redis://redis:6379"
script:
- go mod download
- go test -v -race -coverprofile=coverage.out -covermode=atomic ./...
- go tool cover -func=coverage.out
coverage: '/total:.*?(\d+\.\d+)%/'
artifacts:
when: always
reports:
coverage_report:
coverage_format: cobertura
path: coverage.out
paths:
- coverage.out
expire_in: 30 days
benchmark:
extends: .go_template
stage: test
script:
- go test -bench=. -benchmem ./... | tee benchmark.txt
artifacts:
paths:
- benchmark.txt
expire_in: 7 days
only:
- main
- merge_requests
# Build stage - multi-platform
build:linux:amd64:
extends: .go_template
stage: build
variables:
GOOS: linux
GOARCH: amd64
CGO_ENABLED: "0"
script:
- |
go build \
-ldflags="-s -w -X main.version=$CI_COMMIT_SHORT_SHA -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-o myapp-linux-amd64 \
./cmd/myapp
- ls -lh myapp-linux-amd64
artifacts:
paths:
- myapp-linux-amd64
expire_in: 7 days
only:
- main
- develop
- tags
build:linux:arm64:
extends: .go_template
stage: build
variables:
GOOS: linux
GOARCH: arm64
CGO_ENABLED: "0"
script:
- |
go build \
-ldflags="-s -w -X main.version=$CI_COMMIT_SHORT_SHA -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-o myapp-linux-arm64 \
./cmd/myapp
- ls -lh myapp-linux-arm64
artifacts:
paths:
- myapp-linux-arm64
expire_in: 7 days
only:
- main
- develop
- tags
build:darwin:amd64:
extends: .go_template
stage: build
variables:
GOOS: darwin
GOARCH: amd64
CGO_ENABLED: "0"
script:
- |
go build \
-ldflags="-s -w -X main.version=$CI_COMMIT_SHORT_SHA -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-o myapp-darwin-amd64 \
./cmd/myapp
- ls -lh myapp-darwin-amd64
artifacts:
paths:
- myapp-darwin-amd64
expire_in: 7 days
only:
- tags
build:darwin:arm64:
extends: .go_template
stage: build
variables:
GOOS: darwin
GOARCH: arm64
CGO_ENABLED: "0"
script:
- |
go build \
-ldflags="-s -w -X main.version=$CI_COMMIT_SHORT_SHA -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-o myapp-darwin-arm64 \
./cmd/myapp
- ls -lh myapp-darwin-arm64
artifacts:
paths:
- myapp-darwin-arm64
expire_in: 7 days
only:
- tags
build:windows:amd64:
extends: .go_template
stage: build
variables:
GOOS: windows
GOARCH: amd64
CGO_ENABLED: "0"
script:
- |
go build \
-ldflags="-s -w -X main.version=$CI_COMMIT_SHORT_SHA -X main.buildTime=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
-o myapp-windows-amd64.exe \
./cmd/myapp
- ls -lh myapp-windows-amd64.exe
artifacts:
paths:
- myapp-windows-amd64.exe
expire_in: 7 days
only:
- tags
# Build Docker image
build:docker:
stage: build
image: docker:24-cli
services:
- docker:24-dind
variables:
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
before_script:
- echo $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
# Multi-stage build with Go
- docker pull $CI_REGISTRY_IMAGE:latest || true
- |
docker build \
--cache-from $CI_REGISTRY_IMAGE:latest \
--tag $IMAGE_TAG \
--tag $CI_REGISTRY_IMAGE:latest \
--build-arg GO_VERSION=$GO_VERSION \
--build-arg VERSION=$CI_COMMIT_SHORT_SHA \
--build-arg BUILD_TIME=$(date -u +%Y-%m-%dT%H:%M:%SZ) \
.
- docker push $IMAGE_TAG
- docker push $CI_REGISTRY_IMAGE:latest
- echo "IMAGE_FULL_NAME=$IMAGE_TAG" >> build.env
artifacts:
reports:
dotenv: build.env
only:
- main
- develop
- tags
tags:
- docker
# Integration tests
integration-test:
extends: .go_template
stage: test
needs: [build:linux:amd64]
dependencies:
- build:linux:amd64
services:
- postgres:15
variables:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb?sslmode=disable"
BINARY_PATH: "./myapp-linux-amd64"
script:
- chmod +x myapp-linux-amd64
- go test -v -tags=integration ./tests/integration/...
only:
- main
- merge_requests
# Deploy to staging
deploy:staging:
stage: deploy
image: bitnami/kubectl:latest
needs: [build:docker]
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
before_script:
- kubectl config use-context staging-cluster
script:
- kubectl set image deployment/myapp myapp=$IMAGE_FULL_NAME --namespace=staging --record
- kubectl rollout status deployment/myapp --namespace=staging --timeout=5m
# Health check
- |
POD=$(kubectl get pod -n staging -l app=myapp -o jsonpath="{.items[0].metadata.name}")
kubectl exec -n staging $POD -- /myapp version
kubectl exec -n staging $POD -- curl -f http://localhost:8080/health || exit 1
only:
- develop
when: manual
stop:staging:
stage: deploy
image: bitnami/kubectl:latest
environment:
name: staging
action: stop
script:
- kubectl scale deployment/myapp --replicas=0 --namespace=staging
when: manual
only:
- develop
# Deploy to production
deploy:production:
stage: deploy
image: bitnami/kubectl:latest
needs: [build:docker, integration-test]
environment:
name: production
url: https://api.example.com
before_script:
- kubectl config use-context production-cluster
script:
- echo "Deploying to production..."
- kubectl set image deployment/myapp myapp=$IMAGE_FULL_NAME --namespace=production --record
- kubectl rollout status deployment/myapp --namespace=production --timeout=5m
# Health check
- sleep 10
- |
for i in {1..10}; do
POD=$(kubectl get pod -n production -l app=myapp -o jsonpath="{.items[0].metadata.name}")
if kubectl exec -n production $POD -- curl -f http://localhost:8080/health; then
echo "Health check passed"
exit 0
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
echo "Health check failed"
exit 1
only:
- main
when: manual
# Deploy to Cloud Run
deploy:cloudrun:
stage: deploy
image: google/cloud-sdk:alpine
needs: [build:docker]
environment:
name: production
url: https://myapp.run.app
before_script:
- echo $GCP_SERVICE_KEY | base64 -d > ${HOME}/gcp-key.json
- gcloud auth activate-service-account --key-file ${HOME}/gcp-key.json
- gcloud config set project $GCP_PROJECT_ID
script:
- |
gcloud run deploy myapp \
--image $IMAGE_FULL_NAME \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--memory 512Mi \
--cpu 1 \
--max-instances 10
# Health check
- |
URL=$(gcloud run services describe myapp --region us-central1 --format 'value(status.url)')
curl -f $URL/health || exit 1
only:
- main
when: manual
# Create release
release:
stage: deploy
image: registry.gitlab.com/gitlab-org/release-cli:latest
needs:
- build:linux:amd64
- build:linux:arm64
- build:darwin:amd64
- build:darwin:arm64
- build:windows:amd64
dependencies:
- build:linux:amd64
- build:linux:arm64
- build:darwin:amd64
- build:darwin:arm64
- build:windows:amd64
before_script:
- apk add --no-cache coreutils
script:
# Create checksums
- sha256sum myapp-* > checksums.txt
- cat checksums.txt
release:
tag_name: $CI_COMMIT_TAG
description: |
Go Binary Release
Binaries for multiple platforms:
- Linux (amd64, arm64)
- macOS (amd64, arm64/Apple Silicon)
- Windows (amd64)
Docker Image: $CI_REGISTRY_IMAGE:$CI_COMMIT_TAG
Checksums: See attached checksums.txt
assets:
links:
- name: 'Linux AMD64'
url: '${CI_PROJECT_URL}/-/jobs/artifacts/${CI_COMMIT_TAG}/raw/myapp-linux-amd64?job=build:linux:amd64'
- name: 'Linux ARM64'
url: '${CI_PROJECT_URL}/-/jobs/artifacts/${CI_COMMIT_TAG}/raw/myapp-linux-arm64?job=build:linux:arm64'
- name: 'macOS AMD64'
url: '${CI_PROJECT_URL}/-/jobs/artifacts/${CI_COMMIT_TAG}/raw/myapp-darwin-amd64?job=build:darwin:amd64'
- name: 'macOS ARM64'
url: '${CI_PROJECT_URL}/-/jobs/artifacts/${CI_COMMIT_TAG}/raw/myapp-darwin-arm64?job=build:darwin:arm64'
- name: 'Windows AMD64'
url: '${CI_PROJECT_URL}/-/jobs/artifacts/${CI_COMMIT_TAG}/raw/myapp-windows-amd64.exe?job=build:windows:amd64'
- name: 'Checksums'
url: '${CI_PROJECT_URL}/-/jobs/artifacts/${CI_COMMIT_TAG}/raw/checksums.txt?job=release'
only:
- tags
# GitLab built-in security templates
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
- template: Security/SAST.gitlab-ci.yml
# Override GitLab template stages
dependency_scanning:
stage: security
sast:
stage: security
# Workflow rules
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_COMMIT_BRANCH == "develop"'
- if: '$CI_COMMIT_TAG'
# Interruptible jobs
.interruptible_template:
interruptible: true
lint:
extends: [.go_template, .interruptible_template]
test:
extends: [.go_template, .interruptible_template]
integration-test:
extends: [.go_template, .interruptible_template]

View File

@@ -0,0 +1,334 @@
# GitLab CI/CD Pipeline for Node.js
# Optimized with caching, parallel execution, and deployment
stages:
- security
- validate
- test
- build
- deploy
# Global variables
variables:
NODE_VERSION: "20"
npm_config_cache: "$CI_PROJECT_DIR/.npm"
CYPRESS_CACHE_FOLDER: "$CI_PROJECT_DIR/.cache/Cypress"
# Global cache configuration
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
- .cache/Cypress/
policy: pull
# Reusable configuration
.node_template:
image: node:${NODE_VERSION}
before_script:
- node --version
- npm --version
- npm ci
# Validation stage
lint:
extends: .node_template
stage: validate
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
policy: pull-push
script:
- npm run lint
- npm run format:check
only:
- merge_requests
- main
- develop
# Test stage
unit-test:
extends: .node_template
stage: test
parallel:
matrix:
- NODE_VERSION: ["18", "20", "22"]
script:
- npm run test:unit -- --coverage
coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
paths:
- coverage/
expire_in: 30 days
integration-test:
extends: .node_template
stage: test
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
REDIS_URL: "redis://redis:6379"
script:
- npm run test:integration
artifacts:
when: always
reports:
junit: test-results/junit.xml
paths:
- test-results/
expire_in: 7 days
# Build stage
build:
extends: .node_template
stage: build
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
policy: pull
script:
- npm run build
- echo "BUILD_VERSION=$(node -p "require('./package.json').version")" >> build.env
artifacts:
paths:
- dist/
reports:
dotenv: build.env
expire_in: 7 days
only:
- main
- develop
- tags
# Security: Secret Scanning
secret-scan:trufflehog:
stage: security
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json || true
- |
if [ -s trufflehog-report.json ]; then
echo "❌ Secrets detected!"
cat trufflehog-report.json
exit 1
fi
artifacts:
when: always
paths:
- trufflehog-report.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
secret-scan:gitleaks:
stage: security
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
artifacts:
when: always
paths:
- gitleaks-report.json
expire_in: 30 days
allow_failure: true
only:
- merge_requests
- main
- develop
# Security: SAST
sast:semgrep:
stage: security
image: returntocorp/semgrep
script:
- semgrep scan --config=auto --sarif --output=semgrep.sarif .
- semgrep scan --config=p/owasp-top-ten --json --output=semgrep-owasp.json .
artifacts:
reports:
sast: semgrep.sarif
paths:
- semgrep.sarif
- semgrep-owasp.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
sast:nodejs:
stage: security
image: node:20-alpine
script:
- npm install -g eslint eslint-plugin-security
- eslint . --plugin=security --format=json --output-file=eslint-security.json || true
artifacts:
paths:
- eslint-security.json
expire_in: 30 days
only:
exists:
- package.json
allow_failure: true
# Security: Dependency Scanning
dependency-scan:npm:
extends: .node_template
stage: security
cache:
key:
files:
- package-lock.json
paths:
- node_modules/
- .npm/
policy: pull-push
script:
- npm audit --audit-level=moderate --json > npm-audit.json || true
- npm audit --audit-level=high # Fail on high severity
artifacts:
paths:
- npm-audit.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
# GitLab built-in security templates
include:
- template: Security/SAST.gitlab-ci.yml
- template: Security/Dependency-Scanning.gitlab-ci.yml
# E2E tests (only on main)
e2e-test:
extends: .node_template
stage: test
needs: [build]
dependencies:
- build
script:
- npm run test:e2e
artifacts:
when: always
paths:
- cypress/videos/
- cypress/screenshots/
expire_in: 7 days
only:
- main
# Deploy to staging
deploy:staging:
stage: deploy
image: node:${NODE_VERSION}
needs: [build]
dependencies:
- build
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
script:
- echo "Deploying to staging..."
- npm install -g aws-cli
- aws s3 sync dist/ s3://${STAGING_BUCKET}
- aws cloudfront create-invalidation --distribution-id ${STAGING_CF_DIST} --paths "/*"
only:
- develop
when: manual
stop:staging:
stage: deploy
image: node:${NODE_VERSION}
environment:
name: staging
action: stop
script:
- echo "Stopping staging environment..."
when: manual
only:
- develop
# Deploy to production
deploy:production:
stage: deploy
image: node:${NODE_VERSION}
needs: [build, e2e-test]
dependencies:
- build
environment:
name: production
url: https://example.com
before_script:
- echo "Deploying version ${BUILD_VERSION} to production"
script:
- npm install -g aws-cli
- aws s3 sync dist/ s3://${PRODUCTION_BUCKET}
- aws cloudfront create-invalidation --distribution-id ${PRODUCTION_CF_DIST} --paths "/*"
# Health check
- sleep 10
- curl -f https://example.com/health || exit 1
after_script:
- echo "Deployed successfully"
only:
- main
when: manual
# Create release
release:
stage: deploy
image: registry.gitlab.com/gitlab-org/release-cli:latest
needs: [deploy:production]
script:
- echo "Creating release for version ${BUILD_VERSION}"
release:
tag_name: 'v${BUILD_VERSION}'
description: 'Release v${BUILD_VERSION}'
only:
- main
# Workflow rules
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_COMMIT_BRANCH == "develop"'
- if: '$CI_COMMIT_TAG'
# Additional optimizations
.interruptible_template:
interruptible: true
lint:
extends: [.node_template, .interruptible_template]
unit-test:
extends: [.node_template, .interruptible_template]
integration-test:
extends: [.node_template, .interruptible_template]

View File

@@ -0,0 +1,472 @@
# GitLab CI/CD Pipeline for Python
# Optimized with caching, parallel execution, and deployment
stages:
- security
- validate
- test
- build
- deploy
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
PYTHON_VERSION: "3.11"
# Global cache configuration
cache:
key:
files:
- requirements.txt
- requirements-dev.txt
paths:
- .cache/pip
- .venv/
policy: pull
# Reusable configuration
.python_template:
image: python:${PYTHON_VERSION}
before_script:
- python --version
- pip install --upgrade pip
- python -m venv .venv
- source .venv/bin/activate
- pip install -r requirements.txt
- pip install -r requirements-dev.txt
# Validation stage
lint:
extends: .python_template
stage: validate
cache:
key:
files:
- requirements.txt
- requirements-dev.txt
paths:
- .cache/pip
- .venv/
policy: pull-push
script:
- ruff check .
- black --check .
- isort --check-only .
- mypy . || true # Don't fail on type errors initially
only:
- merge_requests
- main
- develop
# Security: Secret Scanning
secret-scan:trufflehog:
stage: security
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json || true
- |
if [ -s trufflehog-report.json ]; then
echo "❌ Secrets detected!"
cat trufflehog-report.json
exit 1
fi
artifacts:
when: always
paths:
- trufflehog-report.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
secret-scan:gitleaks:
stage: security
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
artifacts:
when: always
paths:
- gitleaks-report.json
expire_in: 30 days
allow_failure: true
only:
- merge_requests
- main
- develop
# Security: SAST
sast:semgrep:
stage: security
image: returntocorp/semgrep
script:
- semgrep scan --config=auto --sarif --output=semgrep.sarif .
- semgrep scan --config=p/owasp-top-ten --json --output=semgrep-owasp.json .
artifacts:
reports:
sast: semgrep.sarif
paths:
- semgrep.sarif
- semgrep-owasp.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
security:bandit:
extends: .python_template
stage: security
script:
- pip install bandit
- bandit -r src/ -f json -o bandit-report.json -ll || true
- bandit -r src/ -ll # Fail on high severity
artifacts:
when: always
paths:
- bandit-report.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
security:safety:
extends: .python_template
stage: security
script:
- pip install safety
- safety check --json --output safety-report.json || true
- safety check
artifacts:
when: always
paths:
- safety-report.json
expire_in: 30 days
allow_failure: true
only:
- merge_requests
- main
- develop
# Security: Dependency Scanning
security:pip-audit:
extends: .python_template
stage: security
script:
- pip install pip-audit
- pip-audit --requirement requirements.txt --format json --output pip-audit.json || true
- pip-audit --requirement requirements.txt # Fail on vulnerabilities
artifacts:
when: always
paths:
- pip-audit.json
expire_in: 30 days
allow_failure: false
only:
- merge_requests
- main
- develop
# Test stage with matrix
test:
extends: .python_template
stage: test
parallel:
matrix:
- PYTHON_VERSION: ["3.9", "3.10", "3.11", "3.12"]
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
REDIS_URL: "redis://redis:6379"
script:
- |
pytest tests/unit \
--cov=src \
--cov-report=xml \
--cov-report=term \
--cov-report=html \
--junitxml=junit.xml \
-v
coverage: '/(?i)total.*? (100(?:\.0+)?\%|[1-9]?\d(?:\.\d+)?\%)$/'
artifacts:
when: always
reports:
junit: junit.xml
coverage_report:
coverage_format: cobertura
path: coverage.xml
paths:
- coverage.xml
- htmlcov/
expire_in: 30 days
integration-test:
extends: .python_template
stage: test
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_USER: testuser
POSTGRES_PASSWORD: testpass
DATABASE_URL: "postgresql://testuser:testpass@postgres:5432/testdb"
REDIS_URL: "redis://redis:6379"
script:
- pytest tests/integration -v --junitxml=junit-integration.xml
artifacts:
when: always
reports:
junit: junit-integration.xml
expire_in: 7 days
only:
- main
- develop
- merge_requests
# Build stage
build:package:
extends: .python_template
stage: build
script:
- pip install build wheel setuptools
- python -m build
- ls -lh dist/
artifacts:
paths:
- dist/
expire_in: 7 days
only:
- main
- develop
- tags
build:docker:
stage: build
image: docker:24-cli
services:
- docker:24-dind
variables:
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
before_script:
- echo $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
# Pull previous image for caching
- docker pull $CI_REGISTRY_IMAGE:latest || true
# Build with cache
- |
docker build \
--cache-from $CI_REGISTRY_IMAGE:latest \
--tag $IMAGE_TAG \
--tag $CI_REGISTRY_IMAGE:latest \
--build-arg BUILDKIT_INLINE_CACHE=1 \
.
# Push images
- docker push $IMAGE_TAG
- docker push $CI_REGISTRY_IMAGE:latest
- echo "IMAGE_FULL_NAME=$IMAGE_TAG" >> build.env
artifacts:
reports:
dotenv: build.env
only:
- main
- develop
- tags
tags:
- docker
# E2E tests (only on main)
e2e-test:
extends: .python_template
stage: test
needs: [build:package]
dependencies:
- build:package
script:
- pip install dist/*.whl
- pytest tests/e2e -v
artifacts:
when: always
paths:
- test-results/
expire_in: 7 days
only:
- main
# Deploy to PyPI
deploy:pypi:
stage: deploy
image: python:3.11
needs: [build:package, test]
dependencies:
- build:package
environment:
name: pypi
url: https://pypi.org/project/your-package
before_script:
- pip install twine
script:
- twine check dist/*
- twine upload dist/* --username __token__ --password $PYPI_TOKEN
only:
- tags
when: manual
# Deploy Docker to staging
deploy:staging:
stage: deploy
image: bitnami/kubectl:latest
needs: [build:docker]
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
before_script:
- kubectl config use-context staging-cluster
script:
- kubectl set image deployment/myapp myapp=$IMAGE_FULL_NAME --namespace=staging --record
- kubectl rollout status deployment/myapp --namespace=staging --timeout=5m
# Smoke test
- |
POD=$(kubectl get pod -n staging -l app=myapp -o jsonpath="{.items[0].metadata.name}")
kubectl exec -n staging $POD -- python -c "import sys; print(sys.version)"
kubectl exec -n staging $POD -- curl -f http://localhost:8000/health || exit 1
only:
- develop
when: manual
stop:staging:
stage: deploy
image: bitnami/kubectl:latest
environment:
name: staging
action: stop
script:
- kubectl scale deployment/myapp --replicas=0 --namespace=staging
when: manual
only:
- develop
# Deploy to production
deploy:production:
stage: deploy
image: bitnami/kubectl:latest
needs: [build:docker, e2e-test]
environment:
name: production
url: https://example.com
before_script:
- kubectl config use-context production-cluster
script:
- echo "Deploying to production..."
- kubectl set image deployment/myapp myapp=$IMAGE_FULL_NAME --namespace=production --record
- kubectl rollout status deployment/myapp --namespace=production --timeout=5m
# Health check
- sleep 10
- |
for i in {1..10}; do
POD=$(kubectl get pod -n production -l app=myapp -o jsonpath="{.items[0].metadata.name}")
if kubectl exec -n production $POD -- curl -f http://localhost:8000/health; then
echo "Health check passed"
exit 0
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
echo "Health check failed"
exit 1
only:
- main
when: manual
# Deploy to Cloud Run (Google Cloud)
deploy:cloudrun:
stage: deploy
image: google/cloud-sdk:alpine
needs: [build:docker]
environment:
name: production
url: https://your-app.run.app
before_script:
- echo $GCP_SERVICE_KEY | base64 -d > ${HOME}/gcp-key.json
- gcloud auth activate-service-account --key-file ${HOME}/gcp-key.json
- gcloud config set project $GCP_PROJECT_ID
script:
- |
gcloud run deploy your-app \
--image $IMAGE_FULL_NAME \
--region us-central1 \
--platform managed \
--allow-unauthenticated
# Health check
- |
URL=$(gcloud run services describe your-app --region us-central1 --format 'value(status.url)')
curl -f $URL/health || exit 1
only:
- main
when: manual
# Create release
release:
stage: deploy
image: registry.gitlab.com/gitlab-org/release-cli:latest
needs: [deploy:production]
script:
- echo "Creating release"
release:
tag_name: $CI_COMMIT_TAG
description: |
Python Package: https://pypi.org/project/your-package/$CI_COMMIT_TAG
Docker Image: $IMAGE_FULL_NAME
Changes in this release:
$CI_COMMIT_MESSAGE
only:
- tags
# GitLab built-in security templates
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
- template: Security/SAST.gitlab-ci.yml
# Override GitLab template stages
dependency_scanning:
stage: security
sast:
stage: security
# Workflow rules
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_COMMIT_BRANCH == "develop"'
- if: '$CI_COMMIT_TAG'
# Interruptible jobs
.interruptible_template:
interruptible: true
lint:
extends: [.python_template, .interruptible_template]
test:
extends: [.python_template, .interruptible_template]
integration-test:
extends: [.python_template, .interruptible_template]

View File

@@ -0,0 +1,479 @@
# Complete DevSecOps Security Scanning Pipeline for GitLab CI
# SAST, DAST, SCA, Container Scanning, Secret Scanning
stages:
- secret-scan
- sast
- sca
- build
- container-scan
- dast
- compliance
- report
variables:
SECURE_LOG_LEVEL: "info"
# Enable Auto DevOps security scanners
SAST_EXCLUDED_PATHS: "spec, test, tests, tmp"
SCAN_KUBERNETES_MANIFESTS: "false"
# Stage 1: Secret Scanning
secret-scan:trufflehog:
stage: secret-scan
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json || true
- |
if [ -s trufflehog-report.json ]; then
echo "❌ Secrets detected!"
cat trufflehog-report.json
exit 1
fi
artifacts:
when: always
paths:
- trufflehog-report.json
expire_in: 30 days
allow_failure: false
secret-scan:gitleaks:
stage: secret-scan
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
artifacts:
when: always
paths:
- gitleaks-report.json
expire_in: 30 days
allow_failure: true
# Stage 2: SAST (Static Application Security Testing)
sast:semgrep:
stage: sast
image: returntocorp/semgrep
script:
- semgrep scan --config=auto --sarif --output=semgrep.sarif .
- semgrep scan --config=p/owasp-top-ten --json --output=semgrep-owasp.json .
artifacts:
reports:
sast: semgrep.sarif
paths:
- semgrep.sarif
- semgrep-owasp.json
expire_in: 30 days
allow_failure: false
sast:nodejs:
stage: sast
image: node:20-alpine
script:
- npm install -g eslint eslint-plugin-security
- eslint . --plugin=security --format=json --output-file=eslint-security.json || true
artifacts:
paths:
- eslint-security.json
expire_in: 30 days
only:
exists:
- package.json
allow_failure: true
sast:python:
stage: sast
image: python:3.11-alpine
script:
- pip install bandit
- bandit -r . -f json -o bandit-report.json -ll || true
- bandit -r . -ll # Fail on high severity
artifacts:
reports:
sast: bandit-report.json
paths:
- bandit-report.json
expire_in: 30 days
only:
exists:
- requirements.txt
allow_failure: false
sast:go:
stage: sast
image: securego/gosec:latest
script:
- gosec -fmt json -out gosec-report.json ./... || true
- gosec ./... # Fail on findings
artifacts:
reports:
sast: gosec-report.json
paths:
- gosec-report.json
expire_in: 30 days
only:
exists:
- go.mod
allow_failure: false
# GitLab built-in SAST
include:
- template: Security/SAST.gitlab-ci.yml
sast:
variables:
SAST_EXCLUDED_ANALYZERS: ""
# Stage 3: SCA (Software Composition Analysis)
sca:npm-audit:
stage: sca
image: node:20-alpine
before_script:
- npm ci
script:
- npm audit --audit-level=moderate --json > npm-audit.json || true
- npm audit --audit-level=high # Fail on high severity
artifacts:
paths:
- npm-audit.json
expire_in: 30 days
only:
exists:
- package.json
allow_failure: false
sca:python:
stage: sca
image: python:3.11-alpine
script:
- pip install pip-audit
- pip-audit --requirement requirements.txt --format json --output pip-audit.json || true
- pip-audit --requirement requirements.txt --vulnerability-service osv # Fail on vulns
artifacts:
paths:
- pip-audit.json
expire_in: 30 days
only:
exists:
- requirements.txt
allow_failure: false
# GitLab built-in Dependency Scanning
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
dependency_scanning:
variables:
DS_EXCLUDED_PATHS: "test/,tests/,spec/,vendor/"
# Stage 4: Build
build:app:
stage: build
image: node:20-alpine
script:
- npm ci
- npm run build
artifacts:
paths:
- dist/
expire_in: 7 days
only:
- branches
- merge_requests
build:docker:
stage: build
image: docker:24-cli
services:
- docker:24-dind
variables:
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA
before_script:
- echo $CI_REGISTRY_PASSWORD | docker login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
- docker build --tag $IMAGE_TAG .
- docker push $IMAGE_TAG
- echo "IMAGE_TAG=$IMAGE_TAG" > build.env
artifacts:
reports:
dotenv: build.env
only:
- branches
- merge_requests
# Stage 5: Container Security Scanning
container:trivy:
stage: container-scan
image: aquasec/trivy:latest
needs: [build:docker]
dependencies:
- build:docker
variables:
GIT_STRATEGY: none
script:
# Scan for vulnerabilities
- trivy image --severity HIGH,CRITICAL --format json --output trivy-report.json $IMAGE_TAG
- trivy image --severity HIGH,CRITICAL --exit-code 1 $IMAGE_TAG
artifacts:
reports:
container_scanning: trivy-report.json
paths:
- trivy-report.json
expire_in: 30 days
allow_failure: false
container:grype:
stage: container-scan
image: anchore/grype:latest
needs: [build:docker]
dependencies:
- build:docker
variables:
GIT_STRATEGY: none
before_script:
- echo $CI_REGISTRY_PASSWORD | grype registry login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
- grype $IMAGE_TAG --fail-on high --output json --file grype-report.json
artifacts:
paths:
- grype-report.json
expire_in: 30 days
allow_failure: true
container:sbom:
stage: container-scan
image: anchore/syft:latest
needs: [build:docker]
dependencies:
- build:docker
variables:
GIT_STRATEGY: none
before_script:
- echo $CI_REGISTRY_PASSWORD | syft registry login -u $CI_REGISTRY_USER --password-stdin $CI_REGISTRY
script:
- syft $IMAGE_TAG -o spdx-json > sbom.spdx.json
- syft $IMAGE_TAG -o cyclonedx-json > sbom.cyclonedx.json
artifacts:
paths:
- sbom.spdx.json
- sbom.cyclonedx.json
expire_in: 90 days
only:
- main
- tags
# GitLab built-in Container Scanning
include:
- template: Security/Container-Scanning.gitlab-ci.yml
container_scanning:
needs: [build:docker]
dependencies:
- build:docker
variables:
CS_IMAGE: $IMAGE_TAG
GIT_STRATEGY: none
# Stage 6: DAST (Dynamic Application Security Testing)
dast:zap-baseline:
stage: dast
image: owasp/zap2docker-stable
needs: [build:docker]
services:
- name: $IMAGE_TAG
alias: testapp
script:
# Wait for app to be ready
- sleep 10
# Run baseline scan
- zap-baseline.py -t http://testapp:8080 -r zap-baseline-report.html -J zap-baseline.json
artifacts:
when: always
paths:
- zap-baseline-report.html
- zap-baseline.json
expire_in: 30 days
only:
- main
- schedules
allow_failure: true
dast:zap-full:
stage: dast
image: owasp/zap2docker-stable
script:
# Full scan on staging environment
- zap-full-scan.py -t https://staging.example.com -r zap-full-report.html -J zap-full.json
artifacts:
when: always
paths:
- zap-full-report.html
- zap-full.json
reports:
dast: zap-full.json
expire_in: 30 days
only:
- schedules # Run on schedule only (slow)
allow_failure: true
# GitLab built-in DAST
include:
- template: DAST.gitlab-ci.yml
dast:
variables:
DAST_WEBSITE: https://staging.example.com
DAST_FULL_SCAN_ENABLED: "false"
only:
- schedules
- main
# Stage 7: License Compliance
license:check:
stage: compliance
image: node:20-alpine
needs: [build:app]
script:
- npm ci
- npm install -g license-checker
- license-checker --production --onlyAllow "MIT;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC;0BSD" --json --out license-report.json
artifacts:
paths:
- license-report.json
expire_in: 30 days
only:
exists:
- package.json
allow_failure: false
# GitLab built-in License Scanning
include:
- template: Security/License-Scanning.gitlab-ci.yml
license_scanning:
only:
- main
- merge_requests
# Stage 8: Security Report & Gate
security:gate:
stage: report
image: alpine:latest
needs:
- secret-scan:trufflehog
- sast:semgrep
- sca:npm-audit
- container:trivy
- license:check
before_script:
- apk add --no-cache jq curl
script:
- |
echo "==================================="
echo "🔒 Security Gate Evaluation"
echo "==================================="
GATE_PASSED=true
# Check Trivy results
if [ -f trivy-report.json ]; then
CRITICAL=$(jq '[.Results[]?.Vulnerabilities[]? | select(.Severity=="CRITICAL")] | length' trivy-report.json)
HIGH=$(jq '[.Results[]?.Vulnerabilities[]? | select(.Severity=="HIGH")] | length' trivy-report.json)
echo "Container Vulnerabilities:"
echo " - Critical: $CRITICAL"
echo " - High: $HIGH"
if [ "$CRITICAL" -gt 0 ]; then
echo "❌ CRITICAL vulnerabilities found in container"
GATE_PASSED=false
fi
if [ "$HIGH" -gt 10 ]; then
echo "⚠️ Too many HIGH vulnerabilities: $HIGH"
GATE_PASSED=false
fi
fi
# Final gate decision
if [ "$GATE_PASSED" = true ]; then
echo ""
echo "✅ Security gate PASSED"
echo "All security checks completed successfully"
exit 0
else
echo ""
echo "❌ Security gate FAILED"
echo "Critical security issues detected"
exit 1
fi
allow_failure: false
security:report:
stage: report
image: alpine:latest
needs:
- security:gate
when: always
script:
- |
cat << EOF > security-report.md
# Security Scan Report - $(date +%Y-%m-%d)
## Summary
- **Project:** $CI_PROJECT_NAME
- **Branch:** $CI_COMMIT_BRANCH
- **Commit:** $CI_COMMIT_SHORT_SHA
- **Pipeline:** $CI_PIPELINE_URL
- **Scan Date:** $(date -u +"%Y-%m-%d %H:%M:%S UTC")
## Scans Performed
1. ✅ Secret Scanning (TruffleHog, Gitleaks)
2. ✅ SAST (Semgrep, Language-specific)
3. ✅ SCA (npm audit, pip-audit)
4. ✅ Container Scanning (Trivy, Grype)
5. ✅ SBOM Generation (Syft)
6. ✅ License Compliance
## Security Gate
Status: See job logs for details
## Artifacts
- Trivy container scan report
- Semgrep SAST report
- SBOM (SPDX & CycloneDX)
- License compliance report
- ZAP DAST report (if applicable)
## Next Steps
1. Review all security findings in artifacts
2. Address critical and high severity vulnerabilities
3. Update dependencies with known vulnerabilities
4. Re-run pipeline to verify fixes
EOF
cat security-report.md
artifacts:
paths:
- security-report.md
expire_in: 90 days
# Workflow rules
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_COMMIT_BRANCH == "develop"'
- if: '$CI_PIPELINE_SOURCE == "schedule"'
- if: '$CI_COMMIT_TAG'
# Interruptible jobs for MR pipelines
.interruptible:
interruptible: true
secret-scan:trufflehog:
extends: .interruptible
sast:semgrep:
extends: .interruptible
sca:npm-audit:
extends: .interruptible

113
plugin.lock.json Normal file
View File

@@ -0,0 +1,113 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:ahmedasmar/devops-claude-skills:ci-cd",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "485763da67098d2a58550d432c4013f7f15170d7",
"treeHash": "5572614b78ad24295d1e0628722629675af7e7727e7ae77100af602df044f961",
"generatedAt": "2025-11-28T10:13:03.171664Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "ci-cd",
"description": "CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, and securing pipelines across GitHub Actions, GitLab CI, and other platforms.",
"version": "1.0.0"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "194c29560f34dc610c02549b1b1e549fa996318cb88ea7903038720609d3508d"
},
{
"path": "SKILL.md",
"sha256": "9ad65a063989ad8e5d2d135a8149e547b084c099deb2a878ded47a939ca07af0"
},
{
"path": "references/troubleshooting.md",
"sha256": "45441d045e7dcb48fb1aecfdff7ed5677eee33041a3b0945d2ab1f5e4fb09376"
},
{
"path": "references/devsecops.md",
"sha256": "b283642a41126f32aaea646d05f823995d64597f9c4c0a586ef41189c92a7443"
},
{
"path": "references/optimization.md",
"sha256": "80161a0d711b3e71d5b709b39935662a50eb99573d4f7e009623ece3e0a3a86f"
},
{
"path": "references/security.md",
"sha256": "ace30ca2801fb4bfb61345316645657d08111d9755b04d105888cc05745efdbf"
},
{
"path": "references/best_practices.md",
"sha256": "362764b3568f59c47193767b80126329f0c2fae81463efd0ae3eac31d5b7b774"
},
{
"path": "scripts/pipeline_analyzer.py",
"sha256": "514dad0080c1beef5cbefb006b0dd708462f83e73791e7057058005d411a0019"
},
{
"path": "scripts/ci_health.py",
"sha256": "dfb64ca1cd644307d6f9d7170de24d1f92b016f29420723e180bd17ea681e722"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "7a415103cf467e31256eb05e10492d53ad9bfd343217c21b974a1a57ea8d1019"
},
{
"path": "assets/templates/gitlab-ci/node-ci.yml",
"sha256": "9b4de74eb0e68055da2b48ccfe07d2523afa7a8df9315674da783ad43621d8ad"
},
{
"path": "assets/templates/gitlab-ci/docker-build.yml",
"sha256": "73430e6bda44e23f069569c530722c045d70e8e3dbc67fc62c69c81585442ed2"
},
{
"path": "assets/templates/gitlab-ci/go-ci.yml",
"sha256": "3312591e10272fddc46b4f2e0a7772bf35df3b5cadbe6d6e153b5865e4ab5e1d"
},
{
"path": "assets/templates/gitlab-ci/python-ci.yml",
"sha256": "9e0097963a4a5ffc67071b2c1e045ccc860f61cbcc65db034f302e835cd7d1a3"
},
{
"path": "assets/templates/gitlab-ci/security-scan.yml",
"sha256": "2ec23a658dda7aa0bb4ec439e02ca2618fb80ab76bdb565c85655b9c9b85493f"
},
{
"path": "assets/templates/github-actions/node-ci.yml",
"sha256": "1d5e9e56dc6bd52a25df657c6cf3aa5941688091f6f8579543a1108e369fd68c"
},
{
"path": "assets/templates/github-actions/docker-build.yml",
"sha256": "539d7336f9a6eb3a6a6e7c36e41e1cbab6209fd7923b22d442849a670c81d4f8"
},
{
"path": "assets/templates/github-actions/go-ci.yml",
"sha256": "22475fa2c778b772c38f184253a66f875bb1b790ac1d8dc531174878019e7ddc"
},
{
"path": "assets/templates/github-actions/python-ci.yml",
"sha256": "121e490eb2db53fb6d5b30720f891094b65c17c511b507a1ca9c8b4105d16c2a"
},
{
"path": "assets/templates/github-actions/security-scan.yml",
"sha256": "3ecb2bd844a4cb91691c62e867730140edad2bc7aa0be6258a4a895dc1153e56"
}
],
"dirSha256": "5572614b78ad24295d1e0628722629675af7e7727e7ae77100af602df044f961"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}

View File

@@ -0,0 +1,675 @@
# CI/CD Best Practices
Comprehensive guide to CI/CD pipeline design, testing strategies, and deployment patterns.
## Table of Contents
- [Pipeline Design Principles](#pipeline-design-principles)
- [Testing in CI/CD](#testing-in-cicd)
- [Deployment Strategies](#deployment-strategies)
- [Dependency Management](#dependency-management)
- [Artifact & Release Management](#artifact--release-management)
- [Platform Patterns](#platform-patterns)
---
## Pipeline Design Principles
### Fast Feedback Loops
Design pipelines to provide feedback quickly:
**Priority ordering:**
1. Linting and code formatting (seconds)
2. Unit tests (1-5 minutes)
3. Integration tests (5-15 minutes)
4. E2E tests (15-30 minutes)
5. Deployment (varies)
**Fail fast pattern:**
```yaml
# GitHub Actions
jobs:
lint:
runs-on: ubuntu-latest
steps:
- run: npm run lint
test:
needs: lint # Only run if lint passes
runs-on: ubuntu-latest
steps:
- run: npm test
e2e:
needs: [lint, test] # Run after basic checks
```
### Job Parallelization
Run independent jobs concurrently:
**GitHub Actions:**
```yaml
jobs:
lint:
runs-on: ubuntu-latest
test:
runs-on: ubuntu-latest
# No 'needs' - runs in parallel with lint
build:
needs: [lint, test] # Wait for both
runs-on: ubuntu-latest
```
**GitLab CI:**
```yaml
stages:
- validate
- test
- build
# Jobs in same stage run in parallel
unit-test:
stage: test
integration-test:
stage: test
e2e-test:
stage: test
```
### Monorepo Strategies
**Path-based triggers (GitHub):**
```yaml
on:
push:
paths:
- 'services/api/**'
- 'shared/**'
jobs:
api-test:
if: |
contains(github.event.head_commit.modified, 'services/api/') ||
contains(github.event.head_commit.modified, 'shared/')
```
**GitLab rules:**
```yaml
api-test:
rules:
- changes:
- services/api/**/*
- shared/**/*
frontend-test:
rules:
- changes:
- services/frontend/**/*
- shared/**/*
```
### Matrix Builds
Test across multiple versions/platforms:
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
node: [18, 20, 22]
include:
- os: ubuntu-latest
node: 22
coverage: true
exclude:
- os: windows-latest
node: 18
fail-fast: false # See all results
```
**GitLab parallel:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
OS: ['ubuntu', 'alpine']
```
---
## Testing in CI/CD
### Test Pyramid Strategy
Maintain proper test distribution:
```
/\
/E2E\ 10% - Slow, expensive, flaky
/-----\
/ Int \ 20% - Medium speed
/--------\
/ Unit \ 70% - Fast, reliable
/------------\
```
**Implementation:**
```yaml
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- run: npm run test:unit # Fast, runs on every commit
integration-test:
runs-on: ubuntu-latest
needs: unit-test
steps:
- run: npm run test:integration # Medium, after unit tests
e2e-test:
runs-on: ubuntu-latest
needs: [unit-test, integration-test]
if: github.ref == 'refs/heads/main' # Only on main branch
steps:
- run: npm run test:e2e # Slow, only on main
```
### Test Splitting & Parallelization
Split large test suites:
**GitHub Actions:**
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
**Playwright example:**
```yaml
strategy:
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
```
### Code Coverage
**Track coverage trends:**
```yaml
- name: Run tests with coverage
run: npm test -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v4
with:
files: ./coverage/lcov.info
fail_ci_if_error: true # Fail if upload fails
- name: Coverage check
run: |
COVERAGE=$(jq -r '.total.lines.pct' coverage/coverage-summary.json)
if (( $(echo "$COVERAGE < 80" | bc -l) )); then
echo "Coverage $COVERAGE% is below 80%"
exit 1
fi
```
### Test Environment Management
**Docker Compose for services:**
```yaml
jobs:
integration-test:
runs-on: ubuntu-latest
steps:
- name: Start services
run: docker-compose up -d postgres redis
- name: Wait for services
run: |
timeout 30 bash -c 'until docker-compose exec -T postgres pg_isready; do sleep 1; done'
- name: Run tests
run: npm run test:integration
- name: Cleanup
if: always()
run: docker-compose down
```
**GitLab services:**
```yaml
integration-test:
services:
- postgres:15
- redis:7-alpine
variables:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: password
script:
- npm run test:integration
```
---
## Deployment Strategies
### Deployment Patterns
**1. Direct Deployment (Simple)**
```yaml
deploy:
if: github.ref == 'refs/heads/main'
steps:
- run: |
aws s3 sync dist/ s3://${{ secrets.S3_BUCKET }}
aws cloudfront create-invalidation --distribution-id ${{ secrets.CF_DIST }}
```
**2. Blue-Green Deployment**
```yaml
deploy:
steps:
- name: Deploy to staging slot
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
- name: Health check
run: |
for i in {1..10}; do
if curl -f https://$APP.azurewebsites.net/health; then
echo "Health check passed"
exit 0
fi
sleep 10
done
exit 1
- name: Rollback on failure
if: failure()
run: az webapp deployment slot swap --slot staging --resource-group $RG --name $APP
```
**3. Canary Deployment**
```yaml
deploy-canary:
steps:
- run: kubectl set image deployment/app app=myapp:${{ github.sha }}
- run: kubectl patch deployment app -p '{"spec":{"replicas":1}}' # 1 pod
- run: sleep 300 # Monitor for 5 minutes
- run: kubectl scale deployment app --replicas=10 # Scale to full
```
### Environment Management
**GitHub Environments:**
```yaml
jobs:
deploy-staging:
environment:
name: staging
url: https://staging.example.com
steps:
- run: ./deploy.sh staging
deploy-production:
needs: deploy-staging
environment:
name: production
url: https://example.com
steps:
- run: ./deploy.sh production
```
**Protection rules:**
- Require approval for production
- Restrict to specific branches
- Add deployment delay
**GitLab environments:**
```yaml
deploy:staging:
stage: deploy
environment:
name: staging
url: https://staging.example.com
on_stop: stop:staging
only:
- develop
deploy:production:
stage: deploy
environment:
name: production
url: https://example.com
when: manual # Require manual trigger
only:
- main
```
### Deployment Gates
**Pre-deployment checks:**
```yaml
pre-deploy-checks:
steps:
- name: Check migration status
run: ./scripts/check-migrations.sh
- name: Verify dependencies
run: npm audit --audit-level=high
- name: Check service health
run: curl -f https://api.example.com/health
```
**Post-deployment validation:**
```yaml
post-deploy-validation:
needs: deploy
steps:
- name: Smoke tests
run: npm run test:smoke
- name: Monitor errors
run: |
ERROR_COUNT=$(datadog-api errors --since 5m)
if [ $ERROR_COUNT -gt 10 ]; then
echo "Error spike detected!"
exit 1
fi
```
---
## Dependency Management
### Lock Files
**Always commit lock files:**
- `package-lock.json` (npm)
- `yarn.lock` (Yarn)
- `pnpm-lock.yaml` (pnpm)
- `Cargo.lock` (Rust)
- `Gemfile.lock` (Ruby)
- `poetry.lock` (Python)
**Use deterministic install commands:**
```bash
# Good - uses lock file
npm ci # Not npm install
yarn install --frozen-lockfile
pnpm install --frozen-lockfile
pip install -r requirements.txt
# Bad - updates lock file
npm install
```
### Dependency Caching
**See optimization.md for detailed caching strategies**
Quick reference:
- Hash lock files for cache keys
- Include OS/platform in cache key
- Use restore-keys for partial matches
- Separate cache for build artifacts vs dependencies
### Security Scanning
**Automated vulnerability checks:**
```yaml
security-scan:
steps:
- name: Dependency audit
run: |
npm audit --audit-level=high
# Or: pip-audit, cargo audit, bundle audit
- name: SAST scanning
uses: github/codeql-action/analyze@v3
- name: Container scanning
run: trivy image myapp:${{ github.sha }}
```
### Dependency Updates
**Automated dependency updates:**
- Dependabot (GitHub)
- Renovate
- GitLab Dependency Scanning
**Configuration example (Dependabot):**
```yaml
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: npm
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 5
groups:
dev-dependencies:
dependency-type: development
```
---
## Artifact & Release Management
### Artifact Strategy
**Build once, deploy many:**
```yaml
build:
steps:
- run: npm run build
- uses: actions/upload-artifact@v4
with:
name: dist-${{ github.sha }}
path: dist/
retention-days: 7
deploy-staging:
needs: build
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
- run: ./deploy.sh staging
deploy-production:
needs: [build, deploy-staging]
steps:
- uses: actions/download-artifact@v4
with:
name: dist-${{ github.sha }}
- run: ./deploy.sh production
```
### Container Image Management
**Multi-stage builds:**
```dockerfile
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]
```
**Image tagging strategy:**
```yaml
- name: Build and tag images
run: |
docker build -t myapp:${{ github.sha }} .
docker tag myapp:${{ github.sha }} myapp:latest
docker tag myapp:${{ github.sha }} myapp:v1.2.3
```
### Release Automation
**Semantic versioning:**
```yaml
release:
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/create-release@v1
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
body: |
Changes in this release:
${{ github.event.head_commit.message }}
```
**Changelog generation:**
```yaml
- name: Generate changelog
run: |
git log $(git describe --tags --abbrev=0)..HEAD \
--pretty=format:"- %s (%h)" > CHANGELOG.md
```
---
## Platform Patterns
### GitHub Actions
**Reusable workflows:**
```yaml
# .github/workflows/reusable-test.yml
on:
workflow_call:
inputs:
node-version:
required: true
type: string
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/setup-node@v4
with:
node-version: ${{ inputs.node-version }}
- run: npm test
```
**Composite actions:**
```yaml
# .github/actions/setup-app/action.yml
name: Setup Application
runs:
using: composite
steps:
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
shell: bash
```
### GitLab CI
**Templates & extends:**
```yaml
.test_template:
image: node:20
before_script:
- npm ci
script:
- npm test
unit-test:
extends: .test_template
script:
- npm run test:unit
integration-test:
extends: .test_template
script:
- npm run test:integration
```
**Dynamic child pipelines:**
```yaml
generate-pipeline:
script:
- ./generate-config.sh > pipeline.yml
artifacts:
paths:
- pipeline.yml
trigger-pipeline:
trigger:
include:
- artifact: pipeline.yml
job: generate-pipeline
```
---
## Continuous Improvement
### Metrics to Track
- **Build duration:** Target < 10 minutes
- **Failure rate:** Target < 5%
- **Time to recovery:** Target < 1 hour
- **Deployment frequency:** Aim for multiple/day
- **Lead time:** Commit to production < 1 day
### Pipeline Optimization Checklist
- [ ] Jobs run in parallel where possible
- [ ] Dependencies are cached
- [ ] Test suite is properly split
- [ ] Linting fails fast
- [ ] Only necessary tests run on PRs
- [ ] Artifacts are reused across jobs
- [ ] Pipeline has appropriate timeouts
- [ ] Flaky tests are identified and fixed
- [ ] Security scanning is automated
- [ ] Deployment requires approval
### Regular Reviews
**Monthly:**
- Review build duration trends
- Analyze failure patterns
- Update dependencies
- Review security scan results
**Quarterly:**
- Audit pipeline efficiency
- Review deployment frequency
- Update CI/CD tools and actions
- Team retrospective on CI/CD pain points

862
references/devsecops.md Normal file
View File

@@ -0,0 +1,862 @@
# DevSecOps in CI/CD
Comprehensive guide to integrating security into CI/CD pipelines with SAST, DAST, SCA, and security gates.
## Table of Contents
- [Shift-Left Security](#shift-left-security)
- [SAST (Static Application Security Testing)](#sast-static-application-security-testing)
- [DAST (Dynamic Application Security Testing)](#dast-dynamic-application-security-testing)
- [SCA (Software Composition Analysis)](#sca-software-composition-analysis)
- [Container Security](#container-security)
- [Secret Scanning](#secret-scanning)
- [Security Gates & Quality Gates](#security-gates--quality-gates)
- [Compliance & License Scanning](#compliance--license-scanning)
---
## Shift-Left Security
**Core principle:** Integrate security testing early in the development lifecycle, not just before production.
**Security testing stages in CI/CD:**
```
Commit → SAST → Unit Tests → SCA → Build → Container Scan → Deploy to Test → DAST → Production
↓ ↓ ↓ ↓ ↓
Secret Code Dependency Docker Dynamic Security
Scan Analysis Vuln Check Image Scan App Testing Gates
```
**Benefits:**
- Find vulnerabilities early (cheaper to fix)
- Faster feedback to developers
- Reduce security debt
- Prevent vulnerable code from reaching production
---
## SAST (Static Application Security Testing)
Analyzes source code, bytecode, or binaries for security vulnerabilities without executing the application.
### Tools by Language
| Language | Tools | GitHub Actions | GitLab CI |
|----------|-------|----------------|-----------|
| **Multi-language** | CodeQL, Semgrep, SonarQube | ✅ | ✅ |
| **JavaScript/TypeScript** | ESLint (security plugins), NodeJsScan | ✅ | ✅ |
| **Python** | Bandit, Pylint, Safety | ✅ | ✅ |
| **Go** | Gosec, GoSec Scanner | ✅ | ✅ |
| **Java** | SpotBugs, FindSecBugs, PMD | ✅ | ✅ |
| **C#/.NET** | Security Code Scan, Roslyn Analyzers | ✅ | ✅ |
### CodeQL (GitHub)
**GitHub Actions:**
```yaml
name: CodeQL Analysis
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * 1' # Weekly scan
jobs:
analyze:
name: Analyze Code
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ['javascript', 'python']
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
queries: security-extended
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
```
**Key features:**
- Supports 10+ languages
- Deep semantic analysis
- Low false positive rate
- Integrates with GitHub Security tab
- Custom query support
### Semgrep
**GitHub Actions:**
```yaml
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/owasp-top-ten
p/cwe-top-25
```
**GitLab CI:**
```yaml
semgrep:
stage: test
image: returntocorp/semgrep
script:
- semgrep --config=auto --sarif --output=semgrep.sarif .
artifacts:
reports:
sast: semgrep.sarif
```
**Benefits:**
- Fast (runs in seconds)
- Highly customizable rules
- Multi-language support
- CI-native design
### Language-Specific SAST
**Python - Bandit:**
```yaml
# GitHub Actions
- name: Run Bandit
run: |
pip install bandit
bandit -r src/ -f json -o bandit-report.json
bandit -r src/ --exit-zero -ll # Only high severity fails build
# GitLab CI
bandit:
stage: test
image: python:3.11
script:
- pip install bandit
- bandit -r src/ -ll -f gitlab > bandit-report.json
artifacts:
reports:
sast: bandit-report.json
```
**JavaScript - ESLint Security Plugin:**
```yaml
# GitHub Actions
- name: Run ESLint Security
run: |
npm install eslint-plugin-security
npx eslint . --plugin=security --format=json --output-file=eslint-security.json
```
**Go - Gosec:**
```yaml
# GitHub Actions
- name: Run Gosec
uses: securego/gosec@master
with:
args: '-fmt sarif -out gosec.sarif ./...'
# GitLab CI
gosec:
stage: test
image: securego/gosec:latest
script:
- gosec -fmt json -out gosec-report.json ./...
artifacts:
reports:
sast: gosec-report.json
```
### SonarQube/SonarCloud
**GitHub Actions:**
```yaml
- name: SonarCloud Scan
uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
with:
args: >
-Dsonar.projectKey=my-project
-Dsonar.organization=my-org
-Dsonar.sources=src
-Dsonar.tests=tests
-Dsonar.python.coverage.reportPaths=coverage.xml
```
**GitLab CI:**
```yaml
sonarqube:
stage: test
image: sonarsource/sonar-scanner-cli:latest
script:
- sonar-scanner
-Dsonar.projectKey=$CI_PROJECT_NAME
-Dsonar.sources=src
-Dsonar.host.url=$SONAR_HOST_URL
-Dsonar.login=$SONAR_TOKEN
```
---
## DAST (Dynamic Application Security Testing)
Tests running applications for vulnerabilities by simulating attacks.
### OWASP ZAP
**Full scan workflow (GitHub Actions):**
```yaml
name: DAST Scan
on:
schedule:
- cron: '0 3 * * 1' # Weekly scan
workflow_dispatch:
jobs:
dast:
runs-on: ubuntu-latest
services:
app:
image: myapp:latest
ports:
- 8080:8080
steps:
- name: Wait for app to start
run: |
timeout 60 bash -c 'until curl -f http://localhost:8080/health; do sleep 2; done'
- name: ZAP Baseline Scan
uses: zaproxy/action-baseline@v0.10.0
with:
target: 'http://localhost:8080'
rules_file_name: '.zap/rules.tsv'
fail_action: true
- name: Upload ZAP report
if: always()
uses: actions/upload-artifact@v4
with:
name: zap-report
path: report_html.html
```
**GitLab CI:**
```yaml
dast:
stage: test
image: owasp/zap2docker-stable
services:
- name: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
alias: testapp
script:
# Baseline scan
- zap-baseline.py -t http://testapp:8080 -r zap-report.html -J zap-report.json
artifacts:
when: always
paths:
- zap-report.html
- zap-report.json
reports:
dast: zap-report.json
only:
- schedules
- main
```
**ZAP scan types:**
1. **Baseline Scan** (Fast, ~1-2 min)
```bash
zap-baseline.py -t https://staging.example.com -r report.html
```
- Passive scanning only
- No active attacks
- Good for PR checks
2. **Full Scan** (Comprehensive, 10-60 min)
```bash
zap-full-scan.py -t https://staging.example.com -r report.html
```
- Active + Passive scanning
- Attempts exploits
- Use on staging only
3. **API Scan**
```bash
zap-api-scan.py -t https://api.example.com/openapi.json -f openapi -r report.html
```
- For REST APIs
- OpenAPI/Swagger support
### Other DAST Tools
**Nuclei:**
```yaml
- name: Run Nuclei
uses: projectdiscovery/nuclei-action@main
with:
target: https://staging.example.com
templates: cves,vulnerabilities,exposures
```
**Nikto (Web server scanner):**
```yaml
nikto:
stage: dast
image: sullo/nikto
script:
- nikto -h http://testapp:8080 -Format json -output nikto-report.json
```
---
## SCA (Software Composition Analysis)
Identifies vulnerabilities in third-party dependencies and libraries.
### Dependency Scanning
**GitHub Dependabot (Built-in):**
```yaml
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "npm"
directory: "/"
schedule:
interval: "weekly"
open-pull-requests-limit: 10
- package-ecosystem: "pip"
directory: "/"
schedule:
interval: "weekly"
```
**GitHub Actions - Dependency Review:**
```yaml
- name: Dependency Review
uses: actions/dependency-review-action@v4
with:
fail-on-severity: high
```
**npm audit:**
```yaml
- name: npm audit
run: |
npm audit --audit-level=high
# Or with audit-ci for better control
npx audit-ci --high
```
**pip-audit (Python):**
```yaml
- name: Python Security Check
run: |
pip install pip-audit
pip-audit --requirement requirements.txt --format json --output pip-audit.json
```
**Snyk:**
```yaml
# GitHub Actions
- name: Run Snyk
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high --fail-on=all
# GitLab CI
snyk:
stage: test
image: snyk/snyk:node
script:
- snyk test --severity-threshold=high --json-file-output=snyk-report.json
artifacts:
reports:
dependency_scanning: snyk-report.json
```
**OWASP Dependency-Check:**
```yaml
- name: OWASP Dependency Check
run: |
wget https://github.com/jeremylong/DependencyCheck/releases/download/v8.4.0/dependency-check-8.4.0-release.zip
unzip dependency-check-8.4.0-release.zip
./dependency-check/bin/dependency-check.sh \
--scan . \
--format JSON \
--out dependency-check-report.json \
--failOnCVSS 7
```
### GitLab Dependency Scanning (Built-in)
```yaml
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
dependency_scanning:
variables:
DS_EXCLUDED_PATHS: "test/,tests/,spec/,vendor/"
```
---
## Container Security
### Image Scanning
**Trivy (Comprehensive):**
```yaml
# GitHub Actions
- name: Run Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1'
- name: Upload to Security tab
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: 'trivy-results.sarif'
# GitLab CI
trivy:
stage: test
image: aquasec/trivy:latest
script:
- trivy image --severity HIGH,CRITICAL --format json --output trivy-report.json $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
- trivy image --severity HIGH,CRITICAL --exit-code 1 $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
artifacts:
reports:
container_scanning: trivy-report.json
```
**Grype:**
```yaml
- name: Scan with Grype
uses: anchore/scan-action@v3
with:
image: myapp:latest
fail-build: true
severity-cutoff: high
output-format: sarif
- name: Upload Grype results
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: ${{ steps.scan.outputs.sarif }}
```
**Clair:**
```yaml
clair:
stage: scan
image: arminc/clair-scanner:latest
script:
- clair-scanner --ip $(hostname -i) myapp:latest
```
### SBOM (Software Bill of Materials)
**Syft:**
```yaml
- name: Generate SBOM
uses: anchore/sbom-action@v0
with:
image: myapp:${{ github.sha }}
format: spdx-json
output-file: sbom.spdx.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.spdx.json
```
**CycloneDX:**
```yaml
- name: Generate CycloneDX SBOM
run: |
npm install -g @cyclonedx/cyclonedx-npm
cyclonedx-npm --output-file sbom.json
```
---
## Secret Scanning
### Pre-commit Prevention
**TruffleHog:**
```yaml
# GitHub Actions
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
# GitLab CI
trufflehog:
stage: test
image: trufflesecurity/trufflehog:latest
script:
- trufflehog filesystem . --json --fail > trufflehog-report.json
```
**Gitleaks:**
```yaml
# GitHub Actions
- name: Gitleaks
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# GitLab CI
gitleaks:
stage: test
image: zricethezav/gitleaks:latest
script:
- gitleaks detect --source . --report-format json --report-path gitleaks-report.json
```
**GitGuardian:**
```yaml
- name: GitGuardian scan
uses: GitGuardian/ggshield-action@master
env:
GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
```
### GitHub Secret Scanning (Native)
Enable in: **Settings → Code security and analysis → Secret scanning**
- Automatic detection
- Partner patterns (AWS, Azure, GCP, etc.)
- Push protection (prevents commits with secrets)
---
## Security Gates & Quality Gates
### Fail Pipeline on Security Issues
**Threshold-based gates:**
```yaml
security-gate:
stage: gate
script:
# Check vulnerability count
- |
CRITICAL=$(jq '.vulnerabilities | map(select(.severity=="CRITICAL")) | length' trivy-report.json)
HIGH=$(jq '.vulnerabilities | map(select(.severity=="HIGH")) | length' trivy-report.json)
echo "Critical: $CRITICAL, High: $HIGH"
if [ "$CRITICAL" -gt 0 ]; then
echo "❌ CRITICAL vulnerabilities found!"
exit 1
fi
if [ "$HIGH" -gt 5 ]; then
echo "❌ Too many HIGH vulnerabilities: $HIGH"
exit 1
fi
```
**SonarQube Quality Gate:**
```yaml
- name: Check Quality Gate
run: |
STATUS=$(curl -u $SONAR_TOKEN: "$SONAR_HOST/api/qualitygates/project_status?projectKey=$PROJECT_KEY" | jq -r '.projectStatus.status')
if [ "$STATUS" != "OK" ]; then
echo "Quality gate failed: $STATUS"
exit 1
fi
```
### Manual Approval for Production
**GitHub Actions:**
```yaml
deploy-production:
runs-on: ubuntu-latest
needs: [sast, dast, container-scan]
environment:
name: production
# Requires manual approval in Settings → Environments
steps:
- run: echo "Deploying to production"
```
**GitLab CI:**
```yaml
deploy:production:
stage: deploy
needs: [sast, dast, container_scanning]
script:
- ./deploy.sh production
when: manual
only:
- main
```
---
## Compliance & License Scanning
### License Compliance
**FOSSology:**
```yaml
license-scan:
stage: compliance
image: fossology/fossology:latest
script:
- fossology --scan ./src
```
**License Finder:**
```yaml
- name: Check Licenses
run: |
gem install license_finder
license_finder --decisions-file .license_finder.yml
```
**npm license checker:**
```yaml
- name: License Check
run: |
npx license-checker --production --onlyAllow "MIT;Apache-2.0;BSD-3-Clause;ISC"
```
### Policy as Code
**Open Policy Agent (OPA):**
```yaml
policy-check:
stage: gate
image: openpolicyagent/opa:latest
script:
- opa test policies/
- opa eval --data policies/ --input violations.json "data.security.allow"
```
---
## Complete DevSecOps Pipeline
**Comprehensive example (GitHub Actions):**
```yaml
name: DevSecOps Pipeline
on: [push, pull_request]
jobs:
# Stage 1: Secret Scanning
secret-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: trufflesecurity/trufflehog@main
# Stage 2: SAST
sast:
runs-on: ubuntu-latest
needs: secret-scan
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
# Stage 3: SCA
sca:
runs-on: ubuntu-latest
needs: secret-scan
steps:
- uses: actions/checkout@v4
- run: npm audit --audit-level=high
- uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
# Stage 4: Build & Container Scan
build-scan:
runs-on: ubuntu-latest
needs: [sast, sca]
steps:
- uses: actions/checkout@v4
- run: docker build -t myapp:${{ github.sha }} .
- uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
exit-code: '1'
# Stage 5: DAST
dast:
runs-on: ubuntu-latest
needs: build-scan
if: github.ref == 'refs/heads/main'
steps:
- uses: zaproxy/action-baseline@v0.10.0
with:
target: 'https://staging.example.com'
# Stage 6: Security Gate
security-gate:
runs-on: ubuntu-latest
needs: [sast, sca, build-scan, dast]
steps:
- run: echo "All security checks passed!"
- run: echo "Ready for deployment"
# Stage 7: Deploy
deploy:
runs-on: ubuntu-latest
needs: security-gate
environment: production
steps:
- run: echo "Deploying to production"
```
---
## Best Practices
### 1. Fail Fast
- Run secret scanning first
- Run SAST early in pipeline
- Block PRs with critical vulnerabilities
### 2. Balance Speed vs Security
- SAST/SCA on every PR (fast)
- Container scanning after build
- DAST on schedules or staging only (slow)
### 3. Prioritize Findings
**Focus on:**
- Critical/High severity
- Exploitable vulnerabilities
- Direct dependencies (not transitive)
- Public-facing components
### 4. Developer Experience
- Clear error messages
- Link to remediation guidance
- Don't overwhelm with noise
- Use quality gates, not just fail/pass
### 5. Continuous Improvement
- Track security debt over time
- Set SLAs for vulnerability remediation
- Regular tool evaluation
- Security training for developers
### 6. Reporting & Metrics
**Track:**
- Mean Time to Remediate (MTTR)
- Vulnerability backlog
- False positive rate
- Coverage (% of code scanned)
```yaml
- name: Generate Security Report
run: |
echo "## Security Scan Summary" >> $GITHUB_STEP_SUMMARY
echo "- SAST: ✅ Passed" >> $GITHUB_STEP_SUMMARY
echo "- SCA: ⚠️ 3 vulnerabilities" >> $GITHUB_STEP_SUMMARY
echo "- Container: ✅ Passed" >> $GITHUB_STEP_SUMMARY
echo "- DAST: 🔄 Scheduled" >> $GITHUB_STEP_SUMMARY
```
---
## Tool Comparison
| Category | Tool | Speed | Accuracy | Cost | Best For |
|----------|------|-------|----------|------|----------|
| **SAST** | CodeQL | Medium | High | Free (GH) | Deep analysis |
| | Semgrep | Fast | Medium | Free/Paid | Custom rules |
| | SonarQube | Medium | High | Free/Paid | Quality + Security |
| **DAST** | OWASP ZAP | Medium | High | Free | Web apps |
| | Burp Suite | Slow | High | Paid | Professional |
| **SCA** | Snyk | Fast | High | Free/Paid | Easy integration |
| | Dependabot | Fast | Medium | Free (GH) | Auto PRs |
| **Container** | Trivy | Fast | High | Free | Fast scans |
| | Grype | Fast | High | Free | SBOM support |
| **Secrets** | TruffleHog | Fast | High | Free/Paid | Git history |
| | GitGuardian | Fast | High | Paid | Real-time |
---
## Security Scanning Schedule
**Recommended frequency:**
| Scan Type | PR | Main Branch | Schedule | Notes |
|-----------|----|-----------|-----------| ------|
| Secret Scanning | ✅ Every | ✅ Every | - | Fast, critical |
| SAST | ✅ Every | ✅ Every | - | Fast, essential |
| SCA | ✅ Every | ✅ Every | Weekly | Check dependencies |
| Linting | ✅ Every | ✅ Every | - | Very fast |
| Container Scan | ❌ No | ✅ Every | - | After build |
| DAST Baseline | ❌ No | ✅ Every | - | Medium speed |
| DAST Full | ❌ No | ❌ No | Weekly | Very slow |
| Penetration Test | ❌ No | ❌ No | Quarterly | Manual |
---
## Security Checklist
- [ ] Secret scanning enabled and running
- [ ] SAST configured for all languages used
- [ ] Dependency scanning (SCA) enabled
- [ ] Container images scanned before deployment
- [ ] DAST running on staging environment
- [ ] Security findings triaged in issue tracker
- [ ] Quality gates prevent vulnerable deployments
- [ ] SBOM generated for releases
- [ ] Security scan results tracked over time
- [ ] Vulnerability remediation SLAs defined
- [ ] Security training for developers
- [ ] Incident response plan documented

651
references/optimization.md Normal file
View File

@@ -0,0 +1,651 @@
# CI/CD Pipeline Optimization
Comprehensive guide to improving pipeline performance through caching, parallelization, and smart resource usage.
## Table of Contents
- [Caching Strategies](#caching-strategies)
- [Parallelization Techniques](#parallelization-techniques)
- [Build Optimization](#build-optimization)
- [Test Optimization](#test-optimization)
- [Resource Management](#resource-management)
- [Monitoring & Metrics](#monitoring--metrics)
---
## Caching Strategies
### Dependency Caching
**Impact:** Can reduce build times by 50-90%
#### GitHub Actions
**Node.js/npm:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- run: npm ci
```
**Python/pip:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- run: pip install -r requirements.txt
```
**Go modules:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-
- run: go build
```
**Rust/Cargo:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- run: cargo build --release
```
**Maven:**
```yaml
- uses: actions/cache@v4
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-maven-
- run: mvn clean install
```
#### GitLab CI
**Global cache:**
```yaml
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- .npm/
- vendor/
```
**Job-specific cache:**
```yaml
build:
cache:
key: build-${CI_COMMIT_REF_SLUG}
paths:
- target/
policy: push # Upload only
test:
cache:
key: build-${CI_COMMIT_REF_SLUG}
paths:
- target/
policy: pull # Download only
```
**Cache with files checksum:**
```yaml
cache:
key:
files:
- package-lock.json
- yarn.lock
paths:
- node_modules/
```
### Build Artifact Caching
**Docker layer caching (GitHub):**
```yaml
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
context: .
cache-from: type=gha
cache-to: type=gha,mode=max
push: false
tags: myapp:latest
```
**Docker layer caching (GitLab):**
```yaml
build:
image: docker:latest
services:
- docker:dind
variables:
DOCKER_DRIVER: overlay2
script:
- docker pull $CI_REGISTRY_IMAGE:latest || true
- docker build --cache-from $CI_REGISTRY_IMAGE:latest -t $CI_REGISTRY_IMAGE:latest .
- docker push $CI_REGISTRY_IMAGE:latest
```
**Gradle build cache:**
```yaml
- uses: actions/cache@v4
with:
path: |
~/.gradle/caches
~/.gradle/wrapper
key: ${{ runner.os }}-gradle-${{ hashFiles('**/*.gradle*', '**/gradle-wrapper.properties') }}
- run: ./gradlew build --build-cache
```
### Cache Best Practices
**Key strategies:**
- Include OS/platform: `${{ runner.os }}-` or `${CI_RUNNER_OS}`
- Hash lock files: `hashFiles('**/package-lock.json')`
- Use restore-keys for fallback matches
- Separate caches for different purposes
**Cache invalidation:**
```yaml
# Version in cache key
cache:
key: v2-${CI_COMMIT_REF_SLUG}-${CI_PIPELINE_ID}
```
**Cache size management:**
- GitHub: 10GB per repository (LRU eviction after 7 days)
- GitLab: Configurable per runner
---
## Parallelization Techniques
### Job Parallelization
**Remove unnecessary dependencies:**
```yaml
# Before - Sequential
jobs:
lint:
test:
needs: lint
build:
needs: test
# After - Parallel
jobs:
lint:
test:
build:
needs: [lint, test] # Only wait for what's needed
```
### Matrix Builds
**GitHub Actions:**
```yaml
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
node: [18, 20, 22]
include:
- os: ubuntu-latest
node: 22
coverage: true
exclude:
- os: macos-latest
node: 18
fail-fast: false
max-parallel: 10 # Limit concurrent jobs
```
**GitLab parallel:**
```yaml
test:
parallel:
matrix:
- NODE_VERSION: ['18', '20', '22']
TEST_SUITE: ['unit', 'integration']
script:
- nvm use $NODE_VERSION
- npm run test:$TEST_SUITE
```
### Test Splitting
**Jest sharding:**
```yaml
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
```
**Playwright sharding:**
```yaml
strategy:
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
```
**Pytest splitting:**
```yaml
strategy:
matrix:
group: [1, 2, 3, 4]
steps:
- run: pytest --splits 4 --group ${{ matrix.group }}
```
### Conditional Execution
**Path-based:**
```yaml
jobs:
frontend-test:
if: contains(github.event.head_commit.modified, 'frontend/')
backend-test:
if: contains(github.event.head_commit.modified, 'backend/')
```
**GitLab rules:**
```yaml
frontend-test:
rules:
- changes:
- frontend/**/*
backend-test:
rules:
- changes:
- backend/**/*
```
---
## Build Optimization
### Incremental Builds
**Turb
orepo (monorepo):**
```yaml
- run: npx turbo run build test lint --filter=[HEAD^1]
```
**Nx (monorepo):**
```yaml
- run: npx nx affected --target=build --base=origin/main
```
### Compiler Optimizations
**TypeScript incremental:**
```json
{
"compilerOptions": {
"incremental": true,
"tsBuildInfoFile": ".tsbuildinfo"
}
}
```
**Cache tsbuildinfo:**
```yaml
- uses: actions/cache@v4
with:
path: .tsbuildinfo
key: ts-build-${{ hashFiles('**/*.ts') }}
```
### Multi-stage Docker Builds
```dockerfile
# Build stage
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]
```
### Build Tool Configuration
**Webpack production mode:**
```javascript
module.exports = {
mode: 'production',
optimization: {
minimize: true,
splitChunks: {
chunks: 'all'
}
}
}
```
**Vite optimization:**
```javascript
export default {
build: {
minify: 'terser',
rollupOptions: {
output: {
manualChunks(id) {
if (id.includes('node_modules')) {
return 'vendor';
}
}
}
}
}
}
```
---
## Test Optimization
### Test Categorization
**Run fast tests first:**
```yaml
jobs:
unit-test:
runs-on: ubuntu-latest
steps:
- run: npm run test:unit # Fast (1-5 min)
integration-test:
needs: unit-test
runs-on: ubuntu-latest
steps:
- run: npm run test:integration # Medium (5-15 min)
e2e-test:
needs: [unit-test, integration-test]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- run: npm run test:e2e # Slow (15-30 min)
```
### Selective Test Execution
**Run only changed:**
```yaml
- name: Get changed files
id: changed
run: |
if [ "${{ github.event_name }}" == "pull_request" ]; then
echo "files=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | tr '\n' ' ')" >> $GITHUB_OUTPUT
fi
- name: Run affected tests
if: steps.changed.outputs.files
run: npm test -- --findRelatedTests ${{ steps.changed.outputs.files }}
```
### Test Fixtures & Data
**Reuse test databases:**
```yaml
services:
postgres:
image: postgres:15
env:
POSTGRES_DB: testdb
POSTGRES_PASSWORD: testpass
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- run: npm test # All tests share same DB
```
**Snapshot testing:**
```javascript
// Faster than full rendering tests
expect(component).toMatchSnapshot();
```
### Mock External Services
```javascript
// Instead of hitting real APIs
jest.mock('./api', () => ({
fetchData: jest.fn(() => Promise.resolve(mockData))
}));
```
---
## Resource Management
### Job Timeouts
**Prevent hung jobs:**
```yaml
jobs:
test:
timeout-minutes: 30 # Default: 360 (6 hours)
build:
timeout-minutes: 15
```
**GitLab:**
```yaml
test:
timeout: 30m # Default: 1h
```
### Concurrency Control
**GitHub Actions:**
```yaml
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true # Cancel old runs
```
**GitLab:**
```yaml
workflow:
auto_cancel:
on_new_commit: interruptible
job:
interruptible: true
```
### Resource Allocation
**GitLab runner tags:**
```yaml
build:
tags:
- high-memory
- ssd
```
**Kubernetes resource limits:**
```yaml
# GitLab Runner config
[[runners]]
[runners.kubernetes]
cpu_request = "1"
cpu_limit = "2"
memory_request = "2Gi"
memory_limit = "4Gi"
```
---
## Monitoring & Metrics
### Track Key Metrics
**Build duration:**
```yaml
- name: Track duration
run: |
START=$SECONDS
npm run build
DURATION=$((SECONDS - START))
echo "Build took ${DURATION}s"
```
**Cache hit rate:**
```yaml
- uses: actions/cache@v4
id: cache
with:
path: node_modules
key: ${{ hashFiles('package-lock.json') }}
- name: Cache stats
run: |
if [ "${{ steps.cache.outputs.cache-hit }}" == "true" ]; then
echo "Cache hit!"
else
echo "Cache miss"
fi
```
### Performance Regression Detection
**Compare against baseline:**
```yaml
- name: Benchmark
run: npm run benchmark > results.json
- name: Compare
run: |
CURRENT=$(jq '.duration' results.json)
BASELINE=120
if [ $CURRENT -gt $((BASELINE * 120 / 100)) ]; then
echo "Performance regression: ${CURRENT}s vs ${BASELINE}s baseline"
exit 1
fi
```
### External Monitoring
**DataDog CI Visibility:**
```yaml
- run: datadog-ci junit upload --service myapp junit-results.xml
```
**BuildPulse (flaky test detection):**
```yaml
- uses: buildpulse/buildpulse-action@v0.11.0
with:
account: myaccount
repository: myrepo
path: test-results/*.xml
```
---
## Optimization Checklist
### Quick Wins
- [ ] Enable dependency caching
- [ ] Remove unnecessary job dependencies
- [ ] Add job timeouts
- [ ] Enable concurrency cancellation
- [ ] Use `npm ci` instead of `npm install`
### Medium Impact
- [ ] Implement test sharding
- [ ] Use Docker layer caching
- [ ] Add path-based triggers
- [ ] Split slow test suites
- [ ] Use matrix builds for parallel execution
### Advanced
- [ ] Implement incremental builds (Nx, Turborepo)
- [ ] Use remote caching
- [ ] Optimize Docker images (multi-stage, distroless)
- [ ] Implement test impact analysis
- [ ] Set up distributed test execution
### Monitoring
- [ ] Track build duration trends
- [ ] Monitor cache hit rates
- [ ] Identify flaky tests
- [ ] Measure test execution time
- [ ] Set up performance regression alerts
---
## Performance Targets
**Build times:**
- Lint: < 1 minute
- Unit tests: < 5 minutes
- Integration tests: < 15 minutes
- E2E tests: < 30 minutes
- Full pipeline: < 20 minutes
**Resource usage:**
- Cache hit rate: > 80%
- Job success rate: > 95%
- Concurrent jobs: Balanced across available runners
- Queue time: < 2 minutes
**Cost optimization:**
- Build minutes used: Monitor monthly trends
- Storage: Keep artifacts < 7 days unless needed
- Self-hosted runners: Monitor utilization (target 60-80%)

611
references/security.md Normal file
View File

@@ -0,0 +1,611 @@
# CI/CD Security
Comprehensive guide to securing CI/CD pipelines, secrets management, and supply chain security.
## Table of Contents
- [Secrets Management](#secrets-management)
- [OIDC Authentication](#oidc-authentication)
- [Supply Chain Security](#supply-chain-security)
- [Access Control](#access-control)
- [Secure Pipeline Patterns](#secure-pipeline-patterns)
- [Vulnerability Scanning](#vulnerability-scanning)
---
## Secrets Management
### Never Commit Secrets
**Prevention methods:**
- Use `.gitignore` for sensitive files
- Enable pre-commit hooks (git-secrets, gitleaks)
- Use secret scanning (GitHub, GitLab)
**If secrets are exposed:**
1. Rotate compromised credentials immediately
2. Remove from git history: `git filter-repo` or BFG Repo-Cleaner
3. Audit access logs for unauthorized usage
### Platform Secret Stores
**GitHub Secrets:**
```yaml
# Repository, Environment, or Organization secrets
steps:
- name: Deploy
env:
API_KEY: ${{ secrets.API_KEY }}
DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
run: ./deploy.sh
```
**Secret hierarchy:**
1. Environment secrets (highest priority)
2. Repository secrets
3. Organization secrets (lowest priority)
**GitLab CI/CD Variables:**
```yaml
# Project > Settings > CI/CD > Variables
deploy:
script:
- echo $API_KEY
- deploy --token $DEPLOY_TOKEN
variables:
ENVIRONMENT: "production" # Non-secret variable
```
**Variable types:**
- **Protected:** Only available on protected branches
- **Masked:** Hidden in job logs
- **Environment scope:** Limit to specific environments
### External Secret Management
**HashiCorp Vault:**
```yaml
# GitHub Actions
- uses: hashicorp/vault-action@v3
with:
url: https://vault.example.com
method: jwt
role: cicd-role
secrets: |
secret/data/app api_key | API_KEY ;
secret/data/db password | DB_PASSWORD
```
**AWS Secrets Manager:**
```yaml
- name: Get secrets
run: |
SECRET=$(aws secretsmanager get-secret-value \
--secret-id prod/api/key \
--query SecretString --output text)
echo "::add-mask::$SECRET"
echo "API_KEY=$SECRET" >> $GITHUB_ENV
```
**Azure Key Vault:**
```yaml
- uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "my-keyvault"
secrets: 'api-key, db-password'
```
### Secret Rotation
**Implement rotation policies:**
```yaml
check-secret-age:
steps:
- name: Check secret age
run: |
CREATED=$(aws secretsmanager describe-secret \
--secret-id myapp/api-key \
--query 'CreatedDate' --output text)
AGE=$(( ($(date +%s) - $(date -d "$CREATED" +%s)) / 86400 ))
if [ $AGE -gt 90 ]; then
echo "Secret is $AGE days old, rotation required"
exit 1
fi
```
**Best practices:**
- Rotate secrets every 90 days
- Use short-lived credentials when possible
- Audit secret access logs
- Automate rotation where possible
---
## OIDC Authentication
### Why OIDC?
**Benefits over static credentials:**
- No long-lived secrets in CI/CD
- Automatic token expiration
- Fine-grained permissions
- Audit trail of authentication
### GitHub Actions OIDC
**AWS example:**
```yaml
permissions:
id-token: write # Required for OIDC
contents: read
jobs:
deploy:
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
- run: aws s3 sync dist/ s3://my-bucket
```
**AWS IAM Trust Policy:**
```json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:owner/repo:ref:refs/heads/main"
}
}
}]
}
```
**GCP example:**
```yaml
- uses: google-github-actions/auth@v2
with:
workload_identity_provider: 'projects/123/locations/global/workloadIdentityPools/github/providers/github-provider'
service_account: 'github-actions@project.iam.gserviceaccount.com'
- run: gcloud storage cp dist/* gs://my-bucket
```
**Azure example:**
```yaml
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- run: az storage blob upload-batch -d mycontainer -s dist/
```
### GitLab OIDC
**Configure ID token:**
```yaml
deploy:
id_tokens:
GITLAB_OIDC_TOKEN:
aud: https://aws.amazonaws.com
script:
- |
CREDENTIALS=$(aws sts assume-role-with-web-identity \
--role-arn $AWS_ROLE_ARN \
--role-session-name gitlab-ci \
--web-identity-token $GITLAB_OIDC_TOKEN \
--duration-seconds 3600)
```
**Vault integration:**
```yaml
deploy:
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
before_script:
- export VAULT_TOKEN=$(vault write -field=token auth/jwt/login role=cicd-role jwt=$VAULT_ID_TOKEN)
```
---
## Supply Chain Security
### Dependency Verification
**Lock files:**
- Always commit lock files
- Use `npm ci`, not `npm install`
- Enable `--frozen-lockfile` (Yarn) or `--frozen-lockfile` (pnpm)
**Checksum verification:**
```yaml
- name: Verify dependencies
run: |
npm ci --audit=true
npx lockfile-lint --path package-lock.json --validate-https
```
**SBOM generation:**
```yaml
- name: Generate SBOM
run: |
syft dir:. -o spdx-json > sbom.json
- uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.json
```
### Action/Workflow Security
**Pin to commit SHA (GitHub):**
```yaml
# Bad - mutable tag
- uses: actions/checkout@v4
# Better - specific version
- uses: actions/checkout@v4.1.0
# Best - pinned to SHA
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.0
```
**Verify action sources:**
- Only use actions from trusted sources
- Review action code before first use
- Monitor Dependabot alerts for actions
- Use verified creators when possible
**GitLab include verification:**
```yaml
include:
- project: 'security/ci-templates'
ref: 'v2.1.0' # Pin to specific version
file: '/security-scan.yml'
```
### Container Image Security
**Use specific tags:**
```yaml
# Bad
image: node:latest
# Good
image: node:20.11.0-alpine
# Best
image: node:20.11.0-alpine@sha256:abc123...
```
**Minimal base images:**
```dockerfile
# Prefer distroless or alpine
FROM gcr.io/distroless/node20-debian12
# Or alpine
FROM node:20-alpine
```
**Image scanning:**
```yaml
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan image
run: |
trivy image --severity HIGH,CRITICAL myapp:${{ github.sha }}
grype myapp:${{ github.sha }}
```
### Code Signing
**Sign commits:**
```bash
git config --global user.signingkey <key-id>
git config --global commit.gpgsign true
```
**Verify signed commits (GitHub):**
```yaml
- name: Verify signatures
run: |
git verify-commit HEAD || exit 1
```
**Sign artifacts:**
```yaml
- name: Sign release
run: |
cosign sign myregistry/myapp:${{ github.sha }}
```
---
## Access Control
### Principle of Least Privilege
**GitHub permissions:**
```yaml
# Minimal permissions
permissions:
contents: read # Only read code
pull-requests: write # Comment on PRs
jobs:
deploy:
permissions:
contents: read
id-token: write # For OIDC
```
**GitLab protected branches:**
- Configure in Settings > Repository > Protected branches
- Restrict who can push and merge
- Require approval before merge
### Branch Protection
**GitHub branch protection rules:**
- Require pull request reviews
- Require status checks to pass
- Require signed commits
- Require linear history
- Include administrators
- Restrict who can push
**GitLab merge request approval rules:**
```yaml
# .gitlab/CODEOWNERS
* @senior-devs
/infra/ @devops-team
/security/ @security-team
```
### Environment Protection
**GitHub environment rules:**
- Required reviewers (up to 6)
- Wait timer before deployment
- Deployment branches (limit to specific branches)
- Custom deployment protection rules
**GitLab deployment protection:**
```yaml
production:
environment:
name: production
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual # Require manual trigger
only:
variables:
- $APPROVED == "true"
```
### Audit Logging
**Enable audit logs:**
- GitHub: Enterprise > Settings > Audit log
- GitLab: Admin Area > Monitoring > Audit Events
**Monitor for:**
- Secret access
- Permission changes
- Workflow modifications
- Deployment approvals
---
## Secure Pipeline Patterns
### Isolate Untrusted Code
**Separate test from deploy:**
```yaml
test:
# Runs on PRs from forks
permissions:
contents: read
pull-requests: write
deploy:
if: github.event_name == 'push' # Not on PR
permissions:
contents: read
id-token: write
```
**GitLab fork protection:**
```yaml
deploy:
rules:
- if: '$CI_PROJECT_PATH == "myorg/myrepo"' # Only from main repo
- if: '$CI_COMMIT_BRANCH == "main"'
```
### Sanitize Inputs
**Avoid command injection:**
```yaml
# Bad - command injection risk
- run: echo "Title: ${{ github.event.issue.title }}"
# Good - use environment variable
- env:
TITLE: ${{ github.event.issue.title }}
run: echo "Title: $TITLE"
```
**Validate inputs:**
```yaml
- name: Validate version
run: |
if [[ ! "${{ inputs.version }}" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Invalid version format"
exit 1
fi
```
### Network Restrictions
**Limit egress:**
```yaml
# GitHub Actions with StepSecurity
- uses: step-security/harden-runner@v2
with:
egress-policy: block
allowed-endpoints: |
api.github.com:443
npmjs.org:443
```
**GitLab network policy:**
```yaml
# Kubernetes NetworkPolicy for GitLab Runner pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: gitlab-runner-policy
spec:
podSelector:
matchLabels:
app: gitlab-runner
policyTypes:
- Egress
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443
```
---
## Vulnerability Scanning
### Dependency Scanning
**npm audit:**
```yaml
- run: npm audit --audit-level=high
```
**Snyk:**
```yaml
- uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high
```
**GitLab Dependency Scanning:**
```yaml
include:
- template: Security/Dependency-Scanning.gitlab-ci.yml
```
### Static Application Security Testing (SAST)
**CodeQL (GitHub):**
```yaml
- uses: github/codeql-action/init@v3
with:
languages: javascript, python
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
```
**SonarQube:**
```yaml
- uses: sonarsource/sonarqube-scan-action@master
env:
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
```
### Container Scanning
**Trivy:**
```yaml
- run: |
docker build -t myapp .
trivy image --severity HIGH,CRITICAL --exit-code 1 myapp
```
**Grype:**
```yaml
- uses: anchore/scan-action@v3
with:
image: myapp:latest
fail-build: true
severity-cutoff: high
```
### Dynamic Application Security Testing (DAST)
**OWASP ZAP:**
```yaml
dast:
stage: test
image: owasp/zap2docker-stable
script:
- zap-baseline.py -t https://staging.example.com -r report.html
artifacts:
paths:
- report.html
```
---
## Security Checklist
### Repository Level
- [ ] Enable branch protection
- [ ] Require code review
- [ ] Enable secret scanning
- [ ] Configure CODEOWNERS
- [ ] Enable signed commits
- [ ] Audit third-party integrations
### Pipeline Level
- [ ] Use OIDC instead of static credentials
- [ ] Pin actions/includes to specific versions
- [ ] Minimize permissions
- [ ] Sanitize user inputs
- [ ] Enable vulnerability scanning
- [ ] Separate test from deploy workflows
- [ ] Add security gates
### Secrets Management
- [ ] Use platform secret stores
- [ ] Enable secret masking
- [ ] Rotate secrets regularly
- [ ] Use short-lived credentials
- [ ] Audit secret access
- [ ] Never log secrets
### Monitoring & Response
- [ ] Enable audit logging
- [ ] Monitor for security alerts
- [ ] Set up incident response plan
- [ ] Regular security reviews
- [ ] Dependency update automation
- [ ] Security training for team

View File

@@ -0,0 +1,656 @@
# CI/CD Troubleshooting
Comprehensive guide to diagnosing and resolving common CI/CD pipeline issues.
## Table of Contents
- [Pipeline Failures](#pipeline-failures)
- [Dependency Issues](#dependency-issues)
- [Docker & Container Problems](#docker--container-problems)
- [Authentication & Permissions](#authentication--permissions)
- [Performance Issues](#performance-issues)
- [Platform-Specific Issues](#platform-specific-issues)
---
## Pipeline Failures
### Workflow Not Triggering
**GitHub Actions:**
**Symptoms:** Workflow doesn't run on push/PR
**Common causes:**
1. Workflow file in wrong location (must be `.github/workflows/`)
2. Invalid YAML syntax
3. Branch/path filters excluding the changes
4. Workflow disabled in repository settings
**Diagnostics:**
```bash
# Validate YAML
yamllint .github/workflows/ci.yml
# Check if workflow is disabled
gh workflow list --repo owner/repo
```
**Solutions:**
```yaml
# Check trigger configuration
on:
push:
branches: [main] # Ensure your branch matches
paths-ignore:
- 'docs/**' # May be excluding your changes
# Enable workflow
gh workflow enable ci.yml --repo owner/repo
```
**GitLab CI:**
**Symptoms:** Pipeline doesn't start
**Diagnostics:**
```bash
# Validate .gitlab-ci.yml
gl-ci-lint < .gitlab-ci.yml
# Check CI/CD settings
# Project > Settings > CI/CD > General pipelines
```
**Solutions:**
- Check if CI/CD is enabled for the project
- Verify `.gitlab-ci.yml` is in repository root
- Check pipeline must succeed setting isn't blocking
- Review `only`/`except` or `rules` configuration
### Jobs Failing Intermittently
**Symptoms:** Same job passes sometimes, fails others
**Common causes:**
1. Flaky tests
2. Race conditions
3. Network timeouts
4. Resource constraints
5. Time-dependent tests
**Identify flaky tests:**
```yaml
# GitHub Actions - Run multiple times
strategy:
matrix:
attempt: [1, 2, 3, 4, 5]
steps:
- run: npm test
```
**Solutions:**
```javascript
// Add retries to flaky tests
jest.retryTimes(3);
// Increase timeouts
jest.setTimeout(30000);
// Fix race conditions
await waitFor(() => expect(element).toBeInDocument(), {
timeout: 5000
});
```
**Network retry pattern:**
```yaml
- name: Install with retry
uses: nick-invision/retry@v2
with:
timeout_minutes: 10
max_attempts: 3
command: npm ci
```
### Timeout Errors
**Symptoms:** "Job exceeded maximum time" or similar
**Solutions:**
```yaml
# GitHub Actions - Increase timeout
jobs:
build:
timeout-minutes: 60 # Default: 360
# GitLab CI
test:
timeout: 2h # Default: 1h
```
**Optimize long-running jobs:**
- Add caching for dependencies
- Split tests into parallel jobs
- Use faster runners
- Identify and optimize slow tests
### Exit Code Errors
**Symptoms:** "Process completed with exit code 1"
**Diagnostics:**
```yaml
# Add verbose logging
- run: npm test -- --verbose
# Check specific exit codes
- run: |
npm test
EXIT_CODE=$?
echo "Exit code: $EXIT_CODE"
if [ $EXIT_CODE -eq 127 ]; then
echo "Command not found"
elif [ $EXIT_CODE -eq 1 ]; then
echo "General error"
fi
exit $EXIT_CODE
```
**Common exit codes:**
- `1`: General error
- `2`: Misuse of shell command
- `126`: Command cannot execute
- `127`: Command not found
- `130`: Terminated by Ctrl+C
- `137`: Killed (OOM)
- `143`: Terminated (SIGTERM)
---
## Dependency Issues
### "Module not found" or "Cannot find package"
**Symptoms:** Build fails with missing dependency error
**Causes:**
1. Missing dependency in `package.json`
2. Cache corruption
3. Lock file out of sync
4. Private package access issues
**Solutions:**
```yaml
# Clear cache and reinstall
- run: rm -rf node_modules package-lock.json
- run: npm install
# Use npm ci for clean install
- run: npm ci
# Clear GitHub Actions cache
# Settings > Actions > Caches > Delete specific cache
# GitLab - clear cache
cache:
key: $CI_COMMIT_REF_SLUG
policy: push # Force new cache
```
### Version Conflicts
**Symptoms:** Dependency resolution errors, peer dependency warnings
**Diagnostics:**
```bash
# Check for conflicts
npm ls
npm outdated
# View dependency tree
npm list --depth=1
```
**Solutions:**
```json
// Use overrides (package.json)
{
"overrides": {
"problematic-package": "2.0.0"
}
}
// Or resolutions (Yarn)
{
"resolutions": {
"problematic-package": "2.0.0"
}
}
```
### Private Package Access
**Symptoms:** "401 Unauthorized" or "404 Not Found" for private packages
**GitHub Packages:**
```yaml
- run: |
echo "@myorg:registry=https://npm.pkg.github.com" >> .npmrc
echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" >> .npmrc
- run: npm ci
```
**npm Registry:**
```yaml
- run: echo "//registry.npmjs.org/:_authToken=${{ secrets.NPM_TOKEN }}" >> .npmrc
- run: npm ci
```
**GitLab Package Registry:**
```yaml
before_script:
- echo "@mygroup:registry=${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/npm/" >> .npmrc
- echo "${CI_API_V4_URL#https?}/projects/${CI_PROJECT_ID}/packages/npm/:_authToken=${CI_JOB_TOKEN}" >> .npmrc
```
---
## Docker & Container Problems
### "Cannot connect to Docker daemon"
**Symptoms:** Docker commands fail with connection error
**GitHub Actions:**
```yaml
# Ensure Docker is available
runs-on: ubuntu-latest # Has Docker pre-installed
steps:
- run: docker ps # Test Docker access
```
**GitLab CI:**
```yaml
# Use Docker-in-Docker
image: docker:latest
services:
- docker:dind
variables:
DOCKER_HOST: tcp://docker:2376
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
```
### Image Pull Errors
**Symptoms:** "Error response from daemon: pull access denied" or timeout
**Solutions:**
```yaml
# GitHub Actions - Login to registry
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# Or for Docker Hub
- uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# Add retry logic
- run: |
for i in {1..3}; do
docker pull myimage:latest && break
sleep 5
done
```
### "No space left on device"
**Symptoms:** Docker build fails with disk space error
**Solutions:**
```yaml
# GitHub Actions - Clean up space
- run: docker system prune -af --volumes
# Or use built-in action
- uses: jlumbroso/free-disk-space@main
with:
tool-cache: true
android: true
dotnet: true
# GitLab - configure runner
[[runners]]
[runners.docker]
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
[runners.docker.tmpfs]
"/tmp" = "rw,noexec"
```
### Multi-platform Build Issues
**Symptoms:** Build fails for ARM/different architecture
**Solution:**
```yaml
- uses: docker/setup-qemu-action@v3
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
platforms: linux/amd64,linux/arm64
context: .
push: false
```
---
## Authentication & Permissions
### "Permission denied" or "403 Forbidden"
**GitHub Actions:**
**Symptoms:** Cannot push, create release, or access API
**Solutions:**
```yaml
# Add necessary permissions
permissions:
contents: write # For pushing tags/releases
pull-requests: write # For commenting on PRs
packages: write # For pushing packages
id-token: write # For OIDC
# Check GITHUB_TOKEN permissions
- run: |
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
https://api.github.com/repos/${{ github.repository }}
```
**GitLab CI:**
**Symptoms:** Cannot push to repository or access API
**Solutions:**
```yaml
# Use CI_JOB_TOKEN for API access
script:
- 'curl --header "JOB-TOKEN: $CI_JOB_TOKEN" "${CI_API_V4_URL}/projects"'
# Or use personal/project access token
variables:
GIT_STRATEGY: clone
before_script:
- git config --global user.email "ci@example.com"
- git config --global user.name "CI Bot"
```
### Git Push Failures
**Symptoms:** "failed to push some refs" or "protected branch"
**Solutions:**
```yaml
# GitHub Actions - Check branch protection
# Settings > Branches > Branch protection rules
# Allow bypass
permissions:
contents: write
# Or use PAT with admin access
- uses: actions/checkout@v4
with:
token: ${{ secrets.ADMIN_PAT }}
# GitLab - Grant permissions
# Settings > Repository > Protected Branches
# Add CI/CD role with push permission
```
### AWS Credentials Issues
**Symptoms:** "Unable to locate credentials"
**Solutions:**
```yaml
# Using OIDC (recommended)
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
# Using secrets (legacy)
- uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
# Test credentials
- run: aws sts get-caller-identity
```
---
## Performance Issues
### Slow Pipeline Execution
**Diagnostics:**
```bash
# GitHub - View timing
gh run view <run-id> --log
# Identify slow steps
# Each step shows duration in UI
```
**Solutions:**
- See [optimization.md](optimization.md) for comprehensive guide
- Add dependency caching
- Parallelize independent jobs
- Use faster runners
- Reduce test scope on PRs
### Cache Not Working
**Symptoms:** Cache always misses, builds still slow
**Diagnostics:**
```yaml
- uses: actions/cache@v4
id: cache
with:
path: node_modules
key: ${{ hashFiles('**/package-lock.json') }}
- run: echo "Cache hit: ${{ steps.cache.outputs.cache-hit }}"
```
**Common issues:**
1. Key changes every time
2. Path doesn't exist
3. Cache size exceeds limit
4. Cache evicted (LRU after 7 days on GitHub)
**Solutions:**
```yaml
# Use consistent key
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
# Add restore-keys for partial match
restore-keys: |
${{ runner.os }}-node-
# Check cache size
- run: du -sh node_modules
```
---
## Platform-Specific Issues
### GitHub Actions
**"Resource not accessible by integration":**
```yaml
# Add required permission
permissions:
issues: write # Or whatever resource you're accessing
```
**"Workflow is not shared":**
- Reusable workflows must be in `.github/workflows/`
- Repository must be public or org member
- Check workflow access settings
**"No runner available":**
- Self-hosted: Check runner is online and has matching labels
- GitHub-hosted: May hit concurrent job limit (check usage)
### GitLab CI
**"This job is stuck":**
- No runner available with matching tags
- All runners are busy
- Runner not configured for this project
**Solutions:**
```yaml
# Remove tags to use any available runner
job:
tags: []
# Or check runner configuration
# Settings > CI/CD > Runners
```
**"Job failed (system failure)":**
- Runner disconnected
- Resource limits exceeded
- Infrastructure issue
**Check runner logs:**
```bash
# On runner host
journalctl -u gitlab-runner -f
```
---
## Debugging Techniques
### Enable Debug Logging
**GitHub Actions:**
```yaml
# Repository > Settings > Secrets > Add:
# ACTIONS_RUNNER_DEBUG = true
# ACTIONS_STEP_DEBUG = true
```
**GitLab CI:**
```yaml
variables:
CI_DEBUG_TRACE: "true" # Caution: May expose secrets!
```
### Interactive Debugging
**GitHub Actions:**
```yaml
# Add tmate for SSH access
- uses: mxschmitt/action-tmate@v3
if: failure()
```
**Local reproduction:**
```bash
# Use act to run GitHub Actions locally
act -j build
# Or nektos/act for Docker
docker run -v $(pwd):/workspace -it nektos/act -j build
```
### Reproduce Locally
```bash
# GitHub Actions - Use same Docker image
docker run -it ubuntu:latest bash
# Install dependencies and test
apt-get update && apt-get install -y nodejs npm
npm ci
npm test
```
---
## Prevention Strategies
### Pre-commit Checks
```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: check-yaml
- id: check-added-large-files
- repo: local
hooks:
- id: tests
name: Run tests
entry: npm test
language: system
pass_filenames: false
```
### CI/CD Health Monitoring
Use the `scripts/ci_health.py` script:
```bash
python3 scripts/ci_health.py --platform github --repo owner/repo
```
### Regular Maintenance
- [ ] Monthly: Review failed job patterns
- [ ] Monthly: Update actions/dependencies
- [ ] Quarterly: Audit pipeline efficiency
- [ ] Quarterly: Review and clean old caches
- [ ] Yearly: Major version updates
---
## Getting Help
**GitHub Actions:**
- Community Forum: https://github.community
- Documentation: https://docs.github.com/actions
- Status: https://www.githubstatus.com
**GitLab CI:**
- Forum: https://forum.gitlab.com
- Documentation: https://docs.gitlab.com/ee/ci
- Status: https://status.gitlab.com
**General CI/CD:**
- Stack Overflow: Tag [github-actions] or [gitlab-ci]
- Reddit: r/devops, r/cicd

301
scripts/ci_health.py Normal file
View File

@@ -0,0 +1,301 @@
#!/usr/bin/env python3
"""
CI/CD Pipeline Health Checker
Checks pipeline status, recent failures, and provides insights for GitHub Actions,
GitLab CI, and other platforms. Identifies failing workflows, slow pipelines,
and provides actionable recommendations.
Usage:
# GitHub Actions
python3 ci_health.py --platform github --repo owner/repo
# GitLab CI
python3 ci_health.py --platform gitlab --project-id 12345 --token <token>
# Check specific workflow/pipeline
python3 ci_health.py --platform github --repo owner/repo --workflow ci.yml
"""
import argparse
import json
import subprocess
import sys
import urllib.request
import urllib.error
from datetime import datetime, timedelta
from typing import Dict, List, Optional
class CIHealthChecker:
def __init__(self, platform: str, **kwargs):
self.platform = platform.lower()
self.config = kwargs
self.issues = []
self.warnings = []
self.insights = []
self.metrics = {}
def check_github_workflows(self) -> Dict:
"""Check GitHub Actions workflow health"""
print(f"🔍 Checking GitHub Actions workflows...")
if not self._check_command("gh"):
self.issues.append("GitHub CLI (gh) is not installed")
self.insights.append("Install gh CLI: https://cli.github.com/")
return self._generate_report()
repo = self.config.get('repo')
if not repo:
self.issues.append("Repository not specified")
self.insights.append("Use --repo owner/repo")
return self._generate_report()
try:
# Get recent workflow runs
limit = self.config.get('limit', 20)
cmd = ['gh', 'run', 'list', '--repo', repo, '--limit', str(limit), '--json',
'status,conclusion,name,workflowName,createdAt,displayTitle']
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
if result.returncode != 0:
self.issues.append(f"Failed to fetch workflows: {result.stderr}")
self.insights.append("Verify gh CLI authentication: gh auth status")
return self._generate_report()
runs = json.loads(result.stdout)
if not runs:
self.warnings.append("No recent workflow runs found")
return self._generate_report()
# Analyze runs
total_runs = len(runs)
failed_runs = [r for r in runs if r.get('conclusion') == 'failure']
cancelled_runs = [r for r in runs if r.get('conclusion') == 'cancelled']
success_runs = [r for r in runs if r.get('conclusion') == 'success']
self.metrics['total_runs'] = total_runs
self.metrics['failed_runs'] = len(failed_runs)
self.metrics['cancelled_runs'] = len(cancelled_runs)
self.metrics['success_runs'] = len(success_runs)
self.metrics['failure_rate'] = (len(failed_runs) / total_runs * 100) if total_runs > 0 else 0
# Group failures by workflow
failure_by_workflow = {}
for run in failed_runs:
workflow = run.get('workflowName', 'unknown')
failure_by_workflow[workflow] = failure_by_workflow.get(workflow, 0) + 1
print(f"✅ Analyzed {total_runs} recent runs:")
print(f" - Success: {len(success_runs)} ({len(success_runs)/total_runs*100:.1f}%)")
print(f" - Failed: {len(failed_runs)} ({len(failed_runs)/total_runs*100:.1f}%)")
print(f" - Cancelled: {len(cancelled_runs)} ({len(cancelled_runs)/total_runs*100:.1f}%)")
# Identify issues
if self.metrics['failure_rate'] > 20:
self.issues.append(f"High failure rate: {self.metrics['failure_rate']:.1f}%")
self.insights.append("Investigate failing workflows and address root causes")
if failure_by_workflow:
self.warnings.append("Workflows with recent failures:")
for workflow, count in sorted(failure_by_workflow.items(), key=lambda x: x[1], reverse=True):
self.warnings.append(f" - {workflow}: {count} failure(s)")
self.insights.append(f"Review logs for '{workflow}': gh run view --repo {repo}")
if len(cancelled_runs) > total_runs * 0.3:
self.warnings.append(f"High cancellation rate: {len(cancelled_runs)/total_runs*100:.1f}%")
self.insights.append("Excessive cancellations may indicate workflow timeout issues or manual interventions")
except subprocess.TimeoutExpired:
self.issues.append("Request timed out - check network connectivity")
except json.JSONDecodeError as e:
self.issues.append(f"Failed to parse workflow data: {e}")
except Exception as e:
self.issues.append(f"Unexpected error: {e}")
return self._generate_report()
def check_gitlab_pipelines(self) -> Dict:
"""Check GitLab CI pipeline health"""
print(f"🔍 Checking GitLab CI pipelines...")
url = self.config.get('url', 'https://gitlab.com')
token = self.config.get('token')
project_id = self.config.get('project_id')
if not token:
self.issues.append("GitLab token not provided")
self.insights.append("Provide token with --token or GITLAB_TOKEN env var")
return self._generate_report()
if not project_id:
self.issues.append("Project ID not specified")
self.insights.append("Use --project-id <id>")
return self._generate_report()
try:
# Get recent pipelines
per_page = self.config.get('limit', 20)
api_url = f"{url}/api/v4/projects/{project_id}/pipelines?per_page={per_page}"
req = urllib.request.Request(api_url, headers={'PRIVATE-TOKEN': token})
with urllib.request.urlopen(req, timeout=30) as response:
pipelines = json.loads(response.read())
if not pipelines:
self.warnings.append("No recent pipelines found")
return self._generate_report()
# Analyze pipelines
total_pipelines = len(pipelines)
failed = [p for p in pipelines if p.get('status') == 'failed']
success = [p for p in pipelines if p.get('status') == 'success']
running = [p for p in pipelines if p.get('status') == 'running']
cancelled = [p for p in pipelines if p.get('status') == 'canceled']
self.metrics['total_pipelines'] = total_pipelines
self.metrics['failed'] = len(failed)
self.metrics['success'] = len(success)
self.metrics['running'] = len(running)
self.metrics['failure_rate'] = (len(failed) / total_pipelines * 100) if total_pipelines > 0 else 0
print(f"✅ Analyzed {total_pipelines} recent pipelines:")
print(f" - Success: {len(success)} ({len(success)/total_pipelines*100:.1f}%)")
print(f" - Failed: {len(failed)} ({len(failed)/total_pipelines*100:.1f}%)")
print(f" - Running: {len(running)}")
print(f" - Cancelled: {len(cancelled)}")
# Identify issues
if self.metrics['failure_rate'] > 20:
self.issues.append(f"High failure rate: {self.metrics['failure_rate']:.1f}%")
self.insights.append("Review failing pipelines and fix recurring issues")
# Get details of recent failures
if failed:
self.warnings.append(f"Recent pipeline failures:")
for pipeline in failed[:5]: # Show up to 5 recent failures
ref = pipeline.get('ref', 'unknown')
pipeline_id = pipeline.get('id')
self.warnings.append(f" - Pipeline #{pipeline_id} on {ref}")
self.insights.append(f"View pipeline details: {url}/{project_id}/-/pipelines")
except urllib.error.HTTPError as e:
self.issues.append(f"API error: {e.code} - {e.reason}")
if e.code == 401:
self.insights.append("Check GitLab token permissions")
except urllib.error.URLError as e:
self.issues.append(f"Network error: {e.reason}")
self.insights.append("Check GitLab URL and network connectivity")
except Exception as e:
self.issues.append(f"Unexpected error: {e}")
return self._generate_report()
def _check_command(self, command: str) -> bool:
"""Check if command is available"""
try:
subprocess.run([command, '--version'], capture_output=True, timeout=5)
return True
except (FileNotFoundError, subprocess.TimeoutExpired):
return False
def _generate_report(self) -> Dict:
"""Generate health check report"""
# Determine overall health status
if self.issues:
status = 'unhealthy'
elif self.warnings:
status = 'degraded'
else:
status = 'healthy'
return {
'platform': self.platform,
'status': status,
'issues': self.issues,
'warnings': self.warnings,
'insights': self.insights,
'metrics': self.metrics
}
def print_report(report: Dict):
"""Print formatted health check report"""
print("\n" + "="*60)
print(f"🏥 CI/CD Health Report - {report['platform'].upper()}")
print("="*60)
status_emoji = {"healthy": "", "degraded": "⚠️", "unhealthy": ""}.get(report['status'], "")
print(f"\nStatus: {status_emoji} {report['status'].upper()}")
if report['metrics']:
print(f"\n📊 Metrics:")
for key, value in report['metrics'].items():
formatted_key = key.replace('_', ' ').title()
if 'rate' in key:
print(f" - {formatted_key}: {value:.1f}%")
else:
print(f" - {formatted_key}: {value}")
if report['issues']:
print(f"\n🚨 Issues ({len(report['issues'])}):")
for i, issue in enumerate(report['issues'], 1):
print(f" {i}. {issue}")
if report['warnings']:
print(f"\n⚠️ Warnings:")
for warning in report['warnings']:
if warning.startswith(' -'):
print(f" {warning}")
else:
print(f"{warning}")
if report['insights']:
print(f"\n💡 Insights & Recommendations:")
for i, insight in enumerate(report['insights'], 1):
print(f" {i}. {insight}")
print("\n" + "="*60 + "\n")
def main():
parser = argparse.ArgumentParser(
description='CI/CD Pipeline Health Checker',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('--platform', required=True, choices=['github', 'gitlab'],
help='CI/CD platform')
parser.add_argument('--repo', help='GitHub repository (owner/repo)')
parser.add_argument('--workflow', help='Specific workflow name to check')
parser.add_argument('--project-id', help='GitLab project ID')
parser.add_argument('--url', default='https://gitlab.com', help='GitLab URL')
parser.add_argument('--token', help='GitLab token (or use GITLAB_TOKEN env var)')
parser.add_argument('--limit', type=int, default=20, help='Number of recent runs/pipelines to analyze')
args = parser.parse_args()
# Create checker
checker = CIHealthChecker(
platform=args.platform,
repo=args.repo,
workflow=args.workflow,
project_id=args.project_id,
url=args.url,
token=args.token,
limit=args.limit
)
# Run checks
if args.platform == 'github':
report = checker.check_github_workflows()
elif args.platform == 'gitlab':
report = checker.check_gitlab_pipelines()
# Print report
print_report(report)
# Exit with error code if unhealthy
sys.exit(0 if report['status'] == 'healthy' else 1)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,440 @@
#!/usr/bin/env python3
"""
CI/CD Pipeline Performance Analyzer
Analyzes CI/CD pipeline configuration and execution to identify performance
bottlenecks, caching opportunities, and optimization recommendations.
Usage:
# Analyze GitHub Actions workflow
python3 pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
# Analyze GitLab CI pipeline
python3 pipeline_analyzer.py --platform gitlab --config .gitlab-ci.yml
# Analyze recent workflow runs
python3 pipeline_analyzer.py --platform github --repo owner/repo --analyze-runs 10
"""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
from typing import Dict, List, Optional, Tuple
import yaml
class PipelineAnalyzer:
def __init__(self, platform: str, **kwargs):
self.platform = platform.lower()
self.config = kwargs
self.findings = []
self.optimizations = []
self.metrics = {}
def analyze_github_workflow(self, workflow_file: str) -> Dict:
"""Analyze GitHub Actions workflow file"""
print(f"🔍 Analyzing GitHub Actions workflow: {workflow_file}")
if not os.path.exists(workflow_file):
return self._error(f"Workflow file not found: {workflow_file}")
try:
with open(workflow_file, 'r') as f:
workflow = yaml.safe_load(f)
# Analyze workflow structure
self._check_workflow_triggers(workflow)
self._check_caching_strategy(workflow, 'github')
self._check_job_parallelization(workflow, 'github')
self._check_dependency_management(workflow, 'github')
self._check_matrix_strategy(workflow)
self._check_artifact_usage(workflow)
self._analyze_action_versions(workflow)
return self._generate_report()
except yaml.YAMLError as e:
return self._error(f"Invalid YAML: {e}")
except Exception as e:
return self._error(f"Analysis failed: {e}")
def analyze_gitlab_pipeline(self, config_file: str) -> Dict:
"""Analyze GitLab CI pipeline configuration"""
print(f"🔍 Analyzing GitLab CI pipeline: {config_file}")
if not os.path.exists(config_file):
return self._error(f"Config file not found: {config_file}")
try:
with open(config_file, 'r') as f:
config = yaml.safe_load(f)
# Analyze pipeline structure
self._check_caching_strategy(config, 'gitlab')
self._check_job_parallelization(config, 'gitlab')
self._check_dependency_management(config, 'gitlab')
self._check_gitlab_specific_features(config)
return self._generate_report()
except yaml.YAMLError as e:
return self._error(f"Invalid YAML: {e}")
except Exception as e:
return self._error(f"Analysis failed: {e}")
def _check_workflow_triggers(self, workflow: Dict):
"""Check workflow trigger configuration"""
triggers = workflow.get('on', {})
if isinstance(triggers, list):
trigger_types = triggers
elif isinstance(triggers, dict):
trigger_types = list(triggers.keys())
else:
trigger_types = [triggers] if triggers else []
# Check for overly broad triggers
if 'push' in trigger_types:
push_config = triggers.get('push', {}) if isinstance(triggers, dict) else {}
if not push_config or not push_config.get('branches'):
self.findings.append("Workflow triggers on all push events (no branch filter)")
self.optimizations.append(
"Add branch filters to 'push' trigger to reduce unnecessary runs:\n"
" on:\n"
" push:\n"
" branches: [main, develop]"
)
# Check for path filters
if 'pull_request' in trigger_types:
pr_config = triggers.get('pull_request', {}) if isinstance(triggers, dict) else {}
if not pr_config.get('paths') and not pr_config.get('paths-ignore'):
self.optimizations.append(
"Consider adding path filters to skip unnecessary PR runs:\n"
" pull_request:\n"
" paths-ignore:\n"
" - 'docs/**'\n"
" - '**.md'"
)
def _check_caching_strategy(self, config: Dict, platform: str):
"""Check for dependency caching"""
has_cache = False
if platform == 'github':
jobs = config.get('jobs', {})
for job_name, job in jobs.items():
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict) and step.get('uses', '').startswith('actions/cache'):
has_cache = True
break
if not has_cache:
self.findings.append("No dependency caching detected")
self.optimizations.append(
"Add dependency caching to speed up builds:\n"
" - uses: actions/cache@v4\n"
" with:\n"
" path: |\n"
" ~/.cargo\n"
" ~/.npm\n"
" ~/.cache/pip\n"
" key: ${{ runner.os }}-deps-${{ hashFiles('**/package-lock.json') }}"
)
elif platform == 'gitlab':
cache_config = config.get('cache', {})
job_has_cache = False
# Check global cache
if cache_config:
has_cache = True
# Check job-level cache
for key, value in config.items():
if isinstance(value, dict) and 'script' in value:
if value.get('cache'):
job_has_cache = True
if not has_cache and not job_has_cache:
self.findings.append("No caching configuration detected")
self.optimizations.append(
"Add caching to speed up builds:\n"
"cache:\n"
" key: ${CI_COMMIT_REF_SLUG}\n"
" paths:\n"
" - node_modules/\n"
" - .npm/\n"
" - vendor/"
)
def _check_job_parallelization(self, config: Dict, platform: str):
"""Check for job parallelization opportunities"""
if platform == 'github':
jobs = config.get('jobs', {})
# Count jobs with dependencies
jobs_with_needs = sum(1 for job in jobs.values()
if isinstance(job, dict) and 'needs' in job)
if len(jobs) > 1 and jobs_with_needs == 0:
self.optimizations.append(
f"Found {len(jobs)} jobs with no dependencies - they will run in parallel (good!)"
)
elif len(jobs) > 3 and jobs_with_needs == len(jobs):
self.findings.append("All jobs have 'needs' dependencies - may be unnecessarily sequential")
self.optimizations.append(
"Review job dependencies - remove 'needs' where jobs can run in parallel"
)
elif platform == 'gitlab':
stages = config.get('stages', [])
if len(stages) > 5:
self.findings.append(f"Pipeline has {len(stages)} stages - may be overly sequential")
self.optimizations.append(
"Consider reducing stages to allow more parallel execution"
)
def _check_dependency_management(self, config: Dict, platform: str):
"""Check dependency installation patterns"""
if platform == 'github':
jobs = config.get('jobs', {})
for job_name, job in jobs.items():
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict):
run_cmd = step.get('run', '')
# Check for npm ci vs npm install
if 'npm install' in run_cmd and 'npm ci' not in run_cmd:
self.findings.append(f"Job '{job_name}' uses 'npm install' instead of 'npm ci'")
self.optimizations.append(
f"Use 'npm ci' instead of 'npm install' for faster, reproducible installs"
)
# Check for pip install without cache
if 'pip install' in run_cmd:
has_pip_cache = any(
s.get('uses', '').startswith('actions/cache') and
'pip' in str(s.get('with', {}).get('path', ''))
for s in steps if isinstance(s, dict)
)
if not has_pip_cache:
self.optimizations.append(
f"Add pip cache for job '{job_name}' to speed up Python dependency installation"
)
def _check_matrix_strategy(self, workflow: Dict):
"""Check for matrix strategy usage"""
jobs = workflow.get('jobs', {})
for job_name, job in jobs.items():
if isinstance(job, dict):
strategy = job.get('strategy', {})
matrix = strategy.get('matrix', {})
if matrix:
# Check fail-fast
fail_fast = strategy.get('fail-fast', True)
if fail_fast:
self.optimizations.append(
f"Job '{job_name}' has fail-fast=true (default). "
f"Consider fail-fast=false to see all matrix results"
)
# Check for large matrices
matrix_size = 1
for key, values in matrix.items():
if isinstance(values, list):
matrix_size *= len(values)
if matrix_size > 20:
self.findings.append(
f"Job '{job_name}' has large matrix ({matrix_size} combinations)"
)
self.optimizations.append(
f"Consider reducing matrix size or using 'exclude' to skip unnecessary combinations"
)
def _check_artifact_usage(self, workflow: Dict):
"""Check artifact upload/download patterns"""
jobs = workflow.get('jobs', {})
uploads = {}
downloads = {}
for job_name, job in jobs.items():
if not isinstance(job, dict):
continue
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict):
uses = step.get('uses', '')
if 'actions/upload-artifact' in uses:
artifact_name = step.get('with', {}).get('name', 'unknown')
uploads[artifact_name] = job_name
if 'actions/download-artifact' in uses:
artifact_name = step.get('with', {}).get('name', 'unknown')
downloads.setdefault(artifact_name, []).append(job_name)
# Check for unused artifacts
for artifact, uploader in uploads.items():
if artifact not in downloads:
self.findings.append(f"Artifact '{artifact}' uploaded but never downloaded")
self.optimizations.append(f"Remove unused artifact upload or add download step")
def _analyze_action_versions(self, workflow: Dict):
"""Check for outdated action versions"""
jobs = workflow.get('jobs', {})
outdated_actions = []
for job_name, job in jobs.items():
if not isinstance(job, dict):
continue
steps = job.get('steps', [])
for step in steps:
if isinstance(step, dict):
uses = step.get('uses', '')
# Check for @v1, @v2 versions (likely outdated)
if '@v1' in uses or '@v2' in uses:
outdated_actions.append(uses)
if outdated_actions:
self.findings.append(f"Found {len(outdated_actions)} potentially outdated actions")
self.optimizations.append(
f"Update to latest action versions:\n" +
"\n".join(f" - {action}" for action in set(outdated_actions))
)
def _check_gitlab_specific_features(self, config: Dict):
"""Check GitLab-specific optimization opportunities"""
# Check for interruptible jobs
has_interruptible = any(
isinstance(v, dict) and v.get('interruptible')
for v in config.values()
)
if not has_interruptible:
self.optimizations.append(
"Consider marking jobs as 'interruptible: true' to cancel redundant pipeline runs:\n"
"job_name:\n"
" interruptible: true"
)
# Check for DAG usage (needs keyword)
has_needs = any(
isinstance(v, dict) and 'needs' in v
for v in config.values()
)
if not has_needs and config.get('stages') and len(config.get('stages', [])) > 2:
self.optimizations.append(
"Consider using 'needs' keyword for DAG pipelines to improve parallelization:\n"
"test:\n"
" needs: [build]"
)
def _error(self, message: str) -> Dict:
"""Return error report"""
return {
'status': 'error',
'error': message,
'findings': [],
'optimizations': []
}
def _generate_report(self) -> Dict:
"""Generate analysis report"""
return {
'status': 'success',
'platform': self.platform,
'findings': self.findings,
'optimizations': self.optimizations,
'metrics': self.metrics
}
def print_report(report: Dict):
"""Print formatted analysis report"""
if report['status'] == 'error':
print(f"\n❌ Error: {report['error']}\n")
return
print("\n" + "="*60)
print(f"📊 Pipeline Analysis Report - {report['platform'].upper()}")
print("="*60)
if report['findings']:
print(f"\n🔍 Findings ({len(report['findings'])}):")
for i, finding in enumerate(report['findings'], 1):
print(f"\n {i}. {finding}")
if report['optimizations']:
print(f"\n💡 Optimization Recommendations ({len(report['optimizations'])}):")
for i, opt in enumerate(report['optimizations'], 1):
print(f"\n {i}. {opt}")
if not report['findings'] and not report['optimizations']:
print("\n✅ No issues found - pipeline looks well optimized!")
print("\n" + "="*60 + "\n")
def main():
parser = argparse.ArgumentParser(
description='CI/CD Pipeline Performance Analyzer',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('--platform', required=True, choices=['github', 'gitlab'],
help='CI/CD platform')
parser.add_argument('--workflow', help='Path to GitHub Actions workflow file')
parser.add_argument('--config', help='Path to GitLab CI config file')
parser.add_argument('--repo', help='Repository (owner/repo) for run analysis')
parser.add_argument('--analyze-runs', type=int, help='Number of recent runs to analyze')
args = parser.parse_args()
# Create analyzer
analyzer = PipelineAnalyzer(
platform=args.platform,
repo=args.repo
)
# Run analysis
if args.platform == 'github':
if args.workflow:
report = analyzer.analyze_github_workflow(args.workflow)
else:
# Try to find workflow files
workflow_dir = Path('.github/workflows')
if workflow_dir.exists():
workflows = list(workflow_dir.glob('*.yml')) + list(workflow_dir.glob('*.yaml'))
if workflows:
print(f"Found {len(workflows)} workflow(s), analyzing first one...")
report = analyzer.analyze_github_workflow(str(workflows[0]))
else:
print("❌ No workflow files found in .github/workflows/")
sys.exit(1)
else:
print("❌ No .github/workflows/ directory found")
sys.exit(1)
elif args.platform == 'gitlab':
config_file = args.config or '.gitlab-ci.yml'
report = analyzer.analyze_gitlab_pipeline(config_file)
# Print report
print_report(report)
# Exit with appropriate code
sys.exit(0 if report['status'] == 'success' else 1)
if __name__ == '__main__':
main()