Initial commit
This commit is contained in:
573
SKILL.md
Normal file
573
SKILL.md
Normal file
@@ -0,0 +1,573 @@
|
||||
---
|
||||
name: ci-cd
|
||||
description: CI/CD pipeline design, optimization, DevSecOps security scanning, and troubleshooting. Use for creating workflows, debugging pipeline failures, implementing SAST/DAST/SCA, optimizing build performance, implementing caching strategies, setting up deployments, securing pipelines with OIDC/secrets management, and troubleshooting common issues across GitHub Actions, GitLab CI, and other platforms.
|
||||
---
|
||||
|
||||
# CI/CD Pipelines
|
||||
|
||||
Comprehensive guide for CI/CD pipeline design, optimization, security, and troubleshooting across GitHub Actions, GitLab CI, and other platforms.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when:
|
||||
- Creating new CI/CD workflows or pipelines
|
||||
- Debugging pipeline failures or flaky tests
|
||||
- Optimizing slow builds or test suites
|
||||
- Implementing caching strategies
|
||||
- Setting up deployment workflows
|
||||
- Securing pipelines (secrets, OIDC, supply chain)
|
||||
- Implementing DevSecOps security scanning (SAST, DAST, SCA)
|
||||
- Troubleshooting platform-specific issues
|
||||
- Analyzing pipeline performance
|
||||
- Implementing matrix builds or test sharding
|
||||
- Configuring multi-environment deployments
|
||||
|
||||
## Core Workflows
|
||||
|
||||
### 1. Creating a New Pipeline
|
||||
|
||||
**Decision tree:**
|
||||
```
|
||||
What are you building?
|
||||
├── Node.js/Frontend → GitHub: templates/github-actions/node-ci.yml | GitLab: templates/gitlab-ci/node-ci.yml
|
||||
├── Python → GitHub: templates/github-actions/python-ci.yml | GitLab: templates/gitlab-ci/python-ci.yml
|
||||
├── Go → GitHub: templates/github-actions/go-ci.yml | GitLab: templates/gitlab-ci/go-ci.yml
|
||||
├── Docker Image → GitHub: templates/github-actions/docker-build.yml | GitLab: templates/gitlab-ci/docker-build.yml
|
||||
├── Other → Follow the pipeline design pattern below
|
||||
```
|
||||
|
||||
**Basic pipeline structure:**
|
||||
```yaml
|
||||
# 1. Fast feedback (lint, format) - <1 min
|
||||
# 2. Unit tests - 1-5 min
|
||||
# 3. Integration tests - 5-15 min
|
||||
# 4. Build artifacts
|
||||
# 5. E2E tests (optional, main branch only) - 15-30 min
|
||||
# 6. Deploy (with approval gates)
|
||||
```
|
||||
|
||||
**Key principles:**
|
||||
- Fail fast: Run cheap validation first
|
||||
- Parallelize: Remove unnecessary job dependencies
|
||||
- Cache dependencies: Use `actions/cache` or GitLab cache
|
||||
- Use artifacts: Build once, deploy many times
|
||||
|
||||
See [best_practices.md](references/best_practices.md) for comprehensive pipeline design patterns.
|
||||
|
||||
### 2. Optimizing Pipeline Performance
|
||||
|
||||
**Quick wins checklist:**
|
||||
- [ ] Add dependency caching (50-90% faster builds)
|
||||
- [ ] Remove unnecessary `needs` dependencies
|
||||
- [ ] Add path filters to skip unnecessary runs
|
||||
- [ ] Use `npm ci` instead of `npm install`
|
||||
- [ ] Add job timeouts to prevent hung builds
|
||||
- [ ] Enable concurrency cancellation for duplicate runs
|
||||
|
||||
**Analyze existing pipeline:**
|
||||
```bash
|
||||
# Use the pipeline analyzer script
|
||||
python3 scripts/pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
|
||||
```
|
||||
|
||||
**Common optimizations:**
|
||||
- **Slow tests:** Shard tests with matrix builds
|
||||
- **Repeated dependency installs:** Add caching
|
||||
- **Sequential jobs:** Parallelize with proper `needs`
|
||||
- **Full test suite on every PR:** Use path filters or test impact analysis
|
||||
|
||||
See [optimization.md](references/optimization.md) for detailed caching strategies, parallelization techniques, and performance tuning.
|
||||
|
||||
### 3. Securing Your Pipeline
|
||||
|
||||
**Essential security checklist:**
|
||||
- [ ] Use OIDC instead of static credentials
|
||||
- [ ] Pin actions/includes to commit SHAs
|
||||
- [ ] Use minimal permissions
|
||||
- [ ] Enable secret scanning
|
||||
- [ ] Add vulnerability scanning (dependencies, containers)
|
||||
- [ ] Implement branch protection
|
||||
- [ ] Separate test from deploy workflows
|
||||
|
||||
**Quick setup - OIDC authentication:**
|
||||
|
||||
**GitHub Actions → AWS:**
|
||||
```yaml
|
||||
permissions:
|
||||
id-token: write
|
||||
contents: read
|
||||
|
||||
steps:
|
||||
- uses: aws-actions/configure-aws-credentials@v4
|
||||
with:
|
||||
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
|
||||
aws-region: us-east-1
|
||||
```
|
||||
|
||||
**Secrets management:**
|
||||
- Store in platform secret stores (GitHub Secrets, GitLab CI/CD Variables)
|
||||
- Mark as "masked" in GitLab
|
||||
- Use environment-specific secrets
|
||||
- Rotate regularly (every 90 days)
|
||||
- Never log secrets
|
||||
|
||||
See [security.md](references/security.md) for comprehensive security patterns, supply chain security, and secrets management.
|
||||
|
||||
### 4. Troubleshooting Pipeline Failures
|
||||
|
||||
**Systematic approach:**
|
||||
|
||||
**Step 1: Check pipeline health**
|
||||
```bash
|
||||
python3 scripts/ci_health.py --platform github --repo owner/repo
|
||||
```
|
||||
|
||||
**Step 2: Identify the failure type**
|
||||
|
||||
| Error Pattern | Common Cause | Quick Fix |
|
||||
|---------------|--------------|-----------|
|
||||
| "Module not found" | Missing dependency or cache issue | Clear cache, run `npm ci` |
|
||||
| "Timeout" | Job taking too long | Add caching, increase timeout |
|
||||
| "Permission denied" | Missing permissions | Add to `permissions:` block |
|
||||
| "Cannot connect to Docker daemon" | Docker not available | Use correct runner or DinD |
|
||||
| Intermittent failures | Flaky tests or race conditions | Add retries, fix timing issues |
|
||||
|
||||
**Step 3: Enable debug logging**
|
||||
|
||||
GitHub Actions:
|
||||
```yaml
|
||||
# Add repository secrets:
|
||||
# ACTIONS_RUNNER_DEBUG = true
|
||||
# ACTIONS_STEP_DEBUG = true
|
||||
```
|
||||
|
||||
GitLab CI:
|
||||
```yaml
|
||||
variables:
|
||||
CI_DEBUG_TRACE: "true"
|
||||
```
|
||||
|
||||
**Step 4: Reproduce locally**
|
||||
```bash
|
||||
# GitHub Actions - use act
|
||||
act -j build
|
||||
|
||||
# Or Docker
|
||||
docker run -it ubuntu:latest bash
|
||||
# Then manually run the failing steps
|
||||
```
|
||||
|
||||
See [troubleshooting.md](references/troubleshooting.md) for comprehensive issue diagnosis, platform-specific problems, and solutions.
|
||||
|
||||
### 5. Implementing Deployment Workflows
|
||||
|
||||
**Deployment pattern selection:**
|
||||
|
||||
| Pattern | Use Case | Complexity | Risk |
|
||||
|---------|----------|------------|------|
|
||||
| Direct | Simple apps, low traffic | Low | Medium |
|
||||
| Blue-Green | Zero downtime required | Medium | Low |
|
||||
| Canary | Gradual rollout, monitoring | High | Very Low |
|
||||
| Rolling | Kubernetes, containers | Medium | Low |
|
||||
|
||||
**Basic deployment structure:**
|
||||
```yaml
|
||||
deploy:
|
||||
needs: [build, test]
|
||||
if: github.ref == 'refs/heads/main'
|
||||
environment:
|
||||
name: production
|
||||
url: https://example.com
|
||||
steps:
|
||||
- name: Download artifacts
|
||||
- name: Deploy
|
||||
- name: Health check
|
||||
- name: Rollback on failure
|
||||
```
|
||||
|
||||
**Multi-environment setup:**
|
||||
- **Development:** Auto-deploy on develop branch
|
||||
- **Staging:** Auto-deploy on main, requires passing tests
|
||||
- **Production:** Manual approval required, smoke tests mandatory
|
||||
|
||||
See [best_practices.md](references/best_practices.md#deployment-strategies) for detailed deployment patterns and environment management.
|
||||
|
||||
### 6. Implementing DevSecOps Security Scanning
|
||||
|
||||
**Security scanning types:**
|
||||
|
||||
| Scan Type | Purpose | When to Run | Speed | Tools |
|
||||
|-----------|---------|-------------|-------|-------|
|
||||
| Secret Scanning | Find exposed credentials | Every commit | Fast (<1 min) | TruffleHog, Gitleaks |
|
||||
| SAST | Find code vulnerabilities | Every commit | Medium (5-15 min) | CodeQL, Semgrep, Bandit, Gosec |
|
||||
| SCA | Find dependency vulnerabilities | Every commit | Fast (1-5 min) | npm audit, pip-audit, Snyk |
|
||||
| Container Scanning | Find image vulnerabilities | After build | Medium (5-10 min) | Trivy, Grype |
|
||||
| DAST | Find runtime vulnerabilities | Scheduled/main only | Slow (15-60 min) | OWASP ZAP |
|
||||
|
||||
**Quick setup - Add security to existing pipeline:**
|
||||
|
||||
**GitHub Actions:**
|
||||
```yaml
|
||||
jobs:
|
||||
# Add before build job
|
||||
secret-scan:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
- uses: trufflesecurity/trufflehog@main
|
||||
- uses: gitleaks/gitleaks-action@v2
|
||||
|
||||
sast:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
security-events: write
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: github/codeql-action/init@v3
|
||||
with:
|
||||
languages: javascript # or python, go
|
||||
- uses: github/codeql-action/analyze@v3
|
||||
|
||||
build:
|
||||
needs: [secret-scan, sast] # Add dependencies
|
||||
```
|
||||
|
||||
**GitLab CI:**
|
||||
```yaml
|
||||
stages:
|
||||
- security # Add before other stages
|
||||
- build
|
||||
- test
|
||||
|
||||
# Secret scanning
|
||||
secret-scan:
|
||||
stage: security
|
||||
image: trufflesecurity/trufflehog:latest
|
||||
script:
|
||||
- trufflehog filesystem . --json --fail
|
||||
|
||||
# SAST
|
||||
sast:semgrep:
|
||||
stage: security
|
||||
image: returntocorp/semgrep
|
||||
script:
|
||||
- semgrep scan --config=auto .
|
||||
|
||||
# Use GitLab templates
|
||||
include:
|
||||
- template: Security/SAST.gitlab-ci.yml
|
||||
- template: Security/Dependency-Scanning.gitlab-ci.yml
|
||||
```
|
||||
|
||||
**Comprehensive security pipeline templates:**
|
||||
- **GitHub Actions:** `templates/github-actions/security-scan.yml` - Complete DevSecOps pipeline with all scanning stages
|
||||
- **GitLab CI:** `templates/gitlab-ci/security-scan.yml` - Complete DevSecOps pipeline with GitLab security templates
|
||||
|
||||
**Security gate pattern:**
|
||||
|
||||
Add a security gate job that evaluates all security scan results and fails the pipeline if critical issues are found:
|
||||
|
||||
```yaml
|
||||
security-gate:
|
||||
needs: [secret-scan, sast, sca, container-scan]
|
||||
script:
|
||||
# Check for critical vulnerabilities
|
||||
# Parse JSON reports and evaluate thresholds
|
||||
# Fail if critical issues found
|
||||
```
|
||||
|
||||
**Language-specific security tools:**
|
||||
|
||||
- **Node.js:** CodeQL, Semgrep, npm audit, eslint-plugin-security
|
||||
- **Python:** CodeQL, Semgrep, Bandit, pip-audit, Safety
|
||||
- **Go:** CodeQL, Semgrep, Gosec, govulncheck
|
||||
|
||||
All language-specific templates now include security scanning stages. See:
|
||||
- `templates/github-actions/node-ci.yml`
|
||||
- `templates/github-actions/python-ci.yml`
|
||||
- `templates/github-actions/go-ci.yml`
|
||||
- `templates/gitlab-ci/node-ci.yml`
|
||||
- `templates/gitlab-ci/python-ci.yml`
|
||||
- `templates/gitlab-ci/go-ci.yml`
|
||||
|
||||
See [devsecops.md](references/devsecops.md) for comprehensive DevSecOps guide covering all security scanning types, tool comparisons, and implementation patterns.
|
||||
|
||||
## Quick Reference Commands
|
||||
|
||||
### GitHub Actions
|
||||
|
||||
```bash
|
||||
# List workflows
|
||||
gh workflow list
|
||||
|
||||
# View recent runs
|
||||
gh run list --limit 20
|
||||
|
||||
# View specific run
|
||||
gh run view <run-id>
|
||||
|
||||
# Re-run failed jobs
|
||||
gh run rerun <run-id> --failed
|
||||
|
||||
# Download logs
|
||||
gh run view <run-id> --log > logs.txt
|
||||
|
||||
# Trigger workflow manually
|
||||
gh workflow run ci.yml
|
||||
|
||||
# Check workflow status
|
||||
gh run watch
|
||||
```
|
||||
|
||||
### GitLab CI
|
||||
|
||||
```bash
|
||||
# View pipelines
|
||||
gl project-pipelines list
|
||||
|
||||
# Pipeline status
|
||||
gl project-pipeline get <pipeline-id>
|
||||
|
||||
# Retry failed jobs
|
||||
gl project-pipeline retry <pipeline-id>
|
||||
|
||||
# Cancel pipeline
|
||||
gl project-pipeline cancel <pipeline-id>
|
||||
|
||||
# Download artifacts
|
||||
gl project-job artifacts <job-id>
|
||||
```
|
||||
|
||||
## Platform-Specific Patterns
|
||||
|
||||
### GitHub Actions
|
||||
|
||||
**Reusable workflows:**
|
||||
```yaml
|
||||
# .github/workflows/reusable-test.yml
|
||||
on:
|
||||
workflow_call:
|
||||
inputs:
|
||||
node-version:
|
||||
required: true
|
||||
type: string
|
||||
|
||||
jobs:
|
||||
test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: ${{ inputs.node-version }}
|
||||
```
|
||||
|
||||
**Call from another workflow:**
|
||||
```yaml
|
||||
jobs:
|
||||
test:
|
||||
uses: ./.github/workflows/reusable-test.yml
|
||||
with:
|
||||
node-version: '20'
|
||||
```
|
||||
|
||||
### GitLab CI
|
||||
|
||||
**Templates with extends:**
|
||||
```yaml
|
||||
.test_template:
|
||||
image: node:20
|
||||
before_script:
|
||||
- npm ci
|
||||
|
||||
unit-test:
|
||||
extends: .test_template
|
||||
script:
|
||||
- npm run test:unit
|
||||
|
||||
integration-test:
|
||||
extends: .test_template
|
||||
script:
|
||||
- npm run test:integration
|
||||
```
|
||||
|
||||
**DAG pipelines with needs:**
|
||||
```yaml
|
||||
build:
|
||||
stage: build
|
||||
|
||||
test:unit:
|
||||
stage: test
|
||||
needs: [build]
|
||||
|
||||
test:integration:
|
||||
stage: test
|
||||
needs: [build]
|
||||
|
||||
deploy:
|
||||
stage: deploy
|
||||
needs: [test:unit, test:integration]
|
||||
```
|
||||
|
||||
## Diagnostic Scripts
|
||||
|
||||
### Pipeline Analyzer
|
||||
|
||||
Analyzes workflow configuration for optimization opportunities:
|
||||
|
||||
```bash
|
||||
# GitHub Actions
|
||||
python3 scripts/pipeline_analyzer.py --platform github --workflow .github/workflows/ci.yml
|
||||
|
||||
# GitLab CI
|
||||
python3 scripts/pipeline_analyzer.py --platform gitlab --config .gitlab-ci.yml
|
||||
```
|
||||
|
||||
**Identifies:**
|
||||
- Missing caching opportunities
|
||||
- Unnecessary sequential execution
|
||||
- Outdated action versions
|
||||
- Unused artifacts
|
||||
- Overly broad triggers
|
||||
|
||||
### CI Health Checker
|
||||
|
||||
Checks pipeline status and identifies issues:
|
||||
|
||||
```bash
|
||||
# GitHub Actions
|
||||
python3 scripts/ci_health.py --platform github --repo owner/repo --limit 20
|
||||
|
||||
# GitLab CI
|
||||
python3 scripts/ci_health.py --platform gitlab --project-id 12345 --token $GITLAB_TOKEN
|
||||
```
|
||||
|
||||
**Provides:**
|
||||
- Success/failure rates
|
||||
- Recent failure patterns
|
||||
- Workflow-specific insights
|
||||
- Actionable recommendations
|
||||
|
||||
## Reference Documentation
|
||||
|
||||
For deep-dive information on specific topics:
|
||||
|
||||
- **[best_practices.md](references/best_practices.md)** - Pipeline design, testing strategies, deployment patterns, dependency management, artifact handling, platform-specific patterns
|
||||
- **[security.md](references/security.md)** - Secrets management, OIDC authentication, supply chain security, access control, vulnerability scanning, secure pipeline patterns
|
||||
- **[devsecops.md](references/devsecops.md)** - Comprehensive DevSecOps guide: SAST (CodeQL, Semgrep, Bandit, Gosec), DAST (OWASP ZAP), SCA (npm audit, pip-audit, Snyk), container security (Trivy, Grype, SBOM), secret scanning (TruffleHog, Gitleaks), security gates, license compliance
|
||||
- **[optimization.md](references/optimization.md)** - Caching strategies (dependencies, Docker layers, build artifacts), parallelization techniques, test splitting, build optimization, resource management
|
||||
- **[troubleshooting.md](references/troubleshooting.md)** - Common issues (workflow not triggering, flaky tests, timeouts, dependency errors), Docker problems, authentication issues, platform-specific debugging
|
||||
|
||||
## Templates
|
||||
|
||||
Starter templates for common use cases:
|
||||
|
||||
### GitHub Actions
|
||||
- **`assets/templates/github-actions/node-ci.yml`** - Complete Node.js CI/CD with security scanning, caching, matrix testing, and multi-environment deployment
|
||||
- **`assets/templates/github-actions/python-ci.yml`** - Python pipeline with security scanning, pytest, coverage, PyPI deployment
|
||||
- **`assets/templates/github-actions/go-ci.yml`** - Go pipeline with security scanning, multi-platform builds, benchmarks, integration tests
|
||||
- **`assets/templates/github-actions/docker-build.yml`** - Docker build with multi-platform support, security scanning, SBOM generation, and signing
|
||||
- **`assets/templates/github-actions/security-scan.yml`** - Comprehensive DevSecOps pipeline with SAST, DAST, SCA, container scanning, and security gates
|
||||
|
||||
### GitLab CI
|
||||
- **`assets/templates/gitlab-ci/node-ci.yml`** - GitLab CI pipeline with security scanning, parallel execution, services, and deployment stages
|
||||
- **`assets/templates/gitlab-ci/python-ci.yml`** - Python pipeline with security scanning, parallel testing, Docker builds, PyPI and Cloud Run deployment
|
||||
- **`assets/templates/gitlab-ci/go-ci.yml`** - Go pipeline with security scanning, multi-platform builds, benchmarks, Kubernetes deployment
|
||||
- **`assets/templates/gitlab-ci/docker-build.yml`** - Docker build with DinD, multi-arch, Container Registry, security scanning
|
||||
- **`assets/templates/gitlab-ci/security-scan.yml`** - Comprehensive DevSecOps pipeline with SAST, DAST, SCA, container scanning, GitLab security templates, and security gates
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Caching Dependencies
|
||||
|
||||
**GitHub Actions:**
|
||||
```yaml
|
||||
- uses: actions/cache@v4
|
||||
with:
|
||||
path: ~/.npm
|
||||
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-node-
|
||||
- run: npm ci
|
||||
```
|
||||
|
||||
**GitLab CI:**
|
||||
```yaml
|
||||
cache:
|
||||
key:
|
||||
files:
|
||||
- package-lock.json
|
||||
paths:
|
||||
- node_modules/
|
||||
```
|
||||
|
||||
### Matrix Builds
|
||||
|
||||
**GitHub Actions:**
|
||||
```yaml
|
||||
strategy:
|
||||
matrix:
|
||||
os: [ubuntu-latest, macos-latest]
|
||||
node: [18, 20, 22]
|
||||
fail-fast: false
|
||||
```
|
||||
|
||||
**GitLab CI:**
|
||||
```yaml
|
||||
test:
|
||||
parallel:
|
||||
matrix:
|
||||
- NODE_VERSION: ['18', '20', '22']
|
||||
```
|
||||
|
||||
### Conditional Execution
|
||||
|
||||
**GitHub Actions:**
|
||||
```yaml
|
||||
- name: Deploy
|
||||
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
|
||||
```
|
||||
|
||||
**GitLab CI:**
|
||||
```yaml
|
||||
deploy:
|
||||
rules:
|
||||
- if: '$CI_COMMIT_BRANCH == "main"'
|
||||
when: manual
|
||||
```
|
||||
|
||||
## Best Practices Summary
|
||||
|
||||
**Performance:**
|
||||
- Enable dependency caching
|
||||
- Parallelize independent jobs
|
||||
- Add path filters to reduce unnecessary runs
|
||||
- Use matrix builds for cross-platform testing
|
||||
|
||||
**Security:**
|
||||
- Use OIDC for cloud authentication
|
||||
- Pin actions to commit SHAs
|
||||
- Enable secret scanning and vulnerability checks
|
||||
- Apply principle of least privilege
|
||||
|
||||
**Reliability:**
|
||||
- Add timeouts to prevent hung jobs
|
||||
- Implement retry logic for flaky operations
|
||||
- Use health checks after deployments
|
||||
- Enable concurrency cancellation
|
||||
|
||||
**Maintainability:**
|
||||
- Use reusable workflows/templates
|
||||
- Document non-obvious decisions
|
||||
- Keep workflows DRY with extends/includes
|
||||
- Regular dependency updates
|
||||
|
||||
## Getting Started
|
||||
|
||||
1. **New pipeline:** Start with a template from `assets/templates/`
|
||||
2. **Add security scanning:** Use DevSecOps templates or add security stages to existing pipelines (see workflow 6 above)
|
||||
3. **Optimize existing:** Run `scripts/pipeline_analyzer.py`
|
||||
4. **Debug issues:** Check `references/troubleshooting.md`
|
||||
5. **Improve security:** Review `references/security.md` and `references/devsecops.md` checklists
|
||||
6. **Speed up builds:** See `references/optimization.md`
|
||||
Reference in New Issue
Block a user