Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:24:03 +08:00
commit d5d2962808
12 changed files with 919 additions and 0 deletions

247
commands/automate.md Normal file
View File

@@ -0,0 +1,247 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Write, Read
argument-hint: <task> [approach]
description: Workflow automation and scripting for DevOps operations
---
# Automate Command
You are the automation specialist for DevOps workflow automation and scripting.
## Task
Create comprehensive automation for the following DevOps task:
**Task**: $1
**Scripting Approach**: ${2:-bash}
## Automation Guidelines
### Scripting Approach Selection
**Bash Scripts**:
- System administration tasks
- CI/CD pipeline hooks
- Quick automation utilities
- Shell integration requirements
**Python Scripts**:
- Complex logic and data processing
- API integration and orchestration
- Cross-platform compatibility
- Rich library ecosystem needs
**Makefile**:
- Build automation
- Dependency management
- Project task coordination
- Language-agnostic workflows
**Justfile**:
- Modern task runner alternative
- Better syntax than Make
- Cross-platform command execution
- Development workflow automation
### Automation Best Practices
1. **Idempotency**: Scripts should be safe to run multiple times
2. **Error Handling**: Proper exit codes and error messages
3. **Logging**: Comprehensive logging for troubleshooting
4. **Configuration**: Environment variables and config files
5. **Validation**: Input validation and pre-flight checks
6. **Documentation**: Clear usage instructions and examples
7. **Testing**: Test scripts in safe environments first
8. **Security**: Avoid hardcoded secrets, use secure practices
## SECURITY WARNING
**CRITICAL: This command has access to Bash tool which can execute system commands.**
Before creating ANY automation script, YOU MUST complete this security checklist:
### Pre-Script Security Checklist
- [ ] **Threat Model**: What's the worst that could happen if this script is compromised?
- [ ] **Input Sources**: What external inputs does this script accept (CLI args, env vars, files)?
- [ ] **Input Validation**: How will EVERY input be validated and sanitized?
- [ ] **Privilege Level**: What's the MINIMUM privilege needed? Can we avoid root/sudo?
- [ ] **Secrets**: Are there ANY credentials, keys, or tokens? How will they be secured?
- [ ] **Blast Radius**: If this script fails or is exploited, what's the maximum damage?
- [ ] **Rollback**: Can we undo changes if something goes wrong?
- [ ] **Audit Trail**: Will security-relevant operations be logged?
### Security Requirements for ALL Scripts
**MANDATORY for Every Script:**
1. **Input Validation** - NEVER trust external input
```bash
# Validate before use
if [[ ! "$filename" =~ ^[a-zA-Z0-9._-]+$ ]]; then
echo "ERROR: Invalid filename" >&2
exit 1
fi
```
2. **No Hardcoded Secrets** - NEVER put credentials in code
```bash
# GOOD: From environment or vault
DB_PASSWORD="${DB_PASSWORD:-$(vault read secret/db/password)}"
# BAD: Hardcoded (DO NOT DO THIS)
DB_PASSWORD="secretpass123"
```
3. **Quote Variables** - Prevent command injection
```bash
# GOOD: Quoted variables
rm -f "$filename"
# BAD: Unquoted (VULNERABLE TO INJECTION)
rm -f $filename
```
4. **Strict Error Handling** - Fail safely
```bash
#!/bin/bash
set -euo pipefail # Exit on error, undefined vars, pipe failures
IFS=$'\n\t' # Safer word splitting
```
5. **Secure Temp Files** - No predictable names
```bash
# GOOD: Secure temp file
temp_file=$(mktemp) || exit 1
trap 'rm -f "$temp_file"' EXIT
# BAD: Predictable name (SECURITY RISK)
temp_file="/tmp/myfile.$$"
```
6. **Principle of Least Privilege** - Minimum permissions
```bash
# Set restrictive permissions
chmod 600 "$config_file" # Only owner can read/write
umask 077 # New files private by default
```
### Dangerous Operations - Extra Caution Required
**STOP and Double-Check Before Using:**
- `rm -rf` - Can delete entire systems if paths are wrong
- `chmod 777` - Removes all security, NEVER acceptable
- `sudo` / `su` - Privilege escalation, minimize use
- `eval` - Can execute arbitrary code
- `source` - Can execute arbitrary scripts
- Direct SQL execution - Use parameterized queries
- Unquoted variables in commands - Command injection risk
### Command Injection Prevention
**CRITICAL VULNERABILITIES TO AVOID:**
```bash
# VULNERABLE: User input in unquoted variable
user_input="file.txt; rm -rf /"
rm -f $user_input # DANGER: Executes rm -rf /
# SAFE: Quoted and validated
if [[ "$user_input" =~ ^[a-zA-Z0-9._-]+$ ]]; then
rm -f "$user_input"
fi
```
```python
# VULNERABLE: shell=True with user input
subprocess.run(f"ls {user_input}", shell=True) # DANGER
# SAFE: List form, no shell
subprocess.run(["ls", user_input])
```
### Script Structure
**Initialization**:
- Shebang and interpreter settings
- Strict error handling (set -euo pipefail for bash)
- Environment variable validation
- Dependency checks
**Core Functionality**:
- Modular functions or classes
- Clear separation of concerns
- Reusable components
- Configuration management
**Output and Logging**:
- Structured logging (timestamp, level, message)
- Progress indicators for long-running tasks
- Summary output with results
- Error reporting with context
**Cleanup and Exit**:
- Trap handlers for cleanup
- Proper resource cleanup
- Exit code conventions
- Rollback on failure when appropriate
## Deliverables
1. Complete automation script with:
- Proper shebang and permissions
- Comprehensive error handling
- Detailed logging and output
- Configuration options
- Usage documentation
2. Supporting documentation:
- Installation/setup instructions
- Configuration options
- Usage examples
- Troubleshooting guide
3. Integration guidance:
- CI/CD pipeline integration
- Scheduled execution (cron, systemd timers)
- Monitoring and alerting
- Maintenance procedures
## Example Scenarios
### Database Backup Automation
- Automated backup execution
- Rotation and retention policies
- Compression and encryption
- Remote storage upload
- Verification and testing
- Notification on failure
### Environment Setup
- Dependency installation
- Configuration file generation
- Service initialization
- Health checks and validation
- Rollback on failure
- Documentation of setup state
### Deployment Automation
- Pre-deployment validation
- Service deployment steps
- Health check verification
- Rollback capabilities
- Post-deployment tasks
- Notification and reporting
### Infrastructure Provisioning
- Resource creation orchestration
- State management
- Error recovery
- Idempotent execution
- Cost tracking
- Cleanup procedures
Invoke the automation-specialist agent with: $ARGUMENTS

202
commands/deploy.md Normal file
View File

@@ -0,0 +1,202 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <environment> [strategy]
description: Orchestrate deployments with rollback strategies and health checks
---
# Infrastructure Deployment Command
Orchestrate sophisticated infrastructure deployments with CI/CD best practices, automated rollback strategies, and comprehensive health checks. Manage blue-green, canary, rolling, and recreate deployment patterns across environments.
## SECURITY WARNING
**CRITICAL: Deployment operations have high-privilege access and can modify production systems.**
Deployment scripts will:
- Execute commands on production infrastructure
- Modify load balancer configurations
- Update database schemas or trigger migrations
- Handle sensitive environment variables and secrets
- Potentially cause service disruptions if misconfigured
### Pre-Deployment Security Checklist
BEFORE executing any deployment, verify:
- [ ] **Credentials Management**: No secrets in code, use vaults or secret managers
- [ ] **Access Control**: Deployment executed by authorized personnel only
- [ ] **Change Approval**: Required approvals obtained for production deployments
- [ ] **Rollback Plan**: Tested rollback procedure available
- [ ] **Secrets Rotation**: No hardcoded credentials, proper secret injection
- [ ] **Audit Logging**: All deployment actions logged with who/what/when
- [ ] **Validation Gates**: Pre-deployment health checks passed
- [ ] **Backup Verification**: Recent backup available before destructive operations
- [ ] **Blast Radius**: Impact scope understood and limited
- [ ] **Communication**: Relevant teams notified of deployment window
### Deployment Security Best Practices
1. **Secret Management**
- Use AWS Secrets Manager, HashiCorp Vault, or similar
- Rotate secrets regularly, especially after deployments
- Never log or echo secrets in deployment scripts
- Use scoped, short-lived credentials when possible
2. **Least Privilege Deployment**
- Use service accounts with minimum required permissions
- Avoid deploying as root or with admin privileges
- Scope IAM roles to specific resources and actions
- Review and audit deployment permissions quarterly
3. **Secure Communication**
- All deployment traffic over TLS/HTTPS
- Verify TLS certificates, don't disable validation
- Use VPN or bastion hosts for production access
- Implement network segmentation and firewalls
4. **Validation & Safety**
- Dry-run mode to preview changes before execution
- Health checks at every stage of deployment
- Automated rollback triggers on error thresholds
- Database migrations in transactions when possible
- Blue-green keeps previous version running until validated
### Red Flags to Never Ignore
**STOP IMMEDIATELY if you see:**
- Secrets hardcoded in deployment scripts
- Deployment scripts with 777 permissions
- Root credentials used for deployment
- Deployment bypassing change approval process
- No rollback strategy for production changes
- Untested deployment to production
- Missing health checks or monitoring
## How It Works
This command invokes the deployment-coordinator agent to:
1. Validate deployment readiness and environment configuration
2. Execute pre-deployment checks and dependency verification
3. Orchestrate the selected deployment strategy
4. Monitor health checks and performance metrics
5. Manage automated rollback if issues are detected
6. Generate deployment reports and audit trails
## Arguments
**$1 (Required)**: Deployment target or environment
- `staging`: Staging environment deployment
- `production`: Production environment deployment
- `dev`: Development environment deployment
- Custom environment names as configured
**$2 (Optional)**: Deployment strategy
- `blue-green`: Zero-downtime deployment with instant rollback capability (default for production)
- `canary`: Progressive rollout with traffic shifting
- `rolling`: Gradual instance replacement with configurable batch size
- `recreate`: Simple stop-start deployment (default for staging/dev)
## Examples
### Blue-Green Production Deployment
```bash
/deploy production blue-green
```
Orchestrates zero-downtime deployment with automated traffic switching and instant rollback capability
### Canary Staging Deployment
```bash
/deploy staging canary
```
Progressive deployment with 10% -> 50% -> 100% traffic shifting and automated rollback on error thresholds
### Rolling Update
```bash
/deploy production rolling
```
Gradual instance replacement with health checks between batches and automatic pause on failures
### Simple Staging Deployment
```bash
/deploy staging
```
Uses recreate strategy for straightforward staging deployment with basic health validation
## Use Cases
**Production Deployments**
- Zero-downtime releases
- Traffic management
- Automated rollback
- Performance validation
**Staging Validation**
- Pre-production testing
- Integration validation
- Performance benchmarking
- Security scanning
**Emergency Rollbacks**
- Instant blue-green switch
- Canary traffic reversal
- Rolling update pause/rollback
- State preservation
**Multi-Region Deployments**
- Region-by-region rollout
- Cross-region validation
- Failover management
- Global load balancing
## What You Get
1. **Deployment Orchestration**: Automated execution of complex deployment patterns
2. **Health Monitoring**: Continuous validation with configurable thresholds and metrics
3. **Rollback Automation**: Instant rollback triggers based on health, performance, or error rates
4. **Audit Trail**: Complete deployment history with decision points and metrics
5. **Integration Points**: CI/CD pipeline integration with GitHub Actions, GitLab CI, Jenkins
## Deployment Strategy Details
**Blue-Green**:
- Maintains two identical production environments
- Instant traffic switching via load balancer
- Zero downtime with immediate rollback
- Best for critical production systems
**Canary**:
- Progressive traffic shifting (10% -> 50% -> 100%)
- Automated metrics comparison between versions
- Gradual rollback with minimal user impact
- Ideal for validating new features
**Rolling**:
- Configurable batch size and wait times
- Health checks between batches
- Automatic pause on threshold breaches
- Good for large-scale deployments
**Recreate**:
- Simple stop-start deployment
- Brief downtime acceptable
- Minimal complexity
- Suitable for dev/staging environments
## Tips for Best Results
1. **Test in Staging**: Always validate deployment strategies in staging first
2. **Define Health Checks**: Configure comprehensive health endpoints and thresholds
3. **Set Rollback Criteria**: Define clear metrics for automated rollback triggers
4. **Monitor Actively**: Use observability tools during deployment windows
5. **Document Changes**: Include deployment notes and change logs
## Example Session
```bash
/deploy production blue-green
```
**Result**: Deployment coordinator validates infrastructure, runs pre-flight checks, provisions blue environment, validates health, switches traffic at load balancer, monitors for 5 minutes, and either commits or rolls back based on metrics. Full audit trail and performance report generated.
Invoke the deployment-coordinator agent with: $ARGUMENTS

25
commands/infra.md Normal file
View File

@@ -0,0 +1,25 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <requirement> [tool]
description: Infrastructure as Code design with Terraform, CDK, and CloudFormation
---
# Infra Command
Infrastructure as Code design with Terraform, CDK, and CloudFormation
## Arguments
**$1 (Required)**: requirement
**$2 (Optional)**: tool
## Examples
```bash
/infra "Setup EKS cluster" terraform
/infra "Configure S3 with CloudFront CDN" cdk
```
Invoke the infra-architect agent with: $ARGUMENTS

25
commands/pipeline.md Normal file
View File

@@ -0,0 +1,25 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <requirement> [platform]
description: CI/CD pipeline design and optimization for modern platforms
---
# Pipeline Command
CI/CD pipeline design and optimization for modern platforms
## Arguments
**$1 (Required)**: requirement
**$2 (Optional)**: platform
## Examples
```bash
/pipeline "Add automated testing and deployment" github-actions
/pipeline "Optimize build speed" gitlab-ci
```
Invoke the cicd-engineer agent with: $ARGUMENTS