Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:24:10 +08:00
commit 3b0a1ed0dd
14 changed files with 517 additions and 0 deletions

25
commands/audit.md Normal file
View File

@@ -0,0 +1,25 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <target> [framework]
description: Audit logging and compliance tracking for enterprise requirements
---
# Audit Command
Audit logging and compliance tracking for enterprise requirements
## Arguments
**$1 (Required)**: target
**$2 (Optional)**: framework
## Examples
```bash
/audit "User access logs" soc2
/audit "Data retention policies" gdpr
```
Invoke the compliance-auditor agent with: $ARGUMENTS

25
commands/incident.md Normal file
View File

@@ -0,0 +1,25 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <incident> [phase]
description: Incident response orchestration and SRE best practices
---
# Incident Command
Incident response orchestration and SRE best practices
## Arguments
**$1 (Required)**: incident
**$2 (Optional)**: phase
## Examples
```bash
/incident "Database connection pool exhausted" triage
/incident "Yesterday's outage analysis" postmortem
```
Invoke the sre-engineer agent with: $ARGUMENTS

104
commands/monitor.md Normal file
View File

@@ -0,0 +1,104 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <target> [platform]
description: Setup monitoring and alerting for applications and infrastructure
---
# Monitor Command
You are an observability specialist focused on implementing comprehensive monitoring and alerting solutions across multiple platforms.
## Your Mission
Configure monitoring dashboards, metrics collection, and alerting rules for the specified target using the requested platform (defaulting to Datadog if not specified).
## Arguments
You will receive positional arguments:
- `$1` (Required): Target to monitor - service name, metric type, application component, or infrastructure resource
- `$2` (Optional): Monitoring platform - datadog, cloudwatch, prometheus, grafana (defaults to datadog)
## Platform-Specific Approaches
### Datadog
- Configure APM traces and service monitoring
- Setup custom metrics and dashboards
- Create alert rules with appropriate thresholds
- Implement anomaly detection where applicable
- Configure notification channels (PagerDuty, Slack, email)
### CloudWatch
- Setup CloudWatch metrics and custom metrics
- Configure CloudWatch Alarms with appropriate evaluation periods
- Create CloudWatch Dashboards for visualization
- Setup CloudWatch Logs Insights queries
- Configure SNS topics for notifications
### Prometheus
- Define metric scrape configurations
- Create recording and alerting rules
- Setup Alertmanager for notification routing
- Configure service discovery mechanisms
### Grafana
- Design comprehensive dashboards
- Configure data sources (Prometheus, CloudWatch, etc.)
- Setup alert rules and notification channels
- Implement template variables for flexibility
## Implementation Guidelines
1. **Assess Requirements**
- Identify key metrics and KPIs for the target
- Determine appropriate alert thresholds
- Define SLIs/SLOs if applicable
2. **Configure Metrics Collection**
- Setup metric exporters or agents
- Configure custom metrics if needed
- Validate metric ingestion
3. **Create Dashboards**
- Design clear, actionable visualizations
- Include relevant time ranges and aggregations
- Add annotations for deployment events
4. **Setup Alerting**
- Define alert conditions and thresholds
- Configure escalation policies
- Setup notification channels
- Implement alert suppression for maintenance windows
5. **Document Configuration**
- Provide dashboard URLs
- Document alert thresholds and rationale
- Include runbook references for alerts
6. **Validate Setup**
- Test metric collection
- Verify alert triggering
- Confirm notification delivery
## Examples
```bash
/monitor "API response times" datadog
/monitor "Lambda function errors" cloudwatch
/monitor "PostgreSQL database metrics" prometheus
/monitor "Kubernetes cluster health" grafana
/monitor "payment-service" datadog
```
## Success Criteria
- Metrics are collecting successfully
- Dashboards provide clear visibility
- Alerts fire appropriately with minimal false positives
- Notification channels are configured and tested
- Documentation is complete and accessible
---
Invoke the datadog-specialist agent with: $ARGUMENTS

25
commands/slo.md Normal file
View File

@@ -0,0 +1,25 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <service> [type]
description: SLO/SLI definition and reliability tracking
---
# Slo Command
SLO/SLI definition and reliability tracking
## Arguments
**$1 (Required)**: service
**$2 (Optional)**: type
## Examples
```bash
/slo "payment-api" availability
/slo "search-service" latency
```
Invoke the sre-engineer agent with: $ARGUMENTS

25
commands/trace.md Normal file
View File

@@ -0,0 +1,25 @@
---
model: claude-sonnet-4-0
allowed-tools: Task, Bash, Read, Write
argument-hint: <service> [focus]
description: Distributed tracing and performance bottleneck analysis
---
# Trace Command
Distributed tracing and performance bottleneck analysis
## Arguments
**$1 (Required)**: service
**$2 (Optional)**: focus
## Examples
```bash
/trace "checkout-service" latency
/trace "payment-api" bottlenecks
```
Invoke the performance-analyst agent with: $ARGUMENTS