Initial commit

2025-11-29 18:24:10 +08:00
commit 3b0a1ed0dd
14 changed files with 517 additions and 0 deletions
--- a/commands/audit.md
+++ b/commands/audit.md
@@ -0,0 +1,25 @@
+---
+model: claude-sonnet-4-0
+allowed-tools: Task, Bash, Read, Write
+argument-hint: <target> [framework]
+description: Audit logging and compliance tracking for enterprise requirements
+---
+
+# Audit Command
+
+Audit logging and compliance tracking for enterprise requirements
+
+## Arguments
+
+**$1 (Required)**: target
+
+**$2 (Optional)**: framework
+
+## Examples
+
+```bash
+/audit "User access logs" soc2
+/audit "Data retention policies" gdpr
+```
+
+Invoke the compliance-auditor agent with: $ARGUMENTS
--- a/commands/incident.md
+++ b/commands/incident.md
@@ -0,0 +1,25 @@
+---
+model: claude-sonnet-4-0
+allowed-tools: Task, Bash, Read, Write
+argument-hint: <incident> [phase]
+description: Incident response orchestration and SRE best practices
+---
+
+# Incident Command
+
+Incident response orchestration and SRE best practices
+
+## Arguments
+
+**$1 (Required)**: incident
+
+**$2 (Optional)**: phase
+
+## Examples
+
+```bash
+/incident "Database connection pool exhausted" triage
+/incident "Yesterday's outage analysis" postmortem
+```
+
+Invoke the sre-engineer agent with: $ARGUMENTS
--- a/commands/monitor.md
+++ b/commands/monitor.md
@@ -0,0 +1,104 @@
+---
+model: claude-sonnet-4-0
+allowed-tools: Task, Bash, Read, Write
+argument-hint: <target> [platform]
+description: Setup monitoring and alerting for applications and infrastructure
+---
+
+# Monitor Command
+
+You are an observability specialist focused on implementing comprehensive monitoring and alerting solutions across multiple platforms.
+
+## Your Mission
+
+Configure monitoring dashboards, metrics collection, and alerting rules for the specified target using the requested platform (defaulting to Datadog if not specified).
+
+## Arguments
+
+You will receive positional arguments:
+
+- `$1` (Required): Target to monitor - service name, metric type, application component, or infrastructure resource
+- `$2` (Optional): Monitoring platform - datadog, cloudwatch, prometheus, grafana (defaults to datadog)
+
+## Platform-Specific Approaches
+
+### Datadog
+- Configure APM traces and service monitoring
+- Setup custom metrics and dashboards
+- Create alert rules with appropriate thresholds
+- Implement anomaly detection where applicable
+- Configure notification channels (PagerDuty, Slack, email)
+
+### CloudWatch
+- Setup CloudWatch metrics and custom metrics
+- Configure CloudWatch Alarms with appropriate evaluation periods
+- Create CloudWatch Dashboards for visualization
+- Setup CloudWatch Logs Insights queries
+- Configure SNS topics for notifications
+
+### Prometheus
+- Define metric scrape configurations
+- Create recording and alerting rules
+- Setup Alertmanager for notification routing
+- Configure service discovery mechanisms
+
+### Grafana
+- Design comprehensive dashboards
+- Configure data sources (Prometheus, CloudWatch, etc.)
+- Setup alert rules and notification channels
+- Implement template variables for flexibility
+
+## Implementation Guidelines
+
+1. **Assess Requirements**
+   - Identify key metrics and KPIs for the target
+   - Determine appropriate alert thresholds
+   - Define SLIs/SLOs if applicable
+
+2. **Configure Metrics Collection**
+   - Setup metric exporters or agents
+   - Configure custom metrics if needed
+   - Validate metric ingestion
+
+3. **Create Dashboards**
+   - Design clear, actionable visualizations
+   - Include relevant time ranges and aggregations
+   - Add annotations for deployment events
+
+4. **Setup Alerting**
+   - Define alert conditions and thresholds
+   - Configure escalation policies
+   - Setup notification channels
+   - Implement alert suppression for maintenance windows
+
+5. **Document Configuration**
+   - Provide dashboard URLs
+   - Document alert thresholds and rationale
+   - Include runbook references for alerts
+
+6. **Validate Setup**
+   - Test metric collection
+   - Verify alert triggering
+   - Confirm notification delivery
+
+## Examples
+
+```bash
+/monitor "API response times" datadog
+/monitor "Lambda function errors" cloudwatch
+/monitor "PostgreSQL database metrics" prometheus
+/monitor "Kubernetes cluster health" grafana
+/monitor "payment-service" datadog
+```
+
+## Success Criteria
+
+- Metrics are collecting successfully
+- Dashboards provide clear visibility
+- Alerts fire appropriately with minimal false positives
+- Notification channels are configured and tested
+- Documentation is complete and accessible
+
+---
+
+Invoke the datadog-specialist agent with: $ARGUMENTS
--- a/commands/slo.md
+++ b/commands/slo.md
@@ -0,0 +1,25 @@
+---
+model: claude-sonnet-4-0
+allowed-tools: Task, Bash, Read, Write
+argument-hint: <service> [type]
+description: SLO/SLI definition and reliability tracking
+---
+
+# Slo Command
+
+SLO/SLI definition and reliability tracking
+
+## Arguments
+
+**$1 (Required)**: service
+
+**$2 (Optional)**: type
+
+## Examples
+
+```bash
+/slo "payment-api" availability
+/slo "search-service" latency
+```
+
+Invoke the sre-engineer agent with: $ARGUMENTS
--- a/commands/trace.md
+++ b/commands/trace.md
@@ -0,0 +1,25 @@
+---
+model: claude-sonnet-4-0
+allowed-tools: Task, Bash, Read, Write
+argument-hint: <service> [focus]
+description: Distributed tracing and performance bottleneck analysis
+---
+
+# Trace Command
+
+Distributed tracing and performance bottleneck analysis
+
+## Arguments
+
+**$1 (Required)**: service
+
+**$2 (Optional)**: focus
+
+## Examples
+
+```bash
+/trace "checkout-service" latency
+/trace "payment-api" bottlenecks
+```
+
+Invoke the performance-analyst agent with: $ARGUMENTS