Files
2025-11-30 08:52:48 +08:00

1.8 KiB

description: Monitor system health and performance metrics argument-hint: [monitor-type] [time-range]

System Monitoring Command

Monitor system health, performance metrics, and operational status with comprehensive analysis.

Context

  • Monitor type: $1 (health|performance|errors|all - default: all)
  • Time range: $2 (1h|6h|24h|7d - default: 24h)
  • System status: !ps aux | head -10
  • Disk usage: !df -h
  • Memory usage: !free -h

Monitoring Analysis

1. System Health Check

  • Service status and availability
  • Resource utilization (CPU, memory, disk)
  • Network connectivity and latency
  • Database connection and performance

2. Performance Metrics

  • Response times and throughput
  • Error rates and success rates
  • Queue depths and processing times
  • Cache hit rates and efficiency

3. Error Analysis

  • Error frequency and patterns
  • Critical error identification
  • Performance degradation detection
  • Anomaly detection and alerting

4. Operational Status

  • Deployment status and version
  • Configuration validation
  • Security posture assessment
  • Compliance status check

Monitoring Thresholds

  • CPU Usage: < 80%
  • Memory Usage: < 85%
  • Disk Usage: < 90%
  • Response Time: < 500ms (p95)
  • Error Rate: < 1%

Expected Outcome

  • Comprehensive system health report
  • Performance metrics and trends
  • Error analysis and recommendations
  • Operational status summary

Alert Conditions

Immediate attention required for:

  • Critical service failures
  • Performance degradation > 50%
  • Security incidents or breaches
  • Resource exhaustion warnings

Recommendations

Based on monitoring data:

  • Performance optimization opportunities
  • Capacity planning suggestions
  • Infrastructure scaling recommendations
  • Preventive maintenance actions