1.8 KiB
1.8 KiB
description: Monitor system health and performance metrics
argument-hint: [monitor-type] [time-range]
System Monitoring Command
Monitor system health, performance metrics, and operational status with comprehensive analysis.
Context
- Monitor type: $1 (health|performance|errors|all - default: all)
- Time range: $2 (1h|6h|24h|7d - default: 24h)
- System status: !
ps aux | head -10 - Disk usage: !
df -h - Memory usage: !
free -h
Monitoring Analysis
1. System Health Check
- Service status and availability
- Resource utilization (CPU, memory, disk)
- Network connectivity and latency
- Database connection and performance
2. Performance Metrics
- Response times and throughput
- Error rates and success rates
- Queue depths and processing times
- Cache hit rates and efficiency
3. Error Analysis
- Error frequency and patterns
- Critical error identification
- Performance degradation detection
- Anomaly detection and alerting
4. Operational Status
- Deployment status and version
- Configuration validation
- Security posture assessment
- Compliance status check
Monitoring Thresholds
- CPU Usage: < 80%
- Memory Usage: < 85%
- Disk Usage: < 90%
- Response Time: < 500ms (p95)
- Error Rate: < 1%
Expected Outcome
- Comprehensive system health report
- Performance metrics and trends
- Error analysis and recommendations
- Operational status summary
Alert Conditions
Immediate attention required for:
- Critical service failures
- Performance degradation > 50%
- Security incidents or breaches
- Resource exhaustion warnings
Recommendations
Based on monitoring data:
- Performance optimization opportunities
- Capacity planning suggestions
- Infrastructure scaling recommendations
- Preventive maintenance actions