Files
gh-geored-sre-skill-debuggi…/README.md
2025-11-29 18:28:20 +08:00

537 B

debugging-kubernetes-incidents

Comprehensive skill for systematic investigation and root cause analysis of Kubernetes pod failures, service degradation, and resource issues. Provides structured 5-phase methodology for incident triage, multi-source correlation of logs/events/metrics, and actionable remediation recommendations. Covers common patterns: CrashLoopBackOff, OOMKilled, ImagePullBackOff, resource constraints, TLS errors, and network issues. Emphasizes read-only investigation and distinguishing root causes from symptoms.