Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:51:20 +08:00
commit ad81bc571f
11 changed files with 3746 additions and 0 deletions

View File

@@ -0,0 +1,687 @@
# Kubernetes Performance Troubleshooting
Systematic approach to diagnosing and resolving Kubernetes performance issues.
## Table of Contents
1. [High Latency Issues](#high-latency-issues)
2. [CPU Performance](#cpu-performance)
3. [Memory Performance](#memory-performance)
4. [Network Performance](#network-performance)
5. [Storage I/O Performance](#storage-io-performance)
6. [Application-Level Metrics](#application-level-metrics)
7. [Cluster-Wide Performance](#cluster-wide-performance)
---
## High Latency Issues
### Symptoms
- Slow API response times
- Increased request latency
- Timeouts
- Degraded user experience
### Investigation Workflow
**1. Identify the layer with latency:**
```bash
# Check service mesh metrics (if using Istio/Linkerd)
kubectl top pods -n <namespace>
# Check ingress controller metrics
kubectl logs -n ingress-nginx <ingress-controller-pod> | grep "request_time"
# Check application logs for slow requests
kubectl logs <pod-name> -n <namespace> | grep -i "slow\|timeout\|latency"
```
**2. Profile application performance:**
```bash
# Get pod metrics
kubectl top pod <pod-name> -n <namespace>
# Check if pod is CPU throttled
kubectl get pod <pod-name> -n <namespace> -o json | \
jq '.spec.containers[].resources'
# Exec into pod and check application-specific metrics
kubectl exec -it <pod-name> -n <namespace> -- /bin/sh
# Then: curl localhost:8080/metrics (if Prometheus metrics available)
```
**3. Check dependencies:**
```bash
# Test connectivity to downstream services
kubectl exec -it <pod-name> -n <namespace> -- \
curl -w "@curl-format.txt" -o /dev/null -s http://backend-service
# curl-format.txt content:
# time_namelookup: %{time_namelookup}\n
# time_connect: %{time_connect}\n
# time_appconnect: %{time_appconnect}\n
# time_pretransfer: %{time_pretransfer}\n
# time_redirect: %{time_redirect}\n
# time_starttransfer: %{time_starttransfer}\n
# time_total: %{time_total}\n
```
### Common Causes and Solutions
**CPU Throttling:**
```yaml
# Increase CPU limits or remove limits for bursty workloads
resources:
requests:
cpu: "500m" # What pod needs typically
limits:
cpu: "2000m" # Burst capacity (or remove for unlimited)
```
**Insufficient Replicas:**
```bash
# Scale up deployment
kubectl scale deployment <deployment-name> -n <namespace> --replicas=5
# Or enable HPA
kubectl autoscale deployment <deployment-name> \
--cpu-percent=70 \
--min=2 \
--max=10
```
**Slow Dependencies:**
```yaml
# Implement circuit breakers and timeouts in application
# Or use service mesh policies (Istio example):
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: backend-circuit-breaker
spec:
host: backend-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 50
http2MaxRequests: 100
outlierDetection:
consecutiveErrors: 5
interval: 30s
baseEjectionTime: 30s
```
---
## CPU Performance
### Symptoms
- High CPU usage
- Throttling
- Slow processing
- Queue buildup
### Investigation Commands
```bash
# Check CPU usage
kubectl top nodes
kubectl top pods -n <namespace>
# Check CPU throttling
kubectl get pod <pod-name> -n <namespace> -o json | \
jq '.spec.containers[].resources'
# Get detailed CPU metrics (requires metrics-server)
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/<namespace>/pods/<pod-name>" | jq
# Check container-level CPU from node (SSH to node)
ssh <node> "docker stats --no-stream"
```
### Advanced CPU Profiling
**Enable CPU profiling in application:**
```bash
# For Go applications with pprof
kubectl port-forward <pod-name> 6060:6060 -n <namespace>
# Capture CPU profile
curl http://localhost:6060/debug/pprof/profile?seconds=30 > cpu.prof
# Analyze with pprof
go tool pprof -http=:8080 cpu.prof
```
**For Java applications:**
```bash
# Use async-profiler
kubectl exec -it <pod-name> -n <namespace> -- \
/profiler.sh -d 30 -f /tmp/flamegraph.html 1
# Copy flamegraph
kubectl cp <namespace>/<pod-name>:/tmp/flamegraph.html ./flamegraph.html
```
### Solutions
**Vertical Scaling:**
```yaml
resources:
requests:
cpu: "1000m" # Increased from 500m
limits:
cpu: "2000m" # Increased from 1000m
```
**Horizontal Scaling:**
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
**Remove CPU Limits for Bursty Workloads:**
```yaml
# Allow bursting to available CPU
resources:
requests:
cpu: "500m"
# No limits - can use all available CPU
```
---
## Memory Performance
### Symptoms
- OOMKilled pods
- Memory leaks
- Slow garbage collection
- Swap usage (if enabled)
### Investigation Commands
```bash
# Check memory usage
kubectl top nodes
kubectl top pods -n <namespace>
# Check memory limits and requests
kubectl describe pod <pod-name> -n <namespace> | grep -A 5 "Limits\|Requests"
# Check OOM kills
kubectl get pods -n <namespace> -o json | \
jq '.items[] | select(.status.containerStatuses[]?.lastState.terminated.reason == "OOMKilled") | .metadata.name'
# Detailed memory breakdown (requires metrics-server)
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/<namespace>/pods/<pod-name>" | \
jq '.containers[] | {name, usage: .usage.memory}'
```
### Memory Profiling
**Heap dump for Java:**
```bash
# Capture heap dump
kubectl exec <pod-name> -n <namespace> -- \
jmap -dump:format=b,file=/tmp/heapdump.hprof 1
# Copy heap dump
kubectl cp <namespace>/<pod-name>:/tmp/heapdump.hprof ./heapdump.hprof
# Analyze with Eclipse MAT or VisualVM
```
**Memory profiling for Go:**
```bash
# Capture heap profile
kubectl port-forward <pod-name> 6060:6060 -n <namespace>
curl http://localhost:6060/debug/pprof/heap > heap.prof
# Analyze
go tool pprof -http=:8080 heap.prof
```
### Solutions
**Increase Memory Limits:**
```yaml
resources:
requests:
memory: "512Mi"
limits:
memory: "2Gi" # Increased from 1Gi
```
**Optimize Application:**
- Fix memory leaks
- Implement connection pooling
- Optimize caching strategies
- Tune garbage collection
**Use Memory-Optimized Node Pools:**
```yaml
# Node affinity for memory-intensive workloads
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: workload-type
operator: In
values:
- memory-optimized
```
---
## Network Performance
### Symptoms
- High network latency
- Packet loss
- Connection timeouts
- Bandwidth saturation
### Investigation Commands
```bash
# Check pod network statistics
kubectl exec <pod-name> -n <namespace> -- netstat -s
# Test network performance between pods
# Deploy netperf
kubectl run netperf-client --image=networkstatic/netperf --rm -it -- /bin/bash
# From client, run:
netperf -H <target-pod-ip> -t TCP_STREAM
netperf -H <target-pod-ip> -t TCP_RR # Request-response latency
# Check DNS resolution time
kubectl exec <pod-name> -n <namespace> -- \
time nslookup service-name.namespace.svc.cluster.local
# Check service mesh overhead (if using Istio)
kubectl exec <pod-name> -n <namespace> -c istio-proxy -- \
curl -s localhost:15000/stats | grep "http.inbound\|http.outbound"
```
### Check Network Policies
```bash
# List network policies
kubectl get networkpolicies -n <namespace>
# Check if policy is blocking traffic
kubectl describe networkpolicy <policy-name> -n <namespace>
# Temporarily remove policies to test (in non-production)
kubectl delete networkpolicy <policy-name> -n <namespace>
```
### Solutions
**DNS Optimization:**
```yaml
# Use CoreDNS caching
# Increase CoreDNS replicas
kubectl scale deployment coredns -n kube-system --replicas=5
# Or use NodeLocal DNSCache
# https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/
```
**Optimize Service Mesh:**
```yaml
# Reduce Istio sidecar resources if over-provisioned
sidecar.istio.io/proxyCPU: "100m"
sidecar.istio.io/proxyMemory: "128Mi"
# Or disable for internal, trusted services
sidecar.istio.io/inject: "false"
```
**Use HostNetwork for Network-Intensive Pods:**
```yaml
# Use with caution - bypasses pod networking
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
```
**Enable Bandwidth Limits (QoS):**
```yaml
metadata:
annotations:
kubernetes.io/ingress-bandwidth: "10M"
kubernetes.io/egress-bandwidth: "10M"
```
---
## Storage I/O Performance
### Symptoms
- Slow read/write operations
- High I/O wait
- Application timeouts during disk operations
- Database performance issues
### Investigation Commands
```bash
# Check I/O metrics on node
ssh <node> "iostat -x 1 10"
# Check disk usage
kubectl exec <pod-name> -n <namespace> -- df -h
# Check I/O wait from pod
kubectl exec <pod-name> -n <namespace> -- top
# Test storage performance
kubectl exec <pod-name> -n <namespace> -- \
dd if=/dev/zero of=/data/test bs=1M count=1024 conv=fdatasync
# Check PV performance class
kubectl get pv <pv-name> -o yaml | grep storageClassName
kubectl describe storageclass <storage-class-name>
```
### Storage Benchmarking
**Deploy fio for benchmarking:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: fio-benchmark
spec:
containers:
- name: fio
image: ljishen/fio
command: ["/bin/sh", "-c"]
args:
- |
fio --name=seqread --rw=read --bs=1M --size=1G --runtime=60 --filename=/data/test
fio --name=seqwrite --rw=write --bs=1M --size=1G --runtime=60 --filename=/data/test
fio --name=randread --rw=randread --bs=4k --size=1G --runtime=60 --filename=/data/test
fio --name=randwrite --rw=randwrite --bs=4k --size=1G --runtime=60 --filename=/data/test
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: test-pvc
```
### Solutions
**Use Higher Performance Storage Class:**
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: high-performance-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3 # or io2, premium-rwo (GKE), etc.
resources:
requests:
storage: 100Gi
```
**Provision IOPS (AWS EBS io2):**
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: io2-high-iops
provisioner: ebs.csi.aws.com
parameters:
type: io2
iops: "10000"
fsType: ext4
volumeBindingMode: WaitForFirstConsumer
```
**Use Local NVMe for Ultra-Low Latency:**
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-nvme
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-nvme
local:
path: /mnt/disks/nvme0n1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node-with-nvme
```
---
## Application-Level Metrics
### Expose Prometheus Metrics
**Add metrics endpoint to application:**
```yaml
apiVersion: v1
kind: Service
metadata:
name: app-metrics
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
selector:
app: myapp
ports:
- name: metrics
port: 8080
targetPort: 8080
```
### Key Metrics to Monitor
**Application metrics:**
- Request rate
- Request latency (p50, p95, p99)
- Error rate
- Active connections
- Queue depth
- Cache hit rate
**Example Prometheus queries:**
```promql
# P95 latency
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
# Error rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
# Request rate
sum(rate(http_requests_total[5m]))
```
### Distributed Tracing
**Implement OpenTelemetry:**
```yaml
# Deploy Jaeger
apiVersion: apps/v1
kind: Deployment
metadata:
name: jaeger
spec:
template:
spec:
containers:
- name: jaeger
image: jaegertracing/all-in-one:latest
ports:
- containerPort: 16686 # UI
- containerPort: 14268 # Collector
```
**Instrument application:**
- Add OpenTelemetry SDK to application
- Configure trace export to Jaeger
- Analyze end-to-end request traces to identify bottlenecks
---
## Cluster-Wide Performance
### Cluster Resource Utilization
```bash
# Overall cluster capacity
kubectl top nodes
# Total resources
kubectl describe nodes | grep -A 5 "Allocated resources"
# Resource requests vs limits
kubectl get pods --all-namespaces -o json | \
jq -r '.items[] | "\(.metadata.namespace)/\(.metadata.name) \(.spec.containers[].resources)"'
```
### Control Plane Performance
```bash
# Check API server latency
kubectl get --raw /metrics | grep apiserver_request_duration_seconds
# Check etcd performance
kubectl exec -it -n kube-system etcd-<node> -- \
etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
check perf
# Controller manager metrics
kubectl get --raw /metrics | grep workqueue_depth
```
### Scheduler Performance
```bash
# Check scheduler latency
kubectl get --raw /metrics | grep scheduler_scheduling_duration_seconds
# Check pending pods
kubectl get pods --all-namespaces --field-selector status.phase=Pending
# Scheduler logs
kubectl logs -n kube-system kube-scheduler-<node>
```
### Solutions for Cluster-Wide Issues
**Scale Control Plane:**
- Add more control plane nodes
- Increase API server replicas
- Tune etcd (increase memory, use SSD)
**Optimize Scheduling:**
- Use pod priority and preemption
- Implement pod topology spread constraints
- Use node affinity/anti-affinity appropriately
**Resource Management:**
- Set appropriate resource requests and limits
- Use LimitRanges and ResourceQuotas
- Implement VerticalPodAutoscaler for right-sizing
---
## Performance Optimization Checklist
### Application Level
- [ ] Implement connection pooling
- [ ] Enable response caching
- [ ] Optimize database queries
- [ ] Use async/non-blocking I/O
- [ ] Implement circuit breakers
- [ ] Profile and optimize hot paths
### Kubernetes Level
- [ ] Set appropriate resource requests/limits
- [ ] Use HPA for auto-scaling
- [ ] Implement readiness/liveness probes correctly
- [ ] Use anti-affinity for high-availability
- [ ] Optimize container image size
- [ ] Use multi-stage builds
### Infrastructure Level
- [ ] Use appropriate instance/node types
- [ ] Enable cluster autoscaling
- [ ] Use high-performance storage classes
- [ ] Optimize network topology
- [ ] Implement monitoring and alerting
- [ ] Regular performance testing
---
## Monitoring Tools
**Essential tools:**
- **Prometheus + Grafana**: Metrics and dashboards
- **Jaeger/Zipkin**: Distributed tracing
- **kube-state-metrics**: Kubernetes object metrics
- **node-exporter**: Node-level metrics
- **cAdvisor**: Container metrics
- **kubectl-flamegraph**: CPU profiling
**Commercial options:**
- Datadog
- New Relic
- Dynatrace
- Elastic APM