2.8 KiB
2.8 KiB
name, description
| name | description |
|---|---|
| specweave-kafka:monitor-setup | Set up comprehensive Kafka monitoring with Prometheus + Grafana. Configures JMX exporter, dashboards, and alerting rules. |
Set Up Kafka Monitoring
Configure comprehensive monitoring for your Kafka cluster using Prometheus and Grafana.
What This Command Does
- JMX Exporter Setup: Configure Prometheus JMX exporter for Kafka brokers
- Prometheus Configuration: Add Kafka scrape targets
- Grafana Dashboards: Install 5 pre-built dashboards
- Alerting Rules: Configure 14 critical/high/warning alerts
- Verification: Test metrics collection and dashboard access
Interactive Workflow
I'll detect your environment and guide setup:
Environment Detection
- Kubernetes (Strimzi/Confluent Operator) → Use PodMonitor
- Docker Compose → Add Prometheus + Grafana services
- VM/Bare Metal → Configure JMX exporter JAR
Question 1: Where is Kafka running?
- Kubernetes (Strimzi)
- Docker Compose
- VMs/EC2 instances
Question 2: Prometheus already installed?
- Yes → Just add Kafka scrape config
- No → Install Prometheus + Grafana stack
Example Usage
# Start monitoring setup wizard
/specweave-kafka:monitor-setup
# I'll activate kafka-observability skill and:
# 1. Detect your environment
# 2. Configure JMX exporter (port 7071)
# 3. Set up Prometheus scraping
# 4. Install 5 Grafana dashboards
# 5. Configure 14 alerting rules
# 6. Verify metrics collection
What Gets Configured
JMX Exporter (Kafka brokers):
- Metrics endpoint on port 7071
- 50+ critical Kafka metrics exported
- Broker, topic, consumer lag, JVM metrics
Prometheus Scraping:
scrape_configs:
- job_name: 'kafka'
static_configs:
- targets: ['kafka-0:7071', 'kafka-1:7071', 'kafka-2:7071']
5 Grafana Dashboards:
- Cluster Overview - Health, throughput, ISR changes
- Broker Metrics - CPU, memory, network, request handlers
- Consumer Lag - Lag per group/topic, offset tracking
- Topic Metrics - Partition count, replication, log size
- JVM Metrics - Heap, GC, threads, file descriptors
14 Alerting Rules:
- CRITICAL: Under-replicated partitions, offline partitions, no controller
- HIGH: Consumer lag, ISR shrinks, leader elections
- WARNING: CPU, memory, GC time, disk usage
Prerequisites
- Kafka cluster running (self-hosted or K8s)
- Prometheus installed (or will be installed)
- Grafana installed (or will be installed)
Post-Setup
After setup completes, I'll:
- ✅ Provide Grafana URL and credentials
- ✅ Show how to access dashboards
- ✅ Explain critical alerts
- ✅ Suggest testing alerts by stopping a broker
Skills Activated: kafka-observability
Related Commands: /specweave-kafka:deploy
Dashboard Locations: plugins/specweave-kafka/monitoring/grafana/dashboards/