Files
gh-greyhaven-ai-claude-code…/skills/observability-engineering/templates/INDEX.md
2025-11-29 18:29:23 +08:00

2.1 KiB

Observability Templates

Copy-paste ready configuration templates for Prometheus, Grafana, and OpenTelemetry.

Templates Overview

Grafana Dashboard Template

File: grafana-dashboard.json

Production-ready Golden Signals dashboard:

  • Request Rate: Total RPS with 5-minute averages
  • Error Rate: Percentage of 5xx errors with alert thresholds
  • Latency: p50/p95/p99 percentiles in milliseconds
  • Saturation: CPU and memory usage percentages

Use when: Creating new service dashboards, standardizing monitoring


SLO Definition Template

File: slo-definition.yaml

Service Level Objective configuration:

  • SLO tiers: Critical (99.95%), Essential (99.9%), Standard (99.5%)
  • SLI definitions: Availability, latency, error rate
  • Error budget policy: Feature freeze thresholds
  • Multi-window burn rate alerts: 1h, 6h, 24h windows

Use when: Implementing SLO framework for new services


Prometheus Recording Rules

File: prometheus-recording-rules.yaml

Pre-aggregated metrics for fast dashboards:

  • Request rates: Per-service, per-endpoint RPS
  • Error rates: Percentage calculations (5xx / total)
  • Latency percentiles: p50/p95/p99 pre-computed
  • Error budget: Remaining budget and burn rate

Use when: Optimizing slow dashboard queries, implementing SLOs


Quick Usage

# Copy template to your monitoring directory
cp templates/grafana-dashboard.json ../monitoring/dashboards/

# Edit service name and thresholds
vim ../monitoring/dashboards/grafana-dashboard.json

# Import to Grafana
curl -X POST http://admin:password@localhost:3000/api/dashboards/db \
  -H "Content-Type: application/json" \
  -d @../monitoring/dashboards/grafana-dashboard.json

Return to main agent