4.7 KiB
4.7 KiB
allowed-tools: Bash(cat:), Bash(redis-cli:), Bash(pg_isready:), Bash(nvidia-smi:), Bash(docker:), Bash(kubectl:), Read
description: Generalized infrastructure health check for MCP servers, databases, and system dependencies. Configurable via .infra-check.json in project root.
argument-hint: [--verbose] [--config ]
Infrastructure Health Check
Run comprehensive health checks on all configured infrastructure components.
Context
Command arguments: $ARGS
First, check for configuration files and load the appropriate one:
- Project config:
.infra-check.json - User config:
~/.infra-check.json - If neither exists, use default checks
Your Task
-
Load Configuration:
- Check for
.infra-check.jsonin current directory - Fallback to
~/.infra-check.jsonif not found - Use sensible defaults if no config exists
- Check for
-
Parse Configuration Schema:
{ "checks": { "redis": { "enabled": true, "url": "redis://localhost:6379", "timeout_seconds": 5 }, "temporal": { "enabled": true, "host": "localhost:7233", "namespace": "default" }, "taskqueue": { "enabled": true, "check_npm": true }, "postgresql": { "enabled": false, "connection_string": "postgresql://localhost:5432/mydb" }, "mongodb": { "enabled": false, "url": "mongodb://localhost:27017" }, "gpu": { "enabled": false, "required_model": "RTX 4090", "max_temperature": 85 }, "custom": [ { "name": "Custom Service", "check_command": "curl -f http://localhost:8080/health" } ] }, "output": { "verbose": false, "format": "standard" } } -
Execute Health Checks:
Redis/Valkey Check:
redis-cli -u $REDIS_URL pingTemporal Check:
temporal workflow list --namespace $NAMESPACE --limit 1TaskQueue Check:
npx --version && npm list taskqueue-mcp --depth=0PostgreSQL Check:
psql $CONNECTION_STRING -c "SELECT 1;" || pg_isready -d $CONNECTION_STRINGMongoDB Check:
mongosh $MONGODB_URL --eval "db.runCommand({ ping: 1 })" --quietGPU Check:
nvidia-smi --query-gpu=name,temperature.gpu,memory.used,memory.total --format=csv,noheaderCustom Checks: Run each custom check command and capture exit code.
-
Output Format:
Standard (non-verbose):
=== Infrastructure Health Check === ✅ Redis HEALTHY (redis://localhost:6379) ✅ Temporal HEALTHY (localhost:7233) ✅ TaskQueue HEALTHY (npx available) ⚠️ PostgreSQL WARNING (slow response: 2.3s) ❌ MongoDB FAILED (connection refused) =================================== Overall Status: DEGRADED ⚠️ Issues Detected: 1. PostgreSQL responding slowly (2.3s > 1.0s threshold) └─ Action: Check database load 2. MongoDB connection failed └─ Error: Connection refused at localhost:27017 └─ Fix: Start MongoDB with 'mongod' or 'docker run -d -p 27017:27017 mongo:latest'Verbose: Include detailed metrics for each service (connection time, memory usage, version, uptime, etc.)
-
Exit Codes:
- 0: All enabled checks passed (HEALTHY)
- 1: One or more checks failed (UNHEALTHY or DEGRADED)
-
Integration Notes:
- This command can be used in pre-test hooks
- Can be called from CI/CD pipelines
- Results can be published to coordination channels if MCP servers available
Example Configuration Files
Create .infra-check.json.example in project root:
{
"checks": {
"redis": {"enabled": true, "url": "redis://localhost:6379"},
"temporal": {"enabled": true, "host": "localhost:7233"},
"taskqueue": {"enabled": true}
}
}
Error Handling
- Gracefully handle missing dependencies (e.g., redis-cli not installed)
- Provide helpful error messages with installation instructions
- Continue checking other services even if one fails
- Aggregate all results before reporting overall status
Coordination Integration (Optional)
If Redis MCP is available, publish health metrics:
// Check if mcp__RedisMCPServer tools are available
// If yes, publish metrics:
await mcp__RedisMCPServer__hset({
name: "health:components",
key: "redis",
value: "healthy"
});
await mcp__RedisMCPServer__hset({
name: "health:last_check",
key: "timestamp",
value: new Date().toISOString()
});
If not available, simply output to console.