Initial commit
This commit is contained in:
194
commands/infra-check.md
Normal file
194
commands/infra-check.md
Normal file
@@ -0,0 +1,194 @@
|
||||
---
|
||||
allowed-tools: Bash(cat:*), Bash(redis-cli:*), Bash(pg_isready:*), Bash(nvidia-smi:*), Bash(docker:*), Bash(kubectl:*), Read
|
||||
description: Generalized infrastructure health check for MCP servers, databases, and system dependencies. Configurable via .infra-check.json in project root.
|
||||
argument-hint: [--verbose] [--config <path>]
|
||||
---
|
||||
|
||||
# Infrastructure Health Check
|
||||
|
||||
Run comprehensive health checks on all configured infrastructure components.
|
||||
|
||||
## Context
|
||||
|
||||
**Command arguments**: $ARGS
|
||||
|
||||
First, check for configuration files and load the appropriate one:
|
||||
|
||||
- Project config: `.infra-check.json`
|
||||
- User config: `~/.infra-check.json`
|
||||
- If neither exists, use default checks
|
||||
|
||||
## Your Task
|
||||
|
||||
1. **Load Configuration**:
|
||||
|
||||
- Check for `.infra-check.json` in current directory
|
||||
- Fallback to `~/.infra-check.json` if not found
|
||||
- Use sensible defaults if no config exists
|
||||
|
||||
1. **Parse Configuration Schema**:
|
||||
|
||||
```json
|
||||
{
|
||||
"checks": {
|
||||
"redis": {
|
||||
"enabled": true,
|
||||
"url": "redis://localhost:6379",
|
||||
"timeout_seconds": 5
|
||||
},
|
||||
"temporal": {
|
||||
"enabled": true,
|
||||
"host": "localhost:7233",
|
||||
"namespace": "default"
|
||||
},
|
||||
"taskqueue": {
|
||||
"enabled": true,
|
||||
"check_npm": true
|
||||
},
|
||||
"postgresql": {
|
||||
"enabled": false,
|
||||
"connection_string": "postgresql://localhost:5432/mydb"
|
||||
},
|
||||
"mongodb": {
|
||||
"enabled": false,
|
||||
"url": "mongodb://localhost:27017"
|
||||
},
|
||||
"gpu": {
|
||||
"enabled": false,
|
||||
"required_model": "RTX 4090",
|
||||
"max_temperature": 85
|
||||
},
|
||||
"custom": [
|
||||
{
|
||||
"name": "Custom Service",
|
||||
"check_command": "curl -f http://localhost:8080/health"
|
||||
}
|
||||
]
|
||||
},
|
||||
"output": {
|
||||
"verbose": false,
|
||||
"format": "standard"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. **Execute Health Checks**:
|
||||
|
||||
**Redis/Valkey Check**:
|
||||
|
||||
```bash
|
||||
redis-cli -u $REDIS_URL ping
|
||||
```
|
||||
|
||||
**Temporal Check**:
|
||||
|
||||
```bash
|
||||
temporal workflow list --namespace $NAMESPACE --limit 1
|
||||
```
|
||||
|
||||
**TaskQueue Check**:
|
||||
|
||||
```bash
|
||||
npx --version && npm list taskqueue-mcp --depth=0
|
||||
```
|
||||
|
||||
**PostgreSQL Check**:
|
||||
|
||||
```bash
|
||||
psql $CONNECTION_STRING -c "SELECT 1;" || pg_isready -d $CONNECTION_STRING
|
||||
```
|
||||
|
||||
**MongoDB Check**:
|
||||
|
||||
```bash
|
||||
mongosh $MONGODB_URL --eval "db.runCommand({ ping: 1 })" --quiet
|
||||
```
|
||||
|
||||
**GPU Check**:
|
||||
|
||||
```bash
|
||||
nvidia-smi --query-gpu=name,temperature.gpu,memory.used,memory.total --format=csv,noheader
|
||||
```
|
||||
|
||||
**Custom Checks**: Run each custom check command and capture exit code.
|
||||
|
||||
1. **Output Format**:
|
||||
|
||||
**Standard (non-verbose)**:
|
||||
|
||||
```
|
||||
=== Infrastructure Health Check ===
|
||||
|
||||
✅ Redis HEALTHY (redis://localhost:6379)
|
||||
✅ Temporal HEALTHY (localhost:7233)
|
||||
✅ TaskQueue HEALTHY (npx available)
|
||||
⚠️ PostgreSQL WARNING (slow response: 2.3s)
|
||||
❌ MongoDB FAILED (connection refused)
|
||||
|
||||
===================================
|
||||
Overall Status: DEGRADED ⚠️
|
||||
|
||||
Issues Detected:
|
||||
1. PostgreSQL responding slowly (2.3s > 1.0s threshold)
|
||||
└─ Action: Check database load
|
||||
|
||||
2. MongoDB connection failed
|
||||
└─ Error: Connection refused at localhost:27017
|
||||
└─ Fix: Start MongoDB with 'mongod' or 'docker run -d -p 27017:27017 mongo:latest'
|
||||
```
|
||||
|
||||
**Verbose**: Include detailed metrics for each service (connection time, memory usage, version, uptime, etc.)
|
||||
|
||||
1. **Exit Codes**:
|
||||
|
||||
- 0: All enabled checks passed (HEALTHY)
|
||||
- 1: One or more checks failed (UNHEALTHY or DEGRADED)
|
||||
|
||||
1. **Integration Notes**:
|
||||
|
||||
- This command can be used in pre-test hooks
|
||||
- Can be called from CI/CD pipelines
|
||||
- Results can be published to coordination channels if MCP servers available
|
||||
|
||||
## Example Configuration Files
|
||||
|
||||
Create `.infra-check.json.example` in project root:
|
||||
|
||||
```json
|
||||
{
|
||||
"checks": {
|
||||
"redis": {"enabled": true, "url": "redis://localhost:6379"},
|
||||
"temporal": {"enabled": true, "host": "localhost:7233"},
|
||||
"taskqueue": {"enabled": true}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Gracefully handle missing dependencies (e.g., redis-cli not installed)
|
||||
- Provide helpful error messages with installation instructions
|
||||
- Continue checking other services even if one fails
|
||||
- Aggregate all results before reporting overall status
|
||||
|
||||
## Coordination Integration (Optional)
|
||||
|
||||
If Redis MCP is available, publish health metrics:
|
||||
|
||||
```javascript
|
||||
// Check if mcp__RedisMCPServer tools are available
|
||||
// If yes, publish metrics:
|
||||
await mcp__RedisMCPServer__hset({
|
||||
name: "health:components",
|
||||
key: "redis",
|
||||
value: "healthy"
|
||||
});
|
||||
|
||||
await mcp__RedisMCPServer__hset({
|
||||
name: "health:last_check",
|
||||
key: "timestamp",
|
||||
value: new Date().toISOString()
|
||||
});
|
||||
```
|
||||
|
||||
If not available, simply output to console.
|
||||
447
commands/pipeline-status.md
Normal file
447
commands/pipeline-status.md
Normal file
@@ -0,0 +1,447 @@
|
||||
---
|
||||
allowed-tools: Bash(gh:*), Bash(glab:*), Bash(git:*), Bash(ls:*), Bash(cat:*), Read, Glob, mcp__github__*, mcp__temporal-mcp__*
|
||||
description: CI/CD pipeline status checker. Shows build, test, and deployment status. Configurable for GitHub Actions, GitLab CI, Jenkins, or custom pipelines via .pipeline-status.sh.
|
||||
argument-hint: [--detailed] [--watch]
|
||||
---
|
||||
|
||||
# CI/CD Pipeline Status
|
||||
|
||||
Check the status of CI/CD pipelines for the current project.
|
||||
|
||||
## Context
|
||||
|
||||
**Command arguments**: $ARGS
|
||||
|
||||
**Supported Pipeline Systems** (auto-detect):
|
||||
|
||||
1. GitHub Actions - `.github/workflows/`
|
||||
1. GitLab CI - `.gitlab-ci.yml`
|
||||
1. Jenkins - `Jenkinsfile`
|
||||
1. Custom - `.pipeline-status.sh`
|
||||
1. Temporal - Workflow executions
|
||||
|
||||
**Repository Info** - Detect git branch, commit, and remote URL to contextualize pipeline runs.
|
||||
|
||||
## Your Task
|
||||
|
||||
Provide pipeline status using the appropriate method for the detected CI/CD system:
|
||||
|
||||
### Method 1: GitHub Actions
|
||||
|
||||
Use GitHub CLI to query workflow runs:
|
||||
|
||||
```bash
|
||||
# Check if gh CLI is installed and authenticated
|
||||
if ! command -v gh &> /dev/null; then
|
||||
echo "❌ GitHub CLI (gh) not installed"
|
||||
echo "Install: https://cli.github.com/"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if ! gh auth status &> /dev/null; then
|
||||
echo "❌ Not authenticated with GitHub"
|
||||
echo "Run: gh auth login"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Get latest workflow runs
|
||||
gh run list --limit 10 --json status,name,conclusion,createdAt,headBranch,databaseId
|
||||
|
||||
# For detailed view, show specific workflow
|
||||
if [ -n "$1" ]; then
|
||||
gh run view "$1"
|
||||
fi
|
||||
```
|
||||
|
||||
**Output format**:
|
||||
|
||||
```
|
||||
=== CI/CD Pipeline Status ===
|
||||
System: GitHub Actions
|
||||
Branch: main
|
||||
Commit: abc1234
|
||||
|
||||
Recent Workflow Runs:
|
||||
|
||||
✅ CI (main) Success 2m ago #1234
|
||||
✅ Tests (main) Success 2m ago #1235
|
||||
🟡 Deploy to Staging (main) Running 1m ago #1236
|
||||
❌ Lint (feature/new) Failed 15m ago #1237
|
||||
|
||||
Latest on current branch (main):
|
||||
Status: ✅ All checks passed
|
||||
Duration: 3m 45s
|
||||
Started: 2025-10-12 14:25:00
|
||||
|
||||
Failed Jobs:
|
||||
Lint (feature/new):
|
||||
- Step: Run ruff check
|
||||
- Error: Linting errors found in module.py:45
|
||||
- View: gh run view 1237
|
||||
```
|
||||
|
||||
### Method 2: GitLab CI
|
||||
|
||||
Use GitLab API or CLI:
|
||||
|
||||
```bash
|
||||
# Check for GitLab CLI (glab)
|
||||
if command -v glab &> /dev/null; then
|
||||
# Get pipeline status for current branch
|
||||
glab ci list --status=running,success,failed
|
||||
glab ci status
|
||||
else
|
||||
echo "⚠️ GitLab CLI not installed, using API..."
|
||||
|
||||
# Extract project info from git remote
|
||||
GITLAB_REMOTE=$(git config --get remote.origin.url)
|
||||
# Parse and call API
|
||||
fi
|
||||
```
|
||||
|
||||
**Output format**:
|
||||
|
||||
```
|
||||
=== CI/CD Pipeline Status ===
|
||||
System: GitLab CI
|
||||
Branch: main
|
||||
Commit: abc1234
|
||||
|
||||
Pipeline #5678 (running):
|
||||
✅ build Success (2m 15s)
|
||||
✅ test Success (1m 45s)
|
||||
🟡 deploy Running (45s elapsed)
|
||||
|
||||
Recent Pipelines:
|
||||
#5678 (main) Running 2m ago
|
||||
#5677 (main) Success 1h ago
|
||||
#5676 (dev) Failed 3h ago
|
||||
|
||||
View details: glab ci view 5678
|
||||
```
|
||||
|
||||
### Method 3: Jenkins
|
||||
|
||||
Query Jenkins API:
|
||||
|
||||
```bash
|
||||
# Check for Jenkins CLI or use curl
|
||||
JENKINS_URL="${JENKINS_URL:-http://localhost:8080}"
|
||||
JOB_NAME="${JENKINS_JOB_NAME:-$(basename $(pwd))}"
|
||||
|
||||
if [ -n "$JENKINS_API_TOKEN" ] && [ -n "$JENKINS_USER" ]; then
|
||||
# Query job status via API
|
||||
curl -s -u "$JENKINS_USER:$JENKINS_API_TOKEN" \
|
||||
"$JENKINS_URL/job/$JOB_NAME/lastBuild/api/json" | jq .
|
||||
else
|
||||
echo "⚠️ Jenkins credentials not configured"
|
||||
echo "Set: JENKINS_URL, JENKINS_USER, JENKINS_API_TOKEN"
|
||||
fi
|
||||
```
|
||||
|
||||
**Output format**:
|
||||
|
||||
```
|
||||
=== CI/CD Pipeline Status ===
|
||||
System: Jenkins
|
||||
Job: podcast-pipeline
|
||||
Build: #42
|
||||
|
||||
Current Build (#42):
|
||||
Status: 🟡 Running
|
||||
Started: 3m 45s ago
|
||||
ETA: 2m remaining
|
||||
|
||||
Stages:
|
||||
✅ Checkout Complete (15s)
|
||||
✅ Dependencies Complete (45s)
|
||||
✅ Build Complete (1m 30s)
|
||||
🟡 Test Running (1m 15s)
|
||||
⏸️ Deploy Pending
|
||||
|
||||
Recent Builds:
|
||||
#42 (main) Running 3m ago
|
||||
#41 (main) Success 2h ago
|
||||
#40 (dev) Failed 1d ago
|
||||
|
||||
View: $JENKINS_URL/job/$JOB_NAME/42/
|
||||
```
|
||||
|
||||
### Method 4: Custom Pipeline Script
|
||||
|
||||
If `.pipeline-status.sh` exists, execute it:
|
||||
|
||||
```bash
|
||||
if [ -f ".pipeline-status.sh" ]; then
|
||||
echo "=== CI/CD Pipeline Status ==="
|
||||
echo "System: Custom"
|
||||
echo ""
|
||||
|
||||
# Source and execute custom script
|
||||
source .pipeline-status.sh
|
||||
|
||||
if declare -f get_pipeline_status > /dev/null; then
|
||||
get_pipeline_status "$@"
|
||||
else
|
||||
echo "❌ .pipeline-status.sh must define get_pipeline_status() function"
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
**Custom script format** (`.pipeline-status.sh`):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Custom pipeline status function
|
||||
get_pipeline_status() {
|
||||
local detailed="${1:-false}"
|
||||
|
||||
echo "Branch: $(git rev-parse --abbrev-ref HEAD)"
|
||||
echo "Commit: $(git rev-parse --short HEAD)"
|
||||
echo ""
|
||||
|
||||
# Check local test status
|
||||
echo "Local Tests:"
|
||||
if [ -f ".test-results/last-run.json" ]; then
|
||||
jq -r '.summary' .test-results/last-run.json
|
||||
else
|
||||
echo " No recent test results"
|
||||
fi
|
||||
|
||||
# Check build artifacts
|
||||
echo ""
|
||||
echo "Build Artifacts:"
|
||||
if [ -d "dist/" ]; then
|
||||
echo " ✅ Found: dist/ ($(du -sh dist/ | cut -f1))"
|
||||
else
|
||||
echo " ❌ Missing: dist/"
|
||||
fi
|
||||
|
||||
# Check deployment status (custom logic)
|
||||
echo ""
|
||||
echo "Deployment:"
|
||||
if curl -sf http://localhost:8000/health > /dev/null; then
|
||||
echo " ✅ Local server running"
|
||||
else
|
||||
echo " ❌ Local server not running"
|
||||
fi
|
||||
}
|
||||
```
|
||||
|
||||
### Method 5: No CI/CD Detected
|
||||
|
||||
Provide guidance for setup:
|
||||
|
||||
```
|
||||
=== CI/CD Pipeline Status ===
|
||||
|
||||
⚠️ No CI/CD system detected
|
||||
|
||||
To enable pipeline status tracking:
|
||||
|
||||
Option 1: GitHub Actions
|
||||
Create: .github/workflows/ci.yml
|
||||
Install: gh (GitHub CLI)
|
||||
Docs: https://docs.github.com/actions
|
||||
|
||||
Option 2: GitLab CI
|
||||
Create: .gitlab-ci.yml
|
||||
Install: glab (GitLab CLI)
|
||||
Docs: https://docs.gitlab.com/ee/ci/
|
||||
|
||||
Option 3: Custom Script
|
||||
Create: .pipeline-status.sh
|
||||
Define: get_pipeline_status() function
|
||||
|
||||
Option 4: Local Checks Only
|
||||
Run: pytest, ruff, mypy locally
|
||||
No CI/CD required for development
|
||||
```
|
||||
|
||||
## Watch Mode (Optional)
|
||||
|
||||
If `--watch` flag is provided, poll for updates:
|
||||
|
||||
```bash
|
||||
if [[ "$*" == *"--watch"* ]]; then
|
||||
echo "👀 Watching pipeline status (Ctrl+C to stop)..."
|
||||
echo ""
|
||||
|
||||
while true; do
|
||||
clear
|
||||
# Run status check
|
||||
get_pipeline_status
|
||||
|
||||
echo ""
|
||||
echo "Refreshing in 30 seconds..."
|
||||
sleep 30
|
||||
done
|
||||
fi
|
||||
```
|
||||
|
||||
## Detailed Mode
|
||||
|
||||
If `--detailed` flag is provided, include:
|
||||
|
||||
- Full job logs (last 50 lines)
|
||||
- Artifact download URLs
|
||||
- Test coverage reports
|
||||
- Performance metrics
|
||||
- Deployment URLs
|
||||
|
||||
```bash
|
||||
if [[ "$*" == *"--detailed"* ]]; then
|
||||
echo ""
|
||||
echo "=== Detailed Pipeline Information ==="
|
||||
echo ""
|
||||
|
||||
# Show recent logs
|
||||
echo "Recent Logs (last 50 lines):"
|
||||
if command -v gh &> /dev/null; then
|
||||
gh run view --log | tail -50
|
||||
fi
|
||||
|
||||
# Show test coverage
|
||||
if [ -f "coverage.xml" ]; then
|
||||
echo ""
|
||||
echo "Test Coverage:"
|
||||
# Parse coverage report
|
||||
fi
|
||||
|
||||
# Show artifacts
|
||||
echo ""
|
||||
echo "Build Artifacts:"
|
||||
if command -v gh &> /dev/null; then
|
||||
gh run view --json artifacts
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
## Integration with Coordination Systems
|
||||
|
||||
If Redis MCP is available, cache pipeline status:
|
||||
|
||||
```javascript
|
||||
// Store pipeline status for monitoring
|
||||
await mcp__RedisMCPServer__json_set({
|
||||
name: "pipeline:status:current",
|
||||
path: "$",
|
||||
value: {
|
||||
system: "github",
|
||||
branch: "main",
|
||||
commit: "abc1234",
|
||||
status: "running",
|
||||
started_at: "2025-10-12T14:25:00Z",
|
||||
workflows: [
|
||||
{name: "CI", status: "success"},
|
||||
{name: "Tests", status: "success"},
|
||||
{name: "Deploy", status: "running"}
|
||||
]
|
||||
}
|
||||
});
|
||||
|
||||
// Set TTL (expire after 1 hour)
|
||||
await mcp__RedisMCPServer__expire({
|
||||
name: "pipeline:status:current",
|
||||
expire_seconds: 3600
|
||||
});
|
||||
```
|
||||
|
||||
If TaskQueue MCP is available, create tasks for failed pipelines:
|
||||
|
||||
```javascript
|
||||
// On pipeline failure, create recovery task
|
||||
if (pipelineStatus === "failed") {
|
||||
await mcp__taskqueue__create_task({
|
||||
projectId: "ci-cd",
|
||||
title: "Investigate pipeline failure",
|
||||
description: `Pipeline #${buildNumber} failed on ${branch}
|
||||
|
||||
Failed Jobs:
|
||||
- ${failedJobs.join('\n- ')}
|
||||
|
||||
View logs: ${logsUrl}`
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration File
|
||||
|
||||
Create `.pipeline-status.json` for custom settings:
|
||||
|
||||
```json
|
||||
{
|
||||
"system": "github",
|
||||
"watch_interval_seconds": 30,
|
||||
"notification": {
|
||||
"enabled": true,
|
||||
"on_failure": true,
|
||||
"on_success": false
|
||||
},
|
||||
"filters": {
|
||||
"branches": ["main", "develop"],
|
||||
"workflows": ["CI", "Tests"]
|
||||
},
|
||||
"output": {
|
||||
"format": "standard",
|
||||
"show_logs": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Gracefully handle API authentication failures
|
||||
- Provide helpful error messages with setup instructions
|
||||
- Fall back to git log if CI/CD unavailable
|
||||
- Suggest local test commands if no CI/CD configured
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
**Before pushing code**:
|
||||
|
||||
```bash
|
||||
/pipeline-status
|
||||
# Check current status before pushing changes
|
||||
```
|
||||
|
||||
**Monitoring long-running builds**:
|
||||
|
||||
```bash
|
||||
/pipeline-status --watch
|
||||
# Monitor build progress in real-time
|
||||
```
|
||||
|
||||
**Debugging failures**:
|
||||
|
||||
```bash
|
||||
/pipeline-status --detailed
|
||||
# Get full logs and error details
|
||||
```
|
||||
|
||||
**Integration with pre-push hook**:
|
||||
|
||||
```bash
|
||||
# .git/hooks/pre-push
|
||||
#!/bin/bash
|
||||
if ! /pipeline-status | grep -q "All checks passed"; then
|
||||
echo "⚠️ Warning: Previous pipeline failed"
|
||||
read -p "Continue push? (y/N) " -n 1 -r
|
||||
echo
|
||||
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Cache API responses** to avoid rate limits
|
||||
1. **Use CLI tools** (gh, glab) over direct API calls when available
|
||||
1. **Implement retries** for transient API failures
|
||||
1. **Store credentials securely** (use environment variables, not config files)
|
||||
1. **Filter noise** - focus on current branch and recent runs
|
||||
1. **Link to web UI** for detailed investigation
|
||||
1. **Integrate with notifications** for critical failures
|
||||
431
commands/team-status.md
Normal file
431
commands/team-status.md
Normal file
@@ -0,0 +1,431 @@
|
||||
---
|
||||
allowed-tools: Bash(redis-cli:*), Read, Glob, mcp__RedisMCPServer__*
|
||||
description: Multi-agent coordination status check. Shows agent workload, active tasks, and team health with formatted progress bars and statistics.
|
||||
argument-hint: [agent-type]
|
||||
---
|
||||
|
||||
# Multi-Agent Team Status
|
||||
|
||||
Check current workload and coordination status of Claude Code subagents with formatted visualization.
|
||||
|
||||
## Context
|
||||
|
||||
**Command arguments**: $ARGS
|
||||
|
||||
**Coordination Method**: Redis MCP with JSON serialization and datetime handling
|
||||
|
||||
## Your Task
|
||||
|
||||
Display formatted team status using Redis coordination data with proper JSON parsing:
|
||||
|
||||
### Implementation
|
||||
|
||||
1. **Parse command arguments** to determine display mode:
|
||||
|
||||
- No arguments: Show overview of all agents
|
||||
- Agent type specified: Show detailed view of that agent
|
||||
|
||||
1. **Query Redis with JSON parsing**:
|
||||
|
||||
- Get all agent statuses from `agents:status` hash
|
||||
- Get heartbeat data from `agents:heartbeat` hash
|
||||
- Parse JSON values using the pattern from RedisCoordinationHelper
|
||||
- Restore datetime fields (fields ending in `_at` or containing `timestamp`)
|
||||
|
||||
1. **Display formatted output** matching these patterns:
|
||||
|
||||
### Overview Mode (No Arguments)
|
||||
|
||||
```
|
||||
=== Multi-Agent Team Status ===
|
||||
Coordination Method: Redis MCP
|
||||
|
||||
Agent Workload:
|
||||
project-manager ██████████ 100% | 1 active task
|
||||
platform-engineer ██████████ 100% | 1 active task
|
||||
ai-engineer ████████░░ 80% | 2 active tasks
|
||||
data-engineer ████░░░░░░ 40% | 1 active task
|
||||
python-pro ░░░░░░░░░░ 0% | available
|
||||
|
||||
Task Distribution:
|
||||
Total Active: 5 tasks
|
||||
Average Load: 64%
|
||||
Load Variance: 35% (⚠️ unbalanced)
|
||||
|
||||
Heartbeat Status:
|
||||
✅ All agents reporting healthy
|
||||
|
||||
Last Updated: 2025-11-07 20:45:30
|
||||
```
|
||||
|
||||
**Progress Bar Format**:
|
||||
|
||||
- 10 blocks: `█` for used capacity, `░` for available
|
||||
- Calculate blocks: `Math.round(workload / 10)`
|
||||
- Show percentage and task count
|
||||
- Add "(at capacity)" warning if workload >= 100
|
||||
|
||||
**Workload Indicators**:
|
||||
|
||||
- 0%: "available"
|
||||
- 1-100%: "N active task(s)"
|
||||
- ≥100%: "N active tasks (at capacity)"
|
||||
|
||||
**Balance Status**:
|
||||
|
||||
- Variance \< 20%: "✅ balanced"
|
||||
- Variance ≥ 20%: "⚠️ unbalanced"
|
||||
|
||||
**Heartbeat Freshness**:
|
||||
|
||||
- Age ≤ 60 minutes: Healthy
|
||||
- Age > 60 minutes: Warn with specific age (Xh Ym or Ym format)
|
||||
|
||||
### Detailed Mode (Agent Type Argument)
|
||||
|
||||
When user specifies agent type (e.g., `/team-status project-manager`):
|
||||
|
||||
```
|
||||
=== Agent Status: project-manager ===
|
||||
|
||||
Current Status: BUSY (100% capacity)
|
||||
|
||||
Active Tasks (1):
|
||||
1. smart-onboarding-coordination
|
||||
- Progress: 45%
|
||||
- Duration: 2h 15m
|
||||
- Description: Coordinating Smart Onboarding implementation
|
||||
|
||||
Recent History (from status data):
|
||||
- Current workload: 100%
|
||||
- Task count: 1
|
||||
- Last updated: 5m ago
|
||||
|
||||
Circuit Breaker: ✅ CLOSED (healthy)
|
||||
Last Heartbeat: 5m ago
|
||||
```
|
||||
|
||||
**Task Display**:
|
||||
|
||||
- Show task ID or name
|
||||
- Display progress if available
|
||||
- Calculate duration from `started_at` if present
|
||||
- Show description if available
|
||||
|
||||
**Heartbeat Age Calculation**:
|
||||
|
||||
- Compare timestamp to current time
|
||||
- Format as "Xh Ym" (hours + minutes) or "Ym" (minutes only)
|
||||
- Warn if age > 60 minutes
|
||||
|
||||
### JSON Parsing Logic
|
||||
|
||||
Implement this parsing pattern (from RedisCoordinationHelper):
|
||||
|
||||
```python
|
||||
def parse_redis_value(raw_value: str) -> dict:
|
||||
"""Parse Redis value with JSON deserialization and datetime restoration."""
|
||||
try:
|
||||
parsed = json.loads(raw_value)
|
||||
|
||||
# Handle non-dict values (wrap in dict)
|
||||
if not isinstance(parsed, dict):
|
||||
return {"value": parsed}
|
||||
|
||||
# Restore datetime fields
|
||||
for key, value in parsed.items():
|
||||
if isinstance(value, str) and (key.endswith("_at") or "timestamp" in key.lower()):
|
||||
try:
|
||||
# Convert ISO format string to datetime
|
||||
parsed[key] = datetime.fromisoformat(value)
|
||||
except ValueError:
|
||||
# Not a valid datetime, keep as string
|
||||
pass
|
||||
|
||||
return parsed
|
||||
|
||||
except json.JSONDecodeError:
|
||||
# Not JSON, return as plain value
|
||||
return {"value": raw_value}
|
||||
```
|
||||
|
||||
### Statistics Calculations
|
||||
|
||||
**Load Variance** (standard deviation):
|
||||
|
||||
```python
|
||||
def calculate_variance(workload_values: list[int]) -> float:
|
||||
"""Calculate standard deviation of workload values."""
|
||||
if not workload_values:
|
||||
return 0.0
|
||||
|
||||
mean = sum(workload_values) / len(workload_values)
|
||||
squared_diffs = [(val - mean) ** 2 for val in workload_values]
|
||||
variance = sum(squared_diffs) / len(workload_values)
|
||||
|
||||
return variance ** 0.5 # Standard deviation
|
||||
```
|
||||
|
||||
**Duration Formatting**:
|
||||
|
||||
```python
|
||||
def format_duration(start_time: datetime) -> str:
|
||||
"""Format duration from start time to now."""
|
||||
delta = datetime.now() - start_time
|
||||
|
||||
total_seconds = int(delta.total_seconds())
|
||||
hours = total_seconds // 3600
|
||||
minutes = (total_seconds % 3600) // 60
|
||||
|
||||
if hours > 0:
|
||||
return f"{hours}h {minutes}m"
|
||||
else:
|
||||
return f"{minutes}m"
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
**No Redis Connection**:
|
||||
|
||||
```
|
||||
⚠️ No coordination infrastructure detected
|
||||
|
||||
To enable real-time agent coordination:
|
||||
|
||||
Option 1 (Recommended): Deploy Redis Stack
|
||||
mycelium deploy start --yes
|
||||
|
||||
Option 2: Create coordination directory
|
||||
mkdir -p .claude/coordination/
|
||||
# Agents will create status files here
|
||||
```
|
||||
|
||||
**Empty Redis Data**:
|
||||
|
||||
```
|
||||
⚠️ No agents currently coordinating
|
||||
|
||||
To enable real-time agent coordination:
|
||||
Option 1: Deploy Redis Stack (recommended)
|
||||
mycelium deploy start --yes
|
||||
|
||||
Option 2: Create coordination directory
|
||||
mkdir -p .claude/coordination/
|
||||
```
|
||||
|
||||
**Agent Not Found** (in detailed mode):
|
||||
|
||||
```
|
||||
❌ Agent 'unknown-agent' not found
|
||||
|
||||
Available agents:
|
||||
- project-manager
|
||||
- platform-engineer
|
||||
- ai-engineer
|
||||
- data-engineer
|
||||
- python-pro
|
||||
```
|
||||
|
||||
**Stale Heartbeats** (age > 60 minutes):
|
||||
|
||||
```
|
||||
Heartbeat Status:
|
||||
⚠️ Stale heartbeats detected:
|
||||
- ai-engineer: last seen 2h 15m ago
|
||||
- ml-engineer: last seen 3h 42m ago
|
||||
- python-pro: last seen 1h 5m ago
|
||||
```
|
||||
|
||||
### MCP Tool Usage
|
||||
|
||||
Use these MCP tools to query Redis:
|
||||
|
||||
```javascript
|
||||
// Get all agent statuses
|
||||
const agentStatuses = await mcp__RedisMCPServer__hgetall({
|
||||
name: "agents:status"
|
||||
});
|
||||
|
||||
// Get specific agent status
|
||||
const status = await mcp__RedisMCPServer__hget({
|
||||
name: "agents:status",
|
||||
key: "ai-engineer"
|
||||
});
|
||||
|
||||
// Get all heartbeats
|
||||
const heartbeats = await mcp__RedisMCPServer__hgetall({
|
||||
name: "agents:heartbeat"
|
||||
});
|
||||
|
||||
// Get specific heartbeat
|
||||
const heartbeat = await mcp__RedisMCPServer__hget({
|
||||
name: "agents:heartbeat",
|
||||
key: "ai-engineer"
|
||||
});
|
||||
```
|
||||
|
||||
### Display Requirements
|
||||
|
||||
1. **Sort agents by workload** (highest to lowest) in overview mode
|
||||
1. **Calculate total statistics** (total tasks, average load, variance)
|
||||
1. **Format progress bars** with 10 blocks (█ and ░ characters)
|
||||
1. **Check heartbeat freshness** and warn if stale (>60 min)
|
||||
1. **Handle missing data gracefully** (show 0% workload, "available" status)
|
||||
1. **Parse JSON correctly** (handle both JSON strings and plain values)
|
||||
1. **Restore datetime fields** (fields ending in `_at` or containing `timestamp`)
|
||||
1. **Format timestamps** as human-readable durations (e.g., "2h 15m")
|
||||
|
||||
### Integration Notes
|
||||
|
||||
This command integrates with:
|
||||
|
||||
- `RedisCoordinationHelper` library for JSON serialization patterns
|
||||
- `mycelium deploy` command for Redis Stack deployment
|
||||
- Agent heartbeat mechanisms for health monitoring
|
||||
- Workload management for task distribution
|
||||
|
||||
### Expected Behavior
|
||||
|
||||
**Query Redis MCP Server**:
|
||||
|
||||
1. Check if MCP server is available
|
||||
1. Query `agents:status` hash for all agent data
|
||||
1. Query `agents:heartbeat` hash for heartbeat timestamps
|
||||
1. Parse JSON data with datetime restoration
|
||||
1. Display formatted output with statistics
|
||||
|
||||
**Handle Parsing Errors**:
|
||||
|
||||
- If JSON parsing fails, wrap value in `{"value": raw_value}`
|
||||
- If datetime parsing fails, keep as string
|
||||
- If Redis query fails, show helpful error with deployment instructions
|
||||
|
||||
**Calculate Statistics**:
|
||||
|
||||
- Sort agents by workload (descending)
|
||||
- Sum total active tasks
|
||||
- Calculate average workload percentage
|
||||
- Calculate load variance (standard deviation)
|
||||
- Determine balance status (\< 20% = balanced)
|
||||
|
||||
**Check Heartbeat Health**:
|
||||
|
||||
- Parse heartbeat timestamps
|
||||
- Calculate age in minutes
|
||||
- Warn if any heartbeat > 60 minutes old
|
||||
- Format age as "Xh Ym" or "Ym"
|
||||
|
||||
### Example Outputs
|
||||
|
||||
**Scenario 1: Active Team**
|
||||
|
||||
```
|
||||
=== Multi-Agent Team Status ===
|
||||
Coordination Method: Redis MCP
|
||||
|
||||
Agent Workload:
|
||||
ai-engineer ██████████ 100% | 3 active tasks (at capacity)
|
||||
data-engineer ████████░░ 80% | 2 active tasks
|
||||
python-pro ████░░░░░░ 40% | 1 active task
|
||||
performance-eng ░░░░░░░░░░ 0% | available
|
||||
|
||||
Task Distribution:
|
||||
Total Active: 6 tasks
|
||||
Average Load: 55%
|
||||
Load Variance: 38% (⚠️ unbalanced)
|
||||
|
||||
Heartbeat Status:
|
||||
✅ All agents reporting healthy
|
||||
|
||||
Last Updated: 2025-11-07 20:45:30
|
||||
```
|
||||
|
||||
**Scenario 2: Stale Heartbeats**
|
||||
|
||||
```
|
||||
=== Multi-Agent Team Status ===
|
||||
Coordination Method: Redis MCP
|
||||
|
||||
Agent Workload:
|
||||
project-manager ██████████ 100% | 1 active task
|
||||
|
||||
Task Distribution:
|
||||
Total Active: 1 task
|
||||
Average Load: 100%
|
||||
Load Variance: 0% (✅ balanced)
|
||||
|
||||
Heartbeat Status:
|
||||
⚠️ Stale heartbeats detected:
|
||||
- project-manager: last seen 25 days ago
|
||||
|
||||
Last Updated: 2025-11-07 20:45:30
|
||||
```
|
||||
|
||||
**Scenario 3: Detailed Agent View**
|
||||
|
||||
```
|
||||
=== Agent Status: ai-engineer ===
|
||||
|
||||
Current Status: BUSY (85% capacity)
|
||||
|
||||
Active Tasks (2):
|
||||
1. train-voice-model
|
||||
- Progress: 35%
|
||||
- Duration: 4h 23m
|
||||
- Description: Training custom voice model
|
||||
|
||||
2. evaluate-checkpoint
|
||||
- Progress: 70%
|
||||
- Duration: 45m
|
||||
- Description: Evaluating model checkpoint
|
||||
|
||||
Recent History (from status data):
|
||||
- Current workload: 85%
|
||||
- Task count: 2
|
||||
- Last updated: 2m ago
|
||||
|
||||
Circuit Breaker: ✅ CLOSED (healthy)
|
||||
Last Heartbeat: 2m ago
|
||||
```
|
||||
|
||||
**Scenario 4: No Agents**
|
||||
|
||||
```
|
||||
=== Multi-Agent Team Status ===
|
||||
Coordination Method: Redis MCP
|
||||
|
||||
⚠️ No agents currently coordinating
|
||||
|
||||
To enable real-time agent coordination:
|
||||
Option 1: Deploy Redis Stack (recommended)
|
||||
mycelium deploy start --yes
|
||||
|
||||
Option 2: Create coordination directory
|
||||
mkdir -p .claude/coordination/
|
||||
```
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
When user runs `/team-status`:
|
||||
|
||||
1. **Parse arguments**: Extract agent type from `$ARGS` if provided
|
||||
1. **Query Redis MCP**:
|
||||
- Call `mcp__RedisMCPServer__hgetall` for `agents:status`
|
||||
- Call `mcp__RedisMCPServer__hgetall` for `agents:heartbeat`
|
||||
1. **Parse JSON data**:
|
||||
- Use `parse_redis_value()` logic for each value
|
||||
- Restore datetime fields
|
||||
1. **Determine display mode**:
|
||||
- If agent type specified: Show detailed view
|
||||
- Otherwise: Show overview with all agents
|
||||
1. **Format output**:
|
||||
- Create progress bars (10 blocks)
|
||||
- Calculate statistics (total, average, variance)
|
||||
- Check heartbeat freshness (warn if >60 min)
|
||||
- Display timestamp
|
||||
1. **Handle errors**:
|
||||
- Show helpful message if Redis unavailable
|
||||
- Show empty state if no agents
|
||||
- Show agent not found if invalid agent type
|
||||
|
||||
Now implement this formatted team status display with proper JSON parsing and error handling.
|
||||
Reference in New Issue
Block a user