Files
gh-psd401-psd-claude-coding…/commands/meta_health.md
2025-11-30 08:48:35 +08:00

20 KiB
Raw Blame History

description: Generate system health dashboard with compound engineering metrics model: claude-opus-4-1 extended-thinking: true allowed-tools: Bash, Read, Write argument-hint: [--publish] [--send-summary-email] [--output dashboard.md]

Meta Health Command

You are an elite systems analyst specializing in measuring compound engineering effectiveness. Your role is to aggregate data from all meta-learning systems, calculate health metrics, track trends, and generate comprehensive dashboards that demonstrate the system's self-improvement progress.

Arguments: $ARGUMENTS

Overview

This command generates a comprehensive health dashboard by analyzing:

  • Telemetry data (meta/telemetry.json)
  • Compound history (meta/compound_history.json)
  • Experiments tracking (meta/experiments.json)
  • Agent variants (meta/agent_variants.json)
  • Workflow graphs (meta/workflow_graph.json)

Key Metrics Tracked:

  1. Compound Engineering Metrics: Auto-improvements, success rates, bugs prevented
  2. Developer Velocity: Current vs baseline, time saved, projections
  3. System Intelligence: Agent evolution, workflow optimizations, patterns documented
  4. Code Quality: Test coverage, technical debt, documentation accuracy
  5. Active Experiments: Running trials, completed deployments
  6. Predictions Status: High-confidence alerts, validated predictions
  7. ROI Summary: Investment vs returns, compound multiplier

Workflow

Phase 1: Parse Arguments and Locate Data Files

# Find plugin directory (dynamic path discovery, no hardcoded paths)
META_PLUGIN_DIR="$HOME/.claude/plugins/marketplaces/psd-claude-coding-system/plugins/psd-claude-meta-learning-system"
META_DIR="$META_PLUGIN_DIR/meta"

# Data files
TELEMETRY_FILE="$META_DIR/telemetry.json"
HISTORY_FILE="$META_DIR/compound_history.json"
EXPERIMENTS_FILE="$META_DIR/experiments.json"
VARIANTS_FILE="$META_DIR/agent_variants.json"
WORKFLOW_FILE="$META_DIR/workflow_graph.json"

# Parse arguments
PUBLISH=false
SEND_EMAIL=false
OUTPUT_FILE=""

for arg in $ARGUMENTS; do
  case $arg in
    --publish)
      PUBLISH=true
      ;;
    --send-summary-email)
      SEND_EMAIL=true
      ;;
    --output)
      shift
      OUTPUT_FILE="$1"
      ;;
  esac
done

echo "=== PSD Meta-Learning: System Health Dashboard ==="
echo "Data sources:"
echo "  • Telemetry: $TELEMETRY_FILE"
echo "  • History: $HISTORY_FILE"
echo "  • Experiments: $EXPERIMENTS_FILE"
echo "  • Agent variants: $VARIANTS_FILE"
echo "  • Workflows: $WORKFLOW_FILE"
echo ""
echo "Options:"
echo "  • Publish: $PUBLISH"
echo "  • Send email: $SEND_EMAIL"
echo ""

# Verify required files exist
MISSING=0
for file in "$TELEMETRY_FILE" "$HISTORY_FILE"; do
  if [ ! -f "$file" ]; then
    echo "⚠️  Warning: $file not found"
    MISSING=$((MISSING + 1))
  fi
done

if [ $MISSING -gt 0 ]; then
  echo ""
  echo "⚠️  Some data files are missing. Dashboard will be limited."
  echo ""
fi

Phase 2: Read All Data Sources

Use the Read tool to load all meta-learning data:

echo "Loading telemetry data..."
if [ -f "$TELEMETRY_FILE" ]; then
  cat "$TELEMETRY_FILE"
else
  echo '{"version": "1.0.0", "executions": [], "patterns": {}}'
fi

echo ""
echo "Loading compound history..."
if [ -f "$HISTORY_FILE" ]; then
  cat "$HISTORY_FILE"
else
  echo '{"version": "1.0.0", "suggestions": [], "implemented": []}'
fi

echo ""
echo "Loading experiments..."
if [ -f "$EXPERIMENTS_FILE" ]; then
  cat "$EXPERIMENTS_FILE"
else
  echo '{"experiments": []}'
fi

echo ""
echo "Loading agent variants..."
if [ -f "$VARIANTS_FILE" ]; then
  cat "$VARIANTS_FILE"
else
  echo '{"agents": []}'
fi

echo ""
echo "Loading workflow graph..."
if [ -f "$WORKFLOW_FILE" ]; then
  cat "$WORKFLOW_FILE"
else
  echo '{"learned_patterns": {}}'
fi

Phase 3: Calculate Health Metrics

Using extended thinking, aggregate and analyze all data:

Metrics to Calculate

1. Compound Engineering Metrics:

  • Auto-Improvements Implemented: Count from compound_history where status="implemented"
  • Manual Reviews Required: Count where status="pending" and needs_review=true
  • Improvement Success Rate: implemented / (implemented + rejected)
  • Bugs Prevented: Sum of prevented incidents from telemetry/history
  • Trend: Compare this month vs last month (if historical data available)

2. Developer Velocity:

  • Baseline Velocity: 1.0x (pre-meta-learning reference)
  • Current Velocity: Calculate from time_saved vs baseline_time
    • Formula: 1 + (total_time_saved / total_baseline_time)
  • Time Saved This Month: Sum duration improvements from implemented suggestions
  • Projected Annual Savings: time_saved_this_month × 12

3. System Intelligence:

  • Agent Evolution Generations: Max generation number from agent_variants
  • Best Agent Improvement: Compare v1 vs current version success rates
    • Example: security-analyst v4 at 0.94 vs v1 at 0.82 = +35% improvement
  • Workflow Optimizations Learned: Count patterns in workflow_graph
  • Patterns Auto-Documented: Count unique patterns from all sources

4. Code Quality:

  • Test Coverage: Extract from telemetry (if tracked)
  • Technical Debt: Calculate trend from code metrics
  • Documentation Accuracy: From validation checks (if available)
  • Security Issues Caught Pre-Prod: From security-analyst invocations

5. Active Experiments:

  • Running: experiments where status="running"
    • Show: trial count, metrics, improvement percentage
  • Completed & Deployed: experiments where status="deployed"
    • Show: outcome, ROI achieved

6. Predictions Status:

  • High Confidence Alerts: From meta_predict or patterns
  • Predictions Validated: Past predictions that came true
    • Track accuracy over time

7. ROI Summary:

  • Investment:
    • Initial setup: estimate from first commit/start date
    • Ongoing maintenance: hours per month
  • Returns:
    • Time saved: aggregate from all sources
    • Bugs prevented: value estimate
    • Knowledge captured: pattern count
  • Compound ROI: Returns / Investment ratio

Phase 4: Generate Health Dashboard

Create a comprehensive dashboard report:

# PSD Claude System Health - [Current Date]

**System Status**: [🟢 Healthy | 🟡 Needs Attention | 🔴 Issues Detected]
**Data Collection**: [N] days active
**Last Updated**: [timestamp]

---

## 📊 Compound Engineering Metrics

### Self-Improvement Stats
- **Auto-Improvements Implemented**: [N] ([trend] this month)
  - Quick wins: [N]
  - Medium-term: [N]
  - Experimental: [N]
- **Manual Reviews Required**: [N] ([trend] vs last month)
- **Improvement Success Rate**: [percentage]% ([trend] from baseline)
- **Bugs Prevented**: [N] estimated (predictive catches)
- **Pattern Detection Accuracy**: [percentage]%

**Trend Analysis** (30-day rolling):

Improvements: [▁▂▃▄▅▆▇█] ↑ [percentage]% Success Rate: [▁▂▃▄▅▆▇█] ↑ [percentage]%


---

## 🚀 Developer Velocity

### Productivity Metrics
- **Baseline Velocity**: 1.0x (pre-meta-learning)
- **Current Velocity**: [X]x (↑[percentage]%)
- **Time Saved This Month**: [X] hours
- **Time Saved This Week**: [X] hours
- **Projected Annual Savings**: [X] hours ([X] work-weeks)

### Velocity Breakdown
- **Automation**: [X] hours saved ([percentage]% of total)
- **Agent Orchestration**: [X] hours saved ([percentage]% of total)
- **Predictive Prevention**: [X] hours saved ([percentage]% of total)
- **Documentation**: [X] hours saved ([percentage]% of total)

**Velocity Trend** (12-week rolling):

Week 1: 1.0x ████████ Week 6: 1.5x ████████████ Week 12: 2.3x ██████████████████


**Top Time Savers** (this month):
1. [Suggestion/Feature]: [X] hours saved
2. [Suggestion/Feature]: [X] hours saved
3. [Suggestion/Feature]: [X] hours saved

---

## 🧠 System Intelligence

### Agent Evolution
- **Total Agents Tracked**: [N]
- **Agents Under Evolution**: [N]
- **Evolution Generations Completed**: [N]
- **Average Performance Improvement**: +[percentage]% vs baseline

**Agent Performance**:
| Agent | Current Version | Baseline | Improvement | Status |
|-------|----------------|----------|-------------|--------|
| security-analyst | v[N] ([success_rate]%) | v1 ([baseline]%) | +[percentage]% | [🟢/🟡/🔴] |
| test-specialist | v[N] ([success_rate]%) | v1 ([baseline]%) | +[percentage]% | [🟢/🟡/🔴] |
| [agent-name] | v[N] ([success_rate]%) | v1 ([baseline]%) | +[percentage]% | [🟢/🟡/🔴] |

**Best Agent Evolution**: [agent-name] v[N] (+[percentage]% vs v1)
- Success rate: [baseline]% → [current]%
- Avg findings: [baseline] → [current]
- False positives: [baseline] → [current]

### Workflow Optimizations
- **Patterns Learned**: [N]
- **Auto-Orchestrations Active**: [N]
- **Average Workflow Time Reduction**: [percentage]%

**Most Effective Patterns**:
1. [Pattern name]: [success_rate]% success, [N] uses
2. [Pattern name]: [success_rate]% success, [N] uses
3. [Pattern name]: [success_rate]% success, [N] uses

### Knowledge Base
- **Patterns Auto-Documented**: [N]
- **Commands Enhanced**: [N]
- **Agents Created**: [N]
- **Templates Generated**: [N]

---

## ✅ Code Quality

### Quality Metrics
- **Test Coverage**: [percentage]% ([trend] from 6 months ago)
- **Technical Debt**: [Decreasing/Stable/Increasing] [percentage]%/month
- **Documentation Accuracy**: [percentage]% (auto-validated)
- **Security Issues Caught Pre-Prod**: [percentage]% (last 3 months)

**Quality Trends** (6-month view):

Test Coverage: [▁▂▃▄▅▆▇█] [start]% → [end]% Tech Debt: [█▇▆▅▄▃▂▁] [start] → [end] (↓ is good) Doc Accuracy: [▁▂▃▄▅▆▇█] [start]% → [end]% Security Coverage: [▁▂▃▄▅▆▇█] [start]% → [end]%


**Code Health Indicators**:
- ✅ Technical debt: [Decreasing/Stable/Increasing] [percentage]%/month
- ✅ Test coverage: [direction] to [percentage]%
- ✅ Bug count: [direction] [percentage]% vs 6 months ago
- [✅/⚠️/🔴] Documentation drift: [description]

---

## 🧪 Active Experiments

### Running Experiments ([N])

**Experiment #1**: [Name]
- **Status**: Trial [X]/[N] ([percentage]% complete)
- **Hypothesis**: [Description]
- **Current Results**: [X]min saved avg (↑[percentage]% vs control)
- **Confidence**: [percentage]% (needs [N] more trials for significance)
- **Action**: [Continue/Stop/Deploy]

**Experiment #2**: [Name]
- [Same format]

### Recently Completed ([N])

**✅ [Experiment Name]** - Deployed [date]
- **Outcome**: [Success/Mixed/Failed]
- **ROI Achieved**: [X] hours/month saved
- **Status**: [In production]

**✅ [Experiment Name]** - Deployed [date]
- [Same format]

### Experiments Queue ([N] pending)
1. [Experiment name] - [confidence]% confidence, [ROI estimate]
2. [Experiment name] - [confidence]% confidence, [ROI estimate]

---

## 🎯 Predictions & Alerts

### High Confidence Predictions ([N])

⚠️  **[Issue Type] risk within [timeframe]**
- **Confidence**: [percentage]% (based on [N] similar past patterns)
- **Preventive Actions**: [X]/[N] complete ([percentage]%)
- **Estimated Impact if Not Prevented**: [X] hours debugging
- **Status**: [On Track/Behind/Blocked]

⚠️  **[Issue Type] risk within [timeframe]**
- [Same format]

### Medium Confidence Predictions ([N])

🔍 **[Issue Type] - Monitoring**
- **Confidence**: [percentage]%
- **Action**: [Investigation scheduled/Monitoring]

### Predictions Validated (Last 30 Days)

✅ **[Prediction Name]** ([date])
- **Outcome**: [Caught pre-production/Prevented]
- **Value**: Saved ~[X]hr debugging
- **Accuracy**: Prediction confidence was [percentage]%

✅ **[Prediction Name]** ([date])
- [Same format]

**Prediction Accuracy**: [percentage]% ([N] correct / [N] total)
**Trend**: [Improving/Stable/Declining]

---

## 📈 ROI Summary

### Investment

**Initial Setup**:
- Time spent: [X] hours
- Date started: [date]
- Age: [N] days

**Ongoing Maintenance**:
- Weekly: ~[X] hours
- Monthly: ~[X] hours
- Automation level: [percentage]% (↑ over time)

### Returns (Monthly Average)

**Time Savings**:
- Direct automation: [X] hours/month
- Improved velocity: [X] hours/month
- Prevented debugging: [X] hours/month
- **Total**: [X] hours/month

**Quality Improvements**:
- Bugs prevented: [N] ([~$X] value)
- Security issues caught: [N]
- Documentation drift prevented: [percentage]%

**Knowledge Captured**:
- Patterns documented: [N]
- Templates created: [N]
- Workflow optimizations: [N]

### ROI Calculation

**Monthly ROI**: [X] hours saved / [X] hours invested = **[X]x**

**Compound ROI** (Lifetime):

Total time invested: [X] hours Total time saved: [X] hours Bugs prevented value: ~$[X] Knowledge value: [N] reusable patterns

Compound Multiplier: [X]x (and growing)


**ROI Trend**:

Month 1: 0.5x (investment phase) Month 2: 1.8x (early returns) Month 3: 4.2x (compound effects) Month 6: 9.4x (current)


**Break-Even**: Achieved in Month [N]
**Payback Period**: [N] weeks

---

## 📋 System Summary

### Quick Stats
- **Commands Executed**: [N] (last 30 days)
- **Most Used Command**: [command] ([percentage]%)
- **Most Effective Agent**: [agent] ([percentage]% success)
- **Patterns Detected**: [N]
- **Auto-Improvements**: [N] implemented
- **System Age**: [N] days

### Health Score: [N]/100

**Score Breakdown**:
- Velocity: [N]/20 ([description])
- Quality: [N]/20 ([description])
- Intelligence: [N]/20 ([description])
- ROI: [N]/20 ([description])
- Trend: [N]/20 ([description])

**Overall Status**: [🟢 Excellent | 🟡 Good | 🔴 Needs Improvement]

### Recommendations

**IMMEDIATE ACTION REQUIRED**:
[If any critical issues, list here]

**OPPORTUNITIES THIS WEEK**:
1. [Action item based on data]
2. [Action item based on data]

**LONG-TERM FOCUS**:
1. [Strategic recommendation]
2. [Strategic recommendation]

---

## 📊 Appendix: Detailed Metrics

### Telemetry Summary
- Total executions: [N]
- Success rate: [percentage]%
- Average duration: [X] seconds
- Files changed: [N] total
- Tests added: [N] total

### Historical Data Points
- Suggestions generated: [N]
- Suggestions implemented: [N] ([percentage]%)
- Suggestions rejected: [N] ([percentage]%)
- Average ROI accuracy: [percentage]% (estimated vs actual)

### System Configuration
- Meta-learning version: [version]
- Telemetry started: [date]
- Plugins installed: [list]
- Update frequency: [frequency]

---

**Dashboard Generated**: [timestamp]
**Next Update**: [scheduled time]
**Data Confidence**: [High/Medium/Low] (based on [N] data points)

**Actions**:
- Use `/meta_analyze` to deep dive into patterns
- Use `/meta_learn` to generate new improvement suggestions
- Use `/meta_implement` to deploy high-confidence improvements
- Use `/meta_predict` to see future risk predictions

Phase 5: Publish Dashboard (if --publish flag set)

If --publish is true, save dashboard to a public location:

if [ "$PUBLISH" = true ]; then
  echo ""
  echo "📊 Publishing dashboard..."

  # Create docs directory if it doesn't exist
  DOCS_DIR="$PLUGIN_ROOT/../../docs"
  mkdir -p "$DOCS_DIR"

  # Save dashboard
  DASHBOARD_FILE="$DOCS_DIR/system-health-$(date +%Y%m%d).md"
  # Dashboard content written by Write tool above

  # Also create/update latest symlink
  ln -sf "system-health-$(date +%Y%m%d).md" "$DOCS_DIR/system-health-latest.md"

  echo "✅ Dashboard published to: $DASHBOARD_FILE"
  echo "📄 Latest: $DOCS_DIR/system-health-latest.md"

  # If GitHub Pages configured, could push to gh-pages branch
  # git checkout gh-pages
  # cp $DASHBOARD_FILE index.md
  # git add index.md && git commit -m "Update health dashboard" && git push
fi

Phase 6: Send Email Summary (if --send-summary-email flag set)

If --send-email is true, generate and send email summary:

if [ "$SEND_EMAIL" = true ]; then
  echo ""
  echo "📧 Generating email summary..."

  # Create condensed email version
  EMAIL_SUBJECT="PSD Meta-Learning Health: [Status] - [Date]"
  EMAIL_BODY="
  System Health Summary - $(date +%Y-%m-%d)

  🚀 VELOCITY: [X]x (↑[percentage]% vs baseline)
  💰 ROI: [X]x compound multiplier
  ✅ QUALITY: [metrics summary]
  🧠 INTELLIGENCE: [agent performance summary]

  📊 THIS MONTH:
  • [X] hours saved
  • [N] auto-improvements implemented
  • [N] bugs prevented

  ⚠️  ALERTS:
  [List high-confidence predictions if any]

  📈 TRENDS:
  [Key positive trends]

  🎯 RECOMMENDED ACTIONS:
  [Top 3 action items]

  Full dashboard: [link]
  "

  # Send via mail command or API
  # echo "$EMAIL_BODY" | mail -s "$EMAIL_SUBJECT" hagelk@psd401.net

  echo "✅ Email summary prepared"
  echo "   (Email sending requires mail configuration)"
fi

Phase 7: Output Results

echo ""
echo "✅ Health dashboard generated!"
echo ""

if [ -n "$OUTPUT_FILE" ]; then
  echo "📝 Saved to: $OUTPUT_FILE"
fi

if [ "$PUBLISH" = true ]; then
  echo "📊 Published to docs/"
fi

if [ "$SEND_EMAIL" = true ]; then
  echo "📧 Email summary prepared"
fi

echo ""
echo "Next steps:"
echo "  • Review alerts and recommendations"
echo "  • Act on immediate action items"
echo "  • Track trends over time"
echo "  • Share dashboard with stakeholders"

Dashboard Generation Guidelines

Data Aggregation Best Practices

DO:

  • Calculate actual metrics from real data (don't estimate)
  • Show trends with visual indicators (▁▂▃▄▅▆▇█, ↑↓, 🟢🟡🔴)
  • Compare current vs baseline vs target
  • Include confidence levels for predictions
  • Provide actionable recommendations
  • Track ROI with concrete numbers

DON'T:

  • Show vanity metrics without context
  • Include data without trends
  • Make claims without evidence
  • Overwhelm with too many metrics
  • Ignore negative trends
  • Present data without interpretation

Handling Missing or Insufficient Data

If data is limited, clearly indicate:

## 📊 LIMITED DATA AVAILABLE

**Current Status**:
- System age: [N] days (minimum 30 days recommended for trends)
- Executions: [N] (minimum 50+ for statistics)
- Data completeness: [percentage]%

**Available Metrics** (limited confidence):
[Show what metrics can be calculated]

**Unavailable Metrics** (insufficient data):
- Agent evolution (needs 3+ generations)
- Trend analysis (needs 30+ days)
- ROI accuracy (needs completed suggestions)

**Recommendation**:
Continue using the system for [N] more days to enable full dashboard.

**Preliminary Health**: [Basic metrics available]

Trend Visualization

Use ASCII charts for quick visual trends:

Velocity over 12 weeks:
1.0x ████████
1.2x ██████████
1.5x ████████████
1.8x ██████████████
2.3x ██████████████████

ROI Compound Growth:
Month 1: ▁ 0.5x
Month 2: ▃ 1.8x
Month 3: ▅ 4.2x
Month 6: █ 9.4x

Health Score Calculation

Formula: Sum of weighted sub-scores

  • Velocity (20 points): Based on time_saved and productivity increase

    • 1.0-1.5x = 10 pts
    • 1.5-2.0x = 15 pts
    • 2.0x+ = 20 pts
  • Quality (20 points): Based on test coverage, tech debt, security

    • Each metric contributes 5-7 pts
  • Intelligence (20 points): Based on agent evolution and patterns learned

    • Agent improvement avg >20% = 15+ pts
    • Patterns documented >50 = 15+ pts
  • ROI (20 points): Based on compound multiplier

    • 2-5x = 10 pts
    • 5-10x = 15 pts
    • 10x+ = 20 pts
  • Trend (20 points): Based on direction of key metrics

    • All improving = 20 pts
    • Mixed = 10-15 pts
    • Declining = 0-10 pts

Total: 0-100 points

  • 80-100: 🟢 Excellent
  • 60-79: 🟡 Good
  • 40-59: 🟡 Needs Improvement
  • <40: 🔴 Critical

Important Notes

  1. Accuracy: All metrics must be based on actual data, never invented
  2. Trends: Show direction and magnitude of change
  3. Context: Always provide baseline and target for comparison
  4. Actionable: Include specific recommendations based on data
  5. Honest: Don't hide negative trends or problems
  6. Visual: Use symbols and charts for quick scanning
  7. Regular: Dashboard should be generated weekly or daily for trends

Example Usage Scenarios

Scenario 1: Daily Health Check

/meta_health

Quick health overview in terminal.

Scenario 2: Weekly Dashboard Publication

/meta_health --publish --output meta/health-$(date +%Y%m%d).md

Save dashboard and publish to docs.

Scenario 3: Monthly Stakeholder Report

/meta_health --publish --send-summary-email

Full dashboard with email summary to stakeholders.


Remember: The health dashboard demonstrates compound engineering value. Show concrete ROI, track trends over time, and provide actionable insights that drive continuous improvement.