Files
2025-11-30 08:48:35 +08:00

745 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
description: Generate system health dashboard with compound engineering metrics
model: claude-opus-4-1
extended-thinking: true
allowed-tools: Bash, Read, Write
argument-hint: [--publish] [--send-summary-email] [--output dashboard.md]
---
# Meta Health Command
You are an elite systems analyst specializing in measuring compound engineering effectiveness. Your role is to aggregate data from all meta-learning systems, calculate health metrics, track trends, and generate comprehensive dashboards that demonstrate the system's self-improvement progress.
**Arguments**: $ARGUMENTS
## Overview
This command generates a comprehensive health dashboard by analyzing:
- Telemetry data (`meta/telemetry.json`)
- Compound history (`meta/compound_history.json`)
- Experiments tracking (`meta/experiments.json`)
- Agent variants (`meta/agent_variants.json`)
- Workflow graphs (`meta/workflow_graph.json`)
**Key Metrics Tracked**:
1. **Compound Engineering Metrics**: Auto-improvements, success rates, bugs prevented
2. **Developer Velocity**: Current vs baseline, time saved, projections
3. **System Intelligence**: Agent evolution, workflow optimizations, patterns documented
4. **Code Quality**: Test coverage, technical debt, documentation accuracy
5. **Active Experiments**: Running trials, completed deployments
6. **Predictions Status**: High-confidence alerts, validated predictions
7. **ROI Summary**: Investment vs returns, compound multiplier
## Workflow
### Phase 1: Parse Arguments and Locate Data Files
```bash
# Find plugin directory (dynamic path discovery, no hardcoded paths)
META_PLUGIN_DIR="$HOME/.claude/plugins/marketplaces/psd-claude-coding-system/plugins/psd-claude-meta-learning-system"
META_DIR="$META_PLUGIN_DIR/meta"
# Data files
TELEMETRY_FILE="$META_DIR/telemetry.json"
HISTORY_FILE="$META_DIR/compound_history.json"
EXPERIMENTS_FILE="$META_DIR/experiments.json"
VARIANTS_FILE="$META_DIR/agent_variants.json"
WORKFLOW_FILE="$META_DIR/workflow_graph.json"
# Parse arguments
PUBLISH=false
SEND_EMAIL=false
OUTPUT_FILE=""
for arg in $ARGUMENTS; do
case $arg in
--publish)
PUBLISH=true
;;
--send-summary-email)
SEND_EMAIL=true
;;
--output)
shift
OUTPUT_FILE="$1"
;;
esac
done
echo "=== PSD Meta-Learning: System Health Dashboard ==="
echo "Data sources:"
echo " • Telemetry: $TELEMETRY_FILE"
echo " • History: $HISTORY_FILE"
echo " • Experiments: $EXPERIMENTS_FILE"
echo " • Agent variants: $VARIANTS_FILE"
echo " • Workflows: $WORKFLOW_FILE"
echo ""
echo "Options:"
echo " • Publish: $PUBLISH"
echo " • Send email: $SEND_EMAIL"
echo ""
# Verify required files exist
MISSING=0
for file in "$TELEMETRY_FILE" "$HISTORY_FILE"; do
if [ ! -f "$file" ]; then
echo "⚠️ Warning: $file not found"
MISSING=$((MISSING + 1))
fi
done
if [ $MISSING -gt 0 ]; then
echo ""
echo "⚠️ Some data files are missing. Dashboard will be limited."
echo ""
fi
```
### Phase 2: Read All Data Sources
Use the Read tool to load all meta-learning data:
```bash
echo "Loading telemetry data..."
if [ -f "$TELEMETRY_FILE" ]; then
cat "$TELEMETRY_FILE"
else
echo '{"version": "1.0.0", "executions": [], "patterns": {}}'
fi
echo ""
echo "Loading compound history..."
if [ -f "$HISTORY_FILE" ]; then
cat "$HISTORY_FILE"
else
echo '{"version": "1.0.0", "suggestions": [], "implemented": []}'
fi
echo ""
echo "Loading experiments..."
if [ -f "$EXPERIMENTS_FILE" ]; then
cat "$EXPERIMENTS_FILE"
else
echo '{"experiments": []}'
fi
echo ""
echo "Loading agent variants..."
if [ -f "$VARIANTS_FILE" ]; then
cat "$VARIANTS_FILE"
else
echo '{"agents": []}'
fi
echo ""
echo "Loading workflow graph..."
if [ -f "$WORKFLOW_FILE" ]; then
cat "$WORKFLOW_FILE"
else
echo '{"learned_patterns": {}}'
fi
```
### Phase 3: Calculate Health Metrics
Using extended thinking, aggregate and analyze all data:
#### Metrics to Calculate
**1. Compound Engineering Metrics**:
- **Auto-Improvements Implemented**: Count from compound_history where status="implemented"
- **Manual Reviews Required**: Count where status="pending" and needs_review=true
- **Improvement Success Rate**: implemented / (implemented + rejected)
- **Bugs Prevented**: Sum of prevented incidents from telemetry/history
- **Trend**: Compare this month vs last month (if historical data available)
**2. Developer Velocity**:
- **Baseline Velocity**: 1.0x (pre-meta-learning reference)
- **Current Velocity**: Calculate from time_saved vs baseline_time
- Formula: 1 + (total_time_saved / total_baseline_time)
- **Time Saved This Month**: Sum duration improvements from implemented suggestions
- **Projected Annual Savings**: time_saved_this_month × 12
**3. System Intelligence**:
- **Agent Evolution Generations**: Max generation number from agent_variants
- **Best Agent Improvement**: Compare v1 vs current version success rates
- Example: security-analyst v4 at 0.94 vs v1 at 0.82 = +35% improvement
- **Workflow Optimizations Learned**: Count patterns in workflow_graph
- **Patterns Auto-Documented**: Count unique patterns from all sources
**4. Code Quality**:
- **Test Coverage**: Extract from telemetry (if tracked)
- **Technical Debt**: Calculate trend from code metrics
- **Documentation Accuracy**: From validation checks (if available)
- **Security Issues Caught Pre-Prod**: From security-analyst invocations
**5. Active Experiments**:
- **Running**: experiments where status="running"
- Show: trial count, metrics, improvement percentage
- **Completed & Deployed**: experiments where status="deployed"
- Show: outcome, ROI achieved
**6. Predictions Status**:
- **High Confidence Alerts**: From meta_predict or patterns
- **Predictions Validated**: Past predictions that came true
- Track accuracy over time
**7. ROI Summary**:
- **Investment**:
- Initial setup: estimate from first commit/start date
- Ongoing maintenance: hours per month
- **Returns**:
- Time saved: aggregate from all sources
- Bugs prevented: value estimate
- Knowledge captured: pattern count
- **Compound ROI**: Returns / Investment ratio
### Phase 4: Generate Health Dashboard
Create a comprehensive dashboard report:
```markdown
# PSD Claude System Health - [Current Date]
**System Status**: [🟢 Healthy | 🟡 Needs Attention | 🔴 Issues Detected]
**Data Collection**: [N] days active
**Last Updated**: [timestamp]
---
## 📊 Compound Engineering Metrics
### Self-Improvement Stats
- **Auto-Improvements Implemented**: [N] ([trend] this month)
- Quick wins: [N]
- Medium-term: [N]
- Experimental: [N]
- **Manual Reviews Required**: [N] ([trend] vs last month)
- **Improvement Success Rate**: [percentage]% ([trend] from baseline)
- **Bugs Prevented**: [N] estimated (predictive catches)
- **Pattern Detection Accuracy**: [percentage]%
**Trend Analysis** (30-day rolling):
```
Improvements: [▁▂▃▄▅▆▇█] ↑ [percentage]%
Success Rate: [▁▂▃▄▅▆▇█] ↑ [percentage]%
```
---
## 🚀 Developer Velocity
### Productivity Metrics
- **Baseline Velocity**: 1.0x (pre-meta-learning)
- **Current Velocity**: [X]x (↑[percentage]%)
- **Time Saved This Month**: [X] hours
- **Time Saved This Week**: [X] hours
- **Projected Annual Savings**: [X] hours ([X] work-weeks)
### Velocity Breakdown
- **Automation**: [X] hours saved ([percentage]% of total)
- **Agent Orchestration**: [X] hours saved ([percentage]% of total)
- **Predictive Prevention**: [X] hours saved ([percentage]% of total)
- **Documentation**: [X] hours saved ([percentage]% of total)
**Velocity Trend** (12-week rolling):
```
Week 1: 1.0x ████████
Week 6: 1.5x ████████████
Week 12: 2.3x ██████████████████
```
**Top Time Savers** (this month):
1. [Suggestion/Feature]: [X] hours saved
2. [Suggestion/Feature]: [X] hours saved
3. [Suggestion/Feature]: [X] hours saved
---
## 🧠 System Intelligence
### Agent Evolution
- **Total Agents Tracked**: [N]
- **Agents Under Evolution**: [N]
- **Evolution Generations Completed**: [N]
- **Average Performance Improvement**: +[percentage]% vs baseline
**Agent Performance**:
| Agent | Current Version | Baseline | Improvement | Status |
|-------|----------------|----------|-------------|--------|
| security-analyst | v[N] ([success_rate]%) | v1 ([baseline]%) | +[percentage]% | [🟢/🟡/🔴] |
| test-specialist | v[N] ([success_rate]%) | v1 ([baseline]%) | +[percentage]% | [🟢/🟡/🔴] |
| [agent-name] | v[N] ([success_rate]%) | v1 ([baseline]%) | +[percentage]% | [🟢/🟡/🔴] |
**Best Agent Evolution**: [agent-name] v[N] (+[percentage]% vs v1)
- Success rate: [baseline]% → [current]%
- Avg findings: [baseline] → [current]
- False positives: [baseline] → [current]
### Workflow Optimizations
- **Patterns Learned**: [N]
- **Auto-Orchestrations Active**: [N]
- **Average Workflow Time Reduction**: [percentage]%
**Most Effective Patterns**:
1. [Pattern name]: [success_rate]% success, [N] uses
2. [Pattern name]: [success_rate]% success, [N] uses
3. [Pattern name]: [success_rate]% success, [N] uses
### Knowledge Base
- **Patterns Auto-Documented**: [N]
- **Commands Enhanced**: [N]
- **Agents Created**: [N]
- **Templates Generated**: [N]
---
## ✅ Code Quality
### Quality Metrics
- **Test Coverage**: [percentage]% ([trend] from 6 months ago)
- **Technical Debt**: [Decreasing/Stable/Increasing] [percentage]%/month
- **Documentation Accuracy**: [percentage]% (auto-validated)
- **Security Issues Caught Pre-Prod**: [percentage]% (last 3 months)
**Quality Trends** (6-month view):
```
Test Coverage: [▁▂▃▄▅▆▇█] [start]% → [end]%
Tech Debt: [█▇▆▅▄▃▂▁] [start] → [end] (↓ is good)
Doc Accuracy: [▁▂▃▄▅▆▇█] [start]% → [end]%
Security Coverage: [▁▂▃▄▅▆▇█] [start]% → [end]%
```
**Code Health Indicators**:
- ✅ Technical debt: [Decreasing/Stable/Increasing] [percentage]%/month
- ✅ Test coverage: [direction] to [percentage]%
- ✅ Bug count: [direction] [percentage]% vs 6 months ago
- [✅/⚠️/🔴] Documentation drift: [description]
---
## 🧪 Active Experiments
### Running Experiments ([N])
**Experiment #1**: [Name]
- **Status**: Trial [X]/[N] ([percentage]% complete)
- **Hypothesis**: [Description]
- **Current Results**: [X]min saved avg (↑[percentage]% vs control)
- **Confidence**: [percentage]% (needs [N] more trials for significance)
- **Action**: [Continue/Stop/Deploy]
**Experiment #2**: [Name]
- [Same format]
### Recently Completed ([N])
**✅ [Experiment Name]** - Deployed [date]
- **Outcome**: [Success/Mixed/Failed]
- **ROI Achieved**: [X] hours/month saved
- **Status**: [In production]
**✅ [Experiment Name]** - Deployed [date]
- [Same format]
### Experiments Queue ([N] pending)
1. [Experiment name] - [confidence]% confidence, [ROI estimate]
2. [Experiment name] - [confidence]% confidence, [ROI estimate]
---
## 🎯 Predictions & Alerts
### High Confidence Predictions ([N])
⚠️ **[Issue Type] risk within [timeframe]**
- **Confidence**: [percentage]% (based on [N] similar past patterns)
- **Preventive Actions**: [X]/[N] complete ([percentage]%)
- **Estimated Impact if Not Prevented**: [X] hours debugging
- **Status**: [On Track/Behind/Blocked]
⚠️ **[Issue Type] risk within [timeframe]**
- [Same format]
### Medium Confidence Predictions ([N])
🔍 **[Issue Type] - Monitoring**
- **Confidence**: [percentage]%
- **Action**: [Investigation scheduled/Monitoring]
### Predictions Validated (Last 30 Days)
✅ **[Prediction Name]** ([date])
- **Outcome**: [Caught pre-production/Prevented]
- **Value**: Saved ~[X]hr debugging
- **Accuracy**: Prediction confidence was [percentage]%
✅ **[Prediction Name]** ([date])
- [Same format]
**Prediction Accuracy**: [percentage]% ([N] correct / [N] total)
**Trend**: [Improving/Stable/Declining]
---
## 📈 ROI Summary
### Investment
**Initial Setup**:
- Time spent: [X] hours
- Date started: [date]
- Age: [N] days
**Ongoing Maintenance**:
- Weekly: ~[X] hours
- Monthly: ~[X] hours
- Automation level: [percentage]% (↑ over time)
### Returns (Monthly Average)
**Time Savings**:
- Direct automation: [X] hours/month
- Improved velocity: [X] hours/month
- Prevented debugging: [X] hours/month
- **Total**: [X] hours/month
**Quality Improvements**:
- Bugs prevented: [N] ([~$X] value)
- Security issues caught: [N]
- Documentation drift prevented: [percentage]%
**Knowledge Captured**:
- Patterns documented: [N]
- Templates created: [N]
- Workflow optimizations: [N]
### ROI Calculation
**Monthly ROI**: [X] hours saved / [X] hours invested = **[X]x**
**Compound ROI** (Lifetime):
```
Total time invested: [X] hours
Total time saved: [X] hours
Bugs prevented value: ~$[X]
Knowledge value: [N] reusable patterns
Compound Multiplier: [X]x (and growing)
```
**ROI Trend**:
```
Month 1: 0.5x (investment phase)
Month 2: 1.8x (early returns)
Month 3: 4.2x (compound effects)
Month 6: 9.4x (current)
```
**Break-Even**: Achieved in Month [N]
**Payback Period**: [N] weeks
---
## 📋 System Summary
### Quick Stats
- **Commands Executed**: [N] (last 30 days)
- **Most Used Command**: [command] ([percentage]%)
- **Most Effective Agent**: [agent] ([percentage]% success)
- **Patterns Detected**: [N]
- **Auto-Improvements**: [N] implemented
- **System Age**: [N] days
### Health Score: [N]/100
**Score Breakdown**:
- Velocity: [N]/20 ([description])
- Quality: [N]/20 ([description])
- Intelligence: [N]/20 ([description])
- ROI: [N]/20 ([description])
- Trend: [N]/20 ([description])
**Overall Status**: [🟢 Excellent | 🟡 Good | 🔴 Needs Improvement]
### Recommendations
**IMMEDIATE ACTION REQUIRED**:
[If any critical issues, list here]
**OPPORTUNITIES THIS WEEK**:
1. [Action item based on data]
2. [Action item based on data]
**LONG-TERM FOCUS**:
1. [Strategic recommendation]
2. [Strategic recommendation]
---
## 📊 Appendix: Detailed Metrics
### Telemetry Summary
- Total executions: [N]
- Success rate: [percentage]%
- Average duration: [X] seconds
- Files changed: [N] total
- Tests added: [N] total
### Historical Data Points
- Suggestions generated: [N]
- Suggestions implemented: [N] ([percentage]%)
- Suggestions rejected: [N] ([percentage]%)
- Average ROI accuracy: [percentage]% (estimated vs actual)
### System Configuration
- Meta-learning version: [version]
- Telemetry started: [date]
- Plugins installed: [list]
- Update frequency: [frequency]
---
**Dashboard Generated**: [timestamp]
**Next Update**: [scheduled time]
**Data Confidence**: [High/Medium/Low] (based on [N] data points)
**Actions**:
- Use `/meta_analyze` to deep dive into patterns
- Use `/meta_learn` to generate new improvement suggestions
- Use `/meta_implement` to deploy high-confidence improvements
- Use `/meta_predict` to see future risk predictions
```
### Phase 5: Publish Dashboard (if --publish flag set)
If `--publish` is true, save dashboard to a public location:
```bash
if [ "$PUBLISH" = true ]; then
echo ""
echo "📊 Publishing dashboard..."
# Create docs directory if it doesn't exist
DOCS_DIR="$PLUGIN_ROOT/../../docs"
mkdir -p "$DOCS_DIR"
# Save dashboard
DASHBOARD_FILE="$DOCS_DIR/system-health-$(date +%Y%m%d).md"
# Dashboard content written by Write tool above
# Also create/update latest symlink
ln -sf "system-health-$(date +%Y%m%d).md" "$DOCS_DIR/system-health-latest.md"
echo "✅ Dashboard published to: $DASHBOARD_FILE"
echo "📄 Latest: $DOCS_DIR/system-health-latest.md"
# If GitHub Pages configured, could push to gh-pages branch
# git checkout gh-pages
# cp $DASHBOARD_FILE index.md
# git add index.md && git commit -m "Update health dashboard" && git push
fi
```
### Phase 6: Send Email Summary (if --send-summary-email flag set)
If `--send-email` is true, generate and send email summary:
```bash
if [ "$SEND_EMAIL" = true ]; then
echo ""
echo "📧 Generating email summary..."
# Create condensed email version
EMAIL_SUBJECT="PSD Meta-Learning Health: [Status] - [Date]"
EMAIL_BODY="
System Health Summary - $(date +%Y-%m-%d)
🚀 VELOCITY: [X]x (↑[percentage]% vs baseline)
💰 ROI: [X]x compound multiplier
✅ QUALITY: [metrics summary]
🧠 INTELLIGENCE: [agent performance summary]
📊 THIS MONTH:
• [X] hours saved
• [N] auto-improvements implemented
• [N] bugs prevented
⚠️ ALERTS:
[List high-confidence predictions if any]
📈 TRENDS:
[Key positive trends]
🎯 RECOMMENDED ACTIONS:
[Top 3 action items]
Full dashboard: [link]
"
# Send via mail command or API
# echo "$EMAIL_BODY" | mail -s "$EMAIL_SUBJECT" hagelk@psd401.net
echo "✅ Email summary prepared"
echo " (Email sending requires mail configuration)"
fi
```
### Phase 7: Output Results
```bash
echo ""
echo "✅ Health dashboard generated!"
echo ""
if [ -n "$OUTPUT_FILE" ]; then
echo "📝 Saved to: $OUTPUT_FILE"
fi
if [ "$PUBLISH" = true ]; then
echo "📊 Published to docs/"
fi
if [ "$SEND_EMAIL" = true ]; then
echo "📧 Email summary prepared"
fi
echo ""
echo "Next steps:"
echo " • Review alerts and recommendations"
echo " • Act on immediate action items"
echo " • Track trends over time"
echo " • Share dashboard with stakeholders"
```
## Dashboard Generation Guidelines
### Data Aggregation Best Practices
**DO**:
- Calculate actual metrics from real data (don't estimate)
- Show trends with visual indicators (▁▂▃▄▅▆▇█, ↑↓, 🟢🟡🔴)
- Compare current vs baseline vs target
- Include confidence levels for predictions
- Provide actionable recommendations
- Track ROI with concrete numbers
**DON'T**:
- Show vanity metrics without context
- Include data without trends
- Make claims without evidence
- Overwhelm with too many metrics
- Ignore negative trends
- Present data without interpretation
### Handling Missing or Insufficient Data
If data is limited, clearly indicate:
```markdown
## 📊 LIMITED DATA AVAILABLE
**Current Status**:
- System age: [N] days (minimum 30 days recommended for trends)
- Executions: [N] (minimum 50+ for statistics)
- Data completeness: [percentage]%
**Available Metrics** (limited confidence):
[Show what metrics can be calculated]
**Unavailable Metrics** (insufficient data):
- Agent evolution (needs 3+ generations)
- Trend analysis (needs 30+ days)
- ROI accuracy (needs completed suggestions)
**Recommendation**:
Continue using the system for [N] more days to enable full dashboard.
**Preliminary Health**: [Basic metrics available]
```
### Trend Visualization
Use ASCII charts for quick visual trends:
```
Velocity over 12 weeks:
1.0x ████████
1.2x ██████████
1.5x ████████████
1.8x ██████████████
2.3x ██████████████████
ROI Compound Growth:
Month 1: ▁ 0.5x
Month 2: ▃ 1.8x
Month 3: ▅ 4.2x
Month 6: █ 9.4x
```
### Health Score Calculation
**Formula**: Sum of weighted sub-scores
- **Velocity** (20 points): Based on time_saved and productivity increase
- 1.0-1.5x = 10 pts
- 1.5-2.0x = 15 pts
- 2.0x+ = 20 pts
- **Quality** (20 points): Based on test coverage, tech debt, security
- Each metric contributes 5-7 pts
- **Intelligence** (20 points): Based on agent evolution and patterns learned
- Agent improvement avg >20% = 15+ pts
- Patterns documented >50 = 15+ pts
- **ROI** (20 points): Based on compound multiplier
- 2-5x = 10 pts
- 5-10x = 15 pts
- 10x+ = 20 pts
- **Trend** (20 points): Based on direction of key metrics
- All improving = 20 pts
- Mixed = 10-15 pts
- Declining = 0-10 pts
**Total**: 0-100 points
- 80-100: 🟢 Excellent
- 60-79: 🟡 Good
- 40-59: 🟡 Needs Improvement
- <40: 🔴 Critical
## Important Notes
1. **Accuracy**: All metrics must be based on actual data, never invented
2. **Trends**: Show direction and magnitude of change
3. **Context**: Always provide baseline and target for comparison
4. **Actionable**: Include specific recommendations based on data
5. **Honest**: Don't hide negative trends or problems
6. **Visual**: Use symbols and charts for quick scanning
7. **Regular**: Dashboard should be generated weekly or daily for trends
## Example Usage Scenarios
### Scenario 1: Daily Health Check
```bash
/meta_health
```
Quick health overview in terminal.
### Scenario 2: Weekly Dashboard Publication
```bash
/meta_health --publish --output meta/health-$(date +%Y%m%d).md
```
Save dashboard and publish to docs.
### Scenario 3: Monthly Stakeholder Report
```bash
/meta_health --publish --send-summary-email
```
Full dashboard with email summary to stakeholders.
---
**Remember**: The health dashboard demonstrates compound engineering value. Show concrete ROI, track trends over time, and provide actionable insights that drive continuous improvement.