Initial commit

2025-11-30 08:29:28 +08:00
commit 87c03319a3
50 changed files with 21409 additions and 0 deletions
--- a/skills/orchestration-qa/SKILL.md
+++ b/skills/orchestration-qa/SKILL.md
@@ -0,0 +1,573 @@
+---
+name: Orchestration QA
+description: Quality assurance for orchestration workflows - validates Skills and Subagents follow documented patterns, tracks deviations, suggests improvements
+---
+
+# Orchestration QA Skill
+
+## Overview
+
+This skill provides quality assurance for Task Orchestrator workflows by validating that Skills and Subagents follow their documented patterns, detecting deviations, and suggesting continuous improvements.
+
+**Key Capabilities:**
+- **Interactive configuration** - User chooses which analyses to enable (token efficiency)
+- **Pre-execution validation** - Context capture, checkpoint setting
+- **Post-execution review** - Workflow adherence, output validation
+- **Specialized quality analysis** - Execution graphs, tag coverage, information density
+- **Efficiency analysis** - Token optimization, tool selection, parallelization
+- **Deviation reporting** - Structured findings with severity (ALERT/WARN/INFO)
+- **Pattern tracking** - Continuous improvement suggestions
+
+**Philosophy:**
+- ✅ **User-driven configuration** - Pay token costs only for analyses you want
+- ✅ **Observe and validate** - Never blocks execution
+- ✅ **Report transparently** - Clear severity levels (ALERT/WARN/INFO)
+- ✅ **Learn from patterns** - Track issues, suggest improvements
+- ✅ **Progressive loading** - Load only analysis needed for context
+- ❌ **Not a blocker** - Warns about issues, doesn't stop workflows
+- ❌ **Not auto-fix** - Asks user for decisions on deviations
+
+## When to Use This Skill
+
+### Interactive Configuration (FIRST TIME)
+**Trigger**: First time using orchestration-qa in a session, or when user wants to change settings
+**Action**: Ask user which analysis categories to enable (multiselect interface)
+**Output**: Configuration stored in session, used for all subsequent reviews
+**User Value**: Only pay token costs for analyses you actually want
+
+### Session Initialization
+**Trigger**: After configuration, at start of orchestration session
+**Action**: Load knowledge bases (Skills, Subagents, routing config) based on enabled categories
+**Output**: Initialization status with active configuration, ready signal
+
+### Pre-Execution Validation
+**Triggers**:
+- "Create feature for X" (before Feature Orchestration Skill or Feature Architect)
+- "Execute tasks" (before Task Orchestration Skill)
+- "Mark complete" (before Status Progression Skill)
+- Before launching any Skill or Subagent
+
+**Action**: Capture context, set validation checkpoints
+**Output**: Stored context for post-execution comparison
+
+### Post-Execution Review
+**Triggers**:
+- After any Skill completes
+- After any Subagent returns
+- User asks: "Review quality", "Show QA results", "Any issues?"
+
+**Action**: Validate workflow adherence, analyze quality, detect deviations
+**Output**: Structured quality report with findings and recommendations
+
+## Parameters
+
+```typescript
+{
+  phase: "init" | "pre" | "post" | "configure",
+
+  // For pre/post phases
+  entityType?: "feature-orchestration" | "task-orchestration" |
+               "status-progression" | "dependency-analysis" |
+               "feature-architect" | "planning-specialist" |
+               "backend-engineer" | "frontend-developer" |
+               "database-engineer" | "test-engineer" |
+               "technical-writer" | "bug-triage-specialist",
+
+  // For pre phase
+  userInput?: string,          // Original user request
+
+  // For post phase
+  entityOutput?: string,       // Output from Skill/Subagent
+  entityId?: string,           // Feature/Task/Project ID (if applicable)
+
+  // Optional
+  verboseReporting?: boolean           // Default: false (brief reports)
+}
+```
+
+## Workflow
+
+### Phase: configure (Interactive Configuration) - **ALWAYS RUN FIRST**
+
+**Purpose**: Let user choose which analysis categories to enable for the session
+
+**When**: Before init phase, or when user wants to change settings mid-session
+
+**Interactive Prompts**:
+
+Use AskUserQuestion to present configuration options:
+
+```javascript
+AskUserQuestion({
+  questions: [
+    {
+      question: "Which quality analysis categories would you like to enable for this session?",
+      header: "QA Categories",
+      multiSelect: true,
+      options: [
+        {
+          label: "Information Density",
+          description: "Analyze task content quality, detect wasteful patterns, measure information-to-token ratio (Specialists only)"
+        },
+        {
+          label: "Execution Graphs",
+          description: "Validate dependency graphs and parallel execution opportunities (Planning Specialist only)"
+        },
+        {
+          label: "Tag Coverage",
+          description: "Check tag consistency and agent-mapping coverage (Planning Specialist & Feature Architect)"
+        },
+        {
+          label: "Token Optimization",
+          description: "Identify token waste patterns (verbose output, unnecessary loading, redundant operations)"
+        },
+        {
+          label: "Tool Selection",
+          description: "Verify optimal tool usage (overview vs get, search vs filtered query, bulk operations)"
+        },
+        {
+          label: "Routing Validation",
+          description: "Detect Skills bypass violations (CRITICAL - status changes, feature creation, task execution)"
+        },
+        {
+          label: "Parallel Detection",
+          description: "Find missed parallelization opportunities (independent tasks, batch operations)"
+        }
+      ]
+    },
+    {
+      question: "How detailed should QA reports be?",
+      header: "Report Style",
+      multiSelect: false,
+      options: [
+        {
+          label: "Brief",
+          description: "Only show critical issues (ALERT level) - minimal token usage"
+        },
+        {
+          label: "Standard",
+          description: "Show ALERT and WARN level issues with brief explanations"
+        },
+        {
+          label: "Detailed",
+          description: "Show all issues (ALERT/WARN/INFO) with full analysis and recommendations"
+        }
+      ]
+    }
+  ]
+})
+```
+
+**Default Configuration** (if user skips configuration):
+- ✅ Routing Validation (CRITICAL - always enabled)
+- ✅ Information Density (for specialists)
+- ❌ All other categories disabled
+- Report style: Standard
+
+**Configuration Storage**:
+Store user preferences in session state:
+```javascript
+session.qaConfig = {
+  enabled: {
+    informationDensity: true/false,
+    executionGraphs: true/false,
+    tagCoverage: true/false,
+    tokenOptimization: true/false,
+    toolSelection: true/false,
+    routingValidation: true,  // Always true (CRITICAL)
+    parallelDetection: true/false
+  },
+  reportStyle: "brief" | "standard" | "detailed"
+}
+```
+
+**Token Cost**: ~200-300 tokens (one-time configuration)
+
+### Phase: init (Session Initialization)
+
+**Purpose**: Load knowledge bases for validation throughout session
+
+**Steps**:
+1. **If not configured**: Run configure phase first (interactive)
+2. Read `initialization.md` for setup workflow
+3. Glob `.claude/skills/*/SKILL.md` → extract Skills knowledge
+   - Parse skill name, triggers, workflows, tools, token ranges
+4. Glob `.claude/agents/task-orchestrator/*.md` → extract Subagents knowledge
+   - Parse agent name, steps, critical patterns, output validation
+5. Read `agent-mapping.yaml` → extract routing configuration
+6. Initialize tracking state (deviations, patterns, improvements)
+7. Report initialization status with active configuration
+
+**Output**:
+```javascript
+{
+  initialized: true,
+  knowledgeBase: {
+    skillsCount: 5,
+    subagentsCount: 8,
+    routingLoaded: true
+  },
+  configuration: {
+    enabled: ["Information Density", "Routing Validation"],
+    disabled: ["Execution Graphs", "Tag Coverage", "Token Optimization", "Tool Selection", "Parallel Detection"],
+    reportStyle: "standard"
+  },
+  tracking: {
+    sessionStart: timestamp,
+    deviations: [],
+    patterns: []
+  }
+}
+```
+
+**Token Cost**: ~800-1000 tokens (loads once per session)
+
+### Phase: pre (Pre-Execution Validation)
+
+**Purpose**: Capture context and set validation checkpoints before launching
+
+**Steps**:
+1. Read `pre-execution.md` for validation checklist
+2. Identify entity type (Skill vs Subagent)
+3. Capture original user input context
+4. Set entity-specific validation checkpoints based on type:
+   - **Skills**: Expected workflow steps, tool usage, token range
+   - **Subagents**: Expected steps (8-9 steps), critical patterns, output format
+5. Store context for post-execution comparison
+6. Return ready signal
+
+**Context Captured**:
+- User's original request (full text)
+- Expected mode (PRD/Interactive/Quick for Feature Architect)
+- Entity type and anticipated complexity
+- Validation checkpoints to verify after execution
+
+**Output**:
+```javascript
+{
+  ready: true,
+  contextCaptured: true,
+  checkpoints: [
+    "Verify Skill assessed complexity correctly",
+    "Verify templates discovered and applied",
+    // ... entity-specific checkpoints
+  ]
+}
+```
+
+**Token Cost**: ~400-600 tokens
+
+### Phase: post (Post-Execution Review)
+
+**Purpose**: Validate workflow adherence, analyze quality, detect deviations
+
+**Steps**:
+
+#### 1. Load Post-Execution Workflow
+Read `post-execution.md` for review process
+
+#### 2. Determine Required Analyses
+Based on entity type AND user configuration:
+
+**Planning Specialist**:
+- Always: `post-execution.md` → core workflow validation
+- If `routingValidation` enabled: `routing-validation.md` → Skills usage check
+- If `executionGraphs` enabled: `graph-quality.md` → execution graph validation
+- If `tagCoverage` enabled: `tag-quality.md` → tag coverage analysis
+
+**Feature Architect**:
+- Always: `post-execution.md` → PRD extraction validation
+- Always: Compare output vs original user input
+- If `routingValidation` enabled: `routing-validation.md` → agent-mapping check
+- If `tagCoverage` enabled: `tag-quality.md` → tag consistency check
+
+**Implementation Specialists** (Backend, Frontend, Database, Test, Technical Writer):
+- Always: `post-execution.md` → lifecycle steps verification
+- If `routingValidation` enabled: `routing-validation.md` → Status Progression Skill usage
+- If `informationDensity` enabled: `task-content-quality.md` → content quality analysis
+- Always: Verify summary (300-500 chars), Files Changed section, test results
+
+**All Skills**:
+- Always: Read skill definition from knowledge base
+- Always: Verify expected workflow steps followed
+- Always: Check tool usage matches expected patterns
+- Always: Validate token range
+
+#### 3. Conditional Efficiency Analysis
+Based on user configuration:
+- If `tokenOptimization` enabled: Read `token-optimization.md` → identify token waste
+- If `toolSelection` enabled: Read `tool-selection.md` → verify optimal tool usage
+- If `parallelDetection` enabled: Read `parallel-detection.md` → find missed parallelization
+
+#### 4. Deviation Detection
+Compare actual execution against expected patterns:
+- **ALERT**: Critical violations (status bypass, cross-domain tasks, missing requirements)
+- **WARN**: Process issues (verbose output, skipped steps, suboptimal dependencies)
+- **INFO**: Observations (efficiency opportunities, quality patterns)
+
+#### 5. Reporting
+If deviations found:
+- Read `deviation-templates.md` → format report
+- Add to TodoWrite with appropriate severity
+- If ALERT: Report immediately to user with decision prompt
+- If WARN: Log for end-of-session summary
+- If INFO: Track for pattern analysis
+
+#### 6. Pattern Tracking
+Read `pattern-tracking.md` → continuous improvement:
+- Check for recurring issues (count >= 2 in session)
+- Suggest definition improvements if patterns detected
+- Track for session summary
+
+**Output**:
+```javascript
+{
+  workflowAdherence: "8/8 steps followed (100%)",
+  expectedOutputs: "7/7 present",
+  deviations: [
+    {
+      severity: "ALERT",
+      issue: "Cross-domain task detected",
+      details: "Task mixes backend + frontend",
+      recommendation: "Split into domain-isolated tasks"
+    }
+  ],
+  analyses: {
+    graphQuality: "95%",
+    tagCoverage: "100%",
+    tokenEfficiency: "85%"
+  },
+  recommendations: [
+    "Update planning-specialist.md to enforce domain isolation",
+    "Add validation checklist for cross-domain detection"
+  ]
+}
+```
+
+**Token Cost**:
+- Basic validation: ~600-800 tokens
+- With specialized analysis (Planning Specialist): ~1500-2000 tokens
+- With efficiency analysis: +800-1200 tokens
+
+## Progressive Loading Strategy
+
+**Optimization**: Load only the analysis docs needed based on entity type AND user configuration
+
+### Configuration-Driven Loading
+
+**Core Loading** (always loaded regardless of config):
+- `post-execution.md` → base workflow validation
+- Skill/Subagent definition from knowledge base
+- Entity-specific mandatory checks (summary, files changed, etc.)
+
+**Conditional Loading** (based on user configuration):
+
+```javascript
+// Planning Specialist
+if (config.routingValidation) → Read routing-validation.md
+if (config.executionGraphs) → Read graph-quality.md
+if (config.tagCoverage) → Read tag-quality.md
+
+// Feature Architect
+if (config.routingValidation) → Read routing-validation.md
+if (config.tagCoverage) → Read tag-quality.md
+
+// Implementation Specialists (Backend, Frontend, Database, Test, Technical Writer)
+if (config.routingValidation) → Read routing-validation.md
+if (config.informationDensity) → Read task-content-quality.md
+
+// All Entities
+if (config.tokenOptimization) → Read token-optimization.md
+if (config.toolSelection) → Read tool-selection.md
+if (config.parallelDetection) → Read parallel-detection.md
+
+// Reporting
+if (deviations.length > 0) → Read deviation-templates.md
+if (session.deviations.count >= 2) → Read pattern-tracking.md
+```
+
+### Token Savings Examples
+
+**Example 1: User only wants Information Density feedback**
+- Configuration: Only "Information Density" enabled
+- Loaded for Backend Engineer: `post-execution.md` + `task-content-quality.md` = ~1,200 tokens
+- Skipped: `routing-validation.md`, `token-optimization.md`, `tool-selection.md`, `parallel-detection.md` = ~2,400 tokens saved
+- **Savings: 67% reduction**
+
+**Example 2: User wants minimal CRITICAL validation only**
+- Configuration: Only "Routing Validation" enabled
+- Loaded: `post-execution.md` + `routing-validation.md` = ~1,000 tokens
+- Skipped: All other analysis docs = ~3,500 tokens saved
+- **Savings: 78% reduction**
+
+**Example 3: User wants comprehensive Planning Specialist review**
+- Configuration: All categories enabled
+- Loaded: `post-execution.md` + `graph-quality.md` + `tag-quality.md` + `routing-validation.md` + efficiency docs = ~3,500 tokens
+- Skipped: None (comprehensive mode)
+- **Savings: 0% (full analysis)**
+
+### Special Cases
+
+**Task Orchestration Skill**:
+- `parallel-detection.md` always loaded if enabled in config (core to this skill's purpose)
+
+**Status Progression Skill**:
+- `routing-validation.md` always loaded if enabled in config (CRITICAL - status bypass detection)
+
+## Output Format
+
+### Success (No Deviations)
+```markdown
+✅ **QA Review**: [Entity Name]
+
+Workflow adherence: 100%
+All quality checks passed.
+
+[If efficiency analysis enabled:]
+Token efficiency: 85% (identified 2 optimization opportunities)
+```
+
+### Issues Found
+```markdown
+## QA Review: [Entity Name]
+
+**Workflow Adherence:** X/Y steps (Z%)
+
+### ✅ Successes
+- [Success 1]
+- [Success 2]
+
+### ⚠️ Issues Detected
+
+**🚨 ALERT**: [Critical issue]
+- Impact: [What this affects]
+- Found: [What was observed]
+- Expected: [What should have happened]
+- Recommendation: [How to fix]
+
+**⚠️ WARN**: [Process issue]
+- Found: [What was observed]
+- Expected: [What should have happened]
+
+### 📋 Added to TodoWrite
+- Review [Entity]: [Issue description]
+- Improvement: [Suggestion]
+
+### 🎯 Recommendations
+1. [Most critical action]
+2. [Secondary action]
+
+### 💭 Decision Required
+[If user decision needed, present options]
+```
+
+## Integration with Orchestrator
+
+**Recommended Pattern**:
+
+```javascript
+// 1. FIRST TIME: Interactive configuration
+Use orchestration-qa skill (phase="configure")
+// Agent asks user which analysis categories to enable
+// User selects: "Information Density" + "Routing Validation"
+// Configuration stored in session
+
+// 2. Session initialization
+Use orchestration-qa skill (phase="init")
+// Returns: Initialized with [2] analysis categories enabled
+
+// 3. Before launching Feature Architect
+Use orchestration-qa skill (
+  phase="pre",
+  entityType="feature-architect",
+  userInput="[user's original request]"
+)
+
+// 4. Launch Feature Architect
+Task(subagent_type="Feature Architect", prompt="...")
+
+// 5. After Feature Architect returns
+Use orchestration-qa skill (
+  phase="post",
+  entityType="feature-architect",
+  entityOutput="[subagent's response]",
+  entityId="feature-uuid"
+)
+// Only loads: post-execution.md + routing-validation.md (user config)
+// Skips: graph-quality.md, tag-quality.md, token-optimization.md (not enabled)
+
+// 6. Review QA findings, take action if needed
+```
+
+**Mid-Session Reconfiguration**:
+
+```javascript
+// User: "I want to also track token optimization now"
+Use orchestration-qa skill (phase="configure")
+// Agent asks again, pre-selects current config
+// User adds "Token Optimization" to enabled categories
+// New config stored, affects all subsequent post-execution reviews
+```
+
+## Supporting Documentation
+
+This skill uses progressive loading to minimize token usage. Supporting docs are read as needed:
+
+- **initialization.md** - Session setup workflow
+- **pre-execution.md** - Context capture and checkpoint setting
+- **post-execution.md** - Core review workflow for all entities
+- **graph-quality.md** - Planning Specialist: execution graph analysis
+- **tag-quality.md** - Planning Specialist: tag coverage validation
+- **task-content-quality.md** - Implementation Specialists: information density and wasteful pattern detection
+- **token-optimization.md** - Efficiency: identify token waste patterns
+- **tool-selection.md** - Efficiency: verify optimal tool usage
+- **parallel-detection.md** - Efficiency: find missed parallelization
+- **routing-validation.md** - Critical: Skills vs Direct tool violations
+- **deviation-templates.md** - User report formatting by severity
+- **pattern-tracking.md** - Continuous improvement tracking
+
+## Token Efficiency
+
+**Current Trainer** (monolithic): ~20k-30k tokens always loaded
+
+**Orchestration QA Skill** (configuration-driven progressive loading):
+- Configure phase: ~200-300 tokens (one-time, interactive)
+- Init phase: ~1000 tokens (one-time per session)
+- Pre-execution: ~600 tokens (per entity)
+- Post-execution (varies by configuration):
+  - **Minimal** (routing only): ~800-1000 tokens
+  - **Standard** (info density + routing): ~1200-1500 tokens
+  - **Planning Specialist** (graphs + tags + routing): ~2000-2500 tokens
+  - **Comprehensive** (all categories): ~3500-4000 tokens
+
+**Configuration Impact Examples**:
+
+| User Configuration | Token Cost | vs Monolithic | vs Default |
+|-------------------|------------|---------------|------------|
+| Information Density only | ~1,200 tokens | 94% savings | 67% savings |
+| Routing Validation only | ~1,000 tokens | 95% savings | 78% savings |
+| Default (Info + Routing) | ~1,500 tokens | 93% savings | baseline |
+| Comprehensive (all enabled) | ~4,000 tokens | 80% savings | -167% |
+
+**Smart Defaults**: Most users only need Information Density + Routing Validation, achieving 93% token reduction while catching critical issues and wasteful content.
+
+## Quality Metrics
+
+Track these metrics across sessions:
+- Workflow adherence percentage
+- Deviation count by severity (ALERT/WARN/INFO)
+- Pattern recurrence (same issue multiple times)
+- Definition improvement suggestions generated
+- Token efficiency of analyzed workflows
+
+## Examples
+
+See `examples.md` for detailed usage scenarios including:
+- **Interactive configuration** - Choosing analysis categories
+- **Session initialization** - Loading knowledge bases with config
+- **Feature Architect validation** - PRD mode with selective analysis
+- **Planning Specialist review** - Graph + tag analysis (when enabled)
+- **Implementation Specialist review** - Information density tracking
+- **Status Progression enforcement** - Critical routing violations
+- **Mid-session reconfiguration** - Changing enabled categories
+- **Token efficiency comparisons** - Different configuration impacts