Initial commit
This commit is contained in:
573
skills/orchestration-qa/SKILL.md
Normal file
573
skills/orchestration-qa/SKILL.md
Normal file
@@ -0,0 +1,573 @@
|
||||
---
|
||||
name: Orchestration QA
|
||||
description: Quality assurance for orchestration workflows - validates Skills and Subagents follow documented patterns, tracks deviations, suggests improvements
|
||||
---
|
||||
|
||||
# Orchestration QA Skill
|
||||
|
||||
## Overview
|
||||
|
||||
This skill provides quality assurance for Task Orchestrator workflows by validating that Skills and Subagents follow their documented patterns, detecting deviations, and suggesting continuous improvements.
|
||||
|
||||
**Key Capabilities:**
|
||||
- **Interactive configuration** - User chooses which analyses to enable (token efficiency)
|
||||
- **Pre-execution validation** - Context capture, checkpoint setting
|
||||
- **Post-execution review** - Workflow adherence, output validation
|
||||
- **Specialized quality analysis** - Execution graphs, tag coverage, information density
|
||||
- **Efficiency analysis** - Token optimization, tool selection, parallelization
|
||||
- **Deviation reporting** - Structured findings with severity (ALERT/WARN/INFO)
|
||||
- **Pattern tracking** - Continuous improvement suggestions
|
||||
|
||||
**Philosophy:**
|
||||
- ✅ **User-driven configuration** - Pay token costs only for analyses you want
|
||||
- ✅ **Observe and validate** - Never blocks execution
|
||||
- ✅ **Report transparently** - Clear severity levels (ALERT/WARN/INFO)
|
||||
- ✅ **Learn from patterns** - Track issues, suggest improvements
|
||||
- ✅ **Progressive loading** - Load only analysis needed for context
|
||||
- ❌ **Not a blocker** - Warns about issues, doesn't stop workflows
|
||||
- ❌ **Not auto-fix** - Asks user for decisions on deviations
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
### Interactive Configuration (FIRST TIME)
|
||||
**Trigger**: First time using orchestration-qa in a session, or when user wants to change settings
|
||||
**Action**: Ask user which analysis categories to enable (multiselect interface)
|
||||
**Output**: Configuration stored in session, used for all subsequent reviews
|
||||
**User Value**: Only pay token costs for analyses you actually want
|
||||
|
||||
### Session Initialization
|
||||
**Trigger**: After configuration, at start of orchestration session
|
||||
**Action**: Load knowledge bases (Skills, Subagents, routing config) based on enabled categories
|
||||
**Output**: Initialization status with active configuration, ready signal
|
||||
|
||||
### Pre-Execution Validation
|
||||
**Triggers**:
|
||||
- "Create feature for X" (before Feature Orchestration Skill or Feature Architect)
|
||||
- "Execute tasks" (before Task Orchestration Skill)
|
||||
- "Mark complete" (before Status Progression Skill)
|
||||
- Before launching any Skill or Subagent
|
||||
|
||||
**Action**: Capture context, set validation checkpoints
|
||||
**Output**: Stored context for post-execution comparison
|
||||
|
||||
### Post-Execution Review
|
||||
**Triggers**:
|
||||
- After any Skill completes
|
||||
- After any Subagent returns
|
||||
- User asks: "Review quality", "Show QA results", "Any issues?"
|
||||
|
||||
**Action**: Validate workflow adherence, analyze quality, detect deviations
|
||||
**Output**: Structured quality report with findings and recommendations
|
||||
|
||||
## Parameters
|
||||
|
||||
```typescript
|
||||
{
|
||||
phase: "init" | "pre" | "post" | "configure",
|
||||
|
||||
// For pre/post phases
|
||||
entityType?: "feature-orchestration" | "task-orchestration" |
|
||||
"status-progression" | "dependency-analysis" |
|
||||
"feature-architect" | "planning-specialist" |
|
||||
"backend-engineer" | "frontend-developer" |
|
||||
"database-engineer" | "test-engineer" |
|
||||
"technical-writer" | "bug-triage-specialist",
|
||||
|
||||
// For pre phase
|
||||
userInput?: string, // Original user request
|
||||
|
||||
// For post phase
|
||||
entityOutput?: string, // Output from Skill/Subagent
|
||||
entityId?: string, // Feature/Task/Project ID (if applicable)
|
||||
|
||||
// Optional
|
||||
verboseReporting?: boolean // Default: false (brief reports)
|
||||
}
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase: configure (Interactive Configuration) - **ALWAYS RUN FIRST**
|
||||
|
||||
**Purpose**: Let user choose which analysis categories to enable for the session
|
||||
|
||||
**When**: Before init phase, or when user wants to change settings mid-session
|
||||
|
||||
**Interactive Prompts**:
|
||||
|
||||
Use AskUserQuestion to present configuration options:
|
||||
|
||||
```javascript
|
||||
AskUserQuestion({
|
||||
questions: [
|
||||
{
|
||||
question: "Which quality analysis categories would you like to enable for this session?",
|
||||
header: "QA Categories",
|
||||
multiSelect: true,
|
||||
options: [
|
||||
{
|
||||
label: "Information Density",
|
||||
description: "Analyze task content quality, detect wasteful patterns, measure information-to-token ratio (Specialists only)"
|
||||
},
|
||||
{
|
||||
label: "Execution Graphs",
|
||||
description: "Validate dependency graphs and parallel execution opportunities (Planning Specialist only)"
|
||||
},
|
||||
{
|
||||
label: "Tag Coverage",
|
||||
description: "Check tag consistency and agent-mapping coverage (Planning Specialist & Feature Architect)"
|
||||
},
|
||||
{
|
||||
label: "Token Optimization",
|
||||
description: "Identify token waste patterns (verbose output, unnecessary loading, redundant operations)"
|
||||
},
|
||||
{
|
||||
label: "Tool Selection",
|
||||
description: "Verify optimal tool usage (overview vs get, search vs filtered query, bulk operations)"
|
||||
},
|
||||
{
|
||||
label: "Routing Validation",
|
||||
description: "Detect Skills bypass violations (CRITICAL - status changes, feature creation, task execution)"
|
||||
},
|
||||
{
|
||||
label: "Parallel Detection",
|
||||
description: "Find missed parallelization opportunities (independent tasks, batch operations)"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
question: "How detailed should QA reports be?",
|
||||
header: "Report Style",
|
||||
multiSelect: false,
|
||||
options: [
|
||||
{
|
||||
label: "Brief",
|
||||
description: "Only show critical issues (ALERT level) - minimal token usage"
|
||||
},
|
||||
{
|
||||
label: "Standard",
|
||||
description: "Show ALERT and WARN level issues with brief explanations"
|
||||
},
|
||||
{
|
||||
label: "Detailed",
|
||||
description: "Show all issues (ALERT/WARN/INFO) with full analysis and recommendations"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
})
|
||||
```
|
||||
|
||||
**Default Configuration** (if user skips configuration):
|
||||
- ✅ Routing Validation (CRITICAL - always enabled)
|
||||
- ✅ Information Density (for specialists)
|
||||
- ❌ All other categories disabled
|
||||
- Report style: Standard
|
||||
|
||||
**Configuration Storage**:
|
||||
Store user preferences in session state:
|
||||
```javascript
|
||||
session.qaConfig = {
|
||||
enabled: {
|
||||
informationDensity: true/false,
|
||||
executionGraphs: true/false,
|
||||
tagCoverage: true/false,
|
||||
tokenOptimization: true/false,
|
||||
toolSelection: true/false,
|
||||
routingValidation: true, // Always true (CRITICAL)
|
||||
parallelDetection: true/false
|
||||
},
|
||||
reportStyle: "brief" | "standard" | "detailed"
|
||||
}
|
||||
```
|
||||
|
||||
**Token Cost**: ~200-300 tokens (one-time configuration)
|
||||
|
||||
### Phase: init (Session Initialization)
|
||||
|
||||
**Purpose**: Load knowledge bases for validation throughout session
|
||||
|
||||
**Steps**:
|
||||
1. **If not configured**: Run configure phase first (interactive)
|
||||
2. Read `initialization.md` for setup workflow
|
||||
3. Glob `.claude/skills/*/SKILL.md` → extract Skills knowledge
|
||||
- Parse skill name, triggers, workflows, tools, token ranges
|
||||
4. Glob `.claude/agents/task-orchestrator/*.md` → extract Subagents knowledge
|
||||
- Parse agent name, steps, critical patterns, output validation
|
||||
5. Read `agent-mapping.yaml` → extract routing configuration
|
||||
6. Initialize tracking state (deviations, patterns, improvements)
|
||||
7. Report initialization status with active configuration
|
||||
|
||||
**Output**:
|
||||
```javascript
|
||||
{
|
||||
initialized: true,
|
||||
knowledgeBase: {
|
||||
skillsCount: 5,
|
||||
subagentsCount: 8,
|
||||
routingLoaded: true
|
||||
},
|
||||
configuration: {
|
||||
enabled: ["Information Density", "Routing Validation"],
|
||||
disabled: ["Execution Graphs", "Tag Coverage", "Token Optimization", "Tool Selection", "Parallel Detection"],
|
||||
reportStyle: "standard"
|
||||
},
|
||||
tracking: {
|
||||
sessionStart: timestamp,
|
||||
deviations: [],
|
||||
patterns: []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Token Cost**: ~800-1000 tokens (loads once per session)
|
||||
|
||||
### Phase: pre (Pre-Execution Validation)
|
||||
|
||||
**Purpose**: Capture context and set validation checkpoints before launching
|
||||
|
||||
**Steps**:
|
||||
1. Read `pre-execution.md` for validation checklist
|
||||
2. Identify entity type (Skill vs Subagent)
|
||||
3. Capture original user input context
|
||||
4. Set entity-specific validation checkpoints based on type:
|
||||
- **Skills**: Expected workflow steps, tool usage, token range
|
||||
- **Subagents**: Expected steps (8-9 steps), critical patterns, output format
|
||||
5. Store context for post-execution comparison
|
||||
6. Return ready signal
|
||||
|
||||
**Context Captured**:
|
||||
- User's original request (full text)
|
||||
- Expected mode (PRD/Interactive/Quick for Feature Architect)
|
||||
- Entity type and anticipated complexity
|
||||
- Validation checkpoints to verify after execution
|
||||
|
||||
**Output**:
|
||||
```javascript
|
||||
{
|
||||
ready: true,
|
||||
contextCaptured: true,
|
||||
checkpoints: [
|
||||
"Verify Skill assessed complexity correctly",
|
||||
"Verify templates discovered and applied",
|
||||
// ... entity-specific checkpoints
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Token Cost**: ~400-600 tokens
|
||||
|
||||
### Phase: post (Post-Execution Review)
|
||||
|
||||
**Purpose**: Validate workflow adherence, analyze quality, detect deviations
|
||||
|
||||
**Steps**:
|
||||
|
||||
#### 1. Load Post-Execution Workflow
|
||||
Read `post-execution.md` for review process
|
||||
|
||||
#### 2. Determine Required Analyses
|
||||
Based on entity type AND user configuration:
|
||||
|
||||
**Planning Specialist**:
|
||||
- Always: `post-execution.md` → core workflow validation
|
||||
- If `routingValidation` enabled: `routing-validation.md` → Skills usage check
|
||||
- If `executionGraphs` enabled: `graph-quality.md` → execution graph validation
|
||||
- If `tagCoverage` enabled: `tag-quality.md` → tag coverage analysis
|
||||
|
||||
**Feature Architect**:
|
||||
- Always: `post-execution.md` → PRD extraction validation
|
||||
- Always: Compare output vs original user input
|
||||
- If `routingValidation` enabled: `routing-validation.md` → agent-mapping check
|
||||
- If `tagCoverage` enabled: `tag-quality.md` → tag consistency check
|
||||
|
||||
**Implementation Specialists** (Backend, Frontend, Database, Test, Technical Writer):
|
||||
- Always: `post-execution.md` → lifecycle steps verification
|
||||
- If `routingValidation` enabled: `routing-validation.md` → Status Progression Skill usage
|
||||
- If `informationDensity` enabled: `task-content-quality.md` → content quality analysis
|
||||
- Always: Verify summary (300-500 chars), Files Changed section, test results
|
||||
|
||||
**All Skills**:
|
||||
- Always: Read skill definition from knowledge base
|
||||
- Always: Verify expected workflow steps followed
|
||||
- Always: Check tool usage matches expected patterns
|
||||
- Always: Validate token range
|
||||
|
||||
#### 3. Conditional Efficiency Analysis
|
||||
Based on user configuration:
|
||||
- If `tokenOptimization` enabled: Read `token-optimization.md` → identify token waste
|
||||
- If `toolSelection` enabled: Read `tool-selection.md` → verify optimal tool usage
|
||||
- If `parallelDetection` enabled: Read `parallel-detection.md` → find missed parallelization
|
||||
|
||||
#### 4. Deviation Detection
|
||||
Compare actual execution against expected patterns:
|
||||
- **ALERT**: Critical violations (status bypass, cross-domain tasks, missing requirements)
|
||||
- **WARN**: Process issues (verbose output, skipped steps, suboptimal dependencies)
|
||||
- **INFO**: Observations (efficiency opportunities, quality patterns)
|
||||
|
||||
#### 5. Reporting
|
||||
If deviations found:
|
||||
- Read `deviation-templates.md` → format report
|
||||
- Add to TodoWrite with appropriate severity
|
||||
- If ALERT: Report immediately to user with decision prompt
|
||||
- If WARN: Log for end-of-session summary
|
||||
- If INFO: Track for pattern analysis
|
||||
|
||||
#### 6. Pattern Tracking
|
||||
Read `pattern-tracking.md` → continuous improvement:
|
||||
- Check for recurring issues (count >= 2 in session)
|
||||
- Suggest definition improvements if patterns detected
|
||||
- Track for session summary
|
||||
|
||||
**Output**:
|
||||
```javascript
|
||||
{
|
||||
workflowAdherence: "8/8 steps followed (100%)",
|
||||
expectedOutputs: "7/7 present",
|
||||
deviations: [
|
||||
{
|
||||
severity: "ALERT",
|
||||
issue: "Cross-domain task detected",
|
||||
details: "Task mixes backend + frontend",
|
||||
recommendation: "Split into domain-isolated tasks"
|
||||
}
|
||||
],
|
||||
analyses: {
|
||||
graphQuality: "95%",
|
||||
tagCoverage: "100%",
|
||||
tokenEfficiency: "85%"
|
||||
},
|
||||
recommendations: [
|
||||
"Update planning-specialist.md to enforce domain isolation",
|
||||
"Add validation checklist for cross-domain detection"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Token Cost**:
|
||||
- Basic validation: ~600-800 tokens
|
||||
- With specialized analysis (Planning Specialist): ~1500-2000 tokens
|
||||
- With efficiency analysis: +800-1200 tokens
|
||||
|
||||
## Progressive Loading Strategy
|
||||
|
||||
**Optimization**: Load only the analysis docs needed based on entity type AND user configuration
|
||||
|
||||
### Configuration-Driven Loading
|
||||
|
||||
**Core Loading** (always loaded regardless of config):
|
||||
- `post-execution.md` → base workflow validation
|
||||
- Skill/Subagent definition from knowledge base
|
||||
- Entity-specific mandatory checks (summary, files changed, etc.)
|
||||
|
||||
**Conditional Loading** (based on user configuration):
|
||||
|
||||
```javascript
|
||||
// Planning Specialist
|
||||
if (config.routingValidation) → Read routing-validation.md
|
||||
if (config.executionGraphs) → Read graph-quality.md
|
||||
if (config.tagCoverage) → Read tag-quality.md
|
||||
|
||||
// Feature Architect
|
||||
if (config.routingValidation) → Read routing-validation.md
|
||||
if (config.tagCoverage) → Read tag-quality.md
|
||||
|
||||
// Implementation Specialists (Backend, Frontend, Database, Test, Technical Writer)
|
||||
if (config.routingValidation) → Read routing-validation.md
|
||||
if (config.informationDensity) → Read task-content-quality.md
|
||||
|
||||
// All Entities
|
||||
if (config.tokenOptimization) → Read token-optimization.md
|
||||
if (config.toolSelection) → Read tool-selection.md
|
||||
if (config.parallelDetection) → Read parallel-detection.md
|
||||
|
||||
// Reporting
|
||||
if (deviations.length > 0) → Read deviation-templates.md
|
||||
if (session.deviations.count >= 2) → Read pattern-tracking.md
|
||||
```
|
||||
|
||||
### Token Savings Examples
|
||||
|
||||
**Example 1: User only wants Information Density feedback**
|
||||
- Configuration: Only "Information Density" enabled
|
||||
- Loaded for Backend Engineer: `post-execution.md` + `task-content-quality.md` = ~1,200 tokens
|
||||
- Skipped: `routing-validation.md`, `token-optimization.md`, `tool-selection.md`, `parallel-detection.md` = ~2,400 tokens saved
|
||||
- **Savings: 67% reduction**
|
||||
|
||||
**Example 2: User wants minimal CRITICAL validation only**
|
||||
- Configuration: Only "Routing Validation" enabled
|
||||
- Loaded: `post-execution.md` + `routing-validation.md` = ~1,000 tokens
|
||||
- Skipped: All other analysis docs = ~3,500 tokens saved
|
||||
- **Savings: 78% reduction**
|
||||
|
||||
**Example 3: User wants comprehensive Planning Specialist review**
|
||||
- Configuration: All categories enabled
|
||||
- Loaded: `post-execution.md` + `graph-quality.md` + `tag-quality.md` + `routing-validation.md` + efficiency docs = ~3,500 tokens
|
||||
- Skipped: None (comprehensive mode)
|
||||
- **Savings: 0% (full analysis)**
|
||||
|
||||
### Special Cases
|
||||
|
||||
**Task Orchestration Skill**:
|
||||
- `parallel-detection.md` always loaded if enabled in config (core to this skill's purpose)
|
||||
|
||||
**Status Progression Skill**:
|
||||
- `routing-validation.md` always loaded if enabled in config (CRITICAL - status bypass detection)
|
||||
|
||||
## Output Format
|
||||
|
||||
### Success (No Deviations)
|
||||
```markdown
|
||||
✅ **QA Review**: [Entity Name]
|
||||
|
||||
Workflow adherence: 100%
|
||||
All quality checks passed.
|
||||
|
||||
[If efficiency analysis enabled:]
|
||||
Token efficiency: 85% (identified 2 optimization opportunities)
|
||||
```
|
||||
|
||||
### Issues Found
|
||||
```markdown
|
||||
## QA Review: [Entity Name]
|
||||
|
||||
**Workflow Adherence:** X/Y steps (Z%)
|
||||
|
||||
### ✅ Successes
|
||||
- [Success 1]
|
||||
- [Success 2]
|
||||
|
||||
### ⚠️ Issues Detected
|
||||
|
||||
**🚨 ALERT**: [Critical issue]
|
||||
- Impact: [What this affects]
|
||||
- Found: [What was observed]
|
||||
- Expected: [What should have happened]
|
||||
- Recommendation: [How to fix]
|
||||
|
||||
**⚠️ WARN**: [Process issue]
|
||||
- Found: [What was observed]
|
||||
- Expected: [What should have happened]
|
||||
|
||||
### 📋 Added to TodoWrite
|
||||
- Review [Entity]: [Issue description]
|
||||
- Improvement: [Suggestion]
|
||||
|
||||
### 🎯 Recommendations
|
||||
1. [Most critical action]
|
||||
2. [Secondary action]
|
||||
|
||||
### 💭 Decision Required
|
||||
[If user decision needed, present options]
|
||||
```
|
||||
|
||||
## Integration with Orchestrator
|
||||
|
||||
**Recommended Pattern**:
|
||||
|
||||
```javascript
|
||||
// 1. FIRST TIME: Interactive configuration
|
||||
Use orchestration-qa skill (phase="configure")
|
||||
// Agent asks user which analysis categories to enable
|
||||
// User selects: "Information Density" + "Routing Validation"
|
||||
// Configuration stored in session
|
||||
|
||||
// 2. Session initialization
|
||||
Use orchestration-qa skill (phase="init")
|
||||
// Returns: Initialized with [2] analysis categories enabled
|
||||
|
||||
// 3. Before launching Feature Architect
|
||||
Use orchestration-qa skill (
|
||||
phase="pre",
|
||||
entityType="feature-architect",
|
||||
userInput="[user's original request]"
|
||||
)
|
||||
|
||||
// 4. Launch Feature Architect
|
||||
Task(subagent_type="Feature Architect", prompt="...")
|
||||
|
||||
// 5. After Feature Architect returns
|
||||
Use orchestration-qa skill (
|
||||
phase="post",
|
||||
entityType="feature-architect",
|
||||
entityOutput="[subagent's response]",
|
||||
entityId="feature-uuid"
|
||||
)
|
||||
// Only loads: post-execution.md + routing-validation.md (user config)
|
||||
// Skips: graph-quality.md, tag-quality.md, token-optimization.md (not enabled)
|
||||
|
||||
// 6. Review QA findings, take action if needed
|
||||
```
|
||||
|
||||
**Mid-Session Reconfiguration**:
|
||||
|
||||
```javascript
|
||||
// User: "I want to also track token optimization now"
|
||||
Use orchestration-qa skill (phase="configure")
|
||||
// Agent asks again, pre-selects current config
|
||||
// User adds "Token Optimization" to enabled categories
|
||||
// New config stored, affects all subsequent post-execution reviews
|
||||
```
|
||||
|
||||
## Supporting Documentation
|
||||
|
||||
This skill uses progressive loading to minimize token usage. Supporting docs are read as needed:
|
||||
|
||||
- **initialization.md** - Session setup workflow
|
||||
- **pre-execution.md** - Context capture and checkpoint setting
|
||||
- **post-execution.md** - Core review workflow for all entities
|
||||
- **graph-quality.md** - Planning Specialist: execution graph analysis
|
||||
- **tag-quality.md** - Planning Specialist: tag coverage validation
|
||||
- **task-content-quality.md** - Implementation Specialists: information density and wasteful pattern detection
|
||||
- **token-optimization.md** - Efficiency: identify token waste patterns
|
||||
- **tool-selection.md** - Efficiency: verify optimal tool usage
|
||||
- **parallel-detection.md** - Efficiency: find missed parallelization
|
||||
- **routing-validation.md** - Critical: Skills vs Direct tool violations
|
||||
- **deviation-templates.md** - User report formatting by severity
|
||||
- **pattern-tracking.md** - Continuous improvement tracking
|
||||
|
||||
## Token Efficiency
|
||||
|
||||
**Current Trainer** (monolithic): ~20k-30k tokens always loaded
|
||||
|
||||
**Orchestration QA Skill** (configuration-driven progressive loading):
|
||||
- Configure phase: ~200-300 tokens (one-time, interactive)
|
||||
- Init phase: ~1000 tokens (one-time per session)
|
||||
- Pre-execution: ~600 tokens (per entity)
|
||||
- Post-execution (varies by configuration):
|
||||
- **Minimal** (routing only): ~800-1000 tokens
|
||||
- **Standard** (info density + routing): ~1200-1500 tokens
|
||||
- **Planning Specialist** (graphs + tags + routing): ~2000-2500 tokens
|
||||
- **Comprehensive** (all categories): ~3500-4000 tokens
|
||||
|
||||
**Configuration Impact Examples**:
|
||||
|
||||
| User Configuration | Token Cost | vs Monolithic | vs Default |
|
||||
|-------------------|------------|---------------|------------|
|
||||
| Information Density only | ~1,200 tokens | 94% savings | 67% savings |
|
||||
| Routing Validation only | ~1,000 tokens | 95% savings | 78% savings |
|
||||
| Default (Info + Routing) | ~1,500 tokens | 93% savings | baseline |
|
||||
| Comprehensive (all enabled) | ~4,000 tokens | 80% savings | -167% |
|
||||
|
||||
**Smart Defaults**: Most users only need Information Density + Routing Validation, achieving 93% token reduction while catching critical issues and wasteful content.
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
Track these metrics across sessions:
|
||||
- Workflow adherence percentage
|
||||
- Deviation count by severity (ALERT/WARN/INFO)
|
||||
- Pattern recurrence (same issue multiple times)
|
||||
- Definition improvement suggestions generated
|
||||
- Token efficiency of analyzed workflows
|
||||
|
||||
## Examples
|
||||
|
||||
See `examples.md` for detailed usage scenarios including:
|
||||
- **Interactive configuration** - Choosing analysis categories
|
||||
- **Session initialization** - Loading knowledge bases with config
|
||||
- **Feature Architect validation** - PRD mode with selective analysis
|
||||
- **Planning Specialist review** - Graph + tag analysis (when enabled)
|
||||
- **Implementation Specialist review** - Information density tracking
|
||||
- **Status Progression enforcement** - Critical routing violations
|
||||
- **Mid-session reconfiguration** - Changing enabled categories
|
||||
- **Token efficiency comparisons** - Different configuration impacts
|
||||
345
skills/orchestration-qa/deviation-templates.md
Normal file
345
skills/orchestration-qa/deviation-templates.md
Normal file
@@ -0,0 +1,345 @@
|
||||
# Deviation Report Templates
|
||||
|
||||
**Purpose**: Format QA findings for user presentation based on severity.
|
||||
|
||||
**When**: After deviations detected in post-execution review
|
||||
|
||||
**Token Cost**: ~200-400 tokens
|
||||
|
||||
## Severity Levels
|
||||
|
||||
### 🚨 ALERT (Critical)
|
||||
**Impact**: Affects functionality, correctness, or mandatory patterns
|
||||
**Action**: Report immediately, add to TodoWrite, request user decision
|
||||
**Examples**:
|
||||
- Status change bypassed Status Progression Skill
|
||||
- Cross-domain task detected (violates domain isolation)
|
||||
- PRD sections not extracted (requirements lost)
|
||||
- Incorrect dependencies in execution graph
|
||||
- Task has no specialist mapping (routing will fail)
|
||||
|
||||
### ⚠️ WARN (Process Issue)
|
||||
**Impact**: Process not followed optimally, should be addressed
|
||||
**Action**: Include in post-execution report, add to TodoWrite
|
||||
**Examples**:
|
||||
- Workflow step skipped (non-critical)
|
||||
- Output too verbose (token waste)
|
||||
- Templates not applied when available
|
||||
- Missed parallel opportunities
|
||||
- Tags don't follow project conventions
|
||||
|
||||
### ℹ️ INFO (Observation)
|
||||
**Impact**: Optimization opportunity or quality pattern
|
||||
**Action**: Log for pattern tracking, mention if noteworthy
|
||||
**Examples**:
|
||||
- Token usage outside expected range (but reasonable)
|
||||
- Could use more efficient tool (overview vs get)
|
||||
- Format improvement suggestions
|
||||
- Efficiency opportunities identified
|
||||
|
||||
## Report Templates
|
||||
|
||||
### ALERT Template (Critical Violation)
|
||||
|
||||
```markdown
|
||||
## 🚨 QA Review: [Entity Name] - CRITICAL ISSUES DETECTED
|
||||
|
||||
**Workflow Adherence:** [X]/[Y] steps ([Z]%)
|
||||
|
||||
### Critical Issues ([count])
|
||||
|
||||
**❌ ALERT: [Issue Title]**
|
||||
|
||||
**What Happened:**
|
||||
[Clear description of what was observed]
|
||||
|
||||
**Expected Behavior:**
|
||||
[What should have happened according to documentation]
|
||||
|
||||
**Impact:**
|
||||
[What this affects - functionality, correctness, workflow]
|
||||
|
||||
**Evidence:**
|
||||
- [Specific evidence from output/database]
|
||||
- [Tool calls made or not made]
|
||||
- [Data discrepancies]
|
||||
|
||||
**Recommendation:**
|
||||
[Specific action to fix the issue]
|
||||
|
||||
**Definition Update Needed:**
|
||||
[If this is a pattern, what definition needs updating]
|
||||
|
||||
---
|
||||
|
||||
### ✅ Successes ([count])
|
||||
- [What went well]
|
||||
- [Patterns followed correctly]
|
||||
|
||||
### 📋 Added to TodoWrite
|
||||
- [ ] Review [Entity]: [Issue description]
|
||||
- [ ] Fix [specific issue]
|
||||
- [ ] Update [definition file] with [improvement]
|
||||
|
||||
### 💭 Decision Required
|
||||
|
||||
**Question:** [What user needs to decide]
|
||||
|
||||
**Options:**
|
||||
1. [Option A with pros/cons]
|
||||
2. [Option B with pros/cons]
|
||||
3. [Option C with pros/cons]
|
||||
|
||||
**Recommendation:** [Your suggestion with reasoning]
|
||||
```
|
||||
|
||||
### WARN Template (Process Issue)
|
||||
|
||||
```markdown
|
||||
## ⚠️ QA Review: [Entity Name] - Issues Found
|
||||
|
||||
**Workflow Adherence:** [X]/[Y] steps ([Z]%)
|
||||
|
||||
### Issues Detected ([count])
|
||||
|
||||
**⚠️ WARN: [Issue Title]**
|
||||
- **Found:** [What was observed]
|
||||
- **Expected:** [What should have happened]
|
||||
- **Impact:** [How this affects quality/efficiency]
|
||||
- **Fix:** [How to correct]
|
||||
|
||||
**⚠️ WARN: [Issue Title 2]**
|
||||
- **Found:** [What was observed]
|
||||
- **Expected:** [What should have happened]
|
||||
|
||||
### ✅ Successes
|
||||
- [Workflow adherence: X/Y steps]
|
||||
- [Quality metrics: X% graph quality, Y% tag coverage]
|
||||
|
||||
### 📋 Added to TodoWrite
|
||||
- [ ] [Issue 1 to address]
|
||||
- [ ] [Issue 2 to address]
|
||||
|
||||
### 🎯 Recommendations
|
||||
1. [Most important fix]
|
||||
2. [Process improvement]
|
||||
3. [Optional optimization]
|
||||
```
|
||||
|
||||
### INFO Template (Observations)
|
||||
|
||||
```markdown
|
||||
## ℹ️ QA Review: [Entity Name] - Observations
|
||||
|
||||
**Workflow Adherence:** [X]/[Y] steps ([Z]%)
|
||||
|
||||
### Quality Metrics
|
||||
- Dependency Accuracy: [X]%
|
||||
- Parallel Completeness: [Y]%
|
||||
- Tag Coverage: [Z]%
|
||||
- Token Efficiency: [W]%
|
||||
|
||||
### Observations ([count])
|
||||
|
||||
**ℹ️ Efficiency Opportunity: [Title]**
|
||||
- Current approach: [What was done]
|
||||
- Optimal approach: [Better way]
|
||||
- Potential savings: [Benefit]
|
||||
|
||||
**ℹ️ Format Suggestion: [Title]**
|
||||
- Current: [What was done]
|
||||
- Suggested: [Improvement]
|
||||
|
||||
### ✅ Overall Assessment
|
||||
Workflow completed successfully with minor optimization opportunities.
|
||||
|
||||
[Optional: Include observations in session summary]
|
||||
```
|
||||
|
||||
### Success Template (No Issues)
|
||||
|
||||
```markdown
|
||||
## ✅ QA Review: [Entity Name]
|
||||
|
||||
**Workflow Adherence:** 100% ([Y]/[Y] steps completed)
|
||||
|
||||
**Quality Metrics:**
|
||||
- All checkpoints passed ✅
|
||||
- All expected outputs present ✅
|
||||
- Token usage within range ✅
|
||||
- Workflow patterns followed ✅
|
||||
|
||||
[If efficiency analysis enabled:]
|
||||
**Efficiency:**
|
||||
- Token efficiency: [X]%
|
||||
- Optimal tool selection ✅
|
||||
- Parallel opportunities identified ✅
|
||||
|
||||
**Result:** No issues detected - excellent execution!
|
||||
```
|
||||
|
||||
## TodoWrite Integration
|
||||
|
||||
### ALERT Issues
|
||||
|
||||
```javascript
|
||||
TodoWrite([
|
||||
{
|
||||
content: `ALERT: [Entity] - [Critical issue summary]`,
|
||||
activeForm: `Reviewing [Entity] critical issue`,
|
||||
status: "pending"
|
||||
},
|
||||
{
|
||||
content: `Fix: [Specific corrective action]`,
|
||||
activeForm: `Fixing [issue]`,
|
||||
status: "pending"
|
||||
},
|
||||
{
|
||||
content: `Update [definition]: [Improvement needed]`,
|
||||
activeForm: `Updating definition`,
|
||||
status: "pending"
|
||||
}
|
||||
])
|
||||
```
|
||||
|
||||
### WARN Issues
|
||||
|
||||
```javascript
|
||||
TodoWrite([
|
||||
{
|
||||
content: `Review [Entity]: [Issue summary] ([count] issues)`,
|
||||
activeForm: `Reviewing [Entity] quality issues`,
|
||||
status: "pending"
|
||||
}
|
||||
])
|
||||
```
|
||||
|
||||
### INFO Observations
|
||||
|
||||
```javascript
|
||||
// Generally don't add INFO to TodoWrite unless noteworthy
|
||||
// Track for pattern analysis instead
|
||||
```
|
||||
|
||||
## Multi-Issue Aggregation
|
||||
|
||||
When multiple issues of same type detected:
|
||||
|
||||
```markdown
|
||||
### Cross-Domain Tasks Detected ([count])
|
||||
|
||||
**Pattern:** Tasks mixing specialist domains
|
||||
|
||||
**Violations:**
|
||||
1. **[Task A]:** Combines [domain1] + [domain2]
|
||||
- Evidence: [description mentions both]
|
||||
- Fix: Split into 2 tasks
|
||||
|
||||
2. **[Task B]:** Combines [domain2] + [domain3]
|
||||
- Evidence: [tags include both]
|
||||
- Fix: Split into 2 tasks
|
||||
|
||||
**Root Cause:** [Why this happened - e.g., feature requirements not decomposed properly]
|
||||
|
||||
**Systemic Fix:** Update [planning-specialist.md] to enforce domain isolation check before task creation
|
||||
|
||||
**Added to TodoWrite:**
|
||||
- [ ] Split Task A into domain-isolated tasks
|
||||
- [ ] Split Task B into domain-isolated tasks
|
||||
- [ ] Update planning-specialist.md validation checklist
|
||||
```
|
||||
|
||||
## User Decision Prompts
|
||||
|
||||
### Template 1: Retry with Correct Approach
|
||||
|
||||
```markdown
|
||||
### 💭 Decision Required
|
||||
|
||||
**Issue:** [Entity] bypassed mandatory [Skill Name] Skill
|
||||
|
||||
**Impact:** [What validation was skipped]
|
||||
|
||||
**Options:**
|
||||
|
||||
1. **Retry with [Skill Name] Skill** ✅ Recommended
|
||||
- Pros: Ensures validation runs, follows documented workflow
|
||||
- Cons: Requires re-execution
|
||||
|
||||
2. **Accept as-is and manually verify**
|
||||
- Pros: Faster (no re-execution)
|
||||
- Cons: May miss validation issues, sets bad precedent
|
||||
|
||||
3. **Update [Entity] to bypass Skill** ⚠️ Not Recommended
|
||||
- Pros: Allows direct approach
|
||||
- Cons: Removes safety checks, violates workflow
|
||||
|
||||
**Recommendation:** Retry with [Skill Name] Skill to ensure [prerequisites] are validated.
|
||||
|
||||
**Your choice?**
|
||||
```
|
||||
|
||||
### Template 2: Definition Update
|
||||
|
||||
```markdown
|
||||
### 💭 Decision Required
|
||||
|
||||
**Pattern Detected:** [Issue] occurred [N] times in session
|
||||
|
||||
**Systemic Issue:** [Root cause analysis]
|
||||
|
||||
**Proposed Definition Update:**
|
||||
|
||||
```diff
|
||||
// File: [definition-file.md]
|
||||
|
||||
+ Add validation checklist:
|
||||
+ - [ ] Verify all independent tasks in Batch 1
|
||||
+ - [ ] Check for cross-domain tasks before creation
|
||||
+ - [ ] Validate tag → specialist mapping coverage
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
1. **Update definition now** ✅ Recommended
|
||||
- Prevents recurrence
|
||||
- Improves workflow quality
|
||||
|
||||
2. **Track for later review**
|
||||
- Allows more data collection
|
||||
- May recur in meantime
|
||||
|
||||
**Your preference?**
|
||||
```
|
||||
|
||||
## Formatting Guidelines
|
||||
|
||||
### Clarity
|
||||
- Start with severity emoji (🚨/⚠️/ℹ️)
|
||||
- Use clear section headers
|
||||
- Separate concerns (issues, successes, recommendations)
|
||||
|
||||
### Actionability
|
||||
- Specific evidence, not vague observations
|
||||
- Clear "Expected" vs "Found" comparisons
|
||||
- Concrete recommendations with steps
|
||||
|
||||
### Brevity
|
||||
- ALERT: Full details (this is critical)
|
||||
- WARN: Moderate details (important but not urgent)
|
||||
- INFO: Brief summary (observations only)
|
||||
|
||||
### Consistency
|
||||
- Always include workflow adherence percentage
|
||||
- Always show count of issues by severity
|
||||
- Always provide TodoWrite summary
|
||||
- Always offer recommendations
|
||||
|
||||
## Output Size Targets
|
||||
|
||||
- **ALERT report**: 300-600 tokens (comprehensive)
|
||||
- **WARN report**: 200-400 tokens (focused)
|
||||
- **INFO report**: 100-200 tokens (brief)
|
||||
- **Success report**: 50-100 tokens (minimal)
|
||||
|
||||
**Total QA report** (including analysis): 800-2000 tokens depending on issues found
|
||||
746
skills/orchestration-qa/examples.md
Normal file
746
skills/orchestration-qa/examples.md
Normal file
@@ -0,0 +1,746 @@
|
||||
# Orchestration QA Skill - Usage Examples
|
||||
|
||||
This document provides practical examples of using the orchestration-qa skill in different scenarios.
|
||||
|
||||
## Example 1: Interactive Configuration (FIRST TIME)
|
||||
|
||||
### Scenario
|
||||
First time using orchestration-qa in a session - choose which analyses to enable
|
||||
|
||||
### Usage
|
||||
```javascript
|
||||
orchestration-qa(phase="configure")
|
||||
```
|
||||
|
||||
### Interactive Prompts
|
||||
Agent presents multiselect options:
|
||||
|
||||
```markdown
|
||||
## Quality Analysis Configuration
|
||||
|
||||
**Which quality analysis categories would you like to enable for this session?**
|
||||
(Select all that apply)
|
||||
|
||||
☐ Information Density
|
||||
Analyze task content quality, detect wasteful patterns, measure information-to-token ratio (Specialists only)
|
||||
|
||||
☐ Execution Graphs
|
||||
Validate dependency graphs and parallel execution opportunities (Planning Specialist only)
|
||||
|
||||
☐ Tag Coverage
|
||||
Check tag consistency and agent-mapping coverage (Planning Specialist & Feature Architect)
|
||||
|
||||
☐ Token Optimization
|
||||
Identify token waste patterns (verbose output, unnecessary loading, redundant operations)
|
||||
|
||||
☐ Tool Selection
|
||||
Verify optimal tool usage (overview vs get, search vs filtered query, bulk operations)
|
||||
|
||||
☑ Routing Validation
|
||||
Detect Skills bypass violations (CRITICAL - status changes, feature creation, task execution)
|
||||
|
||||
☐ Parallel Detection
|
||||
Find missed parallelization opportunities (independent tasks, batch operations)
|
||||
|
||||
---
|
||||
|
||||
**How detailed should QA reports be?**
|
||||
|
||||
○ Brief - Only show critical issues (ALERT level) - minimal token usage
|
||||
● Standard - Show ALERT and WARN level issues with brief explanations
|
||||
○ Detailed - Show all issues (ALERT/WARN/INFO) with full analysis and recommendations
|
||||
```
|
||||
|
||||
### User Selections
|
||||
```javascript
|
||||
// User selects:
|
||||
{
|
||||
categories: ["Information Density", "Routing Validation"],
|
||||
reportStyle: "Standard"
|
||||
}
|
||||
```
|
||||
|
||||
### Output
|
||||
```markdown
|
||||
✅ **QA Configuration Saved**
|
||||
|
||||
**Enabled Categories (2):**
|
||||
- ✅ Information Density - Track task content quality and wasteful patterns
|
||||
- ✅ Routing Validation - Detect critical Skills bypass violations
|
||||
|
||||
**Disabled Categories (5):**
|
||||
- ⊘ Execution Graphs
|
||||
- ⊘ Tag Coverage
|
||||
- ⊘ Token Optimization
|
||||
- ⊘ Tool Selection
|
||||
- ⊘ Parallel Detection
|
||||
|
||||
**Report Style:** Standard (ALERT + WARN)
|
||||
|
||||
**Token Efficiency:**
|
||||
- Per analysis: ~1,200 tokens (vs ~3,500 comprehensive)
|
||||
- Savings: 67% reduction from full analysis
|
||||
- Trade-off: Focus on content quality + critical routing only
|
||||
|
||||
Configuration will be used for all QA reviews in this session.
|
||||
You can reconfigure anytime with `orchestration-qa(phase="configure")`
|
||||
```
|
||||
|
||||
## Example 2: Session Initialization (With Configuration)
|
||||
|
||||
### Scenario
|
||||
After configuration, initialize the session with knowledge bases
|
||||
|
||||
### Prerequisites
|
||||
Configuration must be set first (via `phase="configure"`)
|
||||
|
||||
### Usage
|
||||
```javascript
|
||||
orchestration-qa(phase="init")
|
||||
```
|
||||
|
||||
### Output
|
||||
```markdown
|
||||
✅ **Orchestration QA Initialized**
|
||||
|
||||
**Knowledge Base Loaded:**
|
||||
- Skills: 5 (feature-orchestration, task-orchestration, status-progression, dependency-analysis, dependency-orchestration)
|
||||
- Subagents: 8 (feature-architect, planning-specialist, backend-engineer, frontend-developer, database-engineer, test-engineer, technical-writer, bug-triage-specialist)
|
||||
- Routing: agent-mapping.yaml loaded (12 tag mappings)
|
||||
|
||||
**Active Configuration:**
|
||||
- ✅ Information Density (track task content quality)
|
||||
- ✅ Routing Validation (critical bypass detection)
|
||||
- ⊘ Execution Graphs (disabled)
|
||||
- ⊘ Tag Coverage (disabled)
|
||||
- ⊘ Token Optimization (disabled)
|
||||
- ⊘ Tool Selection (disabled)
|
||||
- ⊘ Parallel Detection (disabled)
|
||||
- Report Style: Standard (ALERT + WARN)
|
||||
|
||||
**Quality Assurance Active:**
|
||||
- ✅ Pre-execution validation
|
||||
- ✅ Post-execution review
|
||||
- ✅ Configuration-driven analysis (2 categories enabled)
|
||||
- ✅ Pattern tracking (continuous improvement)
|
||||
|
||||
**Session Tracking:**
|
||||
- Deviations: 0 ALERT, 0 WARN, 0 INFO
|
||||
- Patterns: 0 recurring issues
|
||||
- Improvements: 0 suggestions
|
||||
|
||||
Ready to monitor orchestration quality.
|
||||
**Token Efficiency:** ~1,200 tokens per analysis (67% savings from comprehensive)
|
||||
```
|
||||
|
||||
## Example 3: Feature Architect Validation (PRD Mode)
|
||||
|
||||
### Scenario
|
||||
User provides formal PRD document for feature creation
|
||||
|
||||
### User Request
|
||||
```markdown
|
||||
Create a user authentication feature:
|
||||
|
||||
# User Authentication
|
||||
|
||||
## Business Context
|
||||
We need secure user authentication for our platform.
|
||||
|
||||
## User Stories
|
||||
- As a user, I want to sign up with email/password
|
||||
- As a user, I want to log in securely
|
||||
- As a user, I want to reset my password
|
||||
|
||||
## Technical Requirements
|
||||
- OAuth2 support
|
||||
- JWT tokens
|
||||
- Role-based access control
|
||||
```
|
||||
|
||||
### Pre-Execution
|
||||
```javascript
|
||||
orchestration-qa(
|
||||
phase="pre",
|
||||
entityType="feature-architect",
|
||||
userInput="[full PRD above]"
|
||||
)
|
||||
```
|
||||
|
||||
### Pre-Execution Output
|
||||
```javascript
|
||||
{
|
||||
ready: true,
|
||||
contextCaptured: true,
|
||||
entityType: "feature-architect",
|
||||
category: "SUBAGENT",
|
||||
checkpoints: 8,
|
||||
expected: {
|
||||
mode: "PRD",
|
||||
prdSections: ["Business Context", "User Stories", "Technical Requirements"],
|
||||
descriptionLength: [500, 1000],
|
||||
sectionsExpected: 3,
|
||||
handoffTokens: [50, 100]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Post-Execution
|
||||
```javascript
|
||||
orchestration-qa(
|
||||
phase="post",
|
||||
entityType="feature-architect",
|
||||
entityOutput="[Feature Architect's response]",
|
||||
entityId="feature-uuid-123"
|
||||
)
|
||||
```
|
||||
|
||||
### Post-Execution Output (Success)
|
||||
```markdown
|
||||
## ✅ QA Review: Feature Architect (PRD Mode)
|
||||
|
||||
**Workflow Adherence:** 8/8 steps (100%)
|
||||
**Expected Outputs:** 7/7 present
|
||||
|
||||
### Validation Results
|
||||
- ✅ PRD mode detected correctly
|
||||
- ✅ All 3 PRD sections extracted to feature sections
|
||||
- ✅ Description forward-looking (623 chars)
|
||||
- ✅ Templates applied (Technical Approach, Requirements)
|
||||
- ✅ Tags follow project conventions (reused "authentication", "security")
|
||||
- ✅ agent-mapping.yaml checked (tags map to specialists)
|
||||
- ✅ Handoff minimal (87 tokens)
|
||||
- ✅ Core concepts preserved from PRD
|
||||
|
||||
**Quality Metrics:**
|
||||
- PRD extraction: 100% (3/3 sections)
|
||||
- Token efficiency: 95% (handoff 87 tokens, expected < 100)
|
||||
- Tag conventions: 100% (reused existing tags)
|
||||
|
||||
**Result:** Excellent execution - all patterns followed correctly!
|
||||
```
|
||||
|
||||
### Post-Execution Output (Issues Detected)
|
||||
```markdown
|
||||
## 🚨 QA Review: Feature Architect (PRD Mode) - ISSUES DETECTED
|
||||
|
||||
**Workflow Adherence:** 7/8 steps (87%)
|
||||
|
||||
### Critical Issues (1)
|
||||
|
||||
**❌ ALERT: PRD Sections Incomplete**
|
||||
|
||||
**What Happened:**
|
||||
Feature has 2 sections, but PRD contained 3 sections.
|
||||
|
||||
**Expected Behavior:**
|
||||
In PRD mode, Feature Architect must extract ALL sections from user's document.
|
||||
|
||||
**Impact:**
|
||||
"Technical Requirements" section from PRD was not transferred to feature.
|
||||
Requirements may be lost or incomplete.
|
||||
|
||||
**Evidence:**
|
||||
- PRD sections: ["Business Context", "User Stories", "Technical Requirements"]
|
||||
- Feature sections: ["Business Context", "User Stories"]
|
||||
- Missing: "Technical Requirements"
|
||||
|
||||
**Recommendation:**
|
||||
Add missing "Technical Requirements" section to feature.
|
||||
|
||||
**Definition Update Needed:**
|
||||
Update feature-architect.md Step 7 to include validation:
|
||||
- [ ] Verify all PRD sections extracted
|
||||
- [ ] Compare PRD section count vs feature section count
|
||||
- [ ] If mismatch, add missing sections before returning
|
||||
|
||||
---
|
||||
|
||||
### ✅ Successes (6)
|
||||
- PRD mode detected correctly
|
||||
- Description forward-looking
|
||||
- Templates applied
|
||||
- Tags follow conventions
|
||||
- agent-mapping.yaml checked
|
||||
- Handoff minimal (92 tokens)
|
||||
|
||||
### 📋 Added to TodoWrite
|
||||
- [ ] Add "Technical Requirements" section to Feature [ID]
|
||||
- [ ] Update feature-architect.md Step 7 validation checklist
|
||||
|
||||
### 💭 Decision Required
|
||||
|
||||
**Question:** Should we add the missing "Technical Requirements" section now?
|
||||
|
||||
**Options:**
|
||||
1. **Add section now** ✅ Recommended
|
||||
- Pros: Ensures all PRD content captured
|
||||
- Cons: Requires one additional tool call
|
||||
|
||||
2. **Accept as-is**
|
||||
- Pros: Faster (no additional work)
|
||||
- Cons: Requirements may be incomplete
|
||||
|
||||
**Recommendation:** Add section now to ensure complete PRD capture.
|
||||
|
||||
**Your choice?**
|
||||
```
|
||||
|
||||
## Example 4: Planning Specialist Review (Graph Analysis)
|
||||
|
||||
### Scenario
|
||||
Planning Specialist breaks down feature into tasks
|
||||
|
||||
### Post-Execution
|
||||
```javascript
|
||||
orchestration-qa(
|
||||
phase="post",
|
||||
entityType="planning-specialist",
|
||||
entityOutput="[Planning Specialist's response]",
|
||||
entityId="feature-uuid-123"
|
||||
)
|
||||
```
|
||||
|
||||
### Output (High Quality)
|
||||
```markdown
|
||||
## ✅ QA Review: Planning Specialist
|
||||
|
||||
**Workflow Adherence:** 8/8 steps (100%)
|
||||
**Expected Outputs:** 7/7 present
|
||||
|
||||
### Specialized Analysis
|
||||
|
||||
**📊 Execution Graph Quality: 98%**
|
||||
- Dependency Accuracy: 100% (all dependencies correct)
|
||||
- Parallel Completeness: 100% (all opportunities identified)
|
||||
- Format Clarity: 95% (clear batch numbers, explicit dependencies)
|
||||
|
||||
**🏷️ Tag Quality: 100%**
|
||||
- Tag Coverage: 100% (all tasks have tags)
|
||||
- Agent Mapping Coverage: 100% (all tags map to specialists)
|
||||
- Convention Adherence: 100% (reused existing tags)
|
||||
|
||||
### Quality Metrics
|
||||
- Domain isolation: ✅ (one task = one specialist)
|
||||
- Dependencies mapped: ✅ (Database → Backend → Frontend pattern)
|
||||
- Documentation task: ✅ (user-facing feature)
|
||||
- Testing task: ✅ (created)
|
||||
- No circular dependencies: ✅
|
||||
- Templates applied: ✅
|
||||
|
||||
**Result:** Excellent execution - target quality (95%+) achieved!
|
||||
```
|
||||
|
||||
### Output (Issues Detected)
|
||||
```markdown
|
||||
## ⚠️ QA Review: Planning Specialist - Issues Found
|
||||
|
||||
**Workflow Adherence:** 8/8 steps (100%)
|
||||
|
||||
### Specialized Analysis
|
||||
|
||||
**📊 Execution Graph Quality: 73%**
|
||||
- Dependency Accuracy: 67% (2/3 dependencies incorrect)
|
||||
- Parallel Completeness: 67% (1 opportunity missed)
|
||||
- Format Clarity: 85% (some ambiguous notation)
|
||||
|
||||
**🏷️ Tag Quality: 92%**
|
||||
- Tag Coverage: 100% (all tasks have tags)
|
||||
- Agent Mapping Coverage: 100% (all tags map to specialists)
|
||||
- Convention Adherence: 75% (1 new tag without agent-mapping check)
|
||||
|
||||
### Issues Detected (4)
|
||||
|
||||
**🚨 ALERT: Incorrect Dependency**
|
||||
- Task: "Implement backend API"
|
||||
- Expected blocked by: ["Create database schema", "Design API endpoints"]
|
||||
- Found in graph: ["Design API endpoints"]
|
||||
- **Missing:** "Create database schema"
|
||||
- **Impact:** Task might start before database ready
|
||||
|
||||
**🚨 ALERT: Cross-Domain Task**
|
||||
- Task: "Build authentication UI"
|
||||
- Domains detected: frontend + backend
|
||||
- Evidence: Description mentions "UI components AND API integration"
|
||||
- **Fix:** Split into "Build authentication UI" (frontend) + "Integrate auth API" (backend)
|
||||
|
||||
**⚠️ WARN: Missed Parallel Opportunity**
|
||||
- Task: "Create database schema"
|
||||
- Status: Independent (no blockers)
|
||||
- **Issue:** Not in Batch 1, placed in Batch 2
|
||||
- **Fix:** Move to Batch 1 (can start immediately)
|
||||
|
||||
**⚠️ WARN: Tag Convention Violation**
|
||||
- Task: "Build authentication UI"
|
||||
- Tag: "auth-ui" (new tag)
|
||||
- **Issue:** Tag not in project conventions, not checked in agent-mapping.yaml
|
||||
- **Suggestion:** Use existing tag "frontend" or add "auth-ui" to agent-mapping.yaml
|
||||
|
||||
### ✅ Successes
|
||||
- 8/8 workflow steps completed
|
||||
- Documentation task included
|
||||
- Testing task included
|
||||
- Task descriptions populated (200-600 chars)
|
||||
|
||||
### 📋 Added to TodoWrite
|
||||
- [ ] Fix dependency: "Implement backend API" should depend on "Create database schema"
|
||||
- [ ] Split cross-domain task: "Build authentication UI"
|
||||
- [ ] Move "Create database schema" to Batch 1
|
||||
- [ ] Review tag "auth-ui" - add to agent-mapping.yaml or use "frontend"
|
||||
|
||||
### 🎯 Recommendations
|
||||
1. Update dependencies (CRITICAL - affects execution order)
|
||||
2. Split cross-domain task (CRITICAL - routing will fail)
|
||||
3. Optimize parallel execution (move task to Batch 1)
|
||||
4. Update planning-specialist.md Step 3 to enforce domain isolation check
|
||||
```
|
||||
|
||||
## Example 5: Status Progression Skill Validation
|
||||
|
||||
### Scenario
|
||||
Orchestrator attempts to mark task complete
|
||||
|
||||
### Pre-Execution (Routing Violation Detected)
|
||||
```javascript
|
||||
orchestration-qa(
|
||||
phase="pre",
|
||||
entityType="manage_container", // Direct tool call detected
|
||||
userInput="Mark task T1 complete"
|
||||
)
|
||||
```
|
||||
|
||||
### Output (Immediate Alert)
|
||||
```markdown
|
||||
## 🚨 CRITICAL ROUTING VIOLATION
|
||||
|
||||
**Severity:** CRITICAL
|
||||
|
||||
**Workflow Type:** status-change
|
||||
|
||||
**Violation:** Status change bypassed mandatory Status Progression Skill
|
||||
|
||||
**Expected Approach:** Use Status Progression Skill
|
||||
|
||||
**Actual Approach:** Direct manage_container(operation="setStatus") call
|
||||
|
||||
**Impact:** Prerequisite validation may be skipped:
|
||||
- Summary length validation (300-500 chars required)
|
||||
- Dependency completion check
|
||||
- Files Changed section verification (for code tasks)
|
||||
- Test validation
|
||||
|
||||
**Recommendation:** Use Status Progression Skill for ALL status changes
|
||||
|
||||
---
|
||||
|
||||
**Action:** STOP - Do not proceed with direct status change
|
||||
|
||||
**Correct Approach:**
|
||||
```javascript
|
||||
Use status-progression skill with task ID
|
||||
```
|
||||
|
||||
**Why This Matters:**
|
||||
Status Progression Skill validates prerequisites before allowing status change.
|
||||
Bypassing this validation can result in incomplete tasks being marked complete.
|
||||
```
|
||||
|
||||
## Example 6: Implementation Specialist Review
|
||||
|
||||
### Scenario
|
||||
Backend Engineer completes task
|
||||
|
||||
### Post-Execution
|
||||
```javascript
|
||||
orchestration-qa(
|
||||
phase="post",
|
||||
entityType="backend-engineer",
|
||||
entityOutput="[Backend Engineer's response]",
|
||||
entityId="task-uuid-456"
|
||||
)
|
||||
```
|
||||
|
||||
### Output (Success)
|
||||
```markdown
|
||||
## ✅ QA Review: Backend Engineer
|
||||
|
||||
**Workflow Adherence:** 9/9 steps (100%)
|
||||
|
||||
### Lifecycle Validation
|
||||
- ✅ Read task with sections
|
||||
- ✅ Read dependencies
|
||||
- ✅ Completed implementation work
|
||||
- ✅ Updated task sections with results
|
||||
- ✅ Tests run and passing
|
||||
- ✅ Summary populated (387 chars)
|
||||
- ✅ Files Changed section created (ordinal 999)
|
||||
- ✅ Used Status Progression Skill to mark complete
|
||||
- ✅ Output minimal (73 tokens)
|
||||
|
||||
### Quality Checks
|
||||
- Summary length: 387 chars (expected 300-500) ✅
|
||||
- Files Changed: Present ✅
|
||||
- Tests mentioned: Yes ("All 12 tests passing") ✅
|
||||
- Status change method: Status Progression Skill ✅
|
||||
- Output brevity: 73 tokens (expected 50-100) ✅
|
||||
|
||||
**Result:** Perfect lifecycle execution!
|
||||
```
|
||||
|
||||
### Output (Issues)
|
||||
```markdown
|
||||
## 🚨 QA Review: Backend Engineer - CRITICAL ISSUE
|
||||
|
||||
**Workflow Adherence:** 8/9 steps (89%)
|
||||
|
||||
### Critical Issues (1)
|
||||
|
||||
**❌ ALERT: Marked Complete Without Status Progression Skill**
|
||||
|
||||
**What Happened:**
|
||||
Backend Engineer called manage_container(operation="setStatus") directly.
|
||||
|
||||
**Expected Behavior:**
|
||||
Step 8 of specialist lifecycle requires using Status Progression Skill.
|
||||
|
||||
**Impact:**
|
||||
- Summary validation may have been skipped (no length check)
|
||||
- Files Changed section may not have been verified
|
||||
- Test validation may have been incomplete
|
||||
|
||||
**Evidence:**
|
||||
- Task status changed to "completed"
|
||||
- No mention of "Status Progression" in output
|
||||
- Direct tool call detected
|
||||
|
||||
**Recommendation:**
|
||||
All implementation specialists MUST use Status Progression Skill in Step 8.
|
||||
|
||||
**Definition Update Needed:**
|
||||
Update backend-engineer.md to emphasize CRITICAL pattern:
|
||||
```diff
|
||||
### Step 8: Use Status Progression Skill to Mark Complete
|
||||
|
||||
+ **CRITICAL:** NEVER call manage_container directly for status changes
|
||||
+ **ALWAYS:** Use Status Progression Skill for prerequisite validation
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ⚠️ Issues (1)
|
||||
|
||||
**⚠️ WARN: Files Changed Section Missing**
|
||||
- Expected: Section with ordinal 999, title "Files Changed"
|
||||
- Found: No Files Changed section
|
||||
- **Impact:** Difficult to track what files were modified
|
||||
|
||||
### 📋 Added to TodoWrite
|
||||
- [ ] ALERT: Backend Engineer bypassed Status Progression Skill
|
||||
- [ ] Add Files Changed section to task
|
||||
- [ ] Update backend-engineer.md Step 8 critical pattern
|
||||
|
||||
### 💭 Decision Required
|
||||
|
||||
**Issue:** Critical workflow pattern violated (Status Progression bypass)
|
||||
|
||||
**Options:**
|
||||
1. **Validate task manually**
|
||||
- Check summary length (300-500 chars)
|
||||
- Verify Files Changed section exists or create it
|
||||
- Confirm tests passing
|
||||
|
||||
2. **Revert and retry with Status Progression Skill**
|
||||
- Revert task to "in-progress"
|
||||
- Use Status Progression Skill for completion
|
||||
- Ensures all prerequisites validated
|
||||
|
||||
**Recommendation:** Option 1 for this instance, but update backend-engineer.md
|
||||
to prevent recurrence.
|
||||
```
|
||||
|
||||
## Example 7: Session Summary with Patterns
|
||||
|
||||
### Scenario
|
||||
End of session after multiple workflows
|
||||
|
||||
### Usage
|
||||
```javascript
|
||||
orchestration-qa(phase="summary", sessionId="session-123")
|
||||
```
|
||||
|
||||
### Output
|
||||
```markdown
|
||||
## 📊 Session QA Summary
|
||||
|
||||
**Workflows Analyzed:** 6
|
||||
- Skills: 2 (Feature Orchestration, Status Progression)
|
||||
- Subagents: 4 (Feature Architect, Planning Specialist, 2x Backend Engineer)
|
||||
|
||||
**Quality Overview:**
|
||||
- ✅ Successful: 4 (no issues)
|
||||
- ⚠️ Issues: 1 (Planning Specialist - graph quality 73%)
|
||||
- 🚨 Critical: 1 (Backend Engineer - status bypass)
|
||||
|
||||
### Deviation Breakdown
|
||||
- Routing violations: 1 (status change bypass)
|
||||
- Workflow deviations: 0
|
||||
- Output quality: 0
|
||||
- Dependency errors: 2 (in Planning Specialist)
|
||||
- Tag issues: 1 (convention violation)
|
||||
- Token waste: 0
|
||||
|
||||
### Recurring Patterns (1)
|
||||
|
||||
**🔁 Pattern: Status Change Bypasses**
|
||||
- Occurrences: 2 (Backend Engineer x2)
|
||||
- Root cause: Step 8 critical pattern not emphasized enough
|
||||
- Impact: Prerequisites validation skipped
|
||||
- **Suggestion**: Update backend-engineer.md Step 8 with CRITICAL emphasis
|
||||
|
||||
### Improvement Recommendations (2)
|
||||
|
||||
**Priority 1: Backend Engineer Definition Update**
|
||||
- File: backend-engineer.md
|
||||
- Section: Step 8
|
||||
- Type: Critical Pattern Emphasis
|
||||
- Change: Add CRITICAL warning against direct status changes
|
||||
- Impact: Prevents status bypass in future executions
|
||||
- Effort: Low (text addition)
|
||||
|
||||
**Priority 2: Planning Specialist Validation Checklist**
|
||||
- File: planning-specialist.md
|
||||
- Section: Step 5 (Map Dependencies)
|
||||
- Type: Validation Checklist
|
||||
- Change: Add graph quality validation before returning
|
||||
- Impact: Ensures execution graph accuracy >= 95%
|
||||
- Effort: Medium (add quality gate step)
|
||||
|
||||
### Quality Trends
|
||||
- Graph quality: 87% average (baseline 70%, target 95%+)
|
||||
- Tag coverage: 98% average (baseline 90%, target 100%)
|
||||
- Token efficiency: 91% average
|
||||
- Workflow adherence: 94% average
|
||||
|
||||
### Next Steps
|
||||
1. Update backend-engineer.md Step 8 (CRITICAL pattern emphasis)
|
||||
2. Update planning-specialist.md Step 5 (graph validation checklist)
|
||||
3. Monitor for recurrence in next session
|
||||
```
|
||||
|
||||
## Example 8: Mid-Session Reconfiguration
|
||||
|
||||
### Scenario
|
||||
User wants to enable additional analysis categories after session has started
|
||||
|
||||
### User Request
|
||||
"I want to also track token optimization now to see if we're wasting tokens"
|
||||
|
||||
### Usage
|
||||
```javascript
|
||||
orchestration-qa(phase="configure")
|
||||
```
|
||||
|
||||
### Interactive Prompts
|
||||
Agent presents options with current configuration pre-selected:
|
||||
|
||||
```markdown
|
||||
## Quality Analysis Configuration
|
||||
|
||||
**Current Configuration:**
|
||||
- ✅ Information Density
|
||||
- ✅ Routing Validation
|
||||
|
||||
**Which quality analysis categories would you like to enable?**
|
||||
(Your current selections are pre-checked)
|
||||
|
||||
☑ Information Density ← Currently enabled
|
||||
☐ Execution Graphs
|
||||
☐ Tag Coverage
|
||||
☑ Token Optimization ← User adds this
|
||||
☐ Tool Selection
|
||||
☑ Routing Validation ← Currently enabled
|
||||
☐ Parallel Detection
|
||||
```
|
||||
|
||||
### User Updates
|
||||
```javascript
|
||||
// User selects:
|
||||
{
|
||||
categories: ["Information Density", "Routing Validation", "Token Optimization"],
|
||||
reportStyle: "Standard" // Keep existing
|
||||
}
|
||||
```
|
||||
|
||||
### Output
|
||||
```markdown
|
||||
✅ **QA Configuration Updated**
|
||||
|
||||
**Changes:**
|
||||
- ➕ Added: Token Optimization
|
||||
|
||||
**New Configuration (3 categories):**
|
||||
- ✅ Information Density
|
||||
- ✅ Routing Validation
|
||||
- ✅ Token Optimization (NEW)
|
||||
|
||||
**Report Style:** Standard (unchanged)
|
||||
|
||||
**Token Impact:**
|
||||
- Previous: ~1,200 tokens per analysis
|
||||
- Updated: ~1,800 tokens per analysis (+600 tokens)
|
||||
- Trade-off: Now detecting token waste patterns
|
||||
|
||||
All future QA reviews will include token optimization analysis.
|
||||
```
|
||||
|
||||
### Next Post-Execution
|
||||
```javascript
|
||||
// After Backend Engineer completes task
|
||||
orchestration-qa(
|
||||
phase="post",
|
||||
entityType="backend-engineer",
|
||||
entityOutput="...",
|
||||
entityId="task-uuid"
|
||||
)
|
||||
|
||||
// Now loads: post-execution.md + routing-validation.md +
|
||||
// task-content-quality.md + token-optimization.md
|
||||
// Analysis includes: content quality + routing + token waste detection
|
||||
```
|
||||
|
||||
## Usage Patterns Summary
|
||||
|
||||
### Session Start (First Time)
|
||||
1. `phase="configure"` - Interactive category selection (~200-300 tokens)
|
||||
2. `phase="init"` - Load knowledge bases (~1000 tokens)
|
||||
|
||||
### Per Entity
|
||||
- `phase="pre"` - Before launching any Skill or Subagent (~600 tokens)
|
||||
- `phase="post"` - After any Skill or Subagent completes (varies by config)
|
||||
|
||||
### Optional
|
||||
- `phase="configure"` - Reconfigure mid-session
|
||||
- `phase="summary"` - End-of-session pattern tracking (~800 tokens)
|
||||
|
||||
### Configuration-Driven Token Costs
|
||||
|
||||
**Post-Execution Costs by Configuration:**
|
||||
|
||||
| Configuration | Token Cost | Use Case |
|
||||
|--------------|------------|----------|
|
||||
| **Minimal** (Routing only) | ~1,000 tokens | Critical bypass detection only |
|
||||
| **Default** (Info Density + Routing) | ~1,200 tokens | Most users - content + critical checks |
|
||||
| **Planning Focus** (Graphs + Tags + Routing) | ~2,000 tokens | Planning Specialist reviews |
|
||||
| **Comprehensive** (All enabled) | ~3,500 tokens | Full quality analysis |
|
||||
|
||||
**Session Cost Examples:**
|
||||
|
||||
| Workflow | Config | Total Cost | vs Monolithic |
|
||||
|----------|--------|------------|---------------|
|
||||
| 1 Feature + 3 Tasks | Default | ~6k tokens | 70% savings |
|
||||
| 1 Feature + 3 Tasks | Minimal | ~4.5k tokens | 78% savings |
|
||||
| 1 Feature + 3 Tasks | Comprehensive | ~15k tokens | 25% savings |
|
||||
|
||||
**Monolithic Trainer**: 20k-30k tokens always loaded (no configuration)
|
||||
|
||||
**Smart Defaults**: Information Density + Routing Validation achieves 93% token reduction while catching critical issues
|
||||
140
skills/orchestration-qa/graph-quality.md
Normal file
140
skills/orchestration-qa/graph-quality.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# Execution Graph Quality Analysis
|
||||
|
||||
**Purpose**: Validate Planning Specialist's execution graph matches actual database dependencies and identifies all parallel opportunities.
|
||||
|
||||
**When**: After Planning Specialist completes task breakdown
|
||||
|
||||
**Entity**: Planning Specialist only
|
||||
|
||||
**Token Cost**: ~600-900 tokens
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
This analysis measures three aspects of execution graph quality:
|
||||
|
||||
1. **Dependency Accuracy** (70% baseline): Do claimed dependencies match database?
|
||||
2. **Parallel Completeness** (70% baseline): Are all parallel opportunities identified?
|
||||
3. **Format Clarity** (95% baseline): Is graph notation clear and unambiguous?
|
||||
|
||||
**Target**: 95%+ overall quality score
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
### Step 1: Query Actual Dependencies
|
||||
|
||||
```javascript
|
||||
// Get all tasks for feature
|
||||
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
|
||||
|
||||
// Query dependencies for each task
|
||||
actualDependencies = {}
|
||||
for task in tasks:
|
||||
deps = query_dependencies(taskId=task.id, includeTaskInfo=true)
|
||||
actualDependencies[task.id] = {
|
||||
title: task.title,
|
||||
blockedBy: deps.incoming,
|
||||
blocks: deps.outgoing
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Extract Planning Specialist's Graph
|
||||
|
||||
Parse the output to extract claimed execution structure:
|
||||
|
||||
```javascript
|
||||
planningGraph = extractExecutionGraph(planningOutput)
|
||||
// Should contain: batches, dependencies, parallel claims
|
||||
```
|
||||
|
||||
### Step 3: Verify Dependency Accuracy
|
||||
|
||||
Compare claimed vs actual dependencies:
|
||||
|
||||
```javascript
|
||||
for task in tasks:
|
||||
graphBlockers = planningGraph.dependencies[task.title] || []
|
||||
actualBlockers = actualDependencies[task.id].blockedBy.map(t => t.title)
|
||||
|
||||
if (!arraysEqual(graphBlockers, actualBlockers)) {
|
||||
issues.push({
|
||||
task: task.title,
|
||||
expected: actualBlockers,
|
||||
found: graphBlockers,
|
||||
severity: "ALERT"
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Verify Parallel Completeness
|
||||
|
||||
Check all parallel opportunities identified:
|
||||
|
||||
```javascript
|
||||
// Independent tasks (no blockers) should all be in Batch 1
|
||||
independentTasks = tasks.filter(t => actualDependencies[t.id].blockedBy.length == 0)
|
||||
|
||||
for task in independentTasks:
|
||||
if (!isInBatch(task, 1, planningGraph)) {
|
||||
issues.push({
|
||||
task: task.title,
|
||||
issue: "Independent task not in Batch 1",
|
||||
severity: "WARN"
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Tasks in same batch should have no dependencies between them
|
||||
for batch in planningGraph.batches:
|
||||
for [taskA, taskB] in batch.pairs():
|
||||
if (actualDependencies[taskA.id].blocks.includes(taskB.id)) {
|
||||
issues.push({
|
||||
issue: `${taskA.title} blocks ${taskB.title} but both in same batch`,
|
||||
severity: "ALERT"
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 5: Calculate Quality Score
|
||||
|
||||
```javascript
|
||||
score = {
|
||||
dependencyAccuracy: (correct / total) * 100,
|
||||
parallelCompleteness: (identified / opportunities) * 100,
|
||||
formatClarity: hasGoodFormat ? 100 : 50,
|
||||
overall: average(dependencyAccuracy, parallelCompleteness, formatClarity)
|
||||
}
|
||||
```
|
||||
|
||||
## Report Template
|
||||
|
||||
```markdown
|
||||
## 📊 Execution Graph Quality
|
||||
|
||||
**Overall Score**: [X]% (Baseline: 70% / Target: 95%+)
|
||||
|
||||
### Metrics
|
||||
- Dependency Accuracy: [X]%
|
||||
- Parallel Completeness: [Y]%
|
||||
- Format Clarity: [Z]%
|
||||
|
||||
### Issues ([count] total)
|
||||
🚨 **ALERT** ([count]): Critical dependency errors
|
||||
- [Task A]: Expected blocked by [B], found [C]
|
||||
|
||||
⚠️ **WARN** ([count]): Missed parallel opportunities
|
||||
- [Task D]: Independent but not in Batch 1
|
||||
|
||||
### Recommendations
|
||||
1. [Most critical fix]
|
||||
2. [Process improvement]
|
||||
```
|
||||
|
||||
## When to Report
|
||||
|
||||
- **ALWAYS** after Planning Specialist
|
||||
- **Full details** if score < 95%
|
||||
- **Brief summary** if score >= 95%
|
||||
477
skills/orchestration-qa/initialization.md
Normal file
477
skills/orchestration-qa/initialization.md
Normal file
@@ -0,0 +1,477 @@
|
||||
# Session Initialization
|
||||
|
||||
**Purpose**: Load knowledge bases for Skills, Subagents, and routing configuration to enable validation throughout the session.
|
||||
|
||||
**When**: First interaction in new session (phase="init")
|
||||
|
||||
**Token Cost**: ~800-1000 tokens (one-time per session)
|
||||
|
||||
## Initialization Workflow
|
||||
|
||||
### Step 1: Load Skills Knowledge Base
|
||||
|
||||
**Action**: Discover and parse all Skill definitions
|
||||
|
||||
```javascript
|
||||
// Glob all skill files
|
||||
skillFiles = Glob(pattern=".claude/skills/*/SKILL.md")
|
||||
|
||||
// For each skill file found:
|
||||
for skillFile in skillFiles:
|
||||
// Read and parse YAML frontmatter + content
|
||||
content = Read(skillFile)
|
||||
|
||||
// Extract from YAML frontmatter
|
||||
name = content.frontmatter.name
|
||||
description = content.frontmatter.description
|
||||
|
||||
// Extract from content sections
|
||||
mandatoryTriggers = extractSection(content, "When to Use This Skill")
|
||||
workflows = extractSection(content, "Workflow")
|
||||
expectedOutputs = extractSection(content, "Output Format")
|
||||
toolUsage = extractSection(content, "Tools Used")
|
||||
tokenRange = extractSection(content, "Token Cost")
|
||||
|
||||
// Store in knowledge base
|
||||
skills[name] = {
|
||||
file: skillFile,
|
||||
description: description,
|
||||
mandatoryTriggers: mandatoryTriggers,
|
||||
workflows: workflows,
|
||||
expectedOutputs: expectedOutputs,
|
||||
tools: toolUsage,
|
||||
tokenRange: tokenRange
|
||||
}
|
||||
```
|
||||
|
||||
**Example Skills Loaded**:
|
||||
|
||||
```javascript
|
||||
skills = {
|
||||
"Feature Orchestration": {
|
||||
file: ".claude/skills/feature-orchestration/SKILL.md",
|
||||
mandatoryTriggers: [
|
||||
"Create a feature",
|
||||
"Complete feature",
|
||||
"Feature progress"
|
||||
],
|
||||
workflows: [
|
||||
"Smart Feature Creation",
|
||||
"Task Breakdown Coordination",
|
||||
"Feature Completion"
|
||||
],
|
||||
expectedOutputs: ["Feature ID", "Task count", "Next action"],
|
||||
tools: ["query_container", "manage_container", "query_templates", "recommend_agent"],
|
||||
tokenRange: [300, 800]
|
||||
},
|
||||
|
||||
"Task Orchestration": {
|
||||
file: ".claude/skills/task-orchestration/SKILL.md",
|
||||
mandatoryTriggers: [
|
||||
"Execute tasks",
|
||||
"What's next",
|
||||
"Launch batch",
|
||||
"What tasks are ready"
|
||||
],
|
||||
workflows: [
|
||||
"Dependency-Aware Batching",
|
||||
"Parallel Specialist Launch",
|
||||
"Progress Monitoring"
|
||||
],
|
||||
expectedOutputs: ["Batch structure", "Parallel opportunities", "Specialist recommendations"],
|
||||
tools: ["query_container", "manage_container", "query_dependencies", "recommend_agent"],
|
||||
tokenRange: [500, 900]
|
||||
},
|
||||
|
||||
"Status Progression": {
|
||||
file: ".claude/skills/status-progression/SKILL.md",
|
||||
mandatoryTriggers: [
|
||||
"Mark complete",
|
||||
"Update status",
|
||||
"Status change",
|
||||
"Move to testing"
|
||||
],
|
||||
workflows: [
|
||||
"Read Config",
|
||||
"Validate Prerequisites",
|
||||
"Interpret Errors"
|
||||
],
|
||||
expectedOutputs: ["Status updated", "Validation error with details"],
|
||||
tools: ["Read", "query_container", "query_dependencies"],
|
||||
tokenRange: [200, 400],
|
||||
critical: "MANDATORY for ALL status changes - never bypass"
|
||||
},
|
||||
|
||||
"Dependency Analysis": {
|
||||
file: ".claude/skills/dependency-analysis/SKILL.md",
|
||||
mandatoryTriggers: [
|
||||
"What's blocking",
|
||||
"Show dependencies",
|
||||
"Check blockers"
|
||||
],
|
||||
workflows: [
|
||||
"Query Dependencies",
|
||||
"Analyze Chains",
|
||||
"Report Findings"
|
||||
],
|
||||
expectedOutputs: ["Blocker list", "Dependency chains", "Unblock suggestions"],
|
||||
tools: ["query_dependencies", "query_container"],
|
||||
tokenRange: [300, 600]
|
||||
},
|
||||
|
||||
"Dependency Orchestration": {
|
||||
file: ".claude/skills/dependency-orchestration/SKILL.md",
|
||||
mandatoryTriggers: [
|
||||
"Resolve circular dependencies",
|
||||
"Optimize dependencies"
|
||||
],
|
||||
workflows: [
|
||||
"Advanced Dependency Analysis",
|
||||
"Critical Path",
|
||||
"Bottleneck Detection"
|
||||
],
|
||||
expectedOutputs: ["Dependency graph", "Critical path", "Optimization suggestions"],
|
||||
tokenRange: [400, 700]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Load Subagents Knowledge Base
|
||||
|
||||
**Action**: Discover and parse all Subagent definitions
|
||||
|
||||
```javascript
|
||||
// Glob all subagent files
|
||||
subagentFiles = Glob(pattern=".claude/agents/task-orchestrator/*.md")
|
||||
|
||||
// For each subagent file found:
|
||||
for subagentFile in subagentFiles:
|
||||
// Read and parse content
|
||||
content = Read(subagentFile)
|
||||
|
||||
// Extract from YAML frontmatter
|
||||
name = content.frontmatter.name
|
||||
description = content.frontmatter.description
|
||||
|
||||
// Extract workflow steps (numbered steps in document)
|
||||
expectedSteps = extractNumberedSteps(content)
|
||||
|
||||
// Extract critical patterns (CRITICAL, IMPORTANT sections)
|
||||
criticalPatterns = extractPatterns(content, markers=["CRITICAL", "IMPORTANT"])
|
||||
|
||||
// Extract output expectations
|
||||
outputValidation = extractSection(content, "Output Format" or "Return")
|
||||
|
||||
// Store in knowledge base
|
||||
subagents[name] = {
|
||||
file: subagentFile,
|
||||
description: description,
|
||||
triggeredBy: extractTriggeredBy(content),
|
||||
expectedSteps: expectedSteps,
|
||||
criticalPatterns: criticalPatterns,
|
||||
outputValidation: outputValidation,
|
||||
tokenRange: extractTokenRange(content)
|
||||
}
|
||||
```
|
||||
|
||||
**Example Subagents Loaded**:
|
||||
|
||||
```javascript
|
||||
subagents = {
|
||||
"Feature Architect": {
|
||||
file: ".claude/agents/task-orchestrator/feature-architect.md",
|
||||
triggeredBy: [
|
||||
"Complex feature creation",
|
||||
"PRD provided",
|
||||
"Formal planning"
|
||||
],
|
||||
expectedSteps: [
|
||||
"Step 1: Understand Context (get_overview, list_tags)",
|
||||
"Step 2: Detect Input Type (PRD/Interactive/Quick)",
|
||||
"Step 3a/3b/3c: Process based on mode",
|
||||
"Step 4: Discover Templates",
|
||||
"Step 5: Design Tag Strategy",
|
||||
"Step 5.5: Verify Agent Mapping Coverage",
|
||||
"Step 6: Create Feature",
|
||||
"Step 7: Add Custom Sections (mode-dependent)",
|
||||
"Step 8: Return Handoff (minimal)"
|
||||
],
|
||||
criticalPatterns: [
|
||||
"description = forward-looking (what needs to be built)",
|
||||
"Do NOT populate summary field during creation",
|
||||
"Return minimal handoff (50-100 tokens)",
|
||||
"PRD mode: Extract ALL sections from document",
|
||||
"Tag strategy: Reuse existing tags (list_tags first)",
|
||||
"Check agent-mapping.yaml for new tags"
|
||||
],
|
||||
outputValidation: [
|
||||
"Feature created with description?",
|
||||
"Templates applied?",
|
||||
"Tags follow project conventions?",
|
||||
"PRD sections represented (if PRD mode)?",
|
||||
"Handoff minimal (not verbose)?"
|
||||
],
|
||||
tokenRange: [1800, 2200]
|
||||
},
|
||||
|
||||
"Planning Specialist": {
|
||||
file: ".claude/agents/task-orchestrator/planning-specialist.md",
|
||||
triggeredBy: [
|
||||
"Feature needs task breakdown",
|
||||
"Complex feature created"
|
||||
],
|
||||
expectedSteps: [
|
||||
"Step 1: Read Feature Context (includeSections=true)",
|
||||
"Step 2: Discover Task Templates",
|
||||
"Step 3: Break Down into Domain-Isolated Tasks",
|
||||
"Step 4: Create Tasks with Descriptions",
|
||||
"Step 5: Map Dependencies",
|
||||
"Step 7: Inherit and Refine Tags",
|
||||
"Step 8: Return Brief Summary"
|
||||
],
|
||||
criticalPatterns: [
|
||||
"One task = one specialist domain",
|
||||
"Task description populated (200-600 chars)",
|
||||
"Do NOT populate summary field",
|
||||
"ALWAYS create documentation task for user-facing features",
|
||||
"Create separate test task for comprehensive testing",
|
||||
"Database → Backend → Frontend dependency pattern"
|
||||
],
|
||||
outputValidation: [
|
||||
"Tasks created with descriptions?",
|
||||
"Domain isolation preserved?",
|
||||
"Dependencies mapped correctly?",
|
||||
"Documentation task included (if user-facing)?",
|
||||
"Testing task included (if needed)?",
|
||||
"No circular dependencies?",
|
||||
"Templates applied to tasks?"
|
||||
],
|
||||
tokenRange: [1800, 2200]
|
||||
},
|
||||
|
||||
"Backend Engineer": {
|
||||
file: ".claude/agents/task-orchestrator/backend-engineer.md",
|
||||
triggeredBy: ["Backend implementation task"],
|
||||
expectedSteps: [
|
||||
"Step 1: Read task (includeSections=true)",
|
||||
"Step 2: Read dependencies (if any)",
|
||||
"Step 3: Do work (code, tests)",
|
||||
"Step 4: Update task sections",
|
||||
"Step 5: Run tests and validate",
|
||||
"Step 6: Populate summary (300-500 chars)",
|
||||
"Step 7: Create Files Changed section",
|
||||
"Step 8: Use Status Progression Skill to mark complete",
|
||||
"Step 9: Return minimal output"
|
||||
],
|
||||
criticalPatterns: [
|
||||
"ALL tests must pass before completion",
|
||||
"Summary REQUIRED (300-500 chars)",
|
||||
"Files Changed section REQUIRED (ordinal 999)",
|
||||
"Use Status Progression Skill to mark complete",
|
||||
"Return minimal output (50-100 tokens)",
|
||||
"If BLOCKED: Report with details, don't mark complete"
|
||||
],
|
||||
outputValidation: [
|
||||
"Task marked complete?",
|
||||
"Summary populated (300-500 chars)?",
|
||||
"Files Changed section created?",
|
||||
"Tests mentioned in summary?",
|
||||
"Used Status Progression Skill for completion?",
|
||||
"Output minimal (not verbose)?",
|
||||
"If blocked: Clear reason + attempted fixes?"
|
||||
],
|
||||
tokenRange: [1800, 2200]
|
||||
}
|
||||
|
||||
// Similar structures for:
|
||||
// - Frontend Developer
|
||||
// - Database Engineer
|
||||
// - Test Engineer
|
||||
// - Technical Writer
|
||||
// - Bug Triage Specialist
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Load Routing Configuration
|
||||
|
||||
**Action**: Read agent-mapping.yaml for tag-based routing
|
||||
|
||||
```javascript
|
||||
// Read routing config
|
||||
configPath = getProjectRoot().resolve(".taskorchestrator/agent-mapping.yaml")
|
||||
configContent = Read(configPath)
|
||||
|
||||
// Parse YAML
|
||||
agentMapping = parseYAML(configContent)
|
||||
|
||||
// Store tag mappings
|
||||
routing = {
|
||||
tagMappings: agentMapping.tagMappings,
|
||||
// Example:
|
||||
// "backend" → ["Backend Engineer"]
|
||||
// "frontend" → ["Frontend Developer"]
|
||||
// "database" → ["Database Engineer"]
|
||||
// "testing" → ["Test Engineer"]
|
||||
// "documentation" → ["Technical Writer"]
|
||||
}
|
||||
```
|
||||
|
||||
**Example Routing Configuration**:
|
||||
|
||||
```javascript
|
||||
routing = {
|
||||
tagMappings: {
|
||||
"backend": ["Backend Engineer"],
|
||||
"frontend": ["Frontend Developer"],
|
||||
"database": ["Database Engineer"],
|
||||
"testing": ["Test Engineer"],
|
||||
"documentation": ["Technical Writer"],
|
||||
"bug": ["Bug Triage Specialist"],
|
||||
"architecture": ["Feature Architect"],
|
||||
"planning": ["Planning Specialist"],
|
||||
"api": ["Backend Engineer"],
|
||||
"ui": ["Frontend Developer"],
|
||||
"schema": ["Database Engineer"],
|
||||
"migration": ["Database Engineer"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Initialize Tracking State
|
||||
|
||||
**Action**: Set up session-level tracking for deviations and patterns
|
||||
|
||||
```javascript
|
||||
trainingState = {
|
||||
session: {
|
||||
startTime: now(),
|
||||
knowledgeBaseLoaded: true,
|
||||
skillsCount: skills.length,
|
||||
subagentsCount: subagents.length
|
||||
},
|
||||
|
||||
tracking: {
|
||||
// Store original user inputs by workflow ID
|
||||
originalInputs: {},
|
||||
|
||||
// Validation checkpoints by workflow ID
|
||||
checkpoints: [],
|
||||
|
||||
// Categorized deviations
|
||||
deviations: {
|
||||
orchestrator: [], // Routing violations (Skills bypassed)
|
||||
skills: [], // Skill workflow issues
|
||||
subagents: [] // Subagent workflow issues
|
||||
},
|
||||
|
||||
// Improvement suggestions
|
||||
improvements: []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 5: Report Initialization Status
|
||||
|
||||
**Output**: Confirmation that QA system is ready
|
||||
|
||||
```markdown
|
||||
✅ **Orchestration QA Initialized**
|
||||
|
||||
**Knowledge Base Loaded:**
|
||||
- Skills: 5 (feature-orchestration, task-orchestration, status-progression, dependency-analysis, dependency-orchestration)
|
||||
- Subagents: 8 (feature-architect, planning-specialist, backend-engineer, frontend-developer, database-engineer, test-engineer, technical-writer, bug-triage-specialist)
|
||||
- Routing: agent-mapping.yaml loaded (12 tag mappings)
|
||||
|
||||
**Quality Assurance Active:**
|
||||
- ✅ Pre-execution validation
|
||||
- ✅ Post-execution review
|
||||
- ✅ Routing validation (Skills vs Direct)
|
||||
- ✅ Pattern tracking (continuous improvement)
|
||||
|
||||
**Session Tracking:**
|
||||
- Deviations: 0 ALERT, 0 WARN, 0 INFO
|
||||
- Patterns: 0 recurring issues
|
||||
- Improvements: 0 suggestions
|
||||
|
||||
Ready to monitor orchestration quality.
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Skills Directory Not Found
|
||||
|
||||
```javascript
|
||||
if (!exists(".claude/skills/")) {
|
||||
return {
|
||||
error: "Skills directory not found",
|
||||
suggestion: "Run setup_claude_orchestration to install Skills and Subagents",
|
||||
fallback: "QA will operate with limited validation (no Skills knowledge)"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Subagents Directory Not Found
|
||||
|
||||
```javascript
|
||||
if (!exists(".claude/agents/task-orchestrator/")) {
|
||||
return {
|
||||
error: "Subagents directory not found",
|
||||
suggestion: "Run setup_claude_orchestration to install Subagents",
|
||||
fallback: "QA will operate with limited validation (no Subagents knowledge)"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Agent Mapping Not Found
|
||||
|
||||
```javascript
|
||||
if (!exists(".taskorchestrator/agent-mapping.yaml")) {
|
||||
return {
|
||||
warning: "agent-mapping.yaml not found",
|
||||
suggestion: "Routing validation will use default patterns",
|
||||
fallback: "QA will operate without tag-based routing validation"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Caching Strategy
|
||||
|
||||
**Knowledge bases are expensive to load** (~800-1000 tokens). Cache them for the session:
|
||||
|
||||
```javascript
|
||||
// Load once per session
|
||||
if (!session.knowledgeBaseLoaded) {
|
||||
loadSkillsKnowledgeBase()
|
||||
loadSubagentsKnowledgeBase()
|
||||
loadRoutingConfiguration()
|
||||
session.knowledgeBaseLoaded = true
|
||||
}
|
||||
|
||||
// Reuse throughout session
|
||||
skill = skills["Feature Orchestration"]
|
||||
subagent = subagents["Planning Specialist"]
|
||||
routing = routing.tagMappings["backend"]
|
||||
```
|
||||
|
||||
**When to reload**:
|
||||
- New session starts
|
||||
- User explicitly requests: "Reload QA knowledge base"
|
||||
- Skills/Subagents modified during session (rare)
|
||||
|
||||
## Usage Example
|
||||
|
||||
```javascript
|
||||
// At session start
|
||||
orchestration-qa(phase="init")
|
||||
|
||||
// Returns:
|
||||
{
|
||||
initialized: true,
|
||||
skillsCount: 5,
|
||||
subagentsCount: 8,
|
||||
routingLoaded: true,
|
||||
message: "✅ Orchestration QA Initialized - Ready to monitor quality"
|
||||
}
|
||||
|
||||
// Knowledge base now available for all subsequent validations
|
||||
```
|
||||
184
skills/orchestration-qa/parallel-detection.md
Normal file
184
skills/orchestration-qa/parallel-detection.md
Normal file
@@ -0,0 +1,184 @@
|
||||
# Parallel Opportunity Detection
|
||||
|
||||
**Purpose**: Identify missed parallelization opportunities in task execution.
|
||||
|
||||
**When**: Optional post-execution (controlled by enableEfficiencyAnalysis parameter)
|
||||
|
||||
**Applies To**: Task Orchestration Skill, Planning Specialist
|
||||
|
||||
**Token Cost**: ~400-600 tokens
|
||||
|
||||
## Parallel Opportunity Types
|
||||
|
||||
### Type 1: Independent Tasks Not Batched
|
||||
|
||||
**Opportunity**: Tasks with no dependencies can run simultaneously
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
independentTasks = tasks.filter(t =>
|
||||
query_dependencies(taskId=t.id).incoming.length == 0 &&
|
||||
t.status == "pending"
|
||||
)
|
||||
|
||||
if (independentTasks.length >= 2 && !launchedInParallel) {
|
||||
return {
|
||||
type: "Independent tasks not batched",
|
||||
tasks: independentTasks.map(t => t.title),
|
||||
opportunity: `${independentTasks.length} tasks can run simultaneously`,
|
||||
impact: "Sequential execution when parallel possible",
|
||||
recommendation: "Use Task Orchestration Skill to batch parallel tasks"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Type 2: Tasks with Same Dependencies Not Grouped
|
||||
|
||||
**Opportunity**: Tasks blocked by the same tasks can run in parallel after blockers complete
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
// Group tasks by their blockers
|
||||
tasksByBlockers = groupByBlockers(tasks)
|
||||
|
||||
for (blockerKey, taskGroup in tasksByBlockers) {
|
||||
if (taskGroup.length >= 2 && !inSameBatch(taskGroup)) {
|
||||
return {
|
||||
type: "Tasks with same dependencies not grouped",
|
||||
tasks: taskGroup.map(t => t.title),
|
||||
sharedBlockers: parseBlockers(blockerKey),
|
||||
opportunity: `${taskGroup.length} tasks can run parallel after blockers complete`,
|
||||
recommendation: "Batch these tasks together"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Type 3: Sequential Specialist Launches When Parallel Possible
|
||||
|
||||
**Opportunity**: Multiple specialists launched one-by-one instead of in parallel
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (launchedSpecialists.length >= 2 && !launchedInParallel) {
|
||||
// Check if they have no dependencies between them
|
||||
noDependencies = !hasBlockingRelationships(launchedSpecialists)
|
||||
|
||||
if (noDependencies) {
|
||||
return {
|
||||
type: "Sequential specialist launches",
|
||||
specialists: launchedSpecialists,
|
||||
opportunity: "Launch specialists in parallel",
|
||||
impact: "Sequential execution increases total time",
|
||||
recommendation: "Use Task tool multiple times in single message"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Type 4: Domain-Isolated Tasks Not Parallelized
|
||||
|
||||
**Opportunity**: Backend + Frontend + Database tasks can often run in parallel
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
domains = {
|
||||
database: tasks.filter(t => t.tags.includes("database")),
|
||||
backend: tasks.filter(t => t.tags.includes("backend")),
|
||||
frontend: tasks.filter(t => t.tags.includes("frontend"))
|
||||
}
|
||||
|
||||
// Check typical dependency pattern: database → backend → frontend
|
||||
// BUT: If each domain has multiple tasks, those CAN run in parallel
|
||||
|
||||
for (domain, domainTasks in domains) {
|
||||
if (domainTasks.length >= 2 && !parallelizedWithinDomain(domainTasks)) {
|
||||
return {
|
||||
type: "Domain tasks not parallelized",
|
||||
domain: domain,
|
||||
tasks: domainTasks.map(t => t.title),
|
||||
opportunity: `${domainTasks.length} ${domain} tasks can run parallel`,
|
||||
recommendation: "Launch domain specialists in parallel"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
```javascript
|
||||
parallelOpportunities = []
|
||||
|
||||
// Check each opportunity type
|
||||
checkIndependentTasks()
|
||||
checkSameDependencyGroups()
|
||||
checkSequentialLaunches()
|
||||
checkDomainParallelization()
|
||||
|
||||
// Calculate potential time savings
|
||||
if (parallelOpportunities.length > 0) {
|
||||
estimatedTimeSavings = calculateTimeSavings(parallelOpportunities)
|
||||
|
||||
return {
|
||||
opportunitiesFound: parallelOpportunities.length,
|
||||
opportunities: parallelOpportunities,
|
||||
estimatedSavings: estimatedTimeSavings,
|
||||
recommendation: "Use Task Orchestration Skill for parallel batching"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Report Template
|
||||
|
||||
```markdown
|
||||
## ⚡ Parallel Opportunity Detection
|
||||
|
||||
**Opportunities Found**: [count]
|
||||
**Estimated Time Savings**: [X]% (parallel vs sequential)
|
||||
|
||||
### Opportunities
|
||||
|
||||
**ℹ️ INFO**: Independent tasks not batched
|
||||
- Tasks: [Task A, Task B, Task C]
|
||||
- Opportunity: 3 tasks can run simultaneously (no dependencies)
|
||||
- Impact: Sequential execution taking 3x longer than necessary
|
||||
|
||||
**ℹ️ INFO**: Domain tasks not parallelized
|
||||
- Domain: backend
|
||||
- Tasks: [Task D, Task E]
|
||||
- Opportunity: 2 backend tasks can run parallel
|
||||
|
||||
### Recommendations
|
||||
1. Use Task Orchestration Skill for dependency-aware batching
|
||||
2. Launch specialists in parallel: `Task(Backend Engineer, task1)` + `Task(Backend Engineer, task2)` in single message
|
||||
```
|
||||
|
||||
## When to Report
|
||||
|
||||
- **Only if** enableEfficiencyAnalysis=true
|
||||
- **INFO** level (optimizations, not violations)
|
||||
- Most valuable after task execution workflows
|
||||
|
||||
## Integration with Task Orchestration Skill
|
||||
|
||||
This analysis helps validate that Task Orchestration Skill is identifying all parallel opportunities:
|
||||
|
||||
```javascript
|
||||
// If Task Orchestration Skill was used
|
||||
if (usedTaskOrchestrationSkill) {
|
||||
// Check if it identified all opportunities
|
||||
identifiedOpportunities = extractBatchStructure(output)
|
||||
missedOpportunities = parallelOpportunities.filter(o =>
|
||||
!identifiedOpportunities.includes(o)
|
||||
)
|
||||
|
||||
if (missedOpportunities.length > 0) {
|
||||
return {
|
||||
severity: "WARN",
|
||||
issue: "Task Orchestration Skill missed parallel opportunities",
|
||||
missed: missedOpportunities,
|
||||
recommendation: "Update task-orchestration skill workflow"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
370
skills/orchestration-qa/pattern-tracking.md
Normal file
370
skills/orchestration-qa/pattern-tracking.md
Normal file
@@ -0,0 +1,370 @@
|
||||
# Pattern Tracking & Continuous Improvement
|
||||
|
||||
**Purpose**: Track recurring issues and suggest systemic improvements to definitions.
|
||||
|
||||
**When**: After deviations detected, end of session
|
||||
|
||||
**Token Cost**: ~300-500 tokens
|
||||
|
||||
## Pattern Detection
|
||||
|
||||
### Recurrence Threshold
|
||||
|
||||
**Definition**: Issue is "recurring" if it happens 2+ times in session
|
||||
|
||||
**Why**: One-off issues may be anomalies; recurring issues indicate systemic problems
|
||||
|
||||
### Pattern Categories
|
||||
|
||||
1. **Routing Violations** - Skills bypassed
|
||||
2. **Workflow Deviations** - Steps skipped
|
||||
3. **Output Quality** - Verbose output, missing sections
|
||||
4. **Dependency Errors** - Incorrect graph, circular dependencies
|
||||
5. **Tag Issues** - Missing mappings, convention violations
|
||||
6. **Token Waste** - Repeated inefficiency patterns
|
||||
|
||||
## Tracking Workflow
|
||||
|
||||
### Step 1: Detect Recurrence
|
||||
|
||||
```javascript
|
||||
// Track issues across session
|
||||
session.deviations = {
|
||||
routing: [],
|
||||
workflow: [],
|
||||
output: [],
|
||||
dependency: [],
|
||||
tag: [],
|
||||
token: []
|
||||
}
|
||||
|
||||
// After each workflow, categorize deviations
|
||||
for deviation in currentDeviations:
|
||||
category = categorize(deviation)
|
||||
session.deviations[category].push({
|
||||
timestamp: now(),
|
||||
entity: entityType,
|
||||
issue: deviation.issue,
|
||||
severity: deviation.severity
|
||||
})
|
||||
}
|
||||
|
||||
// Detect patterns
|
||||
patterns = []
|
||||
for category, issues in session.deviations:
|
||||
grouped = groupByIssue(issues)
|
||||
for issueType, occurrences in grouped:
|
||||
if (occurrences.length >= 2) {
|
||||
patterns.push({
|
||||
category: category,
|
||||
issue: issueType,
|
||||
count: occurrences.length,
|
||||
entities: occurrences.map(o => o.entity),
|
||||
severity: determineSeverity(occurrences)
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Analyze Root Cause
|
||||
|
||||
```javascript
|
||||
for pattern in patterns:
|
||||
rootCause = analyzeRootCause(pattern)
|
||||
// Returns: "Definition unclear", "Validation missing", "Template incomplete", etc.
|
||||
|
||||
pattern.rootCause = rootCause
|
||||
pattern.systemic = isSystemic(rootCause) // vs one-off orchestrator error
|
||||
}
|
||||
```
|
||||
|
||||
**Root Cause Types**:
|
||||
|
||||
- **Definition Unclear**: Instructions ambiguous or missing
|
||||
- **Validation Missing**: No checkpoint to catch issue
|
||||
- **Template Incomplete**: Template doesn't guide properly
|
||||
- **Knowledge Gap**: Orchestrator unaware of pattern
|
||||
- **Tool Limitation**: Current tools can't prevent issue
|
||||
|
||||
### Step 3: Generate Improvement Suggestions
|
||||
|
||||
```javascript
|
||||
improvements = []
|
||||
|
||||
for pattern in patterns where pattern.systemic:
|
||||
suggestion = generateImprovement(pattern)
|
||||
improvements.push(suggestion)
|
||||
}
|
||||
```
|
||||
|
||||
**Improvement Types**:
|
||||
|
||||
#### Type 1: Definition Update
|
||||
|
||||
```javascript
|
||||
{
|
||||
type: "Definition Update",
|
||||
file: "planning-specialist.md",
|
||||
section: "Step 5: Map Dependencies",
|
||||
issue: "Cross-domain tasks created (3 occurrences)",
|
||||
rootCause: "No validation step for domain isolation",
|
||||
suggestion: {
|
||||
add: `
|
||||
### Validation Checkpoint: Domain Isolation
|
||||
|
||||
Before creating tasks, verify:
|
||||
- [ ] Each task maps to ONE specialist domain
|
||||
- [ ] No task mixes backend + frontend
|
||||
- [ ] No task mixes database + API logic
|
||||
|
||||
If domain mixing detected, split into separate tasks.
|
||||
`,
|
||||
location: "After Step 3, before Step 4"
|
||||
},
|
||||
impact: "Prevents cross-domain tasks in future Planning Specialist executions"
|
||||
}
|
||||
```
|
||||
|
||||
#### Type 2: Validation Checklist
|
||||
|
||||
```javascript
|
||||
{
|
||||
type: "Validation Checklist",
|
||||
file: "feature-architect.md",
|
||||
section: "Step 8: Return Handoff",
|
||||
issue: "Verbose handoff (2 occurrences, avg 400 tokens)",
|
||||
rootCause: "No token limit specified in definition",
|
||||
suggestion: {
|
||||
add: `
|
||||
### Handoff Validation Checklist
|
||||
|
||||
Before returning:
|
||||
- [ ] Token count < 100 (brief summary only)
|
||||
- [ ] No code/detailed content in response
|
||||
- [ ] Feature ID mentioned
|
||||
- [ ] Next action clear
|
||||
|
||||
If output > 100 tokens, move details to feature sections.
|
||||
`,
|
||||
location: "End of Step 8"
|
||||
},
|
||||
impact: "Reduces Feature Architect output from 400 → 80 tokens (80% reduction)"
|
||||
}
|
||||
```
|
||||
|
||||
#### Type 3: Quality Gate
|
||||
|
||||
```javascript
|
||||
{
|
||||
type: "Quality Gate",
|
||||
file: "planning-specialist.md",
|
||||
section: "Step 6: Create Tasks",
|
||||
issue: "Execution graph accuracy < 95% (2 occurrences)",
|
||||
rootCause: "No validation before returning graph",
|
||||
suggestion: {
|
||||
add: `
|
||||
### Quality Gate: Graph Validation
|
||||
|
||||
Before returning execution graph:
|
||||
1. Query actual dependencies via query_dependencies
|
||||
2. Compare graph claims vs database reality
|
||||
3. Verify accuracy >= 95%
|
||||
4. If < 95%, correct graph before returning
|
||||
|
||||
This ensures graph quality baseline is met.
|
||||
`,
|
||||
location: "After Step 6, before Step 7"
|
||||
},
|
||||
impact: "Ensures execution graph accuracy >= 95% in all cases"
|
||||
}
|
||||
```
|
||||
|
||||
#### Type 4: Orchestrator Guidance
|
||||
|
||||
```javascript
|
||||
{
|
||||
type: "Orchestrator Guidance",
|
||||
file: "CLAUDE.md",
|
||||
section: "Decision Gates",
|
||||
issue: "Status changes bypassed Status Progression Skill (2 occurrences)",
|
||||
rootCause: "Orchestrator unaware of mandatory pattern",
|
||||
suggestion: {
|
||||
add: `
|
||||
### CRITICAL: Status Changes
|
||||
|
||||
**ALWAYS use Status Progression Skill for status changes**
|
||||
|
||||
❌ NEVER: manage_container(operation="setStatus", ...)
|
||||
✅ ALWAYS: Use status-progression skill
|
||||
|
||||
**Why Critical**: Prerequisite validation required (summary length, dependencies, task counts)
|
||||
|
||||
**Triggers**:
|
||||
- "mark complete"
|
||||
- "update status"
|
||||
- "move to [status]"
|
||||
- "change status"
|
||||
`,
|
||||
location: "Decision Gates section, top priority"
|
||||
},
|
||||
impact: "Prevents status bypasses in future sessions"
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Prioritize Improvements
|
||||
|
||||
```javascript
|
||||
prioritized = improvements.sort((a, b) => {
|
||||
// Priority order:
|
||||
// 1. CRITICAL patterns (routing violations)
|
||||
// 2. Frequent patterns (count >= 3)
|
||||
// 3. High-impact patterns (affects multiple workflows)
|
||||
// 4. Easy fixes (checklist additions)
|
||||
|
||||
score = {
|
||||
critical: a.severity == "CRITICAL" ? 100 : 0,
|
||||
frequency: a.count * 10,
|
||||
impact: estimateImpact(a) * 5,
|
||||
ease: estimateEase(a) * 2
|
||||
}
|
||||
|
||||
return scoreB - scoreA // Descending
|
||||
})
|
||||
```
|
||||
|
||||
## Session Summary
|
||||
|
||||
### End-of-Session Report
|
||||
|
||||
```markdown
|
||||
## 📊 Session QA Summary
|
||||
|
||||
**Workflows Analyzed:** [count]
|
||||
- Skills: [count]
|
||||
- Subagents: [count]
|
||||
|
||||
**Quality Overview:**
|
||||
- ✅ Successful: [count] (no issues)
|
||||
- ⚠️ Issues: [count] (addressed)
|
||||
- 🚨 Critical: [count] (require attention)
|
||||
|
||||
### Deviation Breakdown
|
||||
- Routing violations: [count]
|
||||
- Workflow deviations: [count]
|
||||
- Output quality: [count]
|
||||
- Dependency errors: [count]
|
||||
- Tag issues: [count]
|
||||
- Token waste: [count]
|
||||
|
||||
### Recurring Patterns ([count])
|
||||
|
||||
**🔁 Pattern: Cross-domain tasks**
|
||||
- Occurrences: [count] (Planning Specialist)
|
||||
- Root cause: No domain isolation validation
|
||||
- Impact: Tasks can't be routed to single specialist
|
||||
- **Suggestion**: Update planning-specialist.md Step 3 with validation checklist
|
||||
|
||||
**🔁 Pattern: Status change bypasses**
|
||||
- Occurrences: [count] (Orchestrator)
|
||||
- Root cause: Decision gates not prominent enough
|
||||
- Impact: Prerequisites not validated
|
||||
- **Suggestion**: Update CLAUDE.md Decision Gates section
|
||||
|
||||
### Improvement Recommendations ([count])
|
||||
|
||||
**Priority 1: [Improvement Title]**
|
||||
- File: [definition-file.md]
|
||||
- Type: [Definition Update / Validation Checklist / Quality Gate]
|
||||
- Impact: [What this prevents/improves]
|
||||
- Effort: [Low / Medium / High]
|
||||
|
||||
**Priority 2: [Improvement Title]**
|
||||
- File: [definition-file.md]
|
||||
- Type: [...]
|
||||
- Impact: [...]
|
||||
|
||||
### Quality Trends
|
||||
- Graph quality: [X]% average (baseline 70%, target 95%+)
|
||||
- Tag coverage: [Y]% average (baseline 90%, target 100%)
|
||||
- Token efficiency: [Z]% average
|
||||
- Workflow adherence: [W]% average
|
||||
|
||||
### Next Steps
|
||||
1. [Most critical improvement]
|
||||
2. [Secondary improvement]
|
||||
3. [Optional enhancement]
|
||||
```
|
||||
|
||||
## Continuous Improvement Cycle
|
||||
|
||||
### Cycle 1: Detection (This Session)
|
||||
- Track deviations as they occur
|
||||
- Detect recurring patterns (2+ occurrences)
|
||||
- Analyze root causes
|
||||
|
||||
### Cycle 2: Analysis (End of Session)
|
||||
- Generate improvement suggestions
|
||||
- Prioritize by impact and ease
|
||||
- Present to user with recommendations
|
||||
|
||||
### Cycle 3: Implementation (User Decision)
|
||||
- User approves definition updates
|
||||
- Apply changes to source files
|
||||
- Document changes in version control
|
||||
|
||||
### Cycle 4: Validation (Next Session)
|
||||
- Verify improvements are effective
|
||||
- Track if recurring patterns reduced
|
||||
- Measure quality metric improvements
|
||||
|
||||
## Metrics to Track
|
||||
|
||||
### Quality Metrics (Per Session)
|
||||
- Workflow adherence: [X]%
|
||||
- Graph quality: [X]%
|
||||
- Tag coverage: [X]%
|
||||
- Token efficiency: [X]%
|
||||
|
||||
### Pattern Metrics (Across Sessions)
|
||||
- Recurring pattern count: [decreasing trend = good]
|
||||
- Definition update count: [applied improvements]
|
||||
- Quality improvement: [metrics increasing over time]
|
||||
|
||||
## Integration with QA Skill
|
||||
|
||||
```javascript
|
||||
// At end of session
|
||||
orchestration-qa(
|
||||
phase="summary",
|
||||
sessionId=currentSession
|
||||
)
|
||||
|
||||
// Returns:
|
||||
{
|
||||
workflowsAnalyzed: 8,
|
||||
deviationsSummary: { ALERT: 2, WARN: 5, INFO: 3 },
|
||||
recurringPatterns: 2,
|
||||
improvements: [
|
||||
{ priority: 1, file: "planning-specialist.md", impact: "high" },
|
||||
{ priority: 2, file: "CLAUDE.md", impact: "high" }
|
||||
],
|
||||
qualityTrends: {
|
||||
graphQuality: "92%",
|
||||
tagCoverage: "98%",
|
||||
tokenEfficiency: "87%"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## When to Report
|
||||
|
||||
- **After deviations**: Track pattern occurrence
|
||||
- **End of session**: Generate summary if patterns detected
|
||||
- **User request**: "Show QA summary", "Any improvements?"
|
||||
|
||||
## Output Size
|
||||
|
||||
- Pattern tracking: ~100-200 tokens per pattern
|
||||
- Session summary: ~400-800 tokens total
|
||||
- Improvement suggestions: ~200-400 tokens per suggestion
|
||||
623
skills/orchestration-qa/post-execution.md
Normal file
623
skills/orchestration-qa/post-execution.md
Normal file
@@ -0,0 +1,623 @@
|
||||
# Post-Execution Review
|
||||
|
||||
**Purpose**: Validate that Skills and Subagents followed their documented workflows and produced expected outputs.
|
||||
|
||||
**When**: After any Skill or Subagent completes (phase="post")
|
||||
|
||||
**Token Cost**: ~600-800 tokens (basic), ~1500-2000 tokens (with specialized analysis)
|
||||
|
||||
## Core Review Workflow
|
||||
|
||||
### Step 1: Load Entity Definition
|
||||
|
||||
**Action**: Read the definition from knowledge base loaded during initialization.
|
||||
|
||||
```javascript
|
||||
// For Skills
|
||||
if (category == "SKILL") {
|
||||
definition = skills[entityType]
|
||||
// Contains: mandatoryTriggers, workflows, expectedOutputs, tools, tokenRange
|
||||
}
|
||||
|
||||
// For Subagents
|
||||
if (category == "SUBAGENT") {
|
||||
definition = subagents[entityType]
|
||||
// Contains: expectedSteps, criticalPatterns, outputValidation, tokenRange
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Retrieve Pre-Execution Context
|
||||
|
||||
**Action**: Load stored context from pre-execution phase.
|
||||
|
||||
```javascript
|
||||
context = session.contexts[workflowId]
|
||||
// Contains: userInput, checkpoints, expected, featureRequirements, etc.
|
||||
```
|
||||
|
||||
### Step 3: Verify Workflow Adherence
|
||||
|
||||
**Check that entity followed its documented workflow steps.**
|
||||
|
||||
#### For Skills
|
||||
|
||||
```javascript
|
||||
workflowCheck = {
|
||||
expectedWorkflows: definition.workflows,
|
||||
actualExecution: analyzeOutput(entityOutput),
|
||||
stepsFollowed: 0,
|
||||
stepsExpected: definition.workflows.length,
|
||||
deviations: []
|
||||
}
|
||||
|
||||
// Example for Feature Orchestration Skill
|
||||
expectedWorkflows = [
|
||||
"Assess complexity (Simple vs Complex)",
|
||||
"Discover templates via query_templates",
|
||||
"Create feature directly (Simple) OR Launch Feature Architect (Complex)",
|
||||
"Return feature ID and next action"
|
||||
]
|
||||
|
||||
// Verify each workflow step
|
||||
for step in expectedWorkflows:
|
||||
if (evidenceOfStep(entityOutput, step)) {
|
||||
workflowCheck.stepsFollowed++
|
||||
} else {
|
||||
workflowCheck.deviations.push({
|
||||
step: step,
|
||||
issue: "No evidence of this step in output",
|
||||
severity: "WARN"
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
#### For Subagents
|
||||
|
||||
```javascript
|
||||
stepCheck = {
|
||||
expectedSteps: definition.expectedSteps, // e.g., 8 steps for Feature Architect
|
||||
actualSteps: extractSteps(entityOutput),
|
||||
stepsFollowed: 0,
|
||||
stepsExpected: definition.expectedSteps.length,
|
||||
deviations: []
|
||||
}
|
||||
|
||||
// Example for Feature Architect (8 expected steps)
|
||||
expectedSteps = [
|
||||
"Step 1: get_overview + list_tags",
|
||||
"Step 2: Detect input type (PRD/Interactive/Quick)",
|
||||
"Step 4: query_templates",
|
||||
"Step 5: Tag strategy (reuse existing tags)",
|
||||
"Step 5.5: Check agent-mapping.yaml",
|
||||
"Step 6: Create feature",
|
||||
"Step 7: Add custom sections (if Detailed/PRD)",
|
||||
"Step 8: Return minimal handoff"
|
||||
]
|
||||
|
||||
// Verify tool usage as evidence of steps
|
||||
for step in expectedSteps:
|
||||
if (evidenceOfStep(entityOutput, step)) {
|
||||
stepCheck.stepsFollowed++
|
||||
} else {
|
||||
stepCheck.deviations.push({
|
||||
step: step,
|
||||
issue: "Step not completed or no evidence",
|
||||
severity: determineStepSeverity(step)
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
**Evidence Detection**:
|
||||
```javascript
|
||||
function evidenceOfStep(output, step) {
|
||||
// Check for tool calls mentioned
|
||||
if (step.includes("query_templates") && mentions(output, "template")) return true
|
||||
if (step.includes("list_tags") && mentions(output, "tags")) return true
|
||||
|
||||
// Check for workflow markers
|
||||
if (step.includes("Create feature") && mentions(output, "feature created")) return true
|
||||
|
||||
// Check for explicit mentions
|
||||
if (contains(output, step.toLowerCase())) return true
|
||||
|
||||
return false
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Validate Critical Patterns
|
||||
|
||||
**Check entity followed critical patterns from its definition.**
|
||||
|
||||
```javascript
|
||||
patternCheck = {
|
||||
criticalPatterns: definition.criticalPatterns,
|
||||
violations: []
|
||||
}
|
||||
|
||||
// Example for Feature Architect
|
||||
criticalPatterns = [
|
||||
"description = forward-looking (what needs to be built)",
|
||||
"Do NOT populate summary field during creation",
|
||||
"Return minimal handoff (50-100 tokens)",
|
||||
"PRD mode: Extract ALL sections from document",
|
||||
"Tag strategy: Reuse existing tags",
|
||||
"Check agent-mapping.yaml for new tags"
|
||||
]
|
||||
|
||||
for pattern in criticalPatterns:
|
||||
violation = checkPattern(pattern, entityOutput, context)
|
||||
if (violation) {
|
||||
patternCheck.violations.push(violation)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Pattern Checking Examples**:
|
||||
|
||||
```javascript
|
||||
// Pattern: "Do NOT populate summary field"
|
||||
function checkSummaryField(output, entityId) {
|
||||
entity = query_container(operation="get", containerType="feature", id=entityId)
|
||||
|
||||
if (entity.summary && entity.summary.length > 0) {
|
||||
return {
|
||||
pattern: "Do NOT populate summary field during creation",
|
||||
violation: "Summary field populated",
|
||||
severity: "WARN",
|
||||
found: `Summary: "${entity.summary}" (${entity.summary.length} chars)`,
|
||||
expected: "Summary should be empty until completion"
|
||||
}
|
||||
}
|
||||
return null
|
||||
}
|
||||
|
||||
// Pattern: "Return minimal handoff (50-100 tokens)"
|
||||
function checkHandoffSize(output) {
|
||||
tokenCount = estimateTokens(output)
|
||||
|
||||
if (tokenCount > 200) {
|
||||
return {
|
||||
pattern: "Return minimal handoff (50-100 tokens)",
|
||||
violation: "Verbose handoff",
|
||||
severity: "WARN",
|
||||
found: `${tokenCount} tokens`,
|
||||
expected: "50-100 tokens (brief summary)",
|
||||
suggestion: "Detailed work should go in feature sections, not response"
|
||||
}
|
||||
}
|
||||
return null
|
||||
}
|
||||
|
||||
// Pattern: "PRD mode: Extract ALL sections"
|
||||
function checkPRDExtraction(output, context) {
|
||||
if (context.userInput.inputType != "PRD") return null
|
||||
|
||||
// Compare PRD sections vs feature sections
|
||||
prdSections = context.prdSections // Captured in pre-execution
|
||||
feature = query_container(operation="get", containerType="feature", id=entityId, includeSections=true)
|
||||
featureSections = feature.sections
|
||||
|
||||
missingSections = []
|
||||
for prdSection in prdSections:
|
||||
if (!hasMatchingSection(featureSections, prdSection)) {
|
||||
missingSections.push(prdSection)
|
||||
}
|
||||
}
|
||||
|
||||
if (missingSections.length > 0) {
|
||||
return {
|
||||
pattern: "PRD mode: Extract ALL sections from document",
|
||||
violation: "PRD sections incomplete",
|
||||
severity: "ALERT",
|
||||
found: `Feature has ${featureSections.length} sections`,
|
||||
expected: `PRD has ${prdSections.length} sections`,
|
||||
missing: missingSections,
|
||||
suggestion: "Add missing sections to feature"
|
||||
}
|
||||
}
|
||||
return null
|
||||
}
|
||||
```
|
||||
|
||||
### Step 5: Verify Expected Outputs
|
||||
|
||||
**Check entity produced expected outputs from its definition.**
|
||||
|
||||
```javascript
|
||||
outputCheck = {
|
||||
expectedOutputs: definition.outputValidation || definition.expectedOutputs,
|
||||
actualOutputs: analyzeOutputs(entityOutput, entityId),
|
||||
present: [],
|
||||
missing: []
|
||||
}
|
||||
|
||||
// Example for Planning Specialist
|
||||
expectedOutputs = [
|
||||
"Tasks created with descriptions?",
|
||||
"Domain isolation preserved?",
|
||||
"Dependencies mapped correctly?",
|
||||
"Documentation task included (if user-facing)?",
|
||||
"Testing task included (if needed)?",
|
||||
"No circular dependencies?",
|
||||
"Templates applied to tasks?"
|
||||
]
|
||||
|
||||
for expectedOutput in expectedOutputs:
|
||||
if (verifyOutput(expectedOutput, entityId, context)) {
|
||||
outputCheck.present.push(expectedOutput)
|
||||
} else {
|
||||
outputCheck.missing.push({
|
||||
output: expectedOutput,
|
||||
severity: determineSeverity(expectedOutput),
|
||||
impact: describeImpact(expectedOutput)
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 6: Validate Against Checkpoints
|
||||
|
||||
**Compare execution against checkpoints set in pre-execution.**
|
||||
|
||||
```javascript
|
||||
checkpointResults = {
|
||||
total: context.checkpoints.length,
|
||||
passed: 0,
|
||||
failed: []
|
||||
}
|
||||
|
||||
for checkpoint in context.checkpoints:
|
||||
result = verifyCheckpoint(checkpoint, entityOutput, entityId, context)
|
||||
|
||||
if (result.passed) {
|
||||
checkpointResults.passed++
|
||||
} else {
|
||||
checkpointResults.failed.push({
|
||||
checkpoint: checkpoint,
|
||||
reason: result.reason,
|
||||
severity: result.severity
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Checkpoint Verification Examples**:
|
||||
|
||||
```javascript
|
||||
// Checkpoint: "Verify templates discovered via query_templates"
|
||||
function verifyTemplatesDiscovered(output) {
|
||||
if (mentions(output, "template") || mentions(output, "query_templates")) {
|
||||
return { passed: true }
|
||||
}
|
||||
return {
|
||||
passed: false,
|
||||
reason: "No evidence of template discovery (query_templates not called)",
|
||||
severity: "WARN"
|
||||
}
|
||||
}
|
||||
|
||||
// Checkpoint: "Verify domain isolation (one task = one specialist)"
|
||||
function verifyDomainIsolation(featureId) {
|
||||
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
|
||||
|
||||
violations = []
|
||||
for task in tasks:
|
||||
domains = detectDomains(task.title + " " + task.description)
|
||||
if (domains.length > 1) {
|
||||
violations.push({
|
||||
task: task.title,
|
||||
domains: domains,
|
||||
issue: "Task spans multiple specialist domains"
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
if (violations.length > 0) {
|
||||
return {
|
||||
passed: false,
|
||||
reason: `${violations.length} cross-domain tasks detected`,
|
||||
severity: "ALERT",
|
||||
details: violations
|
||||
}
|
||||
}
|
||||
|
||||
return { passed: true }
|
||||
}
|
||||
```
|
||||
|
||||
### Step 7: Check Token Range
|
||||
|
||||
**Verify entity stayed within expected token range.**
|
||||
|
||||
```javascript
|
||||
tokenCheck = {
|
||||
actual: estimateTokens(entityOutput),
|
||||
expected: definition.tokenRange,
|
||||
withinRange: false,
|
||||
deviation: 0
|
||||
}
|
||||
|
||||
tokenCheck.withinRange = (
|
||||
tokenCheck.actual >= tokenCheck.expected[0] &&
|
||||
tokenCheck.actual <= tokenCheck.expected[1]
|
||||
)
|
||||
|
||||
if (!tokenCheck.withinRange) {
|
||||
tokenCheck.deviation = tokenCheck.actual > tokenCheck.expected[1]
|
||||
? tokenCheck.actual - tokenCheck.expected[1]
|
||||
: tokenCheck.expected[0] - tokenCheck.actual
|
||||
|
||||
if (tokenCheck.deviation > tokenCheck.expected[1] * 0.5) {
|
||||
// More than 50% over expected range
|
||||
severity = "WARN"
|
||||
} else {
|
||||
severity = "INFO"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 8: Compare Against Original User Input
|
||||
|
||||
**For Subagents: Verify original user requirements preserved.**
|
||||
|
||||
```javascript
|
||||
if (category == "SUBAGENT") {
|
||||
requirementsCheck = compareToOriginal(
|
||||
userInput: context.userInput,
|
||||
output: entityOutput,
|
||||
entityId: entityId
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
**Comparison Logic**:
|
||||
|
||||
```javascript
|
||||
function compareToOriginal(userInput, output, entityId) {
|
||||
// For Feature Architect: Check core concepts preserved
|
||||
if (entityType == "feature-architect") {
|
||||
feature = query_container(operation="get", containerType="feature", id=entityId, includeSections=true)
|
||||
|
||||
originalConcepts = extractConcepts(userInput.fullText)
|
||||
featureConcepts = extractConcepts(feature.description + " " + sectionsToText(feature.sections))
|
||||
|
||||
missingConcepts = originalConcepts.filter(c => !featureConcepts.includes(c))
|
||||
|
||||
if (missingConcepts.length > 0) {
|
||||
return {
|
||||
preserved: false,
|
||||
severity: "ALERT",
|
||||
missing: missingConcepts,
|
||||
suggestion: "Add missing concepts to feature description or sections"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// For Planning Specialist: Check all feature requirements covered
|
||||
if (entityType == "planning-specialist") {
|
||||
requirements = extractRequirements(context.featureRequirements.description)
|
||||
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
|
||||
|
||||
uncoveredRequirements = []
|
||||
for req in requirements:
|
||||
if (!anyTaskCovers(tasks, req)) {
|
||||
uncoveredRequirements.push(req)
|
||||
}
|
||||
}
|
||||
|
||||
if (uncoveredRequirements.length > 0) {
|
||||
return {
|
||||
preserved: false,
|
||||
severity: "WARN",
|
||||
uncovered: uncoveredRequirements,
|
||||
suggestion: "Create additional tasks to cover all requirements"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return { preserved: true }
|
||||
}
|
||||
```
|
||||
|
||||
### Step 9: Determine Specialized Analysis Needed
|
||||
|
||||
**Based on entity type, decide which specialized analysis to run.**
|
||||
|
||||
```javascript
|
||||
specializedAnalyses = []
|
||||
|
||||
// Planning Specialist → Graph + Tag analysis
|
||||
if (entityType == "planning-specialist") {
|
||||
specializedAnalyses.push("graph-quality")
|
||||
specializedAnalyses.push("tag-quality")
|
||||
}
|
||||
|
||||
// All entities → Routing validation
|
||||
specializedAnalyses.push("routing-validation")
|
||||
|
||||
// If efficiency analysis enabled
|
||||
if (params.enableEfficiencyAnalysis) {
|
||||
specializedAnalyses.push("token-optimization")
|
||||
specializedAnalyses.push("tool-selection")
|
||||
specializedAnalyses.push("parallel-detection")
|
||||
}
|
||||
|
||||
// Load and run each specialized analysis
|
||||
for analysis in specializedAnalyses:
|
||||
Read `.claude/skills/orchestration-qa/${analysis}.md`
|
||||
runAnalysis(analysis, entityType, entityOutput, entityId, context)
|
||||
}
|
||||
```
|
||||
|
||||
### Step 10: Aggregate Results
|
||||
|
||||
**Combine all validation results.**
|
||||
|
||||
```javascript
|
||||
results = {
|
||||
entity: entityType,
|
||||
category: category,
|
||||
|
||||
workflowAdherence: `${workflowCheck.stepsFollowed}/${workflowCheck.stepsExpected} steps (${percentage}%)`,
|
||||
expectedOutputs: `${outputCheck.present.length}/${outputCheck.expectedOutputs.length} present`,
|
||||
checkpoints: `${checkpointResults.passed}/${checkpointResults.total} passed`,
|
||||
|
||||
criticalPatternViolations: patternCheck.violations.filter(v => v.severity == "ALERT"),
|
||||
processIssues: patternCheck.violations.filter(v => v.severity == "WARN"),
|
||||
|
||||
tokenUsage: {
|
||||
actual: tokenCheck.actual,
|
||||
expected: tokenCheck.expected,
|
||||
withinRange: tokenCheck.withinRange,
|
||||
deviation: tokenCheck.deviation
|
||||
},
|
||||
|
||||
requirementsPreserved: requirementsCheck?.preserved ?? true,
|
||||
|
||||
deviations: aggregateDeviations(
|
||||
workflowCheck.deviations,
|
||||
patternCheck.violations,
|
||||
outputCheck.missing,
|
||||
checkpointResults.failed
|
||||
),
|
||||
|
||||
specializedAnalyses: specializedAnalysisResults
|
||||
}
|
||||
```
|
||||
|
||||
### Step 11: Categorize Deviations by Severity
|
||||
|
||||
```javascript
|
||||
deviationsSummary = {
|
||||
ALERT: results.deviations.filter(d => d.severity == "ALERT"),
|
||||
WARN: results.deviations.filter(d => d.severity == "WARN"),
|
||||
INFO: results.deviations.filter(d => d.severity == "INFO")
|
||||
}
|
||||
```
|
||||
|
||||
**Severity Determination**:
|
||||
|
||||
- **ALERT**: Critical violations that affect functionality or correctness
|
||||
- Missing requirements from user input
|
||||
- Cross-domain tasks (violates domain isolation)
|
||||
- Status change without Status Progression Skill
|
||||
- Circular dependencies
|
||||
- PRD sections not extracted
|
||||
|
||||
- **WARN**: Process issues that should be addressed
|
||||
- Workflow steps skipped (non-critical)
|
||||
- Output too verbose
|
||||
- Templates not applied when available
|
||||
- Tags don't follow conventions
|
||||
- No Files Changed section
|
||||
|
||||
- **INFO**: Observations and opportunities
|
||||
- Token usage outside expected range (but reasonable)
|
||||
- Efficiency opportunities identified
|
||||
- Quality patterns observed
|
||||
|
||||
### Step 12: Return Results
|
||||
|
||||
If deviations found, prepare for reporting:
|
||||
|
||||
```javascript
|
||||
if (deviationsSummary.ALERT.length > 0 || deviationsSummary.WARN.length > 0) {
|
||||
// Read deviation-templates.md for formatting
|
||||
Read `.claude/skills/orchestration-qa/deviation-templates.md`
|
||||
|
||||
// Format report based on severity
|
||||
report = formatDeviationReport(results, deviationsSummary)
|
||||
|
||||
// Add to TodoWrite
|
||||
addToTodoWrite(deviationsSummary)
|
||||
|
||||
// Return report
|
||||
return report
|
||||
}
|
||||
```
|
||||
|
||||
If no issues:
|
||||
|
||||
```javascript
|
||||
return {
|
||||
success: true,
|
||||
message: `✅ QA Review: ${entityType} - All checks passed`,
|
||||
workflowAdherence: results.workflowAdherence,
|
||||
summary: "No deviations detected"
|
||||
}
|
||||
```
|
||||
|
||||
## Entity-Specific Notes
|
||||
|
||||
### Skills Review
|
||||
|
||||
- Focus on workflow steps and tool usage
|
||||
- Verify token efficiency (Skills should be lightweight)
|
||||
- Check for proper error handling
|
||||
|
||||
### Subagents Review
|
||||
|
||||
- Focus on step-by-step process adherence
|
||||
- Verify critical patterns followed
|
||||
- Compare output vs original user input (requirement preservation)
|
||||
- Check output brevity (specialists should return minimal summaries)
|
||||
|
||||
### Status Progression Skill (Critical)
|
||||
|
||||
**Special validation** - this is the most critical Skill to validate:
|
||||
|
||||
```javascript
|
||||
if (entityType == "status-progression") {
|
||||
// CRITICAL: Was it actually used?
|
||||
if (statusChangedWithoutSkill) {
|
||||
return {
|
||||
severity: "CRITICAL",
|
||||
violation: "Status change bypassed mandatory Status Progression Skill",
|
||||
impact: "Prerequisite validation may have been skipped",
|
||||
action: "IMMEDIATE ALERT to user"
|
||||
}
|
||||
}
|
||||
|
||||
// Verify it read config.yaml
|
||||
if (!mentions(output, "config")) {
|
||||
deviations.push({
|
||||
severity: "WARN",
|
||||
issue: "Status Progression Skill didn't mention config",
|
||||
expected: "Should read config.yaml for workflow validation"
|
||||
})
|
||||
}
|
||||
|
||||
// Verify it validated prerequisites
|
||||
if (validationFailed && !mentions(output, "prerequisite" or "blocker")) {
|
||||
deviations.push({
|
||||
severity: "WARN",
|
||||
issue: "Validation failure without detailed prerequisites",
|
||||
expected: "Should explain what prerequisites are blocking"
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Output Structure
|
||||
|
||||
```javascript
|
||||
{
|
||||
entity: "planning-specialist",
|
||||
category: "SUBAGENT",
|
||||
workflowAdherence: "8/8 steps (100%)",
|
||||
expectedOutputs: "7/7 present",
|
||||
checkpoints: "10/10 passed",
|
||||
tokenUsage: {
|
||||
actual: 1950,
|
||||
expected: [1800, 2200],
|
||||
withinRange: true
|
||||
},
|
||||
deviations: [],
|
||||
specializedAnalyses: {
|
||||
graphQuality: { score: 95, issues: [] },
|
||||
tagQuality: { score: 100, issues: [] }
|
||||
},
|
||||
success: true,
|
||||
message: "✅ All quality checks passed"
|
||||
}
|
||||
```
|
||||
490
skills/orchestration-qa/pre-execution.md
Normal file
490
skills/orchestration-qa/pre-execution.md
Normal file
@@ -0,0 +1,490 @@
|
||||
# Pre-Execution Validation
|
||||
|
||||
**Purpose**: Capture context and set validation checkpoints before launching any Skill or Subagent.
|
||||
|
||||
**When**: Before any Skill or Subagent is launched (phase="pre")
|
||||
|
||||
**Token Cost**: ~400-600 tokens
|
||||
|
||||
## Validation Workflow
|
||||
|
||||
### Step 1: Capture Original User Input
|
||||
|
||||
**Critical**: Store the user's complete original request for post-execution comparison.
|
||||
|
||||
```javascript
|
||||
context = {
|
||||
userInput: {
|
||||
fullText: userMessage,
|
||||
timestamp: now(),
|
||||
inputType: detectInputType(userMessage) // PRD / Detailed / Quick / Command
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Input Type Detection**:
|
||||
|
||||
```javascript
|
||||
function detectInputType(message) {
|
||||
// PRD: Formal document with multiple sections
|
||||
if (message.includes("# ") && message.length > 500 && hasSections(message)) {
|
||||
return "PRD"
|
||||
}
|
||||
|
||||
// Detailed: Rich context, multiple paragraphs, requirements
|
||||
if (message.length > 200 && paragraphCount(message) >= 3) {
|
||||
return "Detailed"
|
||||
}
|
||||
|
||||
// Quick: Short request, minimal context
|
||||
if (message.length < 100) {
|
||||
return "Quick"
|
||||
}
|
||||
|
||||
// Command: Direct instruction
|
||||
return "Command"
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Identify Entity Type
|
||||
|
||||
**Determine what's being launched** (Skill vs Subagent):
|
||||
|
||||
```javascript
|
||||
entityType = identifyEntity(userMessage, context)
|
||||
|
||||
// Skills
|
||||
if (matches(userMessage, skills[].mandatoryTriggers)) {
|
||||
entityType = matchedSkill // "feature-orchestration", "task-orchestration", etc.
|
||||
category = "SKILL"
|
||||
}
|
||||
|
||||
// Subagents
|
||||
if (orchestrator decides to launch subagent) {
|
||||
entityType = subagentName // "feature-architect", "planning-specialist", etc.
|
||||
category = "SUBAGENT"
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Set Entity-Specific Validation Checkpoints
|
||||
|
||||
**Create checklist to verify after execution completes.**
|
||||
|
||||
#### Feature Orchestration Skill
|
||||
|
||||
```javascript
|
||||
checkpoints = [
|
||||
"Verify Skill assessed complexity correctly",
|
||||
"Verify Skill created feature OR launched Feature Architect",
|
||||
"Verify templates discovered via query_templates",
|
||||
"Verify output in token range (300-800 tokens)"
|
||||
]
|
||||
|
||||
context.expected = {
|
||||
complexity: detectExpectedComplexity(userInput),
|
||||
mode: "simple" or "complex",
|
||||
tools: ["query_templates", "manage_container" or "Task(Feature Architect)"],
|
||||
tokenRange: [300, 800]
|
||||
}
|
||||
```
|
||||
|
||||
**Complexity Detection**:
|
||||
```javascript
|
||||
function detectExpectedComplexity(input) {
|
||||
// Simple indicators
|
||||
if (input.length < 150 && paragraphs < 2) return "simple"
|
||||
|
||||
// Complex indicators
|
||||
if (input.inputType == "PRD") return "complex"
|
||||
if (input.length > 200) return "complex"
|
||||
if (mentions(input, ["multiple", "integration", "system"])) return "complex"
|
||||
|
||||
return "simple"
|
||||
}
|
||||
```
|
||||
|
||||
#### Task Orchestration Skill
|
||||
|
||||
```javascript
|
||||
checkpoints = [
|
||||
"Verify Skill analyzed dependencies via query_dependencies",
|
||||
"Verify Skill identified parallel opportunities",
|
||||
"Verify Skill used recommend_agent for routing",
|
||||
"Verify Skill returned batch structure",
|
||||
"Verify output in token range (500-900 tokens)"
|
||||
]
|
||||
|
||||
// Get current feature state for comparison
|
||||
if (featureId) {
|
||||
context.featureState = {
|
||||
totalTasks: query_container(containerType="feature", id=featureId).taskCounts.total,
|
||||
pendingTasks: query_container(containerType="feature", id=featureId, status="pending").length,
|
||||
dependencies: query_dependencies for all pending tasks
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Status Progression Skill
|
||||
|
||||
```javascript
|
||||
checkpoints = [
|
||||
"Verify Skill read config.yaml",
|
||||
"Verify Skill validated prerequisites",
|
||||
"Verify Skill returned clear result or error",
|
||||
"Verify output in token range (200-400 tokens)"
|
||||
]
|
||||
|
||||
// CRITICAL CHECK: Was Status Progression Skill actually used?
|
||||
context.criticalValidation = {
|
||||
mustUseSkill: true,
|
||||
violationSeverity: "CRITICAL",
|
||||
reason: "Status changes MUST use Status Progression Skill for prerequisite validation"
|
||||
}
|
||||
|
||||
// Get current entity state for prerequisite checking
|
||||
context.entityState = {
|
||||
currentStatus: entity.status,
|
||||
summary: entity.summary,
|
||||
dependencies: query_dependencies(taskId) if task,
|
||||
tasks: query_container(featureId).tasks if feature
|
||||
}
|
||||
```
|
||||
|
||||
#### Feature Architect Subagent
|
||||
|
||||
```javascript
|
||||
checkpoints = [
|
||||
"Compare Feature Architect output vs original user input",
|
||||
"Verify mode detection (PRD/Interactive/Quick)",
|
||||
"Verify all PRD sections extracted (if PRD mode)",
|
||||
"Verify core concepts preserved",
|
||||
"Verify templates applied",
|
||||
"Verify tags follow project conventions",
|
||||
"Verify agent-mapping.yaml checked (for new tags)",
|
||||
"Verify handoff minimal (50-100 tokens)"
|
||||
]
|
||||
|
||||
// PRD Mode: Extract sections from user input
|
||||
if (context.userInput.inputType == "PRD") {
|
||||
context.prdSections = extractSections(userInput)
|
||||
// Example: ["Business Context", "User Stories", "Technical Specs", "Requirements"]
|
||||
|
||||
checkpoints.push(
|
||||
"Verify all PRD sections have corresponding feature sections"
|
||||
)
|
||||
}
|
||||
|
||||
context.expected = {
|
||||
mode: context.userInput.inputType,
|
||||
descriptionLength: context.userInput.inputType == "PRD" ? [500, 1000] : [200, 500],
|
||||
sectionsExpected: context.prdSections?.length || 0,
|
||||
handoffTokens: [50, 100],
|
||||
tokenRange: [1800, 2200]
|
||||
}
|
||||
```
|
||||
|
||||
#### Planning Specialist Subagent
|
||||
|
||||
```javascript
|
||||
// First, read the created feature
|
||||
feature = query_container(operation="get", containerType="feature", id=featureId, includeSections=true)
|
||||
|
||||
// Store feature requirements for comparison
|
||||
context.featureRequirements = {
|
||||
description: feature.description,
|
||||
sections: feature.sections,
|
||||
isUserFacing: detectUserFacing(feature),
|
||||
requiresMultipleDomains: detectDomains(feature)
|
||||
}
|
||||
|
||||
checkpoints = [
|
||||
"Verify domain isolation (one task = one specialist)",
|
||||
"Verify dependencies mapped (Database → Backend → Frontend)",
|
||||
"Verify documentation task created (if user-facing)",
|
||||
"Verify testing task created (if needed)",
|
||||
"Verify all feature requirements covered by tasks",
|
||||
"Verify no cross-domain tasks",
|
||||
"Verify no circular dependencies",
|
||||
"Verify task descriptions populated (200-600 chars)",
|
||||
"Verify templates applied to tasks",
|
||||
"Verify output in token range (1800-2200 tokens)"
|
||||
]
|
||||
|
||||
context.expected = {
|
||||
needsDocumentation: context.featureRequirements.isUserFacing,
|
||||
needsTesting: detectTestingNeeded(feature),
|
||||
domainCount: context.featureRequirements.requiresMultipleDomains ? 3 : 1,
|
||||
tokenRange: [1800, 2200]
|
||||
}
|
||||
```
|
||||
|
||||
**Domain Detection**:
|
||||
```javascript
|
||||
function detectDomains(feature) {
|
||||
domains = []
|
||||
|
||||
if (mentions(feature.description, ["database", "schema", "migration"])) {
|
||||
domains.push("database")
|
||||
}
|
||||
if (mentions(feature.description, ["api", "service", "endpoint", "backend"])) {
|
||||
domains.push("backend")
|
||||
}
|
||||
if (mentions(feature.description, ["ui", "component", "page", "frontend"])) {
|
||||
domains.push("frontend")
|
||||
}
|
||||
|
||||
return domains.length
|
||||
}
|
||||
```
|
||||
|
||||
#### Implementation Specialist Subagents
|
||||
|
||||
**Applies to**: Backend Engineer, Frontend Developer, Database Engineer, Test Engineer, Technical Writer
|
||||
|
||||
```javascript
|
||||
// Read task context
|
||||
task = query_container(operation="get", containerType="task", id=taskId, includeSections=true)
|
||||
|
||||
context.taskRequirements = {
|
||||
description: task.description,
|
||||
sections: task.sections,
|
||||
hasDependencies: query_dependencies(taskId).incoming.length > 0,
|
||||
complexity: task.complexity
|
||||
}
|
||||
|
||||
checkpoints = [
|
||||
"Verify specialist completed task lifecycle",
|
||||
"Verify tests run and passing (if code task)",
|
||||
"Verify summary populated (300-500 chars)",
|
||||
"Verify Files Changed section created (ordinal 999)",
|
||||
"Verify used Status Progression Skill to mark complete",
|
||||
"Verify output minimal (50-100 tokens)",
|
||||
"If blocked: Verify clear reason + attempted fixes"
|
||||
]
|
||||
|
||||
context.expected = {
|
||||
summaryLength: [300, 500],
|
||||
hasFilesChanged: true,
|
||||
statusChanged: true,
|
||||
tokenRange: [1800, 2200],
|
||||
outputTokens: [50, 100]
|
||||
}
|
||||
|
||||
// Verify recommend_agent was used
|
||||
context.routingValidation = {
|
||||
shouldUseRecommendAgent: true,
|
||||
matchesTags: checkTagMatch(task.tags, specialistName)
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Verify Routing Decision
|
||||
|
||||
**Check orchestrator made correct routing choice.**
|
||||
|
||||
```javascript
|
||||
routingCheck = {
|
||||
userRequest: userMessage,
|
||||
detectedIntent: detectIntent(userMessage),
|
||||
orchestratorChoice: entityType,
|
||||
correctChoice: validateRouting(detectedIntent, entityType)
|
||||
}
|
||||
|
||||
// Intent detection
|
||||
function detectIntent(message) {
|
||||
// Coordination triggers → MUST use Skills
|
||||
coordinationTriggers = [
|
||||
"mark complete", "update status", "create feature",
|
||||
"execute tasks", "what's next", "check blockers", "complete feature"
|
||||
]
|
||||
|
||||
// Implementation triggers → Should ask user (Direct vs Specialist)
|
||||
implementationTriggers = [
|
||||
"implement", "write code", "create API", "build",
|
||||
"add tests", "fix bug", "database schema", "frontend component"
|
||||
]
|
||||
|
||||
if (matches(message, coordinationTriggers)) return "COORDINATION"
|
||||
if (matches(message, implementationTriggers)) return "IMPLEMENTATION"
|
||||
|
||||
return "UNKNOWN"
|
||||
}
|
||||
|
||||
// Routing validation
|
||||
function validateRouting(intent, choice) {
|
||||
if (intent == "COORDINATION" && !isSkill(choice)) {
|
||||
return {
|
||||
valid: false,
|
||||
severity: "CRITICAL",
|
||||
violation: "Coordination request must use Skill, not direct tools or subagent",
|
||||
expected: "Use appropriate Skill (Feature Orchestration, Task Orchestration, Status Progression)"
|
||||
}
|
||||
}
|
||||
|
||||
if (intent == "IMPLEMENTATION" && !askedUser) {
|
||||
return {
|
||||
valid: false,
|
||||
severity: "WARN",
|
||||
violation: "Implementation request should ask user (Direct vs Specialist)",
|
||||
expected: "Ask user preference before proceeding"
|
||||
}
|
||||
}
|
||||
|
||||
return { valid: true }
|
||||
}
|
||||
```
|
||||
|
||||
**Special Case: Status Changes**
|
||||
|
||||
```javascript
|
||||
// Status changes are ALWAYS coordination → MUST use Status Progression Skill
|
||||
if (userMessage.includes("complete") || userMessage.includes("status")) {
|
||||
if (choice != "status-progression") {
|
||||
return {
|
||||
valid: false,
|
||||
severity: "CRITICAL",
|
||||
violation: "Status change MUST use Status Progression Skill",
|
||||
reason: "Prerequisite validation required (summary length, dependencies, task counts)",
|
||||
expected: "Use Status Progression Skill for ALL status changes"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 5: Store Context for Post-Execution
|
||||
|
||||
**Save all captured information for comparison after execution.**
|
||||
|
||||
```javascript
|
||||
session.contexts[workflowId] = {
|
||||
timestamp: now(),
|
||||
userInput: context.userInput,
|
||||
entityType: entityType,
|
||||
category: "SKILL" or "SUBAGENT",
|
||||
checkpoints: checkpoints,
|
||||
expected: context.expected,
|
||||
featureRequirements: context.featureRequirements, // if Planning Specialist
|
||||
taskRequirements: context.taskRequirements, // if Implementation Specialist
|
||||
routingValidation: routingCheck,
|
||||
criticalValidation: context.criticalValidation // if Status Progression
|
||||
}
|
||||
```
|
||||
|
||||
### Step 6: Return Ready Signal
|
||||
|
||||
```javascript
|
||||
return {
|
||||
ready: true,
|
||||
contextCaptured: true,
|
||||
entityType: entityType,
|
||||
category: category,
|
||||
checkpoints: checkpoints.length,
|
||||
routingValid: routingCheck.valid,
|
||||
warnings: routingCheck.valid ? [] : [routingCheck.violation]
|
||||
}
|
||||
```
|
||||
|
||||
**If routing violation detected**, alert immediately:
|
||||
|
||||
```javascript
|
||||
if (!routingCheck.valid && routingCheck.severity == "CRITICAL") {
|
||||
return {
|
||||
ready: false,
|
||||
violation: {
|
||||
severity: "CRITICAL",
|
||||
type: "Routing Violation",
|
||||
message: routingCheck.violation,
|
||||
expected: routingCheck.expected,
|
||||
action: "STOP - Do not proceed until corrected"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Routing Violation Examples
|
||||
|
||||
### CRITICAL: Status Change Without Status Progression Skill
|
||||
|
||||
```javascript
|
||||
User: "Mark task T1 complete"
|
||||
Orchestrator: [Calls manage_container directly]
|
||||
|
||||
// Pre-execution validation detects:
|
||||
{
|
||||
violation: "CRITICAL",
|
||||
type: "Status change bypassed mandatory Status Progression Skill",
|
||||
expected: "Use Status Progression Skill for status changes",
|
||||
reason: "Prerequisite validation required (summary 300-500 chars, dependencies completed)",
|
||||
action: "STOP - Use Status Progression Skill instead"
|
||||
}
|
||||
|
||||
// Alert user immediately, do NOT proceed
|
||||
```
|
||||
|
||||
### CRITICAL: Feature Creation Without Feature Orchestration Skill
|
||||
|
||||
```javascript
|
||||
User: "Create a user authentication feature"
|
||||
Orchestrator: [Calls manage_container directly]
|
||||
|
||||
// Pre-execution validation detects:
|
||||
{
|
||||
violation: "CRITICAL",
|
||||
type: "Feature creation bypassed mandatory Feature Orchestration Skill",
|
||||
expected: "Use Feature Orchestration Skill for feature creation",
|
||||
reason: "Complexity assessment and template discovery required",
|
||||
action: "STOP - Use Feature Orchestration Skill instead"
|
||||
}
|
||||
```
|
||||
|
||||
### WARN: Implementation Without Asking User
|
||||
|
||||
```javascript
|
||||
User: "Implement login API"
|
||||
Orchestrator: [Works directly without asking preference]
|
||||
|
||||
// Pre-execution validation detects:
|
||||
{
|
||||
violation: "WARN",
|
||||
type: "Implementation without user preference",
|
||||
expected: "Ask user: Direct vs Specialist?",
|
||||
reason: "User should choose approach",
|
||||
action: "Log to TodoWrite, suggest asking user"
|
||||
}
|
||||
|
||||
// Log but don't block
|
||||
```
|
||||
|
||||
## Output Example
|
||||
|
||||
```javascript
|
||||
// Successful pre-execution
|
||||
{
|
||||
ready: true,
|
||||
contextCaptured: true,
|
||||
entityType: "planning-specialist",
|
||||
category: "SUBAGENT",
|
||||
checkpoints: 10,
|
||||
routingValid: true,
|
||||
expected: {
|
||||
mode: "Detailed",
|
||||
needsDocumentation: true,
|
||||
domainCount: 3,
|
||||
tokenRange: [1800, 2200]
|
||||
},
|
||||
message: "✅ Ready to launch Planning Specialist - 10 checkpoints set"
|
||||
}
|
||||
```
|
||||
|
||||
## Integration Example
|
||||
|
||||
```javascript
|
||||
// Before launching Planning Specialist
|
||||
orchestration-qa(
|
||||
phase="pre",
|
||||
entityType="planning-specialist",
|
||||
userInput="Create user authentication feature with OAuth2, JWT tokens, role-based access"
|
||||
)
|
||||
|
||||
// Returns context captured, checkpoints set
|
||||
// Orchestrator proceeds with launch
|
||||
```
|
||||
282
skills/orchestration-qa/routing-validation.md
Normal file
282
skills/orchestration-qa/routing-validation.md
Normal file
@@ -0,0 +1,282 @@
|
||||
# Routing Validation
|
||||
|
||||
**Purpose**: Detect violations of mandatory Skill usage patterns (Skills vs Direct tools vs Subagents).
|
||||
|
||||
**When**: After ANY workflow completes
|
||||
|
||||
**Applies To**: All Skills and Subagents
|
||||
|
||||
**Token Cost**: ~300-500 tokens
|
||||
|
||||
## Critical Routing Rules
|
||||
|
||||
### Rule 1: Status Changes MUST Use Status Progression Skill
|
||||
|
||||
**Violation**: Calling `manage_container(operation="setStatus")` directly
|
||||
|
||||
**Expected**: Use Status Progression Skill for ALL status changes
|
||||
|
||||
**Why Critical**: Prerequisite validation (summary length, dependencies, task completion) required
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (statusChanged && !usedStatusProgressionSkill) {
|
||||
return {
|
||||
severity: "CRITICAL",
|
||||
violation: "Status change bypassed mandatory Status Progression Skill",
|
||||
impact: "Prerequisites may not have been validated",
|
||||
expected: "Use Status Progression Skill for status changes"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Rule 2: Feature Creation MUST Use Feature Orchestration Skill
|
||||
|
||||
**Violation**: Calling `manage_container(operation="create", containerType="feature")` directly
|
||||
|
||||
**Expected**: Use Feature Orchestration Skill for feature creation
|
||||
|
||||
**Why Critical**: Complexity assessment and template discovery required
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (featureCreated && !usedFeatureOrchestrationSkill) {
|
||||
return {
|
||||
severity: "CRITICAL",
|
||||
violation: "Feature creation bypassed mandatory Feature Orchestration Skill",
|
||||
impact: "Complexity not assessed, templates may be missed",
|
||||
expected: "Use Feature Orchestration Skill for feature creation"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Rule 3: Task Execution SHOULD Use Task Orchestration Skill
|
||||
|
||||
**Violation**: Launching specialists directly without checking dependencies/parallel opportunities
|
||||
|
||||
**Expected**: Use Task Orchestration Skill for batch execution
|
||||
|
||||
**Why Important**: Dependency analysis and parallelization optimization
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (multipleTasksLaunched && !usedTaskOrchestrationSkill) {
|
||||
return {
|
||||
severity: "WARN",
|
||||
violation: "Multiple tasks launched without Task Orchestration Skill",
|
||||
impact: "May miss parallel opportunities or dependency conflicts",
|
||||
expected: "Use Task Orchestration Skill for batch execution"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Rule 4: Implementation Specialists MUST Use Status Progression for Completion
|
||||
|
||||
**Violation**: Specialist calls `manage_container(operation="setStatus")` directly
|
||||
|
||||
**Expected**: Specialist uses Status Progression Skill to mark complete
|
||||
|
||||
**Why Critical**: Prerequisite validation (summary, Files Changed section, tests)
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (implementationSpecialist && taskCompleted && !usedStatusProgressionSkill) {
|
||||
return {
|
||||
severity: "CRITICAL",
|
||||
violation: `${specialistName} marked task complete without Status Progression Skill`,
|
||||
impact: "Summary/Files Changed/test validation may have been skipped",
|
||||
expected: "Use Status Progression Skill in Step 8 of specialist lifecycle"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Validation Workflow
|
||||
|
||||
### Step 1: Identify Workflow Type
|
||||
|
||||
```javascript
|
||||
workflowType = identifyWorkflow(entityType, userInput, output)
|
||||
// Returns: "status-change", "feature-creation", "task-execution", "implementation"
|
||||
```
|
||||
|
||||
### Step 2: Check Mandatory Skill Usage
|
||||
|
||||
```javascript
|
||||
mandatorySkills = {
|
||||
"status-change": "status-progression",
|
||||
"feature-creation": "feature-orchestration",
|
||||
"task-execution": "task-orchestration", // WARN level
|
||||
"feature-completion": "feature-orchestration"
|
||||
}
|
||||
|
||||
requiredSkill = mandatorySkills[workflowType]
|
||||
```
|
||||
|
||||
### Step 3: Detect Skill Bypass
|
||||
|
||||
```javascript
|
||||
// Check if required Skill was used
|
||||
skillUsed = checkSkillUsage(output, requiredSkill)
|
||||
|
||||
if (!skillUsed && requiredSkill) {
|
||||
severity = (workflowType == "task-execution") ? "WARN" : "CRITICAL"
|
||||
|
||||
violation = {
|
||||
workflowType: workflowType,
|
||||
requiredSkill: requiredSkill,
|
||||
actualApproach: detectActualApproach(output),
|
||||
severity: severity,
|
||||
impact: describeImpact(requiredSkill)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Verify Specialist Lifecycle Adherence
|
||||
|
||||
For Implementation Specialists (Backend, Frontend, Database, Test, Technical Writer):
|
||||
|
||||
```javascript
|
||||
if (category == "SUBAGENT" && isImplementationSpecialist(entityType)) {
|
||||
lifecycle = {
|
||||
step8Expected: "Use Status Progression Skill to mark complete",
|
||||
step8Actual: detectStep8Approach(output),
|
||||
compliant: false
|
||||
}
|
||||
|
||||
// Check if Status Progression Skill was mentioned
|
||||
if (mentions(output, "Status Progression") || mentions(output, "status-progression")) {
|
||||
lifecycle.compliant = true
|
||||
} else if (taskStatusChanged) {
|
||||
violation = {
|
||||
severity: "CRITICAL",
|
||||
specialist: entityType,
|
||||
step: "Step 8",
|
||||
issue: "Marked task complete without Status Progression Skill",
|
||||
impact: "Prerequisite validation (summary, Files Changed, tests) may be incomplete",
|
||||
expected: "Use Status Progression Skill for completion"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Violation Severity Levels
|
||||
|
||||
### CRITICAL (Immediate Alert)
|
||||
- Status change without Status Progression Skill
|
||||
- Feature creation without Feature Orchestration Skill
|
||||
- Implementation specialist completion without Status Progression Skill
|
||||
- **Action**: Report immediately, add to TodoWrite, suggest correction
|
||||
|
||||
### WARN (Log for Review)
|
||||
- Task execution without Task Orchestration Skill (multiple tasks)
|
||||
- Efficiency opportunities missed (parallelization)
|
||||
- **Action**: Log to TodoWrite, mention in end-of-session summary
|
||||
|
||||
### INFO (Observation)
|
||||
- Workflow variations that are acceptable
|
||||
- Optimization suggestions
|
||||
- **Action**: Track for pattern analysis only
|
||||
|
||||
## Report Template
|
||||
|
||||
```markdown
|
||||
## 🚨 Routing Violation Detected
|
||||
|
||||
**Severity**: CRITICAL
|
||||
|
||||
**Workflow Type**: [status-change / feature-creation / etc.]
|
||||
|
||||
**Violation**: [Description]
|
||||
|
||||
**Impact**: [What this affects]
|
||||
|
||||
**Expected Approach**: Use [Skill Name] Skill
|
||||
|
||||
**Actual Approach**: Direct tool call / Subagent / etc.
|
||||
|
||||
**Recommendation**: [How to correct]
|
||||
|
||||
---
|
||||
|
||||
**Added to TodoWrite**:
|
||||
- Review [Workflow]: [Issue description]
|
||||
|
||||
**Decision Required**: Should orchestrator retry using correct Skill?
|
||||
```
|
||||
|
||||
## Common Violations
|
||||
|
||||
### Violation 1: Direct Status Change
|
||||
|
||||
```javascript
|
||||
User: "Mark task T1 complete"
|
||||
Orchestrator: manage_container(operation="setStatus", status="completed") // ❌
|
||||
|
||||
Expected: Use Status Progression Skill
|
||||
Reason: Summary validation, dependency checks required
|
||||
```
|
||||
|
||||
### Violation 2: Direct Feature Creation
|
||||
|
||||
```javascript
|
||||
User: "Create user authentication feature"
|
||||
Orchestrator: manage_container(operation="create", containerType="feature") // ❌
|
||||
|
||||
Expected: Use Feature Orchestration Skill
|
||||
Reason: Complexity assessment, template discovery required
|
||||
```
|
||||
|
||||
### Violation 3: Specialist Bypass
|
||||
|
||||
```javascript
|
||||
Backend Engineer: manage_container(operation="setStatus", status="completed") // ❌
|
||||
|
||||
Expected: Use Status Progression Skill in Step 8
|
||||
Reason: Summary, Files Changed, test validation required
|
||||
```
|
||||
|
||||
## Integration with Post-Execution Review
|
||||
|
||||
```javascript
|
||||
// ALWAYS run routing validation in post-execution
|
||||
Read "routing-validation.md"
|
||||
|
||||
violations = detectRoutingViolations(
|
||||
workflowType,
|
||||
entityType,
|
||||
entityOutput,
|
||||
context
|
||||
)
|
||||
|
||||
if (violations.length > 0) {
|
||||
for violation in violations:
|
||||
if (violation.severity == "CRITICAL") {
|
||||
// Report immediately
|
||||
alertUser(violation)
|
||||
addToTodoWrite(violation)
|
||||
} else {
|
||||
// Log for summary
|
||||
logViolation(violation)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Continuous Improvement
|
||||
|
||||
### Pattern Tracking
|
||||
If same violation occurs 2+ times in session:
|
||||
- Update orchestrator instructions
|
||||
- Add validation checkpoint in pre-execution
|
||||
- Suggest systemic improvement
|
||||
|
||||
### Definition Updates
|
||||
Recurring violations indicate documentation gaps:
|
||||
- Update Skill definitions with clearer trigger patterns
|
||||
- Add examples of correct vs incorrect usage
|
||||
- Update CLAUDE.md Decision Gates section
|
||||
|
||||
## When to Report
|
||||
|
||||
- **CRITICAL violations**: Report immediately (don't wait for post-execution)
|
||||
- **WARN violations**: Include in post-execution summary
|
||||
- **INFO observations**: Track for pattern analysis only
|
||||
155
skills/orchestration-qa/tag-quality.md
Normal file
155
skills/orchestration-qa/tag-quality.md
Normal file
@@ -0,0 +1,155 @@
|
||||
# Tag Quality Analysis
|
||||
|
||||
**Purpose**: Validate Planning Specialist's tag strategy ensures complete specialist coverage.
|
||||
|
||||
**When**: After Planning Specialist completes task breakdown
|
||||
|
||||
**Entity**: Planning Specialist only
|
||||
|
||||
**Token Cost**: ~400-600 tokens
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
1. **Tag Coverage** (100% baseline): Every task has tags that map to a specialist
|
||||
2. **Tag Conventions** (90% baseline): Tags follow project conventions (reuse existing)
|
||||
3. **Agent Mapping Coverage** (100% baseline): All tags map to specialists in agent-mapping.yaml
|
||||
|
||||
**Target**: 100% coverage, 90%+ conventions adherence
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
### Step 1: Load Project Tag Conventions
|
||||
|
||||
```javascript
|
||||
// Get existing tags from project
|
||||
projectTags = list_tags(entityTypes=["TASK", "FEATURE"])
|
||||
|
||||
// Load agent-mapping.yaml
|
||||
agentMapping = Read(".taskorchestrator/agent-mapping.yaml").tagMappings
|
||||
```
|
||||
|
||||
### Step 2: Analyze Task Tags
|
||||
|
||||
```javascript
|
||||
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
|
||||
|
||||
tagAnalysis = {
|
||||
totalTasks: tasks.length,
|
||||
tasksWithTags: 0,
|
||||
tasksWithoutTags: [],
|
||||
tagCoverage: [],
|
||||
conventionViolations: [],
|
||||
unmappedTags: []
|
||||
}
|
||||
|
||||
for task in tasks:
|
||||
if (!task.tags || task.tags.length == 0) {
|
||||
tagAnalysis.tasksWithoutTags.push(task.title)
|
||||
continue
|
||||
}
|
||||
|
||||
tagAnalysis.tasksWithTags++
|
||||
|
||||
// Check each tag
|
||||
for tag in task.tags:
|
||||
// Does tag map to a specialist?
|
||||
if (!agentMapping[tag]) {
|
||||
tagAnalysis.unmappedTags.push({
|
||||
task: task.title,
|
||||
tag: tag,
|
||||
severity: "ALERT"
|
||||
})
|
||||
}
|
||||
|
||||
// Is tag following conventions (existing tag)?
|
||||
if (!projectTags.includes(tag)) {
|
||||
tagAnalysis.conventionViolations.push({
|
||||
task: task.title,
|
||||
tag: tag,
|
||||
severity: "WARN",
|
||||
suggestion: "Use existing project tags or add to agent-mapping.yaml"
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Verify Specialist Coverage
|
||||
|
||||
```javascript
|
||||
coverageCheck = {
|
||||
covered: 0,
|
||||
uncovered: []
|
||||
}
|
||||
|
||||
for task in tasks:
|
||||
specialists = getSpecialistsForTask(task.tags, agentMapping)
|
||||
|
||||
if (specialists.length == 0) {
|
||||
coverageCheck.uncovered.push({
|
||||
task: task.title,
|
||||
tags: task.tags,
|
||||
issue: "No specialist mapping found",
|
||||
severity: "ALERT"
|
||||
})
|
||||
} else {
|
||||
coverageCheck.covered++
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Calculate Quality Score
|
||||
|
||||
```javascript
|
||||
score = {
|
||||
tagCoverage: (tagAnalysis.tasksWithTags / tagAnalysis.totalTasks) * 100,
|
||||
agentMappingCoverage: (coverageCheck.covered / tagAnalysis.totalTasks) * 100,
|
||||
conventionAdherence: (
|
||||
(tagAnalysis.tasksWithTags - tagAnalysis.conventionViolations.length) /
|
||||
tagAnalysis.tasksWithTags
|
||||
) * 100,
|
||||
overall: average(tagCoverage, agentMappingCoverage, conventionAdherence)
|
||||
}
|
||||
```
|
||||
|
||||
## Report Template
|
||||
|
||||
```markdown
|
||||
## 🏷️ Tag Quality Analysis
|
||||
|
||||
**Overall Score**: [X]% (Baseline: 90% / Target: 100%)
|
||||
|
||||
### Metrics
|
||||
- Tag Coverage: [X]% ([Y]/[Z] tasks have tags)
|
||||
- Agent Mapping Coverage: [X]% ([Y]/[Z] tasks map to specialists)
|
||||
- Convention Adherence: [X]%
|
||||
|
||||
### Issues ([count] total)
|
||||
🚨 **ALERT** ([count]): No specialist mapping
|
||||
- [Task A]: Tags [tag1, tag2] don't map to any specialist
|
||||
|
||||
⚠️ **WARN** ([count]): Convention violations
|
||||
- [Task B]: Tag "new-tag" not in project conventions
|
||||
|
||||
### Recommendations
|
||||
1. Add tags to tasks: [list]
|
||||
2. Update agent-mapping.yaml for: [tags]
|
||||
3. Use existing tags instead of: [new tags]
|
||||
```
|
||||
|
||||
## Critical Checks
|
||||
|
||||
### Check 1: Every Task Has Tags
|
||||
Tasks without tags cannot be routed to specialists.
|
||||
|
||||
### Check 2: Every Tag Maps to Specialist
|
||||
Tags that don't map to agent-mapping.yaml will fail routing.
|
||||
|
||||
### Check 3: Tags Follow Project Conventions
|
||||
New tags should be rare; reuse existing tags when possible.
|
||||
|
||||
## When to Report
|
||||
|
||||
- **ALWAYS** after Planning Specialist
|
||||
- **Full details** if score < 100%
|
||||
- **Brief summary** if score == 100%
|
||||
744
skills/orchestration-qa/task-content-quality.md
Normal file
744
skills/orchestration-qa/task-content-quality.md
Normal file
@@ -0,0 +1,744 @@
|
||||
# Task Content Quality Analysis
|
||||
|
||||
**Purpose**: Analyze information added to tasks by specialists to detect wasteful content, measure information density, and suggest improvements.
|
||||
|
||||
**When**: After Implementation Specialists complete tasks (Backend, Frontend, Database, Test, Technical Writer)
|
||||
|
||||
**Applies To**: Implementation Specialist Subagents only
|
||||
|
||||
**Token Cost**: ~500-700 tokens
|
||||
|
||||
## Overview
|
||||
|
||||
Implementation specialists add content to tasks through:
|
||||
1. **Summary field** (300-500 chars) - What was accomplished
|
||||
2. **Task sections** - Detailed results, approach, decisions
|
||||
3. **Files Changed section** (ordinal 999) - List of modified files
|
||||
|
||||
This analysis ensures specialists add **high-density, non-redundant information** while avoiding token waste.
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
### 1. Information Density
|
||||
**Definition**: Ratio of useful information to total tokens added
|
||||
|
||||
**Formula**: `density = (unique_concepts + actionable_details) / total_tokens`
|
||||
|
||||
**Target**: ≥ 70% (7 concepts per 10 tokens)
|
||||
|
||||
**Good Example** (High Density):
|
||||
```
|
||||
Summary (87 tokens):
|
||||
"Implemented OAuth2 authentication with JWT tokens. Added UserService with
|
||||
login/logout endpoints. All 12 tests passing. Files: AuthController.kt,
|
||||
UserService.kt, SecurityConfig.kt, AuthControllerTest.kt"
|
||||
|
||||
Density: 85% (7 concepts: OAuth2, JWT, UserService, login, logout, tests passing, files)
|
||||
```
|
||||
|
||||
**Bad Example** (Low Density):
|
||||
```
|
||||
Summary (143 tokens):
|
||||
"I have successfully completed the implementation of the authentication feature
|
||||
as requested. The work involved creating the necessary components and ensuring
|
||||
everything works correctly. Testing was performed and all tests are now passing
|
||||
successfully."
|
||||
|
||||
Density: 35% (3 concepts: authentication, components created, tests passing)
|
||||
Waste: 60 tokens of filler words
|
||||
```
|
||||
|
||||
### 2. Redundancy Score
|
||||
**Definition**: Percentage of information duplicated across summary + sections
|
||||
|
||||
**Formula**: `redundancy = duplicate_tokens / (summary_tokens + section_tokens)`
|
||||
|
||||
**Target**: ≤ 20% (minimal overlap between summary and sections)
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
// Extract key phrases from summary
|
||||
summaryPhrases = extractPhrases(task.summary)
|
||||
// e.g., ["OAuth2 authentication", "JWT tokens", "UserService", "12 tests passing"]
|
||||
|
||||
// Check sections for duplicate phrases
|
||||
sectionContent = task.sections.map(s => s.content).join(" ")
|
||||
duplicates = summaryPhrases.filter(phrase => sectionContent.includes(phrase))
|
||||
|
||||
redundancy = (duplicates.length / summaryPhrases.length) * 100
|
||||
```
|
||||
|
||||
**High Redundancy Example** (Bad):
|
||||
```
|
||||
Summary:
|
||||
"Implemented OAuth2 authentication with JWT tokens. Added UserService."
|
||||
|
||||
Technical Approach Section:
|
||||
"For this task, I implemented OAuth2 authentication using JWT tokens.
|
||||
I created a UserService to handle authentication logic..."
|
||||
|
||||
Redundancy: 70% (both mention OAuth2, JWT, UserService)
|
||||
```
|
||||
|
||||
**Low Redundancy Example** (Good):
|
||||
```
|
||||
Summary:
|
||||
"Implemented OAuth2 authentication. 12 tests passing."
|
||||
|
||||
Technical Approach Section:
|
||||
"Used Spring Security OAuth2 library. Token validation in JwtFilter.
|
||||
Refresh token rotation every 24h. Rate limiting: 5 attempts/min."
|
||||
|
||||
Redundancy: 15% (summary is high-level, section adds technical details)
|
||||
```
|
||||
|
||||
### 3. Code Snippet Ratio
|
||||
**Definition**: Percentage of section content that is code vs explanation
|
||||
|
||||
**Formula**: `code_ratio = code_block_tokens / section_tokens`
|
||||
|
||||
**Target**: ≤ 30% (sections explain, files contain code)
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
// Count tokens in code blocks
|
||||
codeBlocks = extractCodeBlocks(section.content) // ```language ... ```
|
||||
codeTokens = sum(codeBlocks.map(b => estimateTokens(b)))
|
||||
|
||||
// Total section tokens
|
||||
sectionTokens = estimateTokens(section.content)
|
||||
|
||||
ratio = (codeTokens / sectionTokens) * 100
|
||||
```
|
||||
|
||||
**Bad Example** (High Code Ratio):
|
||||
```markdown
|
||||
## Implementation Details
|
||||
|
||||
Here's the UserService implementation:
|
||||
|
||||
```kotlin
|
||||
@Service
|
||||
class UserService(
|
||||
private val userRepository: UserRepository,
|
||||
private val passwordEncoder: PasswordEncoder
|
||||
) {
|
||||
fun login(email: String, password: String): User? {
|
||||
val user = userRepository.findByEmail(email)
|
||||
return if (user != null && passwordEncoder.matches(password, user.password)) {
|
||||
user
|
||||
} else null
|
||||
}
|
||||
// ... 50 more lines
|
||||
}
|
||||
```
|
||||
|
||||
And here's the test:
|
||||
|
||||
```kotlin
|
||||
@Test
|
||||
fun `login with valid credentials returns user`() {
|
||||
// ... 30 lines of test code
|
||||
}
|
||||
```
|
||||
|
||||
Code Ratio: 85% (300 code tokens / 350 total tokens)
|
||||
Issue: Full code belongs in files, not task sections
|
||||
```
|
||||
|
||||
**Good Example** (Low Code Ratio):
|
||||
```markdown
|
||||
## Implementation Details
|
||||
|
||||
Created UserService with login/logout methods. Key decisions:
|
||||
- Password hashing: BCrypt (cost factor 12)
|
||||
- Session management: JWT with 1h expiration
|
||||
- Rate limiting: 5 failed attempts → 15min lockout
|
||||
|
||||
Example usage:
|
||||
```kotlin
|
||||
userService.login(email, password) // Returns User or null
|
||||
```
|
||||
|
||||
Code Ratio: 12% (20 code tokens / 165 total tokens)
|
||||
Quality: Explains approach, minimal code snippet for clarity
|
||||
```
|
||||
|
||||
### 4. Summary Quality
|
||||
**Definition**: Summary is concise, informative, and follows best practices
|
||||
|
||||
**Checks**:
|
||||
- ✅ Length: 300-500 characters (enforced by Status Progression Skill)
|
||||
- ✅ Mentions what was done (not how or why - that's in sections)
|
||||
- ✅ Includes test status
|
||||
- ✅ Lists key files changed
|
||||
- ✅ No filler words ("I have...", "successfully...", "as requested...")
|
||||
|
||||
**Scoring**:
|
||||
```javascript
|
||||
quality = {
|
||||
length: inRange(summary.length, 300, 500) ? 25 : 0,
|
||||
mentions_what: containsActionVerbs(summary) ? 25 : 0, // "Implemented", "Added", "Fixed"
|
||||
test_status: mentionsTests(summary) ? 25 : 0, // "12 tests passing"
|
||||
no_filler: !containsFiller(summary) ? 25 : 0 // No "successfully", "I have"
|
||||
}
|
||||
|
||||
score = sum(quality.values) // 0-100
|
||||
```
|
||||
|
||||
**Example Scores**:
|
||||
|
||||
90/100 (Excellent):
|
||||
```
|
||||
"Implemented OAuth2 authentication with JWT tokens. Added UserService for
|
||||
user management. All 12 tests passing. Files: AuthController.kt, UserService.kt,
|
||||
SecurityConfig.kt"
|
||||
|
||||
✓ Length: 387 chars
|
||||
✓ Mentions what: "Implemented", "Added"
|
||||
✓ Test status: "12 tests passing"
|
||||
✓ No filler: Clean, direct
|
||||
```
|
||||
|
||||
50/100 (Poor):
|
||||
```
|
||||
"I have successfully completed the authentication feature as requested. The
|
||||
implementation involved creating the necessary components and ensuring that
|
||||
everything works correctly. All tests are passing."
|
||||
|
||||
✓ Length: 349 chars
|
||||
✗ Mentions what: Vague "components"
|
||||
✓ Test status: "tests are passing"
|
||||
✗ No filler: "successfully", "as requested", "I have"
|
||||
```
|
||||
|
||||
### 5. Section Usefulness
|
||||
**Definition**: Sections add value beyond what's in summary and files
|
||||
|
||||
**Checks per section**:
|
||||
- ✅ Explains decisions/trade-offs
|
||||
- ✅ Documents non-obvious approach
|
||||
- ✅ Provides context for future developers
|
||||
- ✅ References files instead of duplicating code
|
||||
- ✅ Concise (bullet points > paragraphs)
|
||||
|
||||
**Scoring**:
|
||||
```javascript
|
||||
usefulness = {
|
||||
explains_why: containsRationale(section) ? 20 : 0, // "Chose X because..."
|
||||
approach: describesApproach(section) ? 20 : 0, // "Used pattern Y"
|
||||
future_context: providesContext(section) ? 20 : 0, // "Note: Z limitation"
|
||||
references_files: hasFileReferences(section) ? 20 : 0, // "See AuthController.kt:45"
|
||||
concise: isConcise(section) ? 20 : 0 // Bullet points, not prose
|
||||
}
|
||||
|
||||
score = sum(usefulness.values) // 0-100
|
||||
```
|
||||
|
||||
## Wasteful Patterns to Detect
|
||||
|
||||
### Pattern 1: Full Code in Sections
|
||||
|
||||
**Issue**: Code belongs in files, not task documentation
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (section.codeBlockCount > 2 || section.codeRatio > 30) {
|
||||
return {
|
||||
pattern: "Full code in sections",
|
||||
severity: "WARN",
|
||||
found: `${section.codeBlockCount} code blocks, ${section.codeRatio}% of content`,
|
||||
expected: "≤ 2 brief code snippets, ≤ 30% code ratio",
|
||||
recommendation: "Move code to files, reference with: 'See FileName.kt:lineNumber'",
|
||||
savings: estimateSavings(section) // e.g., "~500 tokens"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 2: Full Test Output
|
||||
|
||||
**Issue**: Test results should be summarized, not pasted verbatim
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (section.title.includes("Test") && section.content.includes("PASSED") && section.content.length > 500) {
|
||||
return {
|
||||
pattern: "Full test output in section",
|
||||
severity: "WARN",
|
||||
found: `${section.content.length} chars of test output`,
|
||||
expected: "Test summary: X/Y passed, failure details if any",
|
||||
recommendation: "Summarize: '12/12 tests passing' or '11/12 passing (1 flaky test)'",
|
||||
savings: `~${section.content.length * 0.75} tokens`
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 3: Summary Redundancy
|
||||
|
||||
**Issue**: Summary repeats information already in sections
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
overlap = calculateOverlap(task.summary, task.sections)
|
||||
|
||||
if (overlap > 40) {
|
||||
return {
|
||||
pattern: "High summary-section redundancy",
|
||||
severity: "INFO",
|
||||
found: `${overlap}% overlap between summary and sections`,
|
||||
expected: "≤ 20% overlap (summary = high-level, sections = details)",
|
||||
recommendation: "Make summary more concise, or add new details to sections",
|
||||
savings: `~${estimateRedundantTokens(task)} tokens`
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 4: Filler Language
|
||||
|
||||
**Issue**: Verbose, unnecessary words that don't add information
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
fillerPhrases = [
|
||||
"I have successfully",
|
||||
"as requested",
|
||||
"in order to",
|
||||
"it should be noted that",
|
||||
"for the purpose of",
|
||||
"with regards to",
|
||||
"in conclusion"
|
||||
]
|
||||
|
||||
found = fillerPhrases.filter(phrase => task.summary.includes(phrase))
|
||||
|
||||
if (found.length > 0) {
|
||||
return {
|
||||
pattern: "Filler language in summary",
|
||||
severity: "INFO",
|
||||
found: found.join(", "),
|
||||
expected: "Direct, concise language",
|
||||
recommendation: "Remove filler: 'Implemented X' not 'I have successfully implemented X as requested'",
|
||||
savings: `~${found.length * 3} tokens`
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 5: Over-Explaining Obvious
|
||||
|
||||
**Issue**: Explaining what's clear from file/function names
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (section.title == "Implementation" && containsObvious(section.content)) {
|
||||
return {
|
||||
pattern: "Over-explaining obvious implementation",
|
||||
severity: "INFO",
|
||||
example: "Explaining 'UserService manages users' when class is named UserService",
|
||||
recommendation: "Focus on non-obvious: design decisions, trade-offs, gotchas",
|
||||
savings: "~100-200 tokens"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 6: Uncustomized Template Sections
|
||||
|
||||
**Issue**: Generic template sections with placeholder text that provide zero value
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
placeholderPatterns = [
|
||||
/\[Component\s*\d*\]/i,
|
||||
/\[Library\s*Name\]/i,
|
||||
/\[Phase\s*Name\]/i,
|
||||
/\[Library\]/i,
|
||||
/\[Version\]/i,
|
||||
/\[What it does\]/i,
|
||||
/\[Why chosen\]/i,
|
||||
/\[Goal\]:/i,
|
||||
/\[Deliverables\]:/i
|
||||
]
|
||||
|
||||
for (section in task.sections) {
|
||||
// Check for placeholder patterns
|
||||
hasPlaceholder = placeholderPatterns.some(pattern => pattern.test(section.content))
|
||||
|
||||
// Check for generic template titles with minimal content
|
||||
genericTitles = ["Architecture Overview", "Key Dependencies", "Implementation Strategy"]
|
||||
isGenericTitle = genericTitles.includes(section.title)
|
||||
hasMinimalCustomization = section.content.length < 300 || section.content.includes('[')
|
||||
|
||||
if (hasPlaceholder || (isGenericTitle && hasMinimalCustomization)) {
|
||||
return {
|
||||
pattern: "Uncustomized template section",
|
||||
severity: "WARN", // High priority - significant token waste
|
||||
found: `Section "${section.title}" contains placeholder text or generic template`,
|
||||
expected: "Task-specific content ≥200 chars, OR delete section entirely",
|
||||
recommendation: "DELETE section using manage_sections(operation='delete', id='${section.id}') - Templates provide sufficient structure",
|
||||
savings: `~${estimateTokens(section.content)} tokens`,
|
||||
sectionId: section.id,
|
||||
action: "DELETE" // Explicit action to take
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Common Placeholder Patterns**:
|
||||
- `[Component 1]`, `[Component 2]` - Generic component names
|
||||
- `[Library Name]`, `[Version]` - Dependency table placeholders
|
||||
- `[Phase Name]`, `[Goal]:`, `[Deliverables]:` - Implementation strategy placeholders
|
||||
- `[What it does]`, `[Why chosen]` - Generic explanations
|
||||
|
||||
**Examples of Violations**:
|
||||
|
||||
**Bad Example 1 - Architecture Overview with placeholders**:
|
||||
```markdown
|
||||
Title: Architecture Overview
|
||||
Content:
|
||||
This task involves the following components:
|
||||
- [Component 1]: [What it does]
|
||||
- [Component 2]: [What it does]
|
||||
|
||||
Technical approach:
|
||||
- [Library Name] for [functionality]
|
||||
- [Library Name] for [functionality]
|
||||
|
||||
(72 tokens of waste - DELETE this section)
|
||||
```
|
||||
|
||||
**Bad Example 2 - Key Dependencies with placeholders**:
|
||||
```markdown
|
||||
Title: Key Dependencies
|
||||
Content:
|
||||
| Library | Version | Purpose |
|
||||
|---------|---------|---------|
|
||||
| [Library Name] | [Version] | [What it does] |
|
||||
| [Library Name] | [Version] | [What it does] |
|
||||
|
||||
Rationale:
|
||||
- [Library]: [Why chosen]
|
||||
|
||||
(85 tokens of waste - DELETE this section)
|
||||
```
|
||||
|
||||
**Bad Example 3 - Implementation Strategy with placeholders**:
|
||||
```markdown
|
||||
Title: Implementation Strategy
|
||||
Content:
|
||||
Phase 1: [Phase Name]
|
||||
- Goal: [Goal]
|
||||
- Deliverables: [Deliverables]
|
||||
|
||||
Phase 2: [Phase Name]
|
||||
- Goal: [Goal]
|
||||
- Deliverables: [Deliverables]
|
||||
|
||||
(98 tokens of waste - DELETE this section)
|
||||
```
|
||||
|
||||
**Proper Response When Detected**:
|
||||
```markdown
|
||||
⚠️ WARN - Uncustomized Template Sections (Pattern 6)
|
||||
|
||||
**Found**: 3 task sections contain placeholder text, wasting ~255 tokens
|
||||
|
||||
**Violations**:
|
||||
1. Task [ID] - Section "Architecture Overview" (72 tokens)
|
||||
- Placeholder patterns: `[Component 1]`, `[What it does]`
|
||||
- **Action**: DELETE section (ID: xxx)
|
||||
- **Reason**: Templates provide sufficient structure
|
||||
|
||||
2. Task [ID] - Section "Key Dependencies" (85 tokens)
|
||||
- Placeholder patterns: `[Library Name]`, `[Version]`, `[Why chosen]`
|
||||
- **Action**: DELETE section (ID: yyy)
|
||||
- **Reason**: Generic table with no actual dependencies
|
||||
|
||||
3. Task [ID] - Section "Implementation Strategy" (98 tokens)
|
||||
- Placeholder patterns: `[Phase Name]`, `[Goal]:`, `[Deliverables]:`
|
||||
- **Action**: DELETE section (ID: zzz)
|
||||
- **Reason**: Uncustomized phases with no specific strategy
|
||||
|
||||
**Expected**: Task-specific content ≥200 chars with NO placeholder text, OR delete section entirely
|
||||
|
||||
**Recommendation**:
|
||||
- Planning Specialist must customize ALL sections before returning to orchestrator (Step 7.5 validation)
|
||||
- Implementation Specialists must DELETE any placeholder sections during Step 4
|
||||
- Templates provide sufficient structure for 95% of tasks (complexity ≤7)
|
||||
|
||||
**Root Cause**: Planning Specialist's bulkCreate operation included generic template sections without customization
|
||||
|
||||
**Prevention**:
|
||||
1. Planning Specialist Step 7.5 (Validate Task Quality) must detect and delete placeholder sections
|
||||
2. Implementation Specialists Step 4 must check for and delete placeholder sections
|
||||
3. Orchestration QA Skill now detects this pattern automatically
|
||||
|
||||
**Token Savings**: ~255 tokens (current waste) → 0 tokens (after deletion)
|
||||
```
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
### Step 1: Capture Baseline
|
||||
|
||||
**Before specialist executes**:
|
||||
```javascript
|
||||
baseline = {
|
||||
taskId: task.id,
|
||||
summaryLength: task.summary?.length || 0,
|
||||
sectionCount: task.sections.length,
|
||||
totalTokens: estimateTaskTokens(task)
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Measure Addition
|
||||
|
||||
**After specialist completes**:
|
||||
```javascript
|
||||
delta = {
|
||||
summaryAdded: task.summary.length - baseline.summaryLength,
|
||||
sectionsAdded: task.sections.length - baseline.sectionCount,
|
||||
tokensAdded: estimateTaskTokens(task) - baseline.totalTokens
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Analyze Quality
|
||||
|
||||
**Run quality checks**:
|
||||
```javascript
|
||||
analysis = {
|
||||
informationDensity: calculateDensity(task, delta),
|
||||
redundancyScore: calculateRedundancy(task),
|
||||
codeRatio: calculateCodeRatio(task),
|
||||
summaryQuality: scoreSummary(task.summary),
|
||||
sectionUsefulness: task.sections.map(s => scoreSection(s)),
|
||||
wastefulPatterns: detectWaste(task)
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Generate Report
|
||||
|
||||
**Format findings**:
|
||||
```javascript
|
||||
report = {
|
||||
specialist: entityType,
|
||||
taskId: task.id,
|
||||
tokensAdded: delta.tokensAdded,
|
||||
quality: {
|
||||
informationDensity: `${analysis.informationDensity}%`,
|
||||
redundancy: `${analysis.redundancyScore}%`,
|
||||
codeRatio: `${analysis.codeRatio}%`,
|
||||
summaryScore: `${analysis.summaryQuality}/100`,
|
||||
avgSectionScore: average(analysis.sectionUsefulness)
|
||||
},
|
||||
wastefulPatterns: analysis.wastefulPatterns,
|
||||
potentialSavings: calculateSavings(analysis.wastefulPatterns)
|
||||
}
|
||||
```
|
||||
|
||||
### Step 5: Track Trends
|
||||
|
||||
**Aggregate across tasks**:
|
||||
```javascript
|
||||
session.contentQuality.push(report)
|
||||
|
||||
// After N tasks (e.g., 5), analyze trends
|
||||
if (session.contentQuality.length >= 5) {
|
||||
trends = analyzeTrends(session.contentQuality)
|
||||
// e.g., "Backend Engineer consistently has high code ratio (avg 65%)"
|
||||
}
|
||||
```
|
||||
|
||||
## Report Template
|
||||
|
||||
```markdown
|
||||
## 📊 Task Content Quality Analysis
|
||||
|
||||
**Specialist**: [Backend Engineer / Frontend Developer / etc.]
|
||||
**Task**: [Task Title] ([ID])
|
||||
|
||||
### Tokens Added
|
||||
- Summary: [X] chars ([Y] tokens)
|
||||
- Sections: [N] sections added ([Z] tokens)
|
||||
- **Total Added**: [Y+Z] tokens
|
||||
|
||||
### Quality Metrics
|
||||
- **Information Density**: [X]% ([Target: ≥70%])
|
||||
- **Redundancy Score**: [Y]% ([Target: ≤20%])
|
||||
- **Code Ratio**: [Z]% ([Target: ≤30%])
|
||||
- **Summary Quality**: [Score]/100
|
||||
|
||||
### ✅ Strengths
|
||||
- [What was done well]
|
||||
- [Good practice observed]
|
||||
|
||||
### ⚠️ Wasteful Patterns Detected ([count])
|
||||
|
||||
**Pattern 1: [Name]**
|
||||
- Found: [What was observed]
|
||||
- Expected: [Best practice]
|
||||
- Recommendation: [How to improve]
|
||||
- Potential Savings: ~[X] tokens
|
||||
|
||||
**Pattern 2: [Name]**
|
||||
- Found: [What was observed]
|
||||
- Expected: [Best practice]
|
||||
- Recommendation: [How to improve]
|
||||
- Potential Savings: ~[Y] tokens
|
||||
|
||||
### 💰 Total Potential Savings
|
||||
- Current: [N] tokens added
|
||||
- Optimized: [N-X-Y] tokens
|
||||
- **Savings**: ~[X+Y] tokens ([Z]% reduction)
|
||||
|
||||
### 🎯 Specific Recommendations
|
||||
1. [Most impactful improvement]
|
||||
2. [Secondary improvement]
|
||||
3. [Optional enhancement]
|
||||
```
|
||||
|
||||
## Trend Analysis (After 5+ Tasks)
|
||||
|
||||
```markdown
|
||||
## 📈 Content Quality Trends
|
||||
|
||||
**Session**: [N] tasks analyzed
|
||||
**Specialists**: [List of specialists used]
|
||||
|
||||
### Average Metrics
|
||||
- Information Density: [X]% (Target: ≥70%)
|
||||
- Redundancy: [Y]% (Target: ≤20%)
|
||||
- Code Ratio: [Z]% (Target: ≤30%)
|
||||
- Summary Quality: [Score]/100
|
||||
|
||||
### Recurring Patterns
|
||||
**Most Common Issue**: [Pattern name] ([N] occurrences)
|
||||
- **Specialists Affected**: [Backend Engineer (3x), Frontend (2x)]
|
||||
- **Total Waste**: ~[X] tokens across tasks
|
||||
- **Recommendation**: Update [specialist].md to emphasize [practice]
|
||||
|
||||
**Second Most Common**: [Pattern name] ([M] occurrences)
|
||||
- **Specialists Affected**: [...]
|
||||
- **Recommendation**: [...]
|
||||
|
||||
### Specialist Performance
|
||||
|
||||
**Backend Engineer** ([N] tasks):
|
||||
- Avg Density: [X]%
|
||||
- Avg Redundancy: [Y]%
|
||||
- Common Issue: High code ratio (avg [Z]%)
|
||||
- **Recommendation**: Reference files instead of embedding code
|
||||
|
||||
**Frontend Developer** ([M] tasks):
|
||||
- Avg Density: [X]%
|
||||
- Avg Redundancy: [Y]%
|
||||
- Strengths: Excellent summary quality (avg 85/100)
|
||||
|
||||
### System-Wide Opportunities
|
||||
1. **Update Specialist Templates**
|
||||
- Add "Code in Files, Not Sections" guideline to all implementation specialists
|
||||
- Estimated Impact: [X]% token reduction
|
||||
|
||||
2. **Enhance Summary Guidelines**
|
||||
- Add anti-pattern examples (filler language)
|
||||
- Estimated Impact: [Y]% improvement in quality scores
|
||||
|
||||
3. **Section Template Improvements**
|
||||
- Provide better examples of useful vs wasteful sections
|
||||
- Estimated Impact: [Z]% reduction in redundancy
|
||||
```
|
||||
|
||||
## Integration with Post-Execution Review
|
||||
|
||||
```javascript
|
||||
// In post-execution.md, after Step 4 (Validate completion quality):
|
||||
|
||||
if (isImplementationSpecialist(entityType)) {
|
||||
// Read task-content-quality.md
|
||||
Read ".claude/skills/orchestration-qa/task-content-quality.md"
|
||||
|
||||
// Run content quality analysis
|
||||
contentAnalysis = analyzeTaskContent(task, baseline)
|
||||
|
||||
// Add to report
|
||||
report.contentQuality = contentAnalysis
|
||||
|
||||
// Track for trends
|
||||
session.contentQuality.push(contentAnalysis)
|
||||
|
||||
// If patterns found, add to deviations
|
||||
if (contentAnalysis.wastefulPatterns.length > 0) {
|
||||
deviations.push({
|
||||
severity: "INFO", // Usually INFO, can be WARN if severe
|
||||
type: "Content Quality",
|
||||
patterns: contentAnalysis.wastefulPatterns,
|
||||
savings: contentAnalysis.potentialSavings
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## When to Report
|
||||
|
||||
**Individual Task**:
|
||||
- Report if wasteful patterns detected
|
||||
- Report if quality scores below targets
|
||||
|
||||
**Session Trends**:
|
||||
- After 5+ tasks analyzed
|
||||
- When recurring patterns detected (same issue 2+ times)
|
||||
- At session end (via `phase="summary"`)
|
||||
|
||||
## Add to TodoWrite (If Issues Found)
|
||||
|
||||
```javascript
|
||||
if (contentAnalysis.potentialSavings > 100) {
|
||||
TodoWrite([{
|
||||
content: `Review ${specialist} content quality: ${contentAnalysis.potentialSavings} tokens wasted`,
|
||||
activeForm: `Reviewing ${specialist} content patterns`,
|
||||
status: "pending"
|
||||
}])
|
||||
}
|
||||
|
||||
// If recurring pattern
|
||||
if (trends.recurringPatterns.length > 0) {
|
||||
TodoWrite([{
|
||||
content: `Update ${specialist}.md: ${trends.recurringPatterns[0].name} pattern recurring`,
|
||||
activeForm: `Improving ${specialist} guidelines`,
|
||||
status: "pending"
|
||||
}])
|
||||
}
|
||||
```
|
||||
|
||||
## Target Benchmarks
|
||||
|
||||
**Excellent** (95%+ of metrics in target):
|
||||
- Information Density: ≥ 80%
|
||||
- Redundancy: ≤ 15%
|
||||
- Code Ratio: ≤ 20%
|
||||
- Summary Quality: ≥ 85/100
|
||||
- No wasteful patterns
|
||||
|
||||
**Good** (80%+ of metrics in target):
|
||||
- Information Density: 70-79%
|
||||
- Redundancy: 16-20%
|
||||
- Code Ratio: 21-30%
|
||||
- Summary Quality: 70-84/100
|
||||
- Minor wasteful patterns (< 100 tokens waste)
|
||||
|
||||
**Needs Improvement** (< 80% in target):
|
||||
- Information Density: < 70%
|
||||
- Redundancy: > 20%
|
||||
- Code Ratio: > 30%
|
||||
- Summary Quality: < 70/100
|
||||
- Significant waste (> 100 tokens)
|
||||
|
||||
## Continuous Improvement
|
||||
|
||||
**Track over time**:
|
||||
- Are quality scores improving?
|
||||
- Are wasteful patterns decreasing?
|
||||
- Which specialists need guideline updates?
|
||||
- What best practices emerge from high-quality tasks?
|
||||
|
||||
**Update specialist definitions when**:
|
||||
- Same pattern occurs 3+ times
|
||||
- Potential savings > 500 tokens across multiple tasks
|
||||
- Quality scores consistently below targets
|
||||
230
skills/orchestration-qa/token-optimization.md
Normal file
230
skills/orchestration-qa/token-optimization.md
Normal file
@@ -0,0 +1,230 @@
|
||||
# Token Optimization Analysis
|
||||
|
||||
**Purpose**: Identify token waste patterns and optimization opportunities.
|
||||
|
||||
**When**: Optional post-execution (controlled by enableEfficiencyAnalysis parameter)
|
||||
|
||||
**Applies To**: All Skills and Subagents
|
||||
|
||||
**Token Cost**: ~400-600 tokens
|
||||
|
||||
## Common Token Waste Patterns
|
||||
|
||||
### Pattern 1: Verbose Specialist Output
|
||||
|
||||
**Issue**: Specialist returns full code/documentation in response instead of brief summary
|
||||
|
||||
**Expected**: Specialists return 50-100 token summary, detailed work goes in sections/files
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (isImplementationSpecialist(entityType) && estimateTokens(output) > 200) {
|
||||
return {
|
||||
severity: "WARN",
|
||||
pattern: "Verbose specialist output",
|
||||
actual: estimateTokens(output),
|
||||
expected: "50-100 tokens",
|
||||
savings: estimateTokens(output) - 100,
|
||||
recommendation: "Return brief summary, put details in task sections"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 2: Reading with includeSections When Not Needed
|
||||
|
||||
**Issue**: Loading all sections when only metadata needed
|
||||
|
||||
**Expected**: Use scoped `overview` for hierarchical views without sections
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (mentions(output, "includeSections=true") && !needsSections(workflowType)) {
|
||||
return {
|
||||
severity: "INFO",
|
||||
pattern: "Unnecessary section loading",
|
||||
recommendation: "Use operation='overview' for metadata + task list",
|
||||
savings: "85-93% tokens (e.g., 18.5k → 1.2k for typical feature)"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 3: Multiple Get Operations Instead of Overview
|
||||
|
||||
**Issue**: Calling `query_container(operation="get")` multiple times instead of one overview
|
||||
|
||||
**Expected**: Single scoped overview provides hierarchical view efficiently
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
getCallCount = countToolCalls(output, "query_container", "get")
|
||||
if (getCallCount > 1 && !usedOverview) {
|
||||
return {
|
||||
severity: "INFO",
|
||||
pattern: "Multiple get calls instead of overview",
|
||||
actual: `${getCallCount} get calls`,
|
||||
expected: "1 scoped overview call",
|
||||
savings: estimateSavings(getCallCount)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 4: Listing All Entities When Filtering Would Work
|
||||
|
||||
**Issue**: Querying all tasks then filtering in code
|
||||
|
||||
**Expected**: Use query parameters (status, tags, priority) for filtering
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (mentions(output, "filter") && !usedQueryFilters) {
|
||||
return {
|
||||
severity: "INFO",
|
||||
pattern: "Client-side filtering instead of query filters",
|
||||
recommendation: "Use status/tags/priority parameters in query_container",
|
||||
savings: "~50-70% tokens"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 5: PRD Content in Description Instead of Sections
|
||||
|
||||
**Issue**: Feature Architect puts all PRD content in description field
|
||||
|
||||
**Expected**: Description is forward-looking summary; PRD sections go in feature sections
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (entityType == "feature-architect" && feature.description.length > 800) {
|
||||
return {
|
||||
severity: "WARN",
|
||||
pattern: "PRD content in description field",
|
||||
actual: `${feature.description.length} chars`,
|
||||
expected: "200-500 chars description + sections for detailed content",
|
||||
recommendation: "Move detailed content to feature sections"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 6: Verbose Feature Architect Handoff
|
||||
|
||||
**Issue**: Feature Architect returns detailed feature explanation
|
||||
|
||||
**Expected**: Minimal handoff (50-100 tokens): "Feature created, ID: X, Y tasks ready"
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (entityType == "feature-architect" && estimateTokens(output) > 200) {
|
||||
return {
|
||||
severity: "WARN",
|
||||
pattern: "Verbose Feature Architect handoff",
|
||||
actual: estimateTokens(output),
|
||||
expected: "50-100 tokens",
|
||||
savings: estimateTokens(output) - 100,
|
||||
recommendation: "Brief handoff: Feature ID, next action. Details in feature sections."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
### Step 1: Estimate Token Usage
|
||||
|
||||
```javascript
|
||||
tokenUsage = {
|
||||
input: estimateTokens(context.userInput.fullText),
|
||||
output: estimateTokens(entityOutput),
|
||||
total: estimateTokens(context.userInput.fullText) + estimateTokens(entityOutput)
|
||||
}
|
||||
```
|
||||
|
||||
### Step 2: Compare Against Expected Range
|
||||
|
||||
```javascript
|
||||
expectedRange = definition.tokenRange // e.g., [1800, 2200] for Feature Architect
|
||||
deviation = tokenUsage.output - expectedRange[1]
|
||||
|
||||
if (deviation > expectedRange[1] * 0.5) { // More than 50% over
|
||||
severity = "WARN"
|
||||
} else {
|
||||
severity = "INFO"
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Detect Waste Patterns
|
||||
|
||||
```javascript
|
||||
wastePatterns = []
|
||||
|
||||
// Check each pattern
|
||||
if (verboseSpecialistOutput()) wastePatterns.push(pattern1)
|
||||
if (unnecessarySectionLoading()) wastePatterns.push(pattern2)
|
||||
if (multipleGetsInsteadOfOverview()) wastePatterns.push(pattern3)
|
||||
if (clientSideFiltering()) wastePatterns.push(pattern4)
|
||||
if (prdContentInDescription()) wastePatterns.push(pattern5)
|
||||
if (verboseHandoff()) wastePatterns.push(pattern6)
|
||||
```
|
||||
|
||||
### Step 4: Calculate Potential Savings
|
||||
|
||||
```javascript
|
||||
totalSavings = wastePatterns.reduce((sum, pattern) => sum + pattern.savings, 0)
|
||||
optimizedTokens = tokenUsage.total - totalSavings
|
||||
efficiencyGain = (totalSavings / tokenUsage.total) * 100
|
||||
```
|
||||
|
||||
### Step 5: Generate Report
|
||||
|
||||
```markdown
|
||||
## 💡 Token Optimization Opportunities
|
||||
|
||||
**Current Usage**: [X] tokens
|
||||
**Potential Savings**: [Y] tokens ([Z]% reduction)
|
||||
**Optimized Usage**: [X - Y] tokens
|
||||
|
||||
### Patterns Detected ([count])
|
||||
|
||||
**⚠️ WARN** ([count]): Significant waste
|
||||
- Verbose specialist output: [X] tokens (expected 50-100)
|
||||
- PRD content in description: [Y] chars (expected 200-500)
|
||||
|
||||
**ℹ️ INFO** ([count]): Optimization opportunities
|
||||
- Use overview instead of get: [savings] tokens
|
||||
- Use query filters: [savings] tokens
|
||||
|
||||
### Recommendations
|
||||
1. [Most impactful optimization]
|
||||
2. [Secondary optimization]
|
||||
```
|
||||
|
||||
## Recommended Baselines
|
||||
|
||||
- **Skills**: 200-900 tokens (lightweight coordination)
|
||||
- **Feature Architect**: 1800-2200 tokens (complexity assessment + creation)
|
||||
- **Planning Specialist**: 1800-2200 tokens (analysis + task creation)
|
||||
- **Implementation Specialists**: 1800-2200 tokens (work done, not described)
|
||||
- **Output**: 50-100 tokens (brief summary)
|
||||
- **Sections**: Detailed work (not counted against specialist)
|
||||
|
||||
## When to Report
|
||||
|
||||
- **Only if** enableEfficiencyAnalysis=true
|
||||
- **WARN**: Include in post-execution report
|
||||
- **INFO**: Log for pattern tracking only
|
||||
|
||||
## Integration Example
|
||||
|
||||
```javascript
|
||||
if (params.enableEfficiencyAnalysis) {
|
||||
Read "token-optimization.md"
|
||||
opportunities = analyzeTokenOptimization(entityType, entityOutput, context)
|
||||
|
||||
if (opportunities.length > 0) {
|
||||
report.efficiencyAnalysis = {
|
||||
currentUsage: tokenUsage.total,
|
||||
savings: totalSavings,
|
||||
gain: efficiencyGain,
|
||||
opportunities: opportunities
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
153
skills/orchestration-qa/tool-selection.md
Normal file
153
skills/orchestration-qa/tool-selection.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# Tool Selection Efficiency
|
||||
|
||||
**Purpose**: Verify optimal tool selection for the task at hand.
|
||||
|
||||
**When**: Optional post-execution (controlled by enableEfficiencyAnalysis parameter)
|
||||
|
||||
**Token Cost**: ~300-500 tokens
|
||||
|
||||
## Optimal Tool Selection Patterns
|
||||
|
||||
### Pattern 1: query_container Overview vs Get
|
||||
|
||||
**Optimal**: Use `operation="overview"` for hierarchical views without section content
|
||||
|
||||
**Suboptimal**: Use `operation="get"` with `includeSections=true` when only need metadata + child list
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (usedGet && includeSections && !needsFullSections) {
|
||||
return {
|
||||
pattern: "Used get with sections when overview would suffice",
|
||||
current: "query_container(operation='get', includeSections=true)",
|
||||
optimal: "query_container(operation='overview', id='...')",
|
||||
savings: "85-93% tokens",
|
||||
when: "Need: feature metadata + task list (no section content)"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 2: Search vs Filtered Query
|
||||
|
||||
**Optimal**: Use `query_container` with filters for known criteria
|
||||
|
||||
**Suboptimal**: Use `operation="search"` when exact filters would work
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (usedSearch && hasExactCriteria) {
|
||||
return {
|
||||
pattern: "Used search when filtered query more efficient",
|
||||
current: "query_container(operation='search', query='pending tasks')",
|
||||
optimal: "query_container(operation='search', status='pending')",
|
||||
savings: "Query filters are faster and more precise"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 3: Bulk Operations vs Multiple Singles
|
||||
|
||||
**Optimal**: Use `operation="bulkUpdate"` for multiple updates
|
||||
|
||||
**Suboptimal**: Loop calling `update` multiple times
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
updateCount = countToolCalls(output, "manage_container", "update")
|
||||
if (updateCount >= 3) {
|
||||
return {
|
||||
pattern: "Multiple update calls instead of bulkUpdate",
|
||||
current: `${updateCount} separate update calls`,
|
||||
optimal: "1 bulkUpdate call",
|
||||
savings: `${updateCount - 1} round trips eliminated`
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 4: Scoped Overview vs Multiple Gets
|
||||
|
||||
**Optimal**: Single scoped overview for hierarchical view
|
||||
|
||||
**Suboptimal**: Multiple get calls for related entities
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (getCallCount >= 2 && queriedRelatedEntities) {
|
||||
return {
|
||||
pattern: "Multiple gets for related entities",
|
||||
current: `${getCallCount} get calls`,
|
||||
optimal: "1 scoped overview (returns entity + children)",
|
||||
savings: `${getCallCount - 1} tool calls eliminated`
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern 5: recommend_agent vs Manual Routing
|
||||
|
||||
**Optimal**: Use `recommend_agent` for specialist routing
|
||||
|
||||
**Suboptimal**: Manual tag analysis and routing logic
|
||||
|
||||
**Detection**:
|
||||
```javascript
|
||||
if (taskOrchestration && !usedRecommendAgent && launchedSpecialists) {
|
||||
return {
|
||||
pattern: "Manual specialist routing instead of recommend_agent",
|
||||
current: "Manual tag → specialist mapping",
|
||||
optimal: "recommend_agent(taskId) → automatic routing",
|
||||
benefit: "Centralized routing logic, consistent with agent-mapping.yaml"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Analysis Workflow
|
||||
|
||||
```javascript
|
||||
toolSelectionIssues = []
|
||||
|
||||
// Check each pattern
|
||||
checkOverviewVsGet()
|
||||
checkSearchVsFiltered()
|
||||
checkBulkOpsVsMultiple()
|
||||
checkScopedOverviewVsGets()
|
||||
checkRecommendAgentUsage()
|
||||
|
||||
// Generate report if issues found
|
||||
if (toolSelectionIssues.length > 0) {
|
||||
return {
|
||||
issuesFound: toolSelectionIssues.length,
|
||||
issues: toolSelectionIssues,
|
||||
recommendations: prioritizeRecommendations(toolSelectionIssues)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Report Template
|
||||
|
||||
```markdown
|
||||
## 🔧 Tool Selection Efficiency
|
||||
|
||||
**Suboptimal Patterns**: [count]
|
||||
|
||||
### Issues Detected
|
||||
|
||||
**ℹ️ INFO**: Use overview instead of get
|
||||
- Current: `query_container(operation='get', includeSections=true)`
|
||||
- Optimal: `query_container(operation='overview', id='...')`
|
||||
- Savings: 85-93% tokens
|
||||
|
||||
**ℹ️ INFO**: Use bulkUpdate instead of multiple updates
|
||||
- Current: [X] separate update calls
|
||||
- Optimal: 1 bulkUpdate call
|
||||
- Savings: [X-1] round trips
|
||||
|
||||
### Recommendations
|
||||
1. [Most impactful change]
|
||||
2. [Secondary optimization]
|
||||
```
|
||||
|
||||
## When to Report
|
||||
|
||||
- **Only if** enableEfficiencyAnalysis=true
|
||||
- **INFO** level (observations, not violations)
|
||||
- Include in efficiency analysis section
|
||||
Reference in New Issue
Block a user