Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:29:28 +08:00
commit 87c03319a3
50 changed files with 21409 additions and 0 deletions

View File

@@ -0,0 +1,573 @@
---
name: Orchestration QA
description: Quality assurance for orchestration workflows - validates Skills and Subagents follow documented patterns, tracks deviations, suggests improvements
---
# Orchestration QA Skill
## Overview
This skill provides quality assurance for Task Orchestrator workflows by validating that Skills and Subagents follow their documented patterns, detecting deviations, and suggesting continuous improvements.
**Key Capabilities:**
- **Interactive configuration** - User chooses which analyses to enable (token efficiency)
- **Pre-execution validation** - Context capture, checkpoint setting
- **Post-execution review** - Workflow adherence, output validation
- **Specialized quality analysis** - Execution graphs, tag coverage, information density
- **Efficiency analysis** - Token optimization, tool selection, parallelization
- **Deviation reporting** - Structured findings with severity (ALERT/WARN/INFO)
- **Pattern tracking** - Continuous improvement suggestions
**Philosophy:**
-**User-driven configuration** - Pay token costs only for analyses you want
-**Observe and validate** - Never blocks execution
-**Report transparently** - Clear severity levels (ALERT/WARN/INFO)
-**Learn from patterns** - Track issues, suggest improvements
-**Progressive loading** - Load only analysis needed for context
-**Not a blocker** - Warns about issues, doesn't stop workflows
-**Not auto-fix** - Asks user for decisions on deviations
## When to Use This Skill
### Interactive Configuration (FIRST TIME)
**Trigger**: First time using orchestration-qa in a session, or when user wants to change settings
**Action**: Ask user which analysis categories to enable (multiselect interface)
**Output**: Configuration stored in session, used for all subsequent reviews
**User Value**: Only pay token costs for analyses you actually want
### Session Initialization
**Trigger**: After configuration, at start of orchestration session
**Action**: Load knowledge bases (Skills, Subagents, routing config) based on enabled categories
**Output**: Initialization status with active configuration, ready signal
### Pre-Execution Validation
**Triggers**:
- "Create feature for X" (before Feature Orchestration Skill or Feature Architect)
- "Execute tasks" (before Task Orchestration Skill)
- "Mark complete" (before Status Progression Skill)
- Before launching any Skill or Subagent
**Action**: Capture context, set validation checkpoints
**Output**: Stored context for post-execution comparison
### Post-Execution Review
**Triggers**:
- After any Skill completes
- After any Subagent returns
- User asks: "Review quality", "Show QA results", "Any issues?"
**Action**: Validate workflow adherence, analyze quality, detect deviations
**Output**: Structured quality report with findings and recommendations
## Parameters
```typescript
{
phase: "init" | "pre" | "post" | "configure",
// For pre/post phases
entityType?: "feature-orchestration" | "task-orchestration" |
"status-progression" | "dependency-analysis" |
"feature-architect" | "planning-specialist" |
"backend-engineer" | "frontend-developer" |
"database-engineer" | "test-engineer" |
"technical-writer" | "bug-triage-specialist",
// For pre phase
userInput?: string, // Original user request
// For post phase
entityOutput?: string, // Output from Skill/Subagent
entityId?: string, // Feature/Task/Project ID (if applicable)
// Optional
verboseReporting?: boolean // Default: false (brief reports)
}
```
## Workflow
### Phase: configure (Interactive Configuration) - **ALWAYS RUN FIRST**
**Purpose**: Let user choose which analysis categories to enable for the session
**When**: Before init phase, or when user wants to change settings mid-session
**Interactive Prompts**:
Use AskUserQuestion to present configuration options:
```javascript
AskUserQuestion({
questions: [
{
question: "Which quality analysis categories would you like to enable for this session?",
header: "QA Categories",
multiSelect: true,
options: [
{
label: "Information Density",
description: "Analyze task content quality, detect wasteful patterns, measure information-to-token ratio (Specialists only)"
},
{
label: "Execution Graphs",
description: "Validate dependency graphs and parallel execution opportunities (Planning Specialist only)"
},
{
label: "Tag Coverage",
description: "Check tag consistency and agent-mapping coverage (Planning Specialist & Feature Architect)"
},
{
label: "Token Optimization",
description: "Identify token waste patterns (verbose output, unnecessary loading, redundant operations)"
},
{
label: "Tool Selection",
description: "Verify optimal tool usage (overview vs get, search vs filtered query, bulk operations)"
},
{
label: "Routing Validation",
description: "Detect Skills bypass violations (CRITICAL - status changes, feature creation, task execution)"
},
{
label: "Parallel Detection",
description: "Find missed parallelization opportunities (independent tasks, batch operations)"
}
]
},
{
question: "How detailed should QA reports be?",
header: "Report Style",
multiSelect: false,
options: [
{
label: "Brief",
description: "Only show critical issues (ALERT level) - minimal token usage"
},
{
label: "Standard",
description: "Show ALERT and WARN level issues with brief explanations"
},
{
label: "Detailed",
description: "Show all issues (ALERT/WARN/INFO) with full analysis and recommendations"
}
]
}
]
})
```
**Default Configuration** (if user skips configuration):
- ✅ Routing Validation (CRITICAL - always enabled)
- ✅ Information Density (for specialists)
- ❌ All other categories disabled
- Report style: Standard
**Configuration Storage**:
Store user preferences in session state:
```javascript
session.qaConfig = {
enabled: {
informationDensity: true/false,
executionGraphs: true/false,
tagCoverage: true/false,
tokenOptimization: true/false,
toolSelection: true/false,
routingValidation: true, // Always true (CRITICAL)
parallelDetection: true/false
},
reportStyle: "brief" | "standard" | "detailed"
}
```
**Token Cost**: ~200-300 tokens (one-time configuration)
### Phase: init (Session Initialization)
**Purpose**: Load knowledge bases for validation throughout session
**Steps**:
1. **If not configured**: Run configure phase first (interactive)
2. Read `initialization.md` for setup workflow
3. Glob `.claude/skills/*/SKILL.md` → extract Skills knowledge
- Parse skill name, triggers, workflows, tools, token ranges
4. Glob `.claude/agents/task-orchestrator/*.md` → extract Subagents knowledge
- Parse agent name, steps, critical patterns, output validation
5. Read `agent-mapping.yaml` → extract routing configuration
6. Initialize tracking state (deviations, patterns, improvements)
7. Report initialization status with active configuration
**Output**:
```javascript
{
initialized: true,
knowledgeBase: {
skillsCount: 5,
subagentsCount: 8,
routingLoaded: true
},
configuration: {
enabled: ["Information Density", "Routing Validation"],
disabled: ["Execution Graphs", "Tag Coverage", "Token Optimization", "Tool Selection", "Parallel Detection"],
reportStyle: "standard"
},
tracking: {
sessionStart: timestamp,
deviations: [],
patterns: []
}
}
```
**Token Cost**: ~800-1000 tokens (loads once per session)
### Phase: pre (Pre-Execution Validation)
**Purpose**: Capture context and set validation checkpoints before launching
**Steps**:
1. Read `pre-execution.md` for validation checklist
2. Identify entity type (Skill vs Subagent)
3. Capture original user input context
4. Set entity-specific validation checkpoints based on type:
- **Skills**: Expected workflow steps, tool usage, token range
- **Subagents**: Expected steps (8-9 steps), critical patterns, output format
5. Store context for post-execution comparison
6. Return ready signal
**Context Captured**:
- User's original request (full text)
- Expected mode (PRD/Interactive/Quick for Feature Architect)
- Entity type and anticipated complexity
- Validation checkpoints to verify after execution
**Output**:
```javascript
{
ready: true,
contextCaptured: true,
checkpoints: [
"Verify Skill assessed complexity correctly",
"Verify templates discovered and applied",
// ... entity-specific checkpoints
]
}
```
**Token Cost**: ~400-600 tokens
### Phase: post (Post-Execution Review)
**Purpose**: Validate workflow adherence, analyze quality, detect deviations
**Steps**:
#### 1. Load Post-Execution Workflow
Read `post-execution.md` for review process
#### 2. Determine Required Analyses
Based on entity type AND user configuration:
**Planning Specialist**:
- Always: `post-execution.md` → core workflow validation
- If `routingValidation` enabled: `routing-validation.md` → Skills usage check
- If `executionGraphs` enabled: `graph-quality.md` → execution graph validation
- If `tagCoverage` enabled: `tag-quality.md` → tag coverage analysis
**Feature Architect**:
- Always: `post-execution.md` → PRD extraction validation
- Always: Compare output vs original user input
- If `routingValidation` enabled: `routing-validation.md` → agent-mapping check
- If `tagCoverage` enabled: `tag-quality.md` → tag consistency check
**Implementation Specialists** (Backend, Frontend, Database, Test, Technical Writer):
- Always: `post-execution.md` → lifecycle steps verification
- If `routingValidation` enabled: `routing-validation.md` → Status Progression Skill usage
- If `informationDensity` enabled: `task-content-quality.md` → content quality analysis
- Always: Verify summary (300-500 chars), Files Changed section, test results
**All Skills**:
- Always: Read skill definition from knowledge base
- Always: Verify expected workflow steps followed
- Always: Check tool usage matches expected patterns
- Always: Validate token range
#### 3. Conditional Efficiency Analysis
Based on user configuration:
- If `tokenOptimization` enabled: Read `token-optimization.md` → identify token waste
- If `toolSelection` enabled: Read `tool-selection.md` → verify optimal tool usage
- If `parallelDetection` enabled: Read `parallel-detection.md` → find missed parallelization
#### 4. Deviation Detection
Compare actual execution against expected patterns:
- **ALERT**: Critical violations (status bypass, cross-domain tasks, missing requirements)
- **WARN**: Process issues (verbose output, skipped steps, suboptimal dependencies)
- **INFO**: Observations (efficiency opportunities, quality patterns)
#### 5. Reporting
If deviations found:
- Read `deviation-templates.md` → format report
- Add to TodoWrite with appropriate severity
- If ALERT: Report immediately to user with decision prompt
- If WARN: Log for end-of-session summary
- If INFO: Track for pattern analysis
#### 6. Pattern Tracking
Read `pattern-tracking.md` → continuous improvement:
- Check for recurring issues (count >= 2 in session)
- Suggest definition improvements if patterns detected
- Track for session summary
**Output**:
```javascript
{
workflowAdherence: "8/8 steps followed (100%)",
expectedOutputs: "7/7 present",
deviations: [
{
severity: "ALERT",
issue: "Cross-domain task detected",
details: "Task mixes backend + frontend",
recommendation: "Split into domain-isolated tasks"
}
],
analyses: {
graphQuality: "95%",
tagCoverage: "100%",
tokenEfficiency: "85%"
},
recommendations: [
"Update planning-specialist.md to enforce domain isolation",
"Add validation checklist for cross-domain detection"
]
}
```
**Token Cost**:
- Basic validation: ~600-800 tokens
- With specialized analysis (Planning Specialist): ~1500-2000 tokens
- With efficiency analysis: +800-1200 tokens
## Progressive Loading Strategy
**Optimization**: Load only the analysis docs needed based on entity type AND user configuration
### Configuration-Driven Loading
**Core Loading** (always loaded regardless of config):
- `post-execution.md` → base workflow validation
- Skill/Subagent definition from knowledge base
- Entity-specific mandatory checks (summary, files changed, etc.)
**Conditional Loading** (based on user configuration):
```javascript
// Planning Specialist
if (config.routingValidation) Read routing-validation.md
if (config.executionGraphs) Read graph-quality.md
if (config.tagCoverage) Read tag-quality.md
// Feature Architect
if (config.routingValidation) Read routing-validation.md
if (config.tagCoverage) Read tag-quality.md
// Implementation Specialists (Backend, Frontend, Database, Test, Technical Writer)
if (config.routingValidation) Read routing-validation.md
if (config.informationDensity) Read task-content-quality.md
// All Entities
if (config.tokenOptimization) Read token-optimization.md
if (config.toolSelection) Read tool-selection.md
if (config.parallelDetection) Read parallel-detection.md
// Reporting
if (deviations.length > 0) Read deviation-templates.md
if (session.deviations.count >= 2) Read pattern-tracking.md
```
### Token Savings Examples
**Example 1: User only wants Information Density feedback**
- Configuration: Only "Information Density" enabled
- Loaded for Backend Engineer: `post-execution.md` + `task-content-quality.md` = ~1,200 tokens
- Skipped: `routing-validation.md`, `token-optimization.md`, `tool-selection.md`, `parallel-detection.md` = ~2,400 tokens saved
- **Savings: 67% reduction**
**Example 2: User wants minimal CRITICAL validation only**
- Configuration: Only "Routing Validation" enabled
- Loaded: `post-execution.md` + `routing-validation.md` = ~1,000 tokens
- Skipped: All other analysis docs = ~3,500 tokens saved
- **Savings: 78% reduction**
**Example 3: User wants comprehensive Planning Specialist review**
- Configuration: All categories enabled
- Loaded: `post-execution.md` + `graph-quality.md` + `tag-quality.md` + `routing-validation.md` + efficiency docs = ~3,500 tokens
- Skipped: None (comprehensive mode)
- **Savings: 0% (full analysis)**
### Special Cases
**Task Orchestration Skill**:
- `parallel-detection.md` always loaded if enabled in config (core to this skill's purpose)
**Status Progression Skill**:
- `routing-validation.md` always loaded if enabled in config (CRITICAL - status bypass detection)
## Output Format
### Success (No Deviations)
```markdown
**QA Review**: [Entity Name]
Workflow adherence: 100%
All quality checks passed.
[If efficiency analysis enabled:]
Token efficiency: 85% (identified 2 optimization opportunities)
```
### Issues Found
```markdown
## QA Review: [Entity Name]
**Workflow Adherence:** X/Y steps (Z%)
### ✅ Successes
- [Success 1]
- [Success 2]
### ⚠️ Issues Detected
**🚨 ALERT**: [Critical issue]
- Impact: [What this affects]
- Found: [What was observed]
- Expected: [What should have happened]
- Recommendation: [How to fix]
**⚠️ WARN**: [Process issue]
- Found: [What was observed]
- Expected: [What should have happened]
### 📋 Added to TodoWrite
- Review [Entity]: [Issue description]
- Improvement: [Suggestion]
### 🎯 Recommendations
1. [Most critical action]
2. [Secondary action]
### 💭 Decision Required
[If user decision needed, present options]
```
## Integration with Orchestrator
**Recommended Pattern**:
```javascript
// 1. FIRST TIME: Interactive configuration
Use orchestration-qa skill (phase="configure")
// Agent asks user which analysis categories to enable
// User selects: "Information Density" + "Routing Validation"
// Configuration stored in session
// 2. Session initialization
Use orchestration-qa skill (phase="init")
// Returns: Initialized with [2] analysis categories enabled
// 3. Before launching Feature Architect
Use orchestration-qa skill (
phase="pre",
entityType="feature-architect",
userInput="[user's original request]"
)
// 4. Launch Feature Architect
Task(subagent_type="Feature Architect", prompt="...")
// 5. After Feature Architect returns
Use orchestration-qa skill (
phase="post",
entityType="feature-architect",
entityOutput="[subagent's response]",
entityId="feature-uuid"
)
// Only loads: post-execution.md + routing-validation.md (user config)
// Skips: graph-quality.md, tag-quality.md, token-optimization.md (not enabled)
// 6. Review QA findings, take action if needed
```
**Mid-Session Reconfiguration**:
```javascript
// User: "I want to also track token optimization now"
Use orchestration-qa skill (phase="configure")
// Agent asks again, pre-selects current config
// User adds "Token Optimization" to enabled categories
// New config stored, affects all subsequent post-execution reviews
```
## Supporting Documentation
This skill uses progressive loading to minimize token usage. Supporting docs are read as needed:
- **initialization.md** - Session setup workflow
- **pre-execution.md** - Context capture and checkpoint setting
- **post-execution.md** - Core review workflow for all entities
- **graph-quality.md** - Planning Specialist: execution graph analysis
- **tag-quality.md** - Planning Specialist: tag coverage validation
- **task-content-quality.md** - Implementation Specialists: information density and wasteful pattern detection
- **token-optimization.md** - Efficiency: identify token waste patterns
- **tool-selection.md** - Efficiency: verify optimal tool usage
- **parallel-detection.md** - Efficiency: find missed parallelization
- **routing-validation.md** - Critical: Skills vs Direct tool violations
- **deviation-templates.md** - User report formatting by severity
- **pattern-tracking.md** - Continuous improvement tracking
## Token Efficiency
**Current Trainer** (monolithic): ~20k-30k tokens always loaded
**Orchestration QA Skill** (configuration-driven progressive loading):
- Configure phase: ~200-300 tokens (one-time, interactive)
- Init phase: ~1000 tokens (one-time per session)
- Pre-execution: ~600 tokens (per entity)
- Post-execution (varies by configuration):
- **Minimal** (routing only): ~800-1000 tokens
- **Standard** (info density + routing): ~1200-1500 tokens
- **Planning Specialist** (graphs + tags + routing): ~2000-2500 tokens
- **Comprehensive** (all categories): ~3500-4000 tokens
**Configuration Impact Examples**:
| User Configuration | Token Cost | vs Monolithic | vs Default |
|-------------------|------------|---------------|------------|
| Information Density only | ~1,200 tokens | 94% savings | 67% savings |
| Routing Validation only | ~1,000 tokens | 95% savings | 78% savings |
| Default (Info + Routing) | ~1,500 tokens | 93% savings | baseline |
| Comprehensive (all enabled) | ~4,000 tokens | 80% savings | -167% |
**Smart Defaults**: Most users only need Information Density + Routing Validation, achieving 93% token reduction while catching critical issues and wasteful content.
## Quality Metrics
Track these metrics across sessions:
- Workflow adherence percentage
- Deviation count by severity (ALERT/WARN/INFO)
- Pattern recurrence (same issue multiple times)
- Definition improvement suggestions generated
- Token efficiency of analyzed workflows
## Examples
See `examples.md` for detailed usage scenarios including:
- **Interactive configuration** - Choosing analysis categories
- **Session initialization** - Loading knowledge bases with config
- **Feature Architect validation** - PRD mode with selective analysis
- **Planning Specialist review** - Graph + tag analysis (when enabled)
- **Implementation Specialist review** - Information density tracking
- **Status Progression enforcement** - Critical routing violations
- **Mid-session reconfiguration** - Changing enabled categories
- **Token efficiency comparisons** - Different configuration impacts

View File

@@ -0,0 +1,345 @@
# Deviation Report Templates
**Purpose**: Format QA findings for user presentation based on severity.
**When**: After deviations detected in post-execution review
**Token Cost**: ~200-400 tokens
## Severity Levels
### 🚨 ALERT (Critical)
**Impact**: Affects functionality, correctness, or mandatory patterns
**Action**: Report immediately, add to TodoWrite, request user decision
**Examples**:
- Status change bypassed Status Progression Skill
- Cross-domain task detected (violates domain isolation)
- PRD sections not extracted (requirements lost)
- Incorrect dependencies in execution graph
- Task has no specialist mapping (routing will fail)
### ⚠️ WARN (Process Issue)
**Impact**: Process not followed optimally, should be addressed
**Action**: Include in post-execution report, add to TodoWrite
**Examples**:
- Workflow step skipped (non-critical)
- Output too verbose (token waste)
- Templates not applied when available
- Missed parallel opportunities
- Tags don't follow project conventions
### INFO (Observation)
**Impact**: Optimization opportunity or quality pattern
**Action**: Log for pattern tracking, mention if noteworthy
**Examples**:
- Token usage outside expected range (but reasonable)
- Could use more efficient tool (overview vs get)
- Format improvement suggestions
- Efficiency opportunities identified
## Report Templates
### ALERT Template (Critical Violation)
```markdown
## 🚨 QA Review: [Entity Name] - CRITICAL ISSUES DETECTED
**Workflow Adherence:** [X]/[Y] steps ([Z]%)
### Critical Issues ([count])
**❌ ALERT: [Issue Title]**
**What Happened:**
[Clear description of what was observed]
**Expected Behavior:**
[What should have happened according to documentation]
**Impact:**
[What this affects - functionality, correctness, workflow]
**Evidence:**
- [Specific evidence from output/database]
- [Tool calls made or not made]
- [Data discrepancies]
**Recommendation:**
[Specific action to fix the issue]
**Definition Update Needed:**
[If this is a pattern, what definition needs updating]
---
### ✅ Successes ([count])
- [What went well]
- [Patterns followed correctly]
### 📋 Added to TodoWrite
- [ ] Review [Entity]: [Issue description]
- [ ] Fix [specific issue]
- [ ] Update [definition file] with [improvement]
### 💭 Decision Required
**Question:** [What user needs to decide]
**Options:**
1. [Option A with pros/cons]
2. [Option B with pros/cons]
3. [Option C with pros/cons]
**Recommendation:** [Your suggestion with reasoning]
```
### WARN Template (Process Issue)
```markdown
## ⚠️ QA Review: [Entity Name] - Issues Found
**Workflow Adherence:** [X]/[Y] steps ([Z]%)
### Issues Detected ([count])
**⚠️ WARN: [Issue Title]**
- **Found:** [What was observed]
- **Expected:** [What should have happened]
- **Impact:** [How this affects quality/efficiency]
- **Fix:** [How to correct]
**⚠️ WARN: [Issue Title 2]**
- **Found:** [What was observed]
- **Expected:** [What should have happened]
### ✅ Successes
- [Workflow adherence: X/Y steps]
- [Quality metrics: X% graph quality, Y% tag coverage]
### 📋 Added to TodoWrite
- [ ] [Issue 1 to address]
- [ ] [Issue 2 to address]
### 🎯 Recommendations
1. [Most important fix]
2. [Process improvement]
3. [Optional optimization]
```
### INFO Template (Observations)
```markdown
## QA Review: [Entity Name] - Observations
**Workflow Adherence:** [X]/[Y] steps ([Z]%)
### Quality Metrics
- Dependency Accuracy: [X]%
- Parallel Completeness: [Y]%
- Tag Coverage: [Z]%
- Token Efficiency: [W]%
### Observations ([count])
** Efficiency Opportunity: [Title]**
- Current approach: [What was done]
- Optimal approach: [Better way]
- Potential savings: [Benefit]
** Format Suggestion: [Title]**
- Current: [What was done]
- Suggested: [Improvement]
### ✅ Overall Assessment
Workflow completed successfully with minor optimization opportunities.
[Optional: Include observations in session summary]
```
### Success Template (No Issues)
```markdown
## ✅ QA Review: [Entity Name]
**Workflow Adherence:** 100% ([Y]/[Y] steps completed)
**Quality Metrics:**
- All checkpoints passed ✅
- All expected outputs present ✅
- Token usage within range ✅
- Workflow patterns followed ✅
[If efficiency analysis enabled:]
**Efficiency:**
- Token efficiency: [X]%
- Optimal tool selection ✅
- Parallel opportunities identified ✅
**Result:** No issues detected - excellent execution!
```
## TodoWrite Integration
### ALERT Issues
```javascript
TodoWrite([
{
content: `ALERT: [Entity] - [Critical issue summary]`,
activeForm: `Reviewing [Entity] critical issue`,
status: "pending"
},
{
content: `Fix: [Specific corrective action]`,
activeForm: `Fixing [issue]`,
status: "pending"
},
{
content: `Update [definition]: [Improvement needed]`,
activeForm: `Updating definition`,
status: "pending"
}
])
```
### WARN Issues
```javascript
TodoWrite([
{
content: `Review [Entity]: [Issue summary] ([count] issues)`,
activeForm: `Reviewing [Entity] quality issues`,
status: "pending"
}
])
```
### INFO Observations
```javascript
// Generally don't add INFO to TodoWrite unless noteworthy
// Track for pattern analysis instead
```
## Multi-Issue Aggregation
When multiple issues of same type detected:
```markdown
### Cross-Domain Tasks Detected ([count])
**Pattern:** Tasks mixing specialist domains
**Violations:**
1. **[Task A]:** Combines [domain1] + [domain2]
- Evidence: [description mentions both]
- Fix: Split into 2 tasks
2. **[Task B]:** Combines [domain2] + [domain3]
- Evidence: [tags include both]
- Fix: Split into 2 tasks
**Root Cause:** [Why this happened - e.g., feature requirements not decomposed properly]
**Systemic Fix:** Update [planning-specialist.md] to enforce domain isolation check before task creation
**Added to TodoWrite:**
- [ ] Split Task A into domain-isolated tasks
- [ ] Split Task B into domain-isolated tasks
- [ ] Update planning-specialist.md validation checklist
```
## User Decision Prompts
### Template 1: Retry with Correct Approach
```markdown
### 💭 Decision Required
**Issue:** [Entity] bypassed mandatory [Skill Name] Skill
**Impact:** [What validation was skipped]
**Options:**
1. **Retry with [Skill Name] Skill** ✅ Recommended
- Pros: Ensures validation runs, follows documented workflow
- Cons: Requires re-execution
2. **Accept as-is and manually verify**
- Pros: Faster (no re-execution)
- Cons: May miss validation issues, sets bad precedent
3. **Update [Entity] to bypass Skill** ⚠️ Not Recommended
- Pros: Allows direct approach
- Cons: Removes safety checks, violates workflow
**Recommendation:** Retry with [Skill Name] Skill to ensure [prerequisites] are validated.
**Your choice?**
```
### Template 2: Definition Update
```markdown
### 💭 Decision Required
**Pattern Detected:** [Issue] occurred [N] times in session
**Systemic Issue:** [Root cause analysis]
**Proposed Definition Update:**
```diff
// File: [definition-file.md]
+ Add validation checklist:
+ - [ ] Verify all independent tasks in Batch 1
+ - [ ] Check for cross-domain tasks before creation
+ - [ ] Validate tag → specialist mapping coverage
```
**Options:**
1. **Update definition now** ✅ Recommended
- Prevents recurrence
- Improves workflow quality
2. **Track for later review**
- Allows more data collection
- May recur in meantime
**Your preference?**
```
## Formatting Guidelines
### Clarity
- Start with severity emoji (🚨/⚠️/)
- Use clear section headers
- Separate concerns (issues, successes, recommendations)
### Actionability
- Specific evidence, not vague observations
- Clear "Expected" vs "Found" comparisons
- Concrete recommendations with steps
### Brevity
- ALERT: Full details (this is critical)
- WARN: Moderate details (important but not urgent)
- INFO: Brief summary (observations only)
### Consistency
- Always include workflow adherence percentage
- Always show count of issues by severity
- Always provide TodoWrite summary
- Always offer recommendations
## Output Size Targets
- **ALERT report**: 300-600 tokens (comprehensive)
- **WARN report**: 200-400 tokens (focused)
- **INFO report**: 100-200 tokens (brief)
- **Success report**: 50-100 tokens (minimal)
**Total QA report** (including analysis): 800-2000 tokens depending on issues found

View File

@@ -0,0 +1,746 @@
# Orchestration QA Skill - Usage Examples
This document provides practical examples of using the orchestration-qa skill in different scenarios.
## Example 1: Interactive Configuration (FIRST TIME)
### Scenario
First time using orchestration-qa in a session - choose which analyses to enable
### Usage
```javascript
orchestration-qa(phase="configure")
```
### Interactive Prompts
Agent presents multiselect options:
```markdown
## Quality Analysis Configuration
**Which quality analysis categories would you like to enable for this session?**
(Select all that apply)
☐ Information Density
Analyze task content quality, detect wasteful patterns, measure information-to-token ratio (Specialists only)
☐ Execution Graphs
Validate dependency graphs and parallel execution opportunities (Planning Specialist only)
☐ Tag Coverage
Check tag consistency and agent-mapping coverage (Planning Specialist & Feature Architect)
☐ Token Optimization
Identify token waste patterns (verbose output, unnecessary loading, redundant operations)
☐ Tool Selection
Verify optimal tool usage (overview vs get, search vs filtered query, bulk operations)
☑ Routing Validation
Detect Skills bypass violations (CRITICAL - status changes, feature creation, task execution)
☐ Parallel Detection
Find missed parallelization opportunities (independent tasks, batch operations)
---
**How detailed should QA reports be?**
○ Brief - Only show critical issues (ALERT level) - minimal token usage
● Standard - Show ALERT and WARN level issues with brief explanations
○ Detailed - Show all issues (ALERT/WARN/INFO) with full analysis and recommendations
```
### User Selections
```javascript
// User selects:
{
categories: ["Information Density", "Routing Validation"],
reportStyle: "Standard"
}
```
### Output
```markdown
**QA Configuration Saved**
**Enabled Categories (2):**
- ✅ Information Density - Track task content quality and wasteful patterns
- ✅ Routing Validation - Detect critical Skills bypass violations
**Disabled Categories (5):**
- ⊘ Execution Graphs
- ⊘ Tag Coverage
- ⊘ Token Optimization
- ⊘ Tool Selection
- ⊘ Parallel Detection
**Report Style:** Standard (ALERT + WARN)
**Token Efficiency:**
- Per analysis: ~1,200 tokens (vs ~3,500 comprehensive)
- Savings: 67% reduction from full analysis
- Trade-off: Focus on content quality + critical routing only
Configuration will be used for all QA reviews in this session.
You can reconfigure anytime with `orchestration-qa(phase="configure")`
```
## Example 2: Session Initialization (With Configuration)
### Scenario
After configuration, initialize the session with knowledge bases
### Prerequisites
Configuration must be set first (via `phase="configure"`)
### Usage
```javascript
orchestration-qa(phase="init")
```
### Output
```markdown
**Orchestration QA Initialized**
**Knowledge Base Loaded:**
- Skills: 5 (feature-orchestration, task-orchestration, status-progression, dependency-analysis, dependency-orchestration)
- Subagents: 8 (feature-architect, planning-specialist, backend-engineer, frontend-developer, database-engineer, test-engineer, technical-writer, bug-triage-specialist)
- Routing: agent-mapping.yaml loaded (12 tag mappings)
**Active Configuration:**
- ✅ Information Density (track task content quality)
- ✅ Routing Validation (critical bypass detection)
- ⊘ Execution Graphs (disabled)
- ⊘ Tag Coverage (disabled)
- ⊘ Token Optimization (disabled)
- ⊘ Tool Selection (disabled)
- ⊘ Parallel Detection (disabled)
- Report Style: Standard (ALERT + WARN)
**Quality Assurance Active:**
- ✅ Pre-execution validation
- ✅ Post-execution review
- ✅ Configuration-driven analysis (2 categories enabled)
- ✅ Pattern tracking (continuous improvement)
**Session Tracking:**
- Deviations: 0 ALERT, 0 WARN, 0 INFO
- Patterns: 0 recurring issues
- Improvements: 0 suggestions
Ready to monitor orchestration quality.
**Token Efficiency:** ~1,200 tokens per analysis (67% savings from comprehensive)
```
## Example 3: Feature Architect Validation (PRD Mode)
### Scenario
User provides formal PRD document for feature creation
### User Request
```markdown
Create a user authentication feature:
# User Authentication
## Business Context
We need secure user authentication for our platform.
## User Stories
- As a user, I want to sign up with email/password
- As a user, I want to log in securely
- As a user, I want to reset my password
## Technical Requirements
- OAuth2 support
- JWT tokens
- Role-based access control
```
### Pre-Execution
```javascript
orchestration-qa(
phase="pre",
entityType="feature-architect",
userInput="[full PRD above]"
)
```
### Pre-Execution Output
```javascript
{
ready: true,
contextCaptured: true,
entityType: "feature-architect",
category: "SUBAGENT",
checkpoints: 8,
expected: {
mode: "PRD",
prdSections: ["Business Context", "User Stories", "Technical Requirements"],
descriptionLength: [500, 1000],
sectionsExpected: 3,
handoffTokens: [50, 100]
}
}
```
### Post-Execution
```javascript
orchestration-qa(
phase="post",
entityType="feature-architect",
entityOutput="[Feature Architect's response]",
entityId="feature-uuid-123"
)
```
### Post-Execution Output (Success)
```markdown
## ✅ QA Review: Feature Architect (PRD Mode)
**Workflow Adherence:** 8/8 steps (100%)
**Expected Outputs:** 7/7 present
### Validation Results
- ✅ PRD mode detected correctly
- ✅ All 3 PRD sections extracted to feature sections
- ✅ Description forward-looking (623 chars)
- ✅ Templates applied (Technical Approach, Requirements)
- ✅ Tags follow project conventions (reused "authentication", "security")
- ✅ agent-mapping.yaml checked (tags map to specialists)
- ✅ Handoff minimal (87 tokens)
- ✅ Core concepts preserved from PRD
**Quality Metrics:**
- PRD extraction: 100% (3/3 sections)
- Token efficiency: 95% (handoff 87 tokens, expected < 100)
- Tag conventions: 100% (reused existing tags)
**Result:** Excellent execution - all patterns followed correctly!
```
### Post-Execution Output (Issues Detected)
```markdown
## 🚨 QA Review: Feature Architect (PRD Mode) - ISSUES DETECTED
**Workflow Adherence:** 7/8 steps (87%)
### Critical Issues (1)
**❌ ALERT: PRD Sections Incomplete**
**What Happened:**
Feature has 2 sections, but PRD contained 3 sections.
**Expected Behavior:**
In PRD mode, Feature Architect must extract ALL sections from user's document.
**Impact:**
"Technical Requirements" section from PRD was not transferred to feature.
Requirements may be lost or incomplete.
**Evidence:**
- PRD sections: ["Business Context", "User Stories", "Technical Requirements"]
- Feature sections: ["Business Context", "User Stories"]
- Missing: "Technical Requirements"
**Recommendation:**
Add missing "Technical Requirements" section to feature.
**Definition Update Needed:**
Update feature-architect.md Step 7 to include validation:
- [ ] Verify all PRD sections extracted
- [ ] Compare PRD section count vs feature section count
- [ ] If mismatch, add missing sections before returning
---
### ✅ Successes (6)
- PRD mode detected correctly
- Description forward-looking
- Templates applied
- Tags follow conventions
- agent-mapping.yaml checked
- Handoff minimal (92 tokens)
### 📋 Added to TodoWrite
- [ ] Add "Technical Requirements" section to Feature [ID]
- [ ] Update feature-architect.md Step 7 validation checklist
### 💭 Decision Required
**Question:** Should we add the missing "Technical Requirements" section now?
**Options:**
1. **Add section now** ✅ Recommended
- Pros: Ensures all PRD content captured
- Cons: Requires one additional tool call
2. **Accept as-is**
- Pros: Faster (no additional work)
- Cons: Requirements may be incomplete
**Recommendation:** Add section now to ensure complete PRD capture.
**Your choice?**
```
## Example 4: Planning Specialist Review (Graph Analysis)
### Scenario
Planning Specialist breaks down feature into tasks
### Post-Execution
```javascript
orchestration-qa(
phase="post",
entityType="planning-specialist",
entityOutput="[Planning Specialist's response]",
entityId="feature-uuid-123"
)
```
### Output (High Quality)
```markdown
## ✅ QA Review: Planning Specialist
**Workflow Adherence:** 8/8 steps (100%)
**Expected Outputs:** 7/7 present
### Specialized Analysis
**📊 Execution Graph Quality: 98%**
- Dependency Accuracy: 100% (all dependencies correct)
- Parallel Completeness: 100% (all opportunities identified)
- Format Clarity: 95% (clear batch numbers, explicit dependencies)
**🏷️ Tag Quality: 100%**
- Tag Coverage: 100% (all tasks have tags)
- Agent Mapping Coverage: 100% (all tags map to specialists)
- Convention Adherence: 100% (reused existing tags)
### Quality Metrics
- Domain isolation: ✅ (one task = one specialist)
- Dependencies mapped: ✅ (Database → Backend → Frontend pattern)
- Documentation task: ✅ (user-facing feature)
- Testing task: ✅ (created)
- No circular dependencies: ✅
- Templates applied: ✅
**Result:** Excellent execution - target quality (95%+) achieved!
```
### Output (Issues Detected)
```markdown
## ⚠️ QA Review: Planning Specialist - Issues Found
**Workflow Adherence:** 8/8 steps (100%)
### Specialized Analysis
**📊 Execution Graph Quality: 73%**
- Dependency Accuracy: 67% (2/3 dependencies incorrect)
- Parallel Completeness: 67% (1 opportunity missed)
- Format Clarity: 85% (some ambiguous notation)
**🏷️ Tag Quality: 92%**
- Tag Coverage: 100% (all tasks have tags)
- Agent Mapping Coverage: 100% (all tags map to specialists)
- Convention Adherence: 75% (1 new tag without agent-mapping check)
### Issues Detected (4)
**🚨 ALERT: Incorrect Dependency**
- Task: "Implement backend API"
- Expected blocked by: ["Create database schema", "Design API endpoints"]
- Found in graph: ["Design API endpoints"]
- **Missing:** "Create database schema"
- **Impact:** Task might start before database ready
**🚨 ALERT: Cross-Domain Task**
- Task: "Build authentication UI"
- Domains detected: frontend + backend
- Evidence: Description mentions "UI components AND API integration"
- **Fix:** Split into "Build authentication UI" (frontend) + "Integrate auth API" (backend)
**⚠️ WARN: Missed Parallel Opportunity**
- Task: "Create database schema"
- Status: Independent (no blockers)
- **Issue:** Not in Batch 1, placed in Batch 2
- **Fix:** Move to Batch 1 (can start immediately)
**⚠️ WARN: Tag Convention Violation**
- Task: "Build authentication UI"
- Tag: "auth-ui" (new tag)
- **Issue:** Tag not in project conventions, not checked in agent-mapping.yaml
- **Suggestion:** Use existing tag "frontend" or add "auth-ui" to agent-mapping.yaml
### ✅ Successes
- 8/8 workflow steps completed
- Documentation task included
- Testing task included
- Task descriptions populated (200-600 chars)
### 📋 Added to TodoWrite
- [ ] Fix dependency: "Implement backend API" should depend on "Create database schema"
- [ ] Split cross-domain task: "Build authentication UI"
- [ ] Move "Create database schema" to Batch 1
- [ ] Review tag "auth-ui" - add to agent-mapping.yaml or use "frontend"
### 🎯 Recommendations
1. Update dependencies (CRITICAL - affects execution order)
2. Split cross-domain task (CRITICAL - routing will fail)
3. Optimize parallel execution (move task to Batch 1)
4. Update planning-specialist.md Step 3 to enforce domain isolation check
```
## Example 5: Status Progression Skill Validation
### Scenario
Orchestrator attempts to mark task complete
### Pre-Execution (Routing Violation Detected)
```javascript
orchestration-qa(
phase="pre",
entityType="manage_container", // Direct tool call detected
userInput="Mark task T1 complete"
)
```
### Output (Immediate Alert)
```markdown
## 🚨 CRITICAL ROUTING VIOLATION
**Severity:** CRITICAL
**Workflow Type:** status-change
**Violation:** Status change bypassed mandatory Status Progression Skill
**Expected Approach:** Use Status Progression Skill
**Actual Approach:** Direct manage_container(operation="setStatus") call
**Impact:** Prerequisite validation may be skipped:
- Summary length validation (300-500 chars required)
- Dependency completion check
- Files Changed section verification (for code tasks)
- Test validation
**Recommendation:** Use Status Progression Skill for ALL status changes
---
**Action:** STOP - Do not proceed with direct status change
**Correct Approach:**
```javascript
Use status-progression skill with task ID
```
**Why This Matters:**
Status Progression Skill validates prerequisites before allowing status change.
Bypassing this validation can result in incomplete tasks being marked complete.
```
## Example 6: Implementation Specialist Review
### Scenario
Backend Engineer completes task
### Post-Execution
```javascript
orchestration-qa(
phase="post",
entityType="backend-engineer",
entityOutput="[Backend Engineer's response]",
entityId="task-uuid-456"
)
```
### Output (Success)
```markdown
## ✅ QA Review: Backend Engineer
**Workflow Adherence:** 9/9 steps (100%)
### Lifecycle Validation
- ✅ Read task with sections
- ✅ Read dependencies
- ✅ Completed implementation work
- ✅ Updated task sections with results
- ✅ Tests run and passing
- ✅ Summary populated (387 chars)
- ✅ Files Changed section created (ordinal 999)
- ✅ Used Status Progression Skill to mark complete
- ✅ Output minimal (73 tokens)
### Quality Checks
- Summary length: 387 chars (expected 300-500) ✅
- Files Changed: Present ✅
- Tests mentioned: Yes ("All 12 tests passing") ✅
- Status change method: Status Progression Skill ✅
- Output brevity: 73 tokens (expected 50-100) ✅
**Result:** Perfect lifecycle execution!
```
### Output (Issues)
```markdown
## 🚨 QA Review: Backend Engineer - CRITICAL ISSUE
**Workflow Adherence:** 8/9 steps (89%)
### Critical Issues (1)
**❌ ALERT: Marked Complete Without Status Progression Skill**
**What Happened:**
Backend Engineer called manage_container(operation="setStatus") directly.
**Expected Behavior:**
Step 8 of specialist lifecycle requires using Status Progression Skill.
**Impact:**
- Summary validation may have been skipped (no length check)
- Files Changed section may not have been verified
- Test validation may have been incomplete
**Evidence:**
- Task status changed to "completed"
- No mention of "Status Progression" in output
- Direct tool call detected
**Recommendation:**
All implementation specialists MUST use Status Progression Skill in Step 8.
**Definition Update Needed:**
Update backend-engineer.md to emphasize CRITICAL pattern:
```diff
### Step 8: Use Status Progression Skill to Mark Complete
+ **CRITICAL:** NEVER call manage_container directly for status changes
+ **ALWAYS:** Use Status Progression Skill for prerequisite validation
```
---
### ⚠️ Issues (1)
**⚠️ WARN: Files Changed Section Missing**
- Expected: Section with ordinal 999, title "Files Changed"
- Found: No Files Changed section
- **Impact:** Difficult to track what files were modified
### 📋 Added to TodoWrite
- [ ] ALERT: Backend Engineer bypassed Status Progression Skill
- [ ] Add Files Changed section to task
- [ ] Update backend-engineer.md Step 8 critical pattern
### 💭 Decision Required
**Issue:** Critical workflow pattern violated (Status Progression bypass)
**Options:**
1. **Validate task manually**
- Check summary length (300-500 chars)
- Verify Files Changed section exists or create it
- Confirm tests passing
2. **Revert and retry with Status Progression Skill**
- Revert task to "in-progress"
- Use Status Progression Skill for completion
- Ensures all prerequisites validated
**Recommendation:** Option 1 for this instance, but update backend-engineer.md
to prevent recurrence.
```
## Example 7: Session Summary with Patterns
### Scenario
End of session after multiple workflows
### Usage
```javascript
orchestration-qa(phase="summary", sessionId="session-123")
```
### Output
```markdown
## 📊 Session QA Summary
**Workflows Analyzed:** 6
- Skills: 2 (Feature Orchestration, Status Progression)
- Subagents: 4 (Feature Architect, Planning Specialist, 2x Backend Engineer)
**Quality Overview:**
- ✅ Successful: 4 (no issues)
- ⚠️ Issues: 1 (Planning Specialist - graph quality 73%)
- 🚨 Critical: 1 (Backend Engineer - status bypass)
### Deviation Breakdown
- Routing violations: 1 (status change bypass)
- Workflow deviations: 0
- Output quality: 0
- Dependency errors: 2 (in Planning Specialist)
- Tag issues: 1 (convention violation)
- Token waste: 0
### Recurring Patterns (1)
**🔁 Pattern: Status Change Bypasses**
- Occurrences: 2 (Backend Engineer x2)
- Root cause: Step 8 critical pattern not emphasized enough
- Impact: Prerequisites validation skipped
- **Suggestion**: Update backend-engineer.md Step 8 with CRITICAL emphasis
### Improvement Recommendations (2)
**Priority 1: Backend Engineer Definition Update**
- File: backend-engineer.md
- Section: Step 8
- Type: Critical Pattern Emphasis
- Change: Add CRITICAL warning against direct status changes
- Impact: Prevents status bypass in future executions
- Effort: Low (text addition)
**Priority 2: Planning Specialist Validation Checklist**
- File: planning-specialist.md
- Section: Step 5 (Map Dependencies)
- Type: Validation Checklist
- Change: Add graph quality validation before returning
- Impact: Ensures execution graph accuracy >= 95%
- Effort: Medium (add quality gate step)
### Quality Trends
- Graph quality: 87% average (baseline 70%, target 95%+)
- Tag coverage: 98% average (baseline 90%, target 100%)
- Token efficiency: 91% average
- Workflow adherence: 94% average
### Next Steps
1. Update backend-engineer.md Step 8 (CRITICAL pattern emphasis)
2. Update planning-specialist.md Step 5 (graph validation checklist)
3. Monitor for recurrence in next session
```
## Example 8: Mid-Session Reconfiguration
### Scenario
User wants to enable additional analysis categories after session has started
### User Request
"I want to also track token optimization now to see if we're wasting tokens"
### Usage
```javascript
orchestration-qa(phase="configure")
```
### Interactive Prompts
Agent presents options with current configuration pre-selected:
```markdown
## Quality Analysis Configuration
**Current Configuration:**
- ✅ Information Density
- ✅ Routing Validation
**Which quality analysis categories would you like to enable?**
(Your current selections are pre-checked)
☑ Information Density ← Currently enabled
☐ Execution Graphs
☐ Tag Coverage
☑ Token Optimization ← User adds this
☐ Tool Selection
☑ Routing Validation ← Currently enabled
☐ Parallel Detection
```
### User Updates
```javascript
// User selects:
{
categories: ["Information Density", "Routing Validation", "Token Optimization"],
reportStyle: "Standard" // Keep existing
}
```
### Output
```markdown
**QA Configuration Updated**
**Changes:**
- Added: Token Optimization
**New Configuration (3 categories):**
- ✅ Information Density
- ✅ Routing Validation
- ✅ Token Optimization (NEW)
**Report Style:** Standard (unchanged)
**Token Impact:**
- Previous: ~1,200 tokens per analysis
- Updated: ~1,800 tokens per analysis (+600 tokens)
- Trade-off: Now detecting token waste patterns
All future QA reviews will include token optimization analysis.
```
### Next Post-Execution
```javascript
// After Backend Engineer completes task
orchestration-qa(
phase="post",
entityType="backend-engineer",
entityOutput="...",
entityId="task-uuid"
)
// Now loads: post-execution.md + routing-validation.md +
// task-content-quality.md + token-optimization.md
// Analysis includes: content quality + routing + token waste detection
```
## Usage Patterns Summary
### Session Start (First Time)
1. `phase="configure"` - Interactive category selection (~200-300 tokens)
2. `phase="init"` - Load knowledge bases (~1000 tokens)
### Per Entity
- `phase="pre"` - Before launching any Skill or Subagent (~600 tokens)
- `phase="post"` - After any Skill or Subagent completes (varies by config)
### Optional
- `phase="configure"` - Reconfigure mid-session
- `phase="summary"` - End-of-session pattern tracking (~800 tokens)
### Configuration-Driven Token Costs
**Post-Execution Costs by Configuration:**
| Configuration | Token Cost | Use Case |
|--------------|------------|----------|
| **Minimal** (Routing only) | ~1,000 tokens | Critical bypass detection only |
| **Default** (Info Density + Routing) | ~1,200 tokens | Most users - content + critical checks |
| **Planning Focus** (Graphs + Tags + Routing) | ~2,000 tokens | Planning Specialist reviews |
| **Comprehensive** (All enabled) | ~3,500 tokens | Full quality analysis |
**Session Cost Examples:**
| Workflow | Config | Total Cost | vs Monolithic |
|----------|--------|------------|---------------|
| 1 Feature + 3 Tasks | Default | ~6k tokens | 70% savings |
| 1 Feature + 3 Tasks | Minimal | ~4.5k tokens | 78% savings |
| 1 Feature + 3 Tasks | Comprehensive | ~15k tokens | 25% savings |
**Monolithic Trainer**: 20k-30k tokens always loaded (no configuration)
**Smart Defaults**: Information Density + Routing Validation achieves 93% token reduction while catching critical issues

View File

@@ -0,0 +1,140 @@
# Execution Graph Quality Analysis
**Purpose**: Validate Planning Specialist's execution graph matches actual database dependencies and identifies all parallel opportunities.
**When**: After Planning Specialist completes task breakdown
**Entity**: Planning Specialist only
**Token Cost**: ~600-900 tokens
## Quality Metrics
This analysis measures three aspects of execution graph quality:
1. **Dependency Accuracy** (70% baseline): Do claimed dependencies match database?
2. **Parallel Completeness** (70% baseline): Are all parallel opportunities identified?
3. **Format Clarity** (95% baseline): Is graph notation clear and unambiguous?
**Target**: 95%+ overall quality score
## Analysis Workflow
### Step 1: Query Actual Dependencies
```javascript
// Get all tasks for feature
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
// Query dependencies for each task
actualDependencies = {}
for task in tasks:
deps = query_dependencies(taskId=task.id, includeTaskInfo=true)
actualDependencies[task.id] = {
title: task.title,
blockedBy: deps.incoming,
blocks: deps.outgoing
}
}
```
### Step 2: Extract Planning Specialist's Graph
Parse the output to extract claimed execution structure:
```javascript
planningGraph = extractExecutionGraph(planningOutput)
// Should contain: batches, dependencies, parallel claims
```
### Step 3: Verify Dependency Accuracy
Compare claimed vs actual dependencies:
```javascript
for task in tasks:
graphBlockers = planningGraph.dependencies[task.title] || []
actualBlockers = actualDependencies[task.id].blockedBy.map(t => t.title)
if (!arraysEqual(graphBlockers, actualBlockers)) {
issues.push({
task: task.title,
expected: actualBlockers,
found: graphBlockers,
severity: "ALERT"
})
}
}
```
### Step 4: Verify Parallel Completeness
Check all parallel opportunities identified:
```javascript
// Independent tasks (no blockers) should all be in Batch 1
independentTasks = tasks.filter(t => actualDependencies[t.id].blockedBy.length == 0)
for task in independentTasks:
if (!isInBatch(task, 1, planningGraph)) {
issues.push({
task: task.title,
issue: "Independent task not in Batch 1",
severity: "WARN"
})
}
}
// Tasks in same batch should have no dependencies between them
for batch in planningGraph.batches:
for [taskA, taskB] in batch.pairs():
if (actualDependencies[taskA.id].blocks.includes(taskB.id)) {
issues.push({
issue: `${taskA.title} blocks ${taskB.title} but both in same batch`,
severity: "ALERT"
})
}
}
}
```
### Step 5: Calculate Quality Score
```javascript
score = {
dependencyAccuracy: (correct / total) * 100,
parallelCompleteness: (identified / opportunities) * 100,
formatClarity: hasGoodFormat ? 100 : 50,
overall: average(dependencyAccuracy, parallelCompleteness, formatClarity)
}
```
## Report Template
```markdown
## 📊 Execution Graph Quality
**Overall Score**: [X]% (Baseline: 70% / Target: 95%+)
### Metrics
- Dependency Accuracy: [X]%
- Parallel Completeness: [Y]%
- Format Clarity: [Z]%
### Issues ([count] total)
🚨 **ALERT** ([count]): Critical dependency errors
- [Task A]: Expected blocked by [B], found [C]
⚠️ **WARN** ([count]): Missed parallel opportunities
- [Task D]: Independent but not in Batch 1
### Recommendations
1. [Most critical fix]
2. [Process improvement]
```
## When to Report
- **ALWAYS** after Planning Specialist
- **Full details** if score < 95%
- **Brief summary** if score >= 95%

View File

@@ -0,0 +1,477 @@
# Session Initialization
**Purpose**: Load knowledge bases for Skills, Subagents, and routing configuration to enable validation throughout the session.
**When**: First interaction in new session (phase="init")
**Token Cost**: ~800-1000 tokens (one-time per session)
## Initialization Workflow
### Step 1: Load Skills Knowledge Base
**Action**: Discover and parse all Skill definitions
```javascript
// Glob all skill files
skillFiles = Glob(pattern=".claude/skills/*/SKILL.md")
// For each skill file found:
for skillFile in skillFiles:
// Read and parse YAML frontmatter + content
content = Read(skillFile)
// Extract from YAML frontmatter
name = content.frontmatter.name
description = content.frontmatter.description
// Extract from content sections
mandatoryTriggers = extractSection(content, "When to Use This Skill")
workflows = extractSection(content, "Workflow")
expectedOutputs = extractSection(content, "Output Format")
toolUsage = extractSection(content, "Tools Used")
tokenRange = extractSection(content, "Token Cost")
// Store in knowledge base
skills[name] = {
file: skillFile,
description: description,
mandatoryTriggers: mandatoryTriggers,
workflows: workflows,
expectedOutputs: expectedOutputs,
tools: toolUsage,
tokenRange: tokenRange
}
```
**Example Skills Loaded**:
```javascript
skills = {
"Feature Orchestration": {
file: ".claude/skills/feature-orchestration/SKILL.md",
mandatoryTriggers: [
"Create a feature",
"Complete feature",
"Feature progress"
],
workflows: [
"Smart Feature Creation",
"Task Breakdown Coordination",
"Feature Completion"
],
expectedOutputs: ["Feature ID", "Task count", "Next action"],
tools: ["query_container", "manage_container", "query_templates", "recommend_agent"],
tokenRange: [300, 800]
},
"Task Orchestration": {
file: ".claude/skills/task-orchestration/SKILL.md",
mandatoryTriggers: [
"Execute tasks",
"What's next",
"Launch batch",
"What tasks are ready"
],
workflows: [
"Dependency-Aware Batching",
"Parallel Specialist Launch",
"Progress Monitoring"
],
expectedOutputs: ["Batch structure", "Parallel opportunities", "Specialist recommendations"],
tools: ["query_container", "manage_container", "query_dependencies", "recommend_agent"],
tokenRange: [500, 900]
},
"Status Progression": {
file: ".claude/skills/status-progression/SKILL.md",
mandatoryTriggers: [
"Mark complete",
"Update status",
"Status change",
"Move to testing"
],
workflows: [
"Read Config",
"Validate Prerequisites",
"Interpret Errors"
],
expectedOutputs: ["Status updated", "Validation error with details"],
tools: ["Read", "query_container", "query_dependencies"],
tokenRange: [200, 400],
critical: "MANDATORY for ALL status changes - never bypass"
},
"Dependency Analysis": {
file: ".claude/skills/dependency-analysis/SKILL.md",
mandatoryTriggers: [
"What's blocking",
"Show dependencies",
"Check blockers"
],
workflows: [
"Query Dependencies",
"Analyze Chains",
"Report Findings"
],
expectedOutputs: ["Blocker list", "Dependency chains", "Unblock suggestions"],
tools: ["query_dependencies", "query_container"],
tokenRange: [300, 600]
},
"Dependency Orchestration": {
file: ".claude/skills/dependency-orchestration/SKILL.md",
mandatoryTriggers: [
"Resolve circular dependencies",
"Optimize dependencies"
],
workflows: [
"Advanced Dependency Analysis",
"Critical Path",
"Bottleneck Detection"
],
expectedOutputs: ["Dependency graph", "Critical path", "Optimization suggestions"],
tokenRange: [400, 700]
}
}
```
### Step 2: Load Subagents Knowledge Base
**Action**: Discover and parse all Subagent definitions
```javascript
// Glob all subagent files
subagentFiles = Glob(pattern=".claude/agents/task-orchestrator/*.md")
// For each subagent file found:
for subagentFile in subagentFiles:
// Read and parse content
content = Read(subagentFile)
// Extract from YAML frontmatter
name = content.frontmatter.name
description = content.frontmatter.description
// Extract workflow steps (numbered steps in document)
expectedSteps = extractNumberedSteps(content)
// Extract critical patterns (CRITICAL, IMPORTANT sections)
criticalPatterns = extractPatterns(content, markers=["CRITICAL", "IMPORTANT"])
// Extract output expectations
outputValidation = extractSection(content, "Output Format" or "Return")
// Store in knowledge base
subagents[name] = {
file: subagentFile,
description: description,
triggeredBy: extractTriggeredBy(content),
expectedSteps: expectedSteps,
criticalPatterns: criticalPatterns,
outputValidation: outputValidation,
tokenRange: extractTokenRange(content)
}
```
**Example Subagents Loaded**:
```javascript
subagents = {
"Feature Architect": {
file: ".claude/agents/task-orchestrator/feature-architect.md",
triggeredBy: [
"Complex feature creation",
"PRD provided",
"Formal planning"
],
expectedSteps: [
"Step 1: Understand Context (get_overview, list_tags)",
"Step 2: Detect Input Type (PRD/Interactive/Quick)",
"Step 3a/3b/3c: Process based on mode",
"Step 4: Discover Templates",
"Step 5: Design Tag Strategy",
"Step 5.5: Verify Agent Mapping Coverage",
"Step 6: Create Feature",
"Step 7: Add Custom Sections (mode-dependent)",
"Step 8: Return Handoff (minimal)"
],
criticalPatterns: [
"description = forward-looking (what needs to be built)",
"Do NOT populate summary field during creation",
"Return minimal handoff (50-100 tokens)",
"PRD mode: Extract ALL sections from document",
"Tag strategy: Reuse existing tags (list_tags first)",
"Check agent-mapping.yaml for new tags"
],
outputValidation: [
"Feature created with description?",
"Templates applied?",
"Tags follow project conventions?",
"PRD sections represented (if PRD mode)?",
"Handoff minimal (not verbose)?"
],
tokenRange: [1800, 2200]
},
"Planning Specialist": {
file: ".claude/agents/task-orchestrator/planning-specialist.md",
triggeredBy: [
"Feature needs task breakdown",
"Complex feature created"
],
expectedSteps: [
"Step 1: Read Feature Context (includeSections=true)",
"Step 2: Discover Task Templates",
"Step 3: Break Down into Domain-Isolated Tasks",
"Step 4: Create Tasks with Descriptions",
"Step 5: Map Dependencies",
"Step 7: Inherit and Refine Tags",
"Step 8: Return Brief Summary"
],
criticalPatterns: [
"One task = one specialist domain",
"Task description populated (200-600 chars)",
"Do NOT populate summary field",
"ALWAYS create documentation task for user-facing features",
"Create separate test task for comprehensive testing",
"Database → Backend → Frontend dependency pattern"
],
outputValidation: [
"Tasks created with descriptions?",
"Domain isolation preserved?",
"Dependencies mapped correctly?",
"Documentation task included (if user-facing)?",
"Testing task included (if needed)?",
"No circular dependencies?",
"Templates applied to tasks?"
],
tokenRange: [1800, 2200]
},
"Backend Engineer": {
file: ".claude/agents/task-orchestrator/backend-engineer.md",
triggeredBy: ["Backend implementation task"],
expectedSteps: [
"Step 1: Read task (includeSections=true)",
"Step 2: Read dependencies (if any)",
"Step 3: Do work (code, tests)",
"Step 4: Update task sections",
"Step 5: Run tests and validate",
"Step 6: Populate summary (300-500 chars)",
"Step 7: Create Files Changed section",
"Step 8: Use Status Progression Skill to mark complete",
"Step 9: Return minimal output"
],
criticalPatterns: [
"ALL tests must pass before completion",
"Summary REQUIRED (300-500 chars)",
"Files Changed section REQUIRED (ordinal 999)",
"Use Status Progression Skill to mark complete",
"Return minimal output (50-100 tokens)",
"If BLOCKED: Report with details, don't mark complete"
],
outputValidation: [
"Task marked complete?",
"Summary populated (300-500 chars)?",
"Files Changed section created?",
"Tests mentioned in summary?",
"Used Status Progression Skill for completion?",
"Output minimal (not verbose)?",
"If blocked: Clear reason + attempted fixes?"
],
tokenRange: [1800, 2200]
}
// Similar structures for:
// - Frontend Developer
// - Database Engineer
// - Test Engineer
// - Technical Writer
// - Bug Triage Specialist
}
```
### Step 3: Load Routing Configuration
**Action**: Read agent-mapping.yaml for tag-based routing
```javascript
// Read routing config
configPath = getProjectRoot().resolve(".taskorchestrator/agent-mapping.yaml")
configContent = Read(configPath)
// Parse YAML
agentMapping = parseYAML(configContent)
// Store tag mappings
routing = {
tagMappings: agentMapping.tagMappings,
// Example:
// "backend" → ["Backend Engineer"]
// "frontend" → ["Frontend Developer"]
// "database" → ["Database Engineer"]
// "testing" → ["Test Engineer"]
// "documentation" → ["Technical Writer"]
}
```
**Example Routing Configuration**:
```javascript
routing = {
tagMappings: {
"backend": ["Backend Engineer"],
"frontend": ["Frontend Developer"],
"database": ["Database Engineer"],
"testing": ["Test Engineer"],
"documentation": ["Technical Writer"],
"bug": ["Bug Triage Specialist"],
"architecture": ["Feature Architect"],
"planning": ["Planning Specialist"],
"api": ["Backend Engineer"],
"ui": ["Frontend Developer"],
"schema": ["Database Engineer"],
"migration": ["Database Engineer"]
}
}
```
### Step 4: Initialize Tracking State
**Action**: Set up session-level tracking for deviations and patterns
```javascript
trainingState = {
session: {
startTime: now(),
knowledgeBaseLoaded: true,
skillsCount: skills.length,
subagentsCount: subagents.length
},
tracking: {
// Store original user inputs by workflow ID
originalInputs: {},
// Validation checkpoints by workflow ID
checkpoints: [],
// Categorized deviations
deviations: {
orchestrator: [], // Routing violations (Skills bypassed)
skills: [], // Skill workflow issues
subagents: [] // Subagent workflow issues
},
// Improvement suggestions
improvements: []
}
}
```
### Step 5: Report Initialization Status
**Output**: Confirmation that QA system is ready
```markdown
**Orchestration QA Initialized**
**Knowledge Base Loaded:**
- Skills: 5 (feature-orchestration, task-orchestration, status-progression, dependency-analysis, dependency-orchestration)
- Subagents: 8 (feature-architect, planning-specialist, backend-engineer, frontend-developer, database-engineer, test-engineer, technical-writer, bug-triage-specialist)
- Routing: agent-mapping.yaml loaded (12 tag mappings)
**Quality Assurance Active:**
- ✅ Pre-execution validation
- ✅ Post-execution review
- ✅ Routing validation (Skills vs Direct)
- ✅ Pattern tracking (continuous improvement)
**Session Tracking:**
- Deviations: 0 ALERT, 0 WARN, 0 INFO
- Patterns: 0 recurring issues
- Improvements: 0 suggestions
Ready to monitor orchestration quality.
```
## Error Handling
### Skills Directory Not Found
```javascript
if (!exists(".claude/skills/")) {
return {
error: "Skills directory not found",
suggestion: "Run setup_claude_orchestration to install Skills and Subagents",
fallback: "QA will operate with limited validation (no Skills knowledge)"
}
}
```
### Subagents Directory Not Found
```javascript
if (!exists(".claude/agents/task-orchestrator/")) {
return {
error: "Subagents directory not found",
suggestion: "Run setup_claude_orchestration to install Subagents",
fallback: "QA will operate with limited validation (no Subagents knowledge)"
}
}
```
### Agent Mapping Not Found
```javascript
if (!exists(".taskorchestrator/agent-mapping.yaml")) {
return {
warning: "agent-mapping.yaml not found",
suggestion: "Routing validation will use default patterns",
fallback: "QA will operate without tag-based routing validation"
}
}
```
## Caching Strategy
**Knowledge bases are expensive to load** (~800-1000 tokens). Cache them for the session:
```javascript
// Load once per session
if (!session.knowledgeBaseLoaded) {
loadSkillsKnowledgeBase()
loadSubagentsKnowledgeBase()
loadRoutingConfiguration()
session.knowledgeBaseLoaded = true
}
// Reuse throughout session
skill = skills["Feature Orchestration"]
subagent = subagents["Planning Specialist"]
routing = routing.tagMappings["backend"]
```
**When to reload**:
- New session starts
- User explicitly requests: "Reload QA knowledge base"
- Skills/Subagents modified during session (rare)
## Usage Example
```javascript
// At session start
orchestration-qa(phase="init")
// Returns:
{
initialized: true,
skillsCount: 5,
subagentsCount: 8,
routingLoaded: true,
message: "✅ Orchestration QA Initialized - Ready to monitor quality"
}
// Knowledge base now available for all subsequent validations
```

View File

@@ -0,0 +1,184 @@
# Parallel Opportunity Detection
**Purpose**: Identify missed parallelization opportunities in task execution.
**When**: Optional post-execution (controlled by enableEfficiencyAnalysis parameter)
**Applies To**: Task Orchestration Skill, Planning Specialist
**Token Cost**: ~400-600 tokens
## Parallel Opportunity Types
### Type 1: Independent Tasks Not Batched
**Opportunity**: Tasks with no dependencies can run simultaneously
**Detection**:
```javascript
independentTasks = tasks.filter(t =>
query_dependencies(taskId=t.id).incoming.length == 0 &&
t.status == "pending"
)
if (independentTasks.length >= 2 && !launchedInParallel) {
return {
type: "Independent tasks not batched",
tasks: independentTasks.map(t => t.title),
opportunity: `${independentTasks.length} tasks can run simultaneously`,
impact: "Sequential execution when parallel possible",
recommendation: "Use Task Orchestration Skill to batch parallel tasks"
}
}
```
### Type 2: Tasks with Same Dependencies Not Grouped
**Opportunity**: Tasks blocked by the same tasks can run in parallel after blockers complete
**Detection**:
```javascript
// Group tasks by their blockers
tasksByBlockers = groupByBlockers(tasks)
for (blockerKey, taskGroup in tasksByBlockers) {
if (taskGroup.length >= 2 && !inSameBatch(taskGroup)) {
return {
type: "Tasks with same dependencies not grouped",
tasks: taskGroup.map(t => t.title),
sharedBlockers: parseBlockers(blockerKey),
opportunity: `${taskGroup.length} tasks can run parallel after blockers complete`,
recommendation: "Batch these tasks together"
}
}
}
```
### Type 3: Sequential Specialist Launches When Parallel Possible
**Opportunity**: Multiple specialists launched one-by-one instead of in parallel
**Detection**:
```javascript
if (launchedSpecialists.length >= 2 && !launchedInParallel) {
// Check if they have no dependencies between them
noDependencies = !hasBlockingRelationships(launchedSpecialists)
if (noDependencies) {
return {
type: "Sequential specialist launches",
specialists: launchedSpecialists,
opportunity: "Launch specialists in parallel",
impact: "Sequential execution increases total time",
recommendation: "Use Task tool multiple times in single message"
}
}
}
```
### Type 4: Domain-Isolated Tasks Not Parallelized
**Opportunity**: Backend + Frontend + Database tasks can often run in parallel
**Detection**:
```javascript
domains = {
database: tasks.filter(t => t.tags.includes("database")),
backend: tasks.filter(t => t.tags.includes("backend")),
frontend: tasks.filter(t => t.tags.includes("frontend"))
}
// Check typical dependency pattern: database → backend → frontend
// BUT: If each domain has multiple tasks, those CAN run in parallel
for (domain, domainTasks in domains) {
if (domainTasks.length >= 2 && !parallelizedWithinDomain(domainTasks)) {
return {
type: "Domain tasks not parallelized",
domain: domain,
tasks: domainTasks.map(t => t.title),
opportunity: `${domainTasks.length} ${domain} tasks can run parallel`,
recommendation: "Launch domain specialists in parallel"
}
}
}
```
## Analysis Workflow
```javascript
parallelOpportunities = []
// Check each opportunity type
checkIndependentTasks()
checkSameDependencyGroups()
checkSequentialLaunches()
checkDomainParallelization()
// Calculate potential time savings
if (parallelOpportunities.length > 0) {
estimatedTimeSavings = calculateTimeSavings(parallelOpportunities)
return {
opportunitiesFound: parallelOpportunities.length,
opportunities: parallelOpportunities,
estimatedSavings: estimatedTimeSavings,
recommendation: "Use Task Orchestration Skill for parallel batching"
}
}
```
## Report Template
```markdown
## ⚡ Parallel Opportunity Detection
**Opportunities Found**: [count]
**Estimated Time Savings**: [X]% (parallel vs sequential)
### Opportunities
** INFO**: Independent tasks not batched
- Tasks: [Task A, Task B, Task C]
- Opportunity: 3 tasks can run simultaneously (no dependencies)
- Impact: Sequential execution taking 3x longer than necessary
** INFO**: Domain tasks not parallelized
- Domain: backend
- Tasks: [Task D, Task E]
- Opportunity: 2 backend tasks can run parallel
### Recommendations
1. Use Task Orchestration Skill for dependency-aware batching
2. Launch specialists in parallel: `Task(Backend Engineer, task1)` + `Task(Backend Engineer, task2)` in single message
```
## When to Report
- **Only if** enableEfficiencyAnalysis=true
- **INFO** level (optimizations, not violations)
- Most valuable after task execution workflows
## Integration with Task Orchestration Skill
This analysis helps validate that Task Orchestration Skill is identifying all parallel opportunities:
```javascript
// If Task Orchestration Skill was used
if (usedTaskOrchestrationSkill) {
// Check if it identified all opportunities
identifiedOpportunities = extractBatchStructure(output)
missedOpportunities = parallelOpportunities.filter(o =>
!identifiedOpportunities.includes(o)
)
if (missedOpportunities.length > 0) {
return {
severity: "WARN",
issue: "Task Orchestration Skill missed parallel opportunities",
missed: missedOpportunities,
recommendation: "Update task-orchestration skill workflow"
}
}
}
```

View File

@@ -0,0 +1,370 @@
# Pattern Tracking & Continuous Improvement
**Purpose**: Track recurring issues and suggest systemic improvements to definitions.
**When**: After deviations detected, end of session
**Token Cost**: ~300-500 tokens
## Pattern Detection
### Recurrence Threshold
**Definition**: Issue is "recurring" if it happens 2+ times in session
**Why**: One-off issues may be anomalies; recurring issues indicate systemic problems
### Pattern Categories
1. **Routing Violations** - Skills bypassed
2. **Workflow Deviations** - Steps skipped
3. **Output Quality** - Verbose output, missing sections
4. **Dependency Errors** - Incorrect graph, circular dependencies
5. **Tag Issues** - Missing mappings, convention violations
6. **Token Waste** - Repeated inefficiency patterns
## Tracking Workflow
### Step 1: Detect Recurrence
```javascript
// Track issues across session
session.deviations = {
routing: [],
workflow: [],
output: [],
dependency: [],
tag: [],
token: []
}
// After each workflow, categorize deviations
for deviation in currentDeviations:
category = categorize(deviation)
session.deviations[category].push({
timestamp: now(),
entity: entityType,
issue: deviation.issue,
severity: deviation.severity
})
}
// Detect patterns
patterns = []
for category, issues in session.deviations:
grouped = groupByIssue(issues)
for issueType, occurrences in grouped:
if (occurrences.length >= 2) {
patterns.push({
category: category,
issue: issueType,
count: occurrences.length,
entities: occurrences.map(o => o.entity),
severity: determineSeverity(occurrences)
})
}
}
}
```
### Step 2: Analyze Root Cause
```javascript
for pattern in patterns:
rootCause = analyzeRootCause(pattern)
// Returns: "Definition unclear", "Validation missing", "Template incomplete", etc.
pattern.rootCause = rootCause
pattern.systemic = isSystemic(rootCause) // vs one-off orchestrator error
}
```
**Root Cause Types**:
- **Definition Unclear**: Instructions ambiguous or missing
- **Validation Missing**: No checkpoint to catch issue
- **Template Incomplete**: Template doesn't guide properly
- **Knowledge Gap**: Orchestrator unaware of pattern
- **Tool Limitation**: Current tools can't prevent issue
### Step 3: Generate Improvement Suggestions
```javascript
improvements = []
for pattern in patterns where pattern.systemic:
suggestion = generateImprovement(pattern)
improvements.push(suggestion)
}
```
**Improvement Types**:
#### Type 1: Definition Update
```javascript
{
type: "Definition Update",
file: "planning-specialist.md",
section: "Step 5: Map Dependencies",
issue: "Cross-domain tasks created (3 occurrences)",
rootCause: "No validation step for domain isolation",
suggestion: {
add: `
### Validation Checkpoint: Domain Isolation
Before creating tasks, verify:
- [ ] Each task maps to ONE specialist domain
- [ ] No task mixes backend + frontend
- [ ] No task mixes database + API logic
If domain mixing detected, split into separate tasks.
`,
location: "After Step 3, before Step 4"
},
impact: "Prevents cross-domain tasks in future Planning Specialist executions"
}
```
#### Type 2: Validation Checklist
```javascript
{
type: "Validation Checklist",
file: "feature-architect.md",
section: "Step 8: Return Handoff",
issue: "Verbose handoff (2 occurrences, avg 400 tokens)",
rootCause: "No token limit specified in definition",
suggestion: {
add: `
### Handoff Validation Checklist
Before returning:
- [ ] Token count < 100 (brief summary only)
- [ ] No code/detailed content in response
- [ ] Feature ID mentioned
- [ ] Next action clear
If output > 100 tokens, move details to feature sections.
`,
location: "End of Step 8"
},
impact: "Reduces Feature Architect output from 400 → 80 tokens (80% reduction)"
}
```
#### Type 3: Quality Gate
```javascript
{
type: "Quality Gate",
file: "planning-specialist.md",
section: "Step 6: Create Tasks",
issue: "Execution graph accuracy < 95% (2 occurrences)",
rootCause: "No validation before returning graph",
suggestion: {
add: `
### Quality Gate: Graph Validation
Before returning execution graph:
1. Query actual dependencies via query_dependencies
2. Compare graph claims vs database reality
3. Verify accuracy >= 95%
4. If < 95%, correct graph before returning
This ensures graph quality baseline is met.
`,
location: "After Step 6, before Step 7"
},
impact: "Ensures execution graph accuracy >= 95% in all cases"
}
```
#### Type 4: Orchestrator Guidance
```javascript
{
type: "Orchestrator Guidance",
file: "CLAUDE.md",
section: "Decision Gates",
issue: "Status changes bypassed Status Progression Skill (2 occurrences)",
rootCause: "Orchestrator unaware of mandatory pattern",
suggestion: {
add: `
### CRITICAL: Status Changes
**ALWAYS use Status Progression Skill for status changes**
❌ NEVER: manage_container(operation="setStatus", ...)
✅ ALWAYS: Use status-progression skill
**Why Critical**: Prerequisite validation required (summary length, dependencies, task counts)
**Triggers**:
- "mark complete"
- "update status"
- "move to [status]"
- "change status"
`,
location: "Decision Gates section, top priority"
},
impact: "Prevents status bypasses in future sessions"
}
```
### Step 4: Prioritize Improvements
```javascript
prioritized = improvements.sort((a, b) => {
// Priority order:
// 1. CRITICAL patterns (routing violations)
// 2. Frequent patterns (count >= 3)
// 3. High-impact patterns (affects multiple workflows)
// 4. Easy fixes (checklist additions)
score = {
critical: a.severity == "CRITICAL" ? 100 : 0,
frequency: a.count * 10,
impact: estimateImpact(a) * 5,
ease: estimateEase(a) * 2
}
return scoreB - scoreA // Descending
})
```
## Session Summary
### End-of-Session Report
```markdown
## 📊 Session QA Summary
**Workflows Analyzed:** [count]
- Skills: [count]
- Subagents: [count]
**Quality Overview:**
- ✅ Successful: [count] (no issues)
- ⚠️ Issues: [count] (addressed)
- 🚨 Critical: [count] (require attention)
### Deviation Breakdown
- Routing violations: [count]
- Workflow deviations: [count]
- Output quality: [count]
- Dependency errors: [count]
- Tag issues: [count]
- Token waste: [count]
### Recurring Patterns ([count])
**🔁 Pattern: Cross-domain tasks**
- Occurrences: [count] (Planning Specialist)
- Root cause: No domain isolation validation
- Impact: Tasks can't be routed to single specialist
- **Suggestion**: Update planning-specialist.md Step 3 with validation checklist
**🔁 Pattern: Status change bypasses**
- Occurrences: [count] (Orchestrator)
- Root cause: Decision gates not prominent enough
- Impact: Prerequisites not validated
- **Suggestion**: Update CLAUDE.md Decision Gates section
### Improvement Recommendations ([count])
**Priority 1: [Improvement Title]**
- File: [definition-file.md]
- Type: [Definition Update / Validation Checklist / Quality Gate]
- Impact: [What this prevents/improves]
- Effort: [Low / Medium / High]
**Priority 2: [Improvement Title]**
- File: [definition-file.md]
- Type: [...]
- Impact: [...]
### Quality Trends
- Graph quality: [X]% average (baseline 70%, target 95%+)
- Tag coverage: [Y]% average (baseline 90%, target 100%)
- Token efficiency: [Z]% average
- Workflow adherence: [W]% average
### Next Steps
1. [Most critical improvement]
2. [Secondary improvement]
3. [Optional enhancement]
```
## Continuous Improvement Cycle
### Cycle 1: Detection (This Session)
- Track deviations as they occur
- Detect recurring patterns (2+ occurrences)
- Analyze root causes
### Cycle 2: Analysis (End of Session)
- Generate improvement suggestions
- Prioritize by impact and ease
- Present to user with recommendations
### Cycle 3: Implementation (User Decision)
- User approves definition updates
- Apply changes to source files
- Document changes in version control
### Cycle 4: Validation (Next Session)
- Verify improvements are effective
- Track if recurring patterns reduced
- Measure quality metric improvements
## Metrics to Track
### Quality Metrics (Per Session)
- Workflow adherence: [X]%
- Graph quality: [X]%
- Tag coverage: [X]%
- Token efficiency: [X]%
### Pattern Metrics (Across Sessions)
- Recurring pattern count: [decreasing trend = good]
- Definition update count: [applied improvements]
- Quality improvement: [metrics increasing over time]
## Integration with QA Skill
```javascript
// At end of session
orchestration-qa(
phase="summary",
sessionId=currentSession
)
// Returns:
{
workflowsAnalyzed: 8,
deviationsSummary: { ALERT: 2, WARN: 5, INFO: 3 },
recurringPatterns: 2,
improvements: [
{ priority: 1, file: "planning-specialist.md", impact: "high" },
{ priority: 2, file: "CLAUDE.md", impact: "high" }
],
qualityTrends: {
graphQuality: "92%",
tagCoverage: "98%",
tokenEfficiency: "87%"
}
}
```
## When to Report
- **After deviations**: Track pattern occurrence
- **End of session**: Generate summary if patterns detected
- **User request**: "Show QA summary", "Any improvements?"
## Output Size
- Pattern tracking: ~100-200 tokens per pattern
- Session summary: ~400-800 tokens total
- Improvement suggestions: ~200-400 tokens per suggestion

View File

@@ -0,0 +1,623 @@
# Post-Execution Review
**Purpose**: Validate that Skills and Subagents followed their documented workflows and produced expected outputs.
**When**: After any Skill or Subagent completes (phase="post")
**Token Cost**: ~600-800 tokens (basic), ~1500-2000 tokens (with specialized analysis)
## Core Review Workflow
### Step 1: Load Entity Definition
**Action**: Read the definition from knowledge base loaded during initialization.
```javascript
// For Skills
if (category == "SKILL") {
definition = skills[entityType]
// Contains: mandatoryTriggers, workflows, expectedOutputs, tools, tokenRange
}
// For Subagents
if (category == "SUBAGENT") {
definition = subagents[entityType]
// Contains: expectedSteps, criticalPatterns, outputValidation, tokenRange
}
```
### Step 2: Retrieve Pre-Execution Context
**Action**: Load stored context from pre-execution phase.
```javascript
context = session.contexts[workflowId]
// Contains: userInput, checkpoints, expected, featureRequirements, etc.
```
### Step 3: Verify Workflow Adherence
**Check that entity followed its documented workflow steps.**
#### For Skills
```javascript
workflowCheck = {
expectedWorkflows: definition.workflows,
actualExecution: analyzeOutput(entityOutput),
stepsFollowed: 0,
stepsExpected: definition.workflows.length,
deviations: []
}
// Example for Feature Orchestration Skill
expectedWorkflows = [
"Assess complexity (Simple vs Complex)",
"Discover templates via query_templates",
"Create feature directly (Simple) OR Launch Feature Architect (Complex)",
"Return feature ID and next action"
]
// Verify each workflow step
for step in expectedWorkflows:
if (evidenceOfStep(entityOutput, step)) {
workflowCheck.stepsFollowed++
} else {
workflowCheck.deviations.push({
step: step,
issue: "No evidence of this step in output",
severity: "WARN"
})
}
```
#### For Subagents
```javascript
stepCheck = {
expectedSteps: definition.expectedSteps, // e.g., 8 steps for Feature Architect
actualSteps: extractSteps(entityOutput),
stepsFollowed: 0,
stepsExpected: definition.expectedSteps.length,
deviations: []
}
// Example for Feature Architect (8 expected steps)
expectedSteps = [
"Step 1: get_overview + list_tags",
"Step 2: Detect input type (PRD/Interactive/Quick)",
"Step 4: query_templates",
"Step 5: Tag strategy (reuse existing tags)",
"Step 5.5: Check agent-mapping.yaml",
"Step 6: Create feature",
"Step 7: Add custom sections (if Detailed/PRD)",
"Step 8: Return minimal handoff"
]
// Verify tool usage as evidence of steps
for step in expectedSteps:
if (evidenceOfStep(entityOutput, step)) {
stepCheck.stepsFollowed++
} else {
stepCheck.deviations.push({
step: step,
issue: "Step not completed or no evidence",
severity: determineStepSeverity(step)
})
}
```
**Evidence Detection**:
```javascript
function evidenceOfStep(output, step) {
// Check for tool calls mentioned
if (step.includes("query_templates") && mentions(output, "template")) return true
if (step.includes("list_tags") && mentions(output, "tags")) return true
// Check for workflow markers
if (step.includes("Create feature") && mentions(output, "feature created")) return true
// Check for explicit mentions
if (contains(output, step.toLowerCase())) return true
return false
}
```
### Step 4: Validate Critical Patterns
**Check entity followed critical patterns from its definition.**
```javascript
patternCheck = {
criticalPatterns: definition.criticalPatterns,
violations: []
}
// Example for Feature Architect
criticalPatterns = [
"description = forward-looking (what needs to be built)",
"Do NOT populate summary field during creation",
"Return minimal handoff (50-100 tokens)",
"PRD mode: Extract ALL sections from document",
"Tag strategy: Reuse existing tags",
"Check agent-mapping.yaml for new tags"
]
for pattern in criticalPatterns:
violation = checkPattern(pattern, entityOutput, context)
if (violation) {
patternCheck.violations.push(violation)
}
}
```
**Pattern Checking Examples**:
```javascript
// Pattern: "Do NOT populate summary field"
function checkSummaryField(output, entityId) {
entity = query_container(operation="get", containerType="feature", id=entityId)
if (entity.summary && entity.summary.length > 0) {
return {
pattern: "Do NOT populate summary field during creation",
violation: "Summary field populated",
severity: "WARN",
found: `Summary: "${entity.summary}" (${entity.summary.length} chars)`,
expected: "Summary should be empty until completion"
}
}
return null
}
// Pattern: "Return minimal handoff (50-100 tokens)"
function checkHandoffSize(output) {
tokenCount = estimateTokens(output)
if (tokenCount > 200) {
return {
pattern: "Return minimal handoff (50-100 tokens)",
violation: "Verbose handoff",
severity: "WARN",
found: `${tokenCount} tokens`,
expected: "50-100 tokens (brief summary)",
suggestion: "Detailed work should go in feature sections, not response"
}
}
return null
}
// Pattern: "PRD mode: Extract ALL sections"
function checkPRDExtraction(output, context) {
if (context.userInput.inputType != "PRD") return null
// Compare PRD sections vs feature sections
prdSections = context.prdSections // Captured in pre-execution
feature = query_container(operation="get", containerType="feature", id=entityId, includeSections=true)
featureSections = feature.sections
missingSections = []
for prdSection in prdSections:
if (!hasMatchingSection(featureSections, prdSection)) {
missingSections.push(prdSection)
}
}
if (missingSections.length > 0) {
return {
pattern: "PRD mode: Extract ALL sections from document",
violation: "PRD sections incomplete",
severity: "ALERT",
found: `Feature has ${featureSections.length} sections`,
expected: `PRD has ${prdSections.length} sections`,
missing: missingSections,
suggestion: "Add missing sections to feature"
}
}
return null
}
```
### Step 5: Verify Expected Outputs
**Check entity produced expected outputs from its definition.**
```javascript
outputCheck = {
expectedOutputs: definition.outputValidation || definition.expectedOutputs,
actualOutputs: analyzeOutputs(entityOutput, entityId),
present: [],
missing: []
}
// Example for Planning Specialist
expectedOutputs = [
"Tasks created with descriptions?",
"Domain isolation preserved?",
"Dependencies mapped correctly?",
"Documentation task included (if user-facing)?",
"Testing task included (if needed)?",
"No circular dependencies?",
"Templates applied to tasks?"
]
for expectedOutput in expectedOutputs:
if (verifyOutput(expectedOutput, entityId, context)) {
outputCheck.present.push(expectedOutput)
} else {
outputCheck.missing.push({
output: expectedOutput,
severity: determineSeverity(expectedOutput),
impact: describeImpact(expectedOutput)
})
}
}
```
### Step 6: Validate Against Checkpoints
**Compare execution against checkpoints set in pre-execution.**
```javascript
checkpointResults = {
total: context.checkpoints.length,
passed: 0,
failed: []
}
for checkpoint in context.checkpoints:
result = verifyCheckpoint(checkpoint, entityOutput, entityId, context)
if (result.passed) {
checkpointResults.passed++
} else {
checkpointResults.failed.push({
checkpoint: checkpoint,
reason: result.reason,
severity: result.severity
})
}
}
```
**Checkpoint Verification Examples**:
```javascript
// Checkpoint: "Verify templates discovered via query_templates"
function verifyTemplatesDiscovered(output) {
if (mentions(output, "template") || mentions(output, "query_templates")) {
return { passed: true }
}
return {
passed: false,
reason: "No evidence of template discovery (query_templates not called)",
severity: "WARN"
}
}
// Checkpoint: "Verify domain isolation (one task = one specialist)"
function verifyDomainIsolation(featureId) {
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
violations = []
for task in tasks:
domains = detectDomains(task.title + " " + task.description)
if (domains.length > 1) {
violations.push({
task: task.title,
domains: domains,
issue: "Task spans multiple specialist domains"
})
}
}
if (violations.length > 0) {
return {
passed: false,
reason: `${violations.length} cross-domain tasks detected`,
severity: "ALERT",
details: violations
}
}
return { passed: true }
}
```
### Step 7: Check Token Range
**Verify entity stayed within expected token range.**
```javascript
tokenCheck = {
actual: estimateTokens(entityOutput),
expected: definition.tokenRange,
withinRange: false,
deviation: 0
}
tokenCheck.withinRange = (
tokenCheck.actual >= tokenCheck.expected[0] &&
tokenCheck.actual <= tokenCheck.expected[1]
)
if (!tokenCheck.withinRange) {
tokenCheck.deviation = tokenCheck.actual > tokenCheck.expected[1]
? tokenCheck.actual - tokenCheck.expected[1]
: tokenCheck.expected[0] - tokenCheck.actual
if (tokenCheck.deviation > tokenCheck.expected[1] * 0.5) {
// More than 50% over expected range
severity = "WARN"
} else {
severity = "INFO"
}
}
```
### Step 8: Compare Against Original User Input
**For Subagents: Verify original user requirements preserved.**
```javascript
if (category == "SUBAGENT") {
requirementsCheck = compareToOriginal(
userInput: context.userInput,
output: entityOutput,
entityId: entityId
)
}
```
**Comparison Logic**:
```javascript
function compareToOriginal(userInput, output, entityId) {
// For Feature Architect: Check core concepts preserved
if (entityType == "feature-architect") {
feature = query_container(operation="get", containerType="feature", id=entityId, includeSections=true)
originalConcepts = extractConcepts(userInput.fullText)
featureConcepts = extractConcepts(feature.description + " " + sectionsToText(feature.sections))
missingConcepts = originalConcepts.filter(c => !featureConcepts.includes(c))
if (missingConcepts.length > 0) {
return {
preserved: false,
severity: "ALERT",
missing: missingConcepts,
suggestion: "Add missing concepts to feature description or sections"
}
}
}
// For Planning Specialist: Check all feature requirements covered
if (entityType == "planning-specialist") {
requirements = extractRequirements(context.featureRequirements.description)
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
uncoveredRequirements = []
for req in requirements:
if (!anyTaskCovers(tasks, req)) {
uncoveredRequirements.push(req)
}
}
if (uncoveredRequirements.length > 0) {
return {
preserved: false,
severity: "WARN",
uncovered: uncoveredRequirements,
suggestion: "Create additional tasks to cover all requirements"
}
}
}
return { preserved: true }
}
```
### Step 9: Determine Specialized Analysis Needed
**Based on entity type, decide which specialized analysis to run.**
```javascript
specializedAnalyses = []
// Planning Specialist → Graph + Tag analysis
if (entityType == "planning-specialist") {
specializedAnalyses.push("graph-quality")
specializedAnalyses.push("tag-quality")
}
// All entities → Routing validation
specializedAnalyses.push("routing-validation")
// If efficiency analysis enabled
if (params.enableEfficiencyAnalysis) {
specializedAnalyses.push("token-optimization")
specializedAnalyses.push("tool-selection")
specializedAnalyses.push("parallel-detection")
}
// Load and run each specialized analysis
for analysis in specializedAnalyses:
Read `.claude/skills/orchestration-qa/${analysis}.md`
runAnalysis(analysis, entityType, entityOutput, entityId, context)
}
```
### Step 10: Aggregate Results
**Combine all validation results.**
```javascript
results = {
entity: entityType,
category: category,
workflowAdherence: `${workflowCheck.stepsFollowed}/${workflowCheck.stepsExpected} steps (${percentage}%)`,
expectedOutputs: `${outputCheck.present.length}/${outputCheck.expectedOutputs.length} present`,
checkpoints: `${checkpointResults.passed}/${checkpointResults.total} passed`,
criticalPatternViolations: patternCheck.violations.filter(v => v.severity == "ALERT"),
processIssues: patternCheck.violations.filter(v => v.severity == "WARN"),
tokenUsage: {
actual: tokenCheck.actual,
expected: tokenCheck.expected,
withinRange: tokenCheck.withinRange,
deviation: tokenCheck.deviation
},
requirementsPreserved: requirementsCheck?.preserved ?? true,
deviations: aggregateDeviations(
workflowCheck.deviations,
patternCheck.violations,
outputCheck.missing,
checkpointResults.failed
),
specializedAnalyses: specializedAnalysisResults
}
```
### Step 11: Categorize Deviations by Severity
```javascript
deviationsSummary = {
ALERT: results.deviations.filter(d => d.severity == "ALERT"),
WARN: results.deviations.filter(d => d.severity == "WARN"),
INFO: results.deviations.filter(d => d.severity == "INFO")
}
```
**Severity Determination**:
- **ALERT**: Critical violations that affect functionality or correctness
- Missing requirements from user input
- Cross-domain tasks (violates domain isolation)
- Status change without Status Progression Skill
- Circular dependencies
- PRD sections not extracted
- **WARN**: Process issues that should be addressed
- Workflow steps skipped (non-critical)
- Output too verbose
- Templates not applied when available
- Tags don't follow conventions
- No Files Changed section
- **INFO**: Observations and opportunities
- Token usage outside expected range (but reasonable)
- Efficiency opportunities identified
- Quality patterns observed
### Step 12: Return Results
If deviations found, prepare for reporting:
```javascript
if (deviationsSummary.ALERT.length > 0 || deviationsSummary.WARN.length > 0) {
// Read deviation-templates.md for formatting
Read `.claude/skills/orchestration-qa/deviation-templates.md`
// Format report based on severity
report = formatDeviationReport(results, deviationsSummary)
// Add to TodoWrite
addToTodoWrite(deviationsSummary)
// Return report
return report
}
```
If no issues:
```javascript
return {
success: true,
message: `✅ QA Review: ${entityType} - All checks passed`,
workflowAdherence: results.workflowAdherence,
summary: "No deviations detected"
}
```
## Entity-Specific Notes
### Skills Review
- Focus on workflow steps and tool usage
- Verify token efficiency (Skills should be lightweight)
- Check for proper error handling
### Subagents Review
- Focus on step-by-step process adherence
- Verify critical patterns followed
- Compare output vs original user input (requirement preservation)
- Check output brevity (specialists should return minimal summaries)
### Status Progression Skill (Critical)
**Special validation** - this is the most critical Skill to validate:
```javascript
if (entityType == "status-progression") {
// CRITICAL: Was it actually used?
if (statusChangedWithoutSkill) {
return {
severity: "CRITICAL",
violation: "Status change bypassed mandatory Status Progression Skill",
impact: "Prerequisite validation may have been skipped",
action: "IMMEDIATE ALERT to user"
}
}
// Verify it read config.yaml
if (!mentions(output, "config")) {
deviations.push({
severity: "WARN",
issue: "Status Progression Skill didn't mention config",
expected: "Should read config.yaml for workflow validation"
})
}
// Verify it validated prerequisites
if (validationFailed && !mentions(output, "prerequisite" or "blocker")) {
deviations.push({
severity: "WARN",
issue: "Validation failure without detailed prerequisites",
expected: "Should explain what prerequisites are blocking"
})
}
}
```
## Output Structure
```javascript
{
entity: "planning-specialist",
category: "SUBAGENT",
workflowAdherence: "8/8 steps (100%)",
expectedOutputs: "7/7 present",
checkpoints: "10/10 passed",
tokenUsage: {
actual: 1950,
expected: [1800, 2200],
withinRange: true
},
deviations: [],
specializedAnalyses: {
graphQuality: { score: 95, issues: [] },
tagQuality: { score: 100, issues: [] }
},
success: true,
message: "✅ All quality checks passed"
}
```

View File

@@ -0,0 +1,490 @@
# Pre-Execution Validation
**Purpose**: Capture context and set validation checkpoints before launching any Skill or Subagent.
**When**: Before any Skill or Subagent is launched (phase="pre")
**Token Cost**: ~400-600 tokens
## Validation Workflow
### Step 1: Capture Original User Input
**Critical**: Store the user's complete original request for post-execution comparison.
```javascript
context = {
userInput: {
fullText: userMessage,
timestamp: now(),
inputType: detectInputType(userMessage) // PRD / Detailed / Quick / Command
}
}
```
**Input Type Detection**:
```javascript
function detectInputType(message) {
// PRD: Formal document with multiple sections
if (message.includes("# ") && message.length > 500 && hasSections(message)) {
return "PRD"
}
// Detailed: Rich context, multiple paragraphs, requirements
if (message.length > 200 && paragraphCount(message) >= 3) {
return "Detailed"
}
// Quick: Short request, minimal context
if (message.length < 100) {
return "Quick"
}
// Command: Direct instruction
return "Command"
}
```
### Step 2: Identify Entity Type
**Determine what's being launched** (Skill vs Subagent):
```javascript
entityType = identifyEntity(userMessage, context)
// Skills
if (matches(userMessage, skills[].mandatoryTriggers)) {
entityType = matchedSkill // "feature-orchestration", "task-orchestration", etc.
category = "SKILL"
}
// Subagents
if (orchestrator decides to launch subagent) {
entityType = subagentName // "feature-architect", "planning-specialist", etc.
category = "SUBAGENT"
}
```
### Step 3: Set Entity-Specific Validation Checkpoints
**Create checklist to verify after execution completes.**
#### Feature Orchestration Skill
```javascript
checkpoints = [
"Verify Skill assessed complexity correctly",
"Verify Skill created feature OR launched Feature Architect",
"Verify templates discovered via query_templates",
"Verify output in token range (300-800 tokens)"
]
context.expected = {
complexity: detectExpectedComplexity(userInput),
mode: "simple" or "complex",
tools: ["query_templates", "manage_container" or "Task(Feature Architect)"],
tokenRange: [300, 800]
}
```
**Complexity Detection**:
```javascript
function detectExpectedComplexity(input) {
// Simple indicators
if (input.length < 150 && paragraphs < 2) return "simple"
// Complex indicators
if (input.inputType == "PRD") return "complex"
if (input.length > 200) return "complex"
if (mentions(input, ["multiple", "integration", "system"])) return "complex"
return "simple"
}
```
#### Task Orchestration Skill
```javascript
checkpoints = [
"Verify Skill analyzed dependencies via query_dependencies",
"Verify Skill identified parallel opportunities",
"Verify Skill used recommend_agent for routing",
"Verify Skill returned batch structure",
"Verify output in token range (500-900 tokens)"
]
// Get current feature state for comparison
if (featureId) {
context.featureState = {
totalTasks: query_container(containerType="feature", id=featureId).taskCounts.total,
pendingTasks: query_container(containerType="feature", id=featureId, status="pending").length,
dependencies: query_dependencies for all pending tasks
}
}
```
#### Status Progression Skill
```javascript
checkpoints = [
"Verify Skill read config.yaml",
"Verify Skill validated prerequisites",
"Verify Skill returned clear result or error",
"Verify output in token range (200-400 tokens)"
]
// CRITICAL CHECK: Was Status Progression Skill actually used?
context.criticalValidation = {
mustUseSkill: true,
violationSeverity: "CRITICAL",
reason: "Status changes MUST use Status Progression Skill for prerequisite validation"
}
// Get current entity state for prerequisite checking
context.entityState = {
currentStatus: entity.status,
summary: entity.summary,
dependencies: query_dependencies(taskId) if task,
tasks: query_container(featureId).tasks if feature
}
```
#### Feature Architect Subagent
```javascript
checkpoints = [
"Compare Feature Architect output vs original user input",
"Verify mode detection (PRD/Interactive/Quick)",
"Verify all PRD sections extracted (if PRD mode)",
"Verify core concepts preserved",
"Verify templates applied",
"Verify tags follow project conventions",
"Verify agent-mapping.yaml checked (for new tags)",
"Verify handoff minimal (50-100 tokens)"
]
// PRD Mode: Extract sections from user input
if (context.userInput.inputType == "PRD") {
context.prdSections = extractSections(userInput)
// Example: ["Business Context", "User Stories", "Technical Specs", "Requirements"]
checkpoints.push(
"Verify all PRD sections have corresponding feature sections"
)
}
context.expected = {
mode: context.userInput.inputType,
descriptionLength: context.userInput.inputType == "PRD" ? [500, 1000] : [200, 500],
sectionsExpected: context.prdSections?.length || 0,
handoffTokens: [50, 100],
tokenRange: [1800, 2200]
}
```
#### Planning Specialist Subagent
```javascript
// First, read the created feature
feature = query_container(operation="get", containerType="feature", id=featureId, includeSections=true)
// Store feature requirements for comparison
context.featureRequirements = {
description: feature.description,
sections: feature.sections,
isUserFacing: detectUserFacing(feature),
requiresMultipleDomains: detectDomains(feature)
}
checkpoints = [
"Verify domain isolation (one task = one specialist)",
"Verify dependencies mapped (Database → Backend → Frontend)",
"Verify documentation task created (if user-facing)",
"Verify testing task created (if needed)",
"Verify all feature requirements covered by tasks",
"Verify no cross-domain tasks",
"Verify no circular dependencies",
"Verify task descriptions populated (200-600 chars)",
"Verify templates applied to tasks",
"Verify output in token range (1800-2200 tokens)"
]
context.expected = {
needsDocumentation: context.featureRequirements.isUserFacing,
needsTesting: detectTestingNeeded(feature),
domainCount: context.featureRequirements.requiresMultipleDomains ? 3 : 1,
tokenRange: [1800, 2200]
}
```
**Domain Detection**:
```javascript
function detectDomains(feature) {
domains = []
if (mentions(feature.description, ["database", "schema", "migration"])) {
domains.push("database")
}
if (mentions(feature.description, ["api", "service", "endpoint", "backend"])) {
domains.push("backend")
}
if (mentions(feature.description, ["ui", "component", "page", "frontend"])) {
domains.push("frontend")
}
return domains.length
}
```
#### Implementation Specialist Subagents
**Applies to**: Backend Engineer, Frontend Developer, Database Engineer, Test Engineer, Technical Writer
```javascript
// Read task context
task = query_container(operation="get", containerType="task", id=taskId, includeSections=true)
context.taskRequirements = {
description: task.description,
sections: task.sections,
hasDependencies: query_dependencies(taskId).incoming.length > 0,
complexity: task.complexity
}
checkpoints = [
"Verify specialist completed task lifecycle",
"Verify tests run and passing (if code task)",
"Verify summary populated (300-500 chars)",
"Verify Files Changed section created (ordinal 999)",
"Verify used Status Progression Skill to mark complete",
"Verify output minimal (50-100 tokens)",
"If blocked: Verify clear reason + attempted fixes"
]
context.expected = {
summaryLength: [300, 500],
hasFilesChanged: true,
statusChanged: true,
tokenRange: [1800, 2200],
outputTokens: [50, 100]
}
// Verify recommend_agent was used
context.routingValidation = {
shouldUseRecommendAgent: true,
matchesTags: checkTagMatch(task.tags, specialistName)
}
```
### Step 4: Verify Routing Decision
**Check orchestrator made correct routing choice.**
```javascript
routingCheck = {
userRequest: userMessage,
detectedIntent: detectIntent(userMessage),
orchestratorChoice: entityType,
correctChoice: validateRouting(detectedIntent, entityType)
}
// Intent detection
function detectIntent(message) {
// Coordination triggers → MUST use Skills
coordinationTriggers = [
"mark complete", "update status", "create feature",
"execute tasks", "what's next", "check blockers", "complete feature"
]
// Implementation triggers → Should ask user (Direct vs Specialist)
implementationTriggers = [
"implement", "write code", "create API", "build",
"add tests", "fix bug", "database schema", "frontend component"
]
if (matches(message, coordinationTriggers)) return "COORDINATION"
if (matches(message, implementationTriggers)) return "IMPLEMENTATION"
return "UNKNOWN"
}
// Routing validation
function validateRouting(intent, choice) {
if (intent == "COORDINATION" && !isSkill(choice)) {
return {
valid: false,
severity: "CRITICAL",
violation: "Coordination request must use Skill, not direct tools or subagent",
expected: "Use appropriate Skill (Feature Orchestration, Task Orchestration, Status Progression)"
}
}
if (intent == "IMPLEMENTATION" && !askedUser) {
return {
valid: false,
severity: "WARN",
violation: "Implementation request should ask user (Direct vs Specialist)",
expected: "Ask user preference before proceeding"
}
}
return { valid: true }
}
```
**Special Case: Status Changes**
```javascript
// Status changes are ALWAYS coordination → MUST use Status Progression Skill
if (userMessage.includes("complete") || userMessage.includes("status")) {
if (choice != "status-progression") {
return {
valid: false,
severity: "CRITICAL",
violation: "Status change MUST use Status Progression Skill",
reason: "Prerequisite validation required (summary length, dependencies, task counts)",
expected: "Use Status Progression Skill for ALL status changes"
}
}
}
```
### Step 5: Store Context for Post-Execution
**Save all captured information for comparison after execution.**
```javascript
session.contexts[workflowId] = {
timestamp: now(),
userInput: context.userInput,
entityType: entityType,
category: "SKILL" or "SUBAGENT",
checkpoints: checkpoints,
expected: context.expected,
featureRequirements: context.featureRequirements, // if Planning Specialist
taskRequirements: context.taskRequirements, // if Implementation Specialist
routingValidation: routingCheck,
criticalValidation: context.criticalValidation // if Status Progression
}
```
### Step 6: Return Ready Signal
```javascript
return {
ready: true,
contextCaptured: true,
entityType: entityType,
category: category,
checkpoints: checkpoints.length,
routingValid: routingCheck.valid,
warnings: routingCheck.valid ? [] : [routingCheck.violation]
}
```
**If routing violation detected**, alert immediately:
```javascript
if (!routingCheck.valid && routingCheck.severity == "CRITICAL") {
return {
ready: false,
violation: {
severity: "CRITICAL",
type: "Routing Violation",
message: routingCheck.violation,
expected: routingCheck.expected,
action: "STOP - Do not proceed until corrected"
}
}
}
```
## Routing Violation Examples
### CRITICAL: Status Change Without Status Progression Skill
```javascript
User: "Mark task T1 complete"
Orchestrator: [Calls manage_container directly]
// Pre-execution validation detects:
{
violation: "CRITICAL",
type: "Status change bypassed mandatory Status Progression Skill",
expected: "Use Status Progression Skill for status changes",
reason: "Prerequisite validation required (summary 300-500 chars, dependencies completed)",
action: "STOP - Use Status Progression Skill instead"
}
// Alert user immediately, do NOT proceed
```
### CRITICAL: Feature Creation Without Feature Orchestration Skill
```javascript
User: "Create a user authentication feature"
Orchestrator: [Calls manage_container directly]
// Pre-execution validation detects:
{
violation: "CRITICAL",
type: "Feature creation bypassed mandatory Feature Orchestration Skill",
expected: "Use Feature Orchestration Skill for feature creation",
reason: "Complexity assessment and template discovery required",
action: "STOP - Use Feature Orchestration Skill instead"
}
```
### WARN: Implementation Without Asking User
```javascript
User: "Implement login API"
Orchestrator: [Works directly without asking preference]
// Pre-execution validation detects:
{
violation: "WARN",
type: "Implementation without user preference",
expected: "Ask user: Direct vs Specialist?",
reason: "User should choose approach",
action: "Log to TodoWrite, suggest asking user"
}
// Log but don't block
```
## Output Example
```javascript
// Successful pre-execution
{
ready: true,
contextCaptured: true,
entityType: "planning-specialist",
category: "SUBAGENT",
checkpoints: 10,
routingValid: true,
expected: {
mode: "Detailed",
needsDocumentation: true,
domainCount: 3,
tokenRange: [1800, 2200]
},
message: "✅ Ready to launch Planning Specialist - 10 checkpoints set"
}
```
## Integration Example
```javascript
// Before launching Planning Specialist
orchestration-qa(
phase="pre",
entityType="planning-specialist",
userInput="Create user authentication feature with OAuth2, JWT tokens, role-based access"
)
// Returns context captured, checkpoints set
// Orchestrator proceeds with launch
```

View File

@@ -0,0 +1,282 @@
# Routing Validation
**Purpose**: Detect violations of mandatory Skill usage patterns (Skills vs Direct tools vs Subagents).
**When**: After ANY workflow completes
**Applies To**: All Skills and Subagents
**Token Cost**: ~300-500 tokens
## Critical Routing Rules
### Rule 1: Status Changes MUST Use Status Progression Skill
**Violation**: Calling `manage_container(operation="setStatus")` directly
**Expected**: Use Status Progression Skill for ALL status changes
**Why Critical**: Prerequisite validation (summary length, dependencies, task completion) required
**Detection**:
```javascript
if (statusChanged && !usedStatusProgressionSkill) {
return {
severity: "CRITICAL",
violation: "Status change bypassed mandatory Status Progression Skill",
impact: "Prerequisites may not have been validated",
expected: "Use Status Progression Skill for status changes"
}
}
```
### Rule 2: Feature Creation MUST Use Feature Orchestration Skill
**Violation**: Calling `manage_container(operation="create", containerType="feature")` directly
**Expected**: Use Feature Orchestration Skill for feature creation
**Why Critical**: Complexity assessment and template discovery required
**Detection**:
```javascript
if (featureCreated && !usedFeatureOrchestrationSkill) {
return {
severity: "CRITICAL",
violation: "Feature creation bypassed mandatory Feature Orchestration Skill",
impact: "Complexity not assessed, templates may be missed",
expected: "Use Feature Orchestration Skill for feature creation"
}
}
```
### Rule 3: Task Execution SHOULD Use Task Orchestration Skill
**Violation**: Launching specialists directly without checking dependencies/parallel opportunities
**Expected**: Use Task Orchestration Skill for batch execution
**Why Important**: Dependency analysis and parallelization optimization
**Detection**:
```javascript
if (multipleTasksLaunched && !usedTaskOrchestrationSkill) {
return {
severity: "WARN",
violation: "Multiple tasks launched without Task Orchestration Skill",
impact: "May miss parallel opportunities or dependency conflicts",
expected: "Use Task Orchestration Skill for batch execution"
}
}
```
### Rule 4: Implementation Specialists MUST Use Status Progression for Completion
**Violation**: Specialist calls `manage_container(operation="setStatus")` directly
**Expected**: Specialist uses Status Progression Skill to mark complete
**Why Critical**: Prerequisite validation (summary, Files Changed section, tests)
**Detection**:
```javascript
if (implementationSpecialist && taskCompleted && !usedStatusProgressionSkill) {
return {
severity: "CRITICAL",
violation: `${specialistName} marked task complete without Status Progression Skill`,
impact: "Summary/Files Changed/test validation may have been skipped",
expected: "Use Status Progression Skill in Step 8 of specialist lifecycle"
}
}
```
## Validation Workflow
### Step 1: Identify Workflow Type
```javascript
workflowType = identifyWorkflow(entityType, userInput, output)
// Returns: "status-change", "feature-creation", "task-execution", "implementation"
```
### Step 2: Check Mandatory Skill Usage
```javascript
mandatorySkills = {
"status-change": "status-progression",
"feature-creation": "feature-orchestration",
"task-execution": "task-orchestration", // WARN level
"feature-completion": "feature-orchestration"
}
requiredSkill = mandatorySkills[workflowType]
```
### Step 3: Detect Skill Bypass
```javascript
// Check if required Skill was used
skillUsed = checkSkillUsage(output, requiredSkill)
if (!skillUsed && requiredSkill) {
severity = (workflowType == "task-execution") ? "WARN" : "CRITICAL"
violation = {
workflowType: workflowType,
requiredSkill: requiredSkill,
actualApproach: detectActualApproach(output),
severity: severity,
impact: describeImpact(requiredSkill)
}
}
```
### Step 4: Verify Specialist Lifecycle Adherence
For Implementation Specialists (Backend, Frontend, Database, Test, Technical Writer):
```javascript
if (category == "SUBAGENT" && isImplementationSpecialist(entityType)) {
lifecycle = {
step8Expected: "Use Status Progression Skill to mark complete",
step8Actual: detectStep8Approach(output),
compliant: false
}
// Check if Status Progression Skill was mentioned
if (mentions(output, "Status Progression") || mentions(output, "status-progression")) {
lifecycle.compliant = true
} else if (taskStatusChanged) {
violation = {
severity: "CRITICAL",
specialist: entityType,
step: "Step 8",
issue: "Marked task complete without Status Progression Skill",
impact: "Prerequisite validation (summary, Files Changed, tests) may be incomplete",
expected: "Use Status Progression Skill for completion"
}
}
}
```
## Violation Severity Levels
### CRITICAL (Immediate Alert)
- Status change without Status Progression Skill
- Feature creation without Feature Orchestration Skill
- Implementation specialist completion without Status Progression Skill
- **Action**: Report immediately, add to TodoWrite, suggest correction
### WARN (Log for Review)
- Task execution without Task Orchestration Skill (multiple tasks)
- Efficiency opportunities missed (parallelization)
- **Action**: Log to TodoWrite, mention in end-of-session summary
### INFO (Observation)
- Workflow variations that are acceptable
- Optimization suggestions
- **Action**: Track for pattern analysis only
## Report Template
```markdown
## 🚨 Routing Violation Detected
**Severity**: CRITICAL
**Workflow Type**: [status-change / feature-creation / etc.]
**Violation**: [Description]
**Impact**: [What this affects]
**Expected Approach**: Use [Skill Name] Skill
**Actual Approach**: Direct tool call / Subagent / etc.
**Recommendation**: [How to correct]
---
**Added to TodoWrite**:
- Review [Workflow]: [Issue description]
**Decision Required**: Should orchestrator retry using correct Skill?
```
## Common Violations
### Violation 1: Direct Status Change
```javascript
User: "Mark task T1 complete"
Orchestrator: manage_container(operation="setStatus", status="completed") // ❌
Expected: Use Status Progression Skill
Reason: Summary validation, dependency checks required
```
### Violation 2: Direct Feature Creation
```javascript
User: "Create user authentication feature"
Orchestrator: manage_container(operation="create", containerType="feature") // ❌
Expected: Use Feature Orchestration Skill
Reason: Complexity assessment, template discovery required
```
### Violation 3: Specialist Bypass
```javascript
Backend Engineer: manage_container(operation="setStatus", status="completed") // ❌
Expected: Use Status Progression Skill in Step 8
Reason: Summary, Files Changed, test validation required
```
## Integration with Post-Execution Review
```javascript
// ALWAYS run routing validation in post-execution
Read "routing-validation.md"
violations = detectRoutingViolations(
workflowType,
entityType,
entityOutput,
context
)
if (violations.length > 0) {
for violation in violations:
if (violation.severity == "CRITICAL") {
// Report immediately
alertUser(violation)
addToTodoWrite(violation)
} else {
// Log for summary
logViolation(violation)
}
}
```
## Continuous Improvement
### Pattern Tracking
If same violation occurs 2+ times in session:
- Update orchestrator instructions
- Add validation checkpoint in pre-execution
- Suggest systemic improvement
### Definition Updates
Recurring violations indicate documentation gaps:
- Update Skill definitions with clearer trigger patterns
- Add examples of correct vs incorrect usage
- Update CLAUDE.md Decision Gates section
## When to Report
- **CRITICAL violations**: Report immediately (don't wait for post-execution)
- **WARN violations**: Include in post-execution summary
- **INFO observations**: Track for pattern analysis only

View File

@@ -0,0 +1,155 @@
# Tag Quality Analysis
**Purpose**: Validate Planning Specialist's tag strategy ensures complete specialist coverage.
**When**: After Planning Specialist completes task breakdown
**Entity**: Planning Specialist only
**Token Cost**: ~400-600 tokens
## Quality Metrics
1. **Tag Coverage** (100% baseline): Every task has tags that map to a specialist
2. **Tag Conventions** (90% baseline): Tags follow project conventions (reuse existing)
3. **Agent Mapping Coverage** (100% baseline): All tags map to specialists in agent-mapping.yaml
**Target**: 100% coverage, 90%+ conventions adherence
## Analysis Workflow
### Step 1: Load Project Tag Conventions
```javascript
// Get existing tags from project
projectTags = list_tags(entityTypes=["TASK", "FEATURE"])
// Load agent-mapping.yaml
agentMapping = Read(".taskorchestrator/agent-mapping.yaml").tagMappings
```
### Step 2: Analyze Task Tags
```javascript
tasks = query_container(operation="overview", containerType="feature", id=featureId).tasks
tagAnalysis = {
totalTasks: tasks.length,
tasksWithTags: 0,
tasksWithoutTags: [],
tagCoverage: [],
conventionViolations: [],
unmappedTags: []
}
for task in tasks:
if (!task.tags || task.tags.length == 0) {
tagAnalysis.tasksWithoutTags.push(task.title)
continue
}
tagAnalysis.tasksWithTags++
// Check each tag
for tag in task.tags:
// Does tag map to a specialist?
if (!agentMapping[tag]) {
tagAnalysis.unmappedTags.push({
task: task.title,
tag: tag,
severity: "ALERT"
})
}
// Is tag following conventions (existing tag)?
if (!projectTags.includes(tag)) {
tagAnalysis.conventionViolations.push({
task: task.title,
tag: tag,
severity: "WARN",
suggestion: "Use existing project tags or add to agent-mapping.yaml"
})
}
}
}
```
### Step 3: Verify Specialist Coverage
```javascript
coverageCheck = {
covered: 0,
uncovered: []
}
for task in tasks:
specialists = getSpecialistsForTask(task.tags, agentMapping)
if (specialists.length == 0) {
coverageCheck.uncovered.push({
task: task.title,
tags: task.tags,
issue: "No specialist mapping found",
severity: "ALERT"
})
} else {
coverageCheck.covered++
}
}
```
### Step 4: Calculate Quality Score
```javascript
score = {
tagCoverage: (tagAnalysis.tasksWithTags / tagAnalysis.totalTasks) * 100,
agentMappingCoverage: (coverageCheck.covered / tagAnalysis.totalTasks) * 100,
conventionAdherence: (
(tagAnalysis.tasksWithTags - tagAnalysis.conventionViolations.length) /
tagAnalysis.tasksWithTags
) * 100,
overall: average(tagCoverage, agentMappingCoverage, conventionAdherence)
}
```
## Report Template
```markdown
## 🏷️ Tag Quality Analysis
**Overall Score**: [X]% (Baseline: 90% / Target: 100%)
### Metrics
- Tag Coverage: [X]% ([Y]/[Z] tasks have tags)
- Agent Mapping Coverage: [X]% ([Y]/[Z] tasks map to specialists)
- Convention Adherence: [X]%
### Issues ([count] total)
🚨 **ALERT** ([count]): No specialist mapping
- [Task A]: Tags [tag1, tag2] don't map to any specialist
⚠️ **WARN** ([count]): Convention violations
- [Task B]: Tag "new-tag" not in project conventions
### Recommendations
1. Add tags to tasks: [list]
2. Update agent-mapping.yaml for: [tags]
3. Use existing tags instead of: [new tags]
```
## Critical Checks
### Check 1: Every Task Has Tags
Tasks without tags cannot be routed to specialists.
### Check 2: Every Tag Maps to Specialist
Tags that don't map to agent-mapping.yaml will fail routing.
### Check 3: Tags Follow Project Conventions
New tags should be rare; reuse existing tags when possible.
## When to Report
- **ALWAYS** after Planning Specialist
- **Full details** if score < 100%
- **Brief summary** if score == 100%

View File

@@ -0,0 +1,744 @@
# Task Content Quality Analysis
**Purpose**: Analyze information added to tasks by specialists to detect wasteful content, measure information density, and suggest improvements.
**When**: After Implementation Specialists complete tasks (Backend, Frontend, Database, Test, Technical Writer)
**Applies To**: Implementation Specialist Subagents only
**Token Cost**: ~500-700 tokens
## Overview
Implementation specialists add content to tasks through:
1. **Summary field** (300-500 chars) - What was accomplished
2. **Task sections** - Detailed results, approach, decisions
3. **Files Changed section** (ordinal 999) - List of modified files
This analysis ensures specialists add **high-density, non-redundant information** while avoiding token waste.
## Quality Metrics
### 1. Information Density
**Definition**: Ratio of useful information to total tokens added
**Formula**: `density = (unique_concepts + actionable_details) / total_tokens`
**Target**: ≥ 70% (7 concepts per 10 tokens)
**Good Example** (High Density):
```
Summary (87 tokens):
"Implemented OAuth2 authentication with JWT tokens. Added UserService with
login/logout endpoints. All 12 tests passing. Files: AuthController.kt,
UserService.kt, SecurityConfig.kt, AuthControllerTest.kt"
Density: 85% (7 concepts: OAuth2, JWT, UserService, login, logout, tests passing, files)
```
**Bad Example** (Low Density):
```
Summary (143 tokens):
"I have successfully completed the implementation of the authentication feature
as requested. The work involved creating the necessary components and ensuring
everything works correctly. Testing was performed and all tests are now passing
successfully."
Density: 35% (3 concepts: authentication, components created, tests passing)
Waste: 60 tokens of filler words
```
### 2. Redundancy Score
**Definition**: Percentage of information duplicated across summary + sections
**Formula**: `redundancy = duplicate_tokens / (summary_tokens + section_tokens)`
**Target**: ≤ 20% (minimal overlap between summary and sections)
**Detection**:
```javascript
// Extract key phrases from summary
summaryPhrases = extractPhrases(task.summary)
// e.g., ["OAuth2 authentication", "JWT tokens", "UserService", "12 tests passing"]
// Check sections for duplicate phrases
sectionContent = task.sections.map(s => s.content).join(" ")
duplicates = summaryPhrases.filter(phrase => sectionContent.includes(phrase))
redundancy = (duplicates.length / summaryPhrases.length) * 100
```
**High Redundancy Example** (Bad):
```
Summary:
"Implemented OAuth2 authentication with JWT tokens. Added UserService."
Technical Approach Section:
"For this task, I implemented OAuth2 authentication using JWT tokens.
I created a UserService to handle authentication logic..."
Redundancy: 70% (both mention OAuth2, JWT, UserService)
```
**Low Redundancy Example** (Good):
```
Summary:
"Implemented OAuth2 authentication. 12 tests passing."
Technical Approach Section:
"Used Spring Security OAuth2 library. Token validation in JwtFilter.
Refresh token rotation every 24h. Rate limiting: 5 attempts/min."
Redundancy: 15% (summary is high-level, section adds technical details)
```
### 3. Code Snippet Ratio
**Definition**: Percentage of section content that is code vs explanation
**Formula**: `code_ratio = code_block_tokens / section_tokens`
**Target**: ≤ 30% (sections explain, files contain code)
**Detection**:
```javascript
// Count tokens in code blocks
codeBlocks = extractCodeBlocks(section.content) // ```language ... ```
codeTokens = sum(codeBlocks.map(b => estimateTokens(b)))
// Total section tokens
sectionTokens = estimateTokens(section.content)
ratio = (codeTokens / sectionTokens) * 100
```
**Bad Example** (High Code Ratio):
```markdown
## Implementation Details
Here's the UserService implementation:
```kotlin
@Service
class UserService(
private val userRepository: UserRepository,
private val passwordEncoder: PasswordEncoder
) {
fun login(email: String, password: String): User? {
val user = userRepository.findByEmail(email)
return if (user != null && passwordEncoder.matches(password, user.password)) {
user
} else null
}
// ... 50 more lines
}
```
And here's the test:
```kotlin
@Test
fun `login with valid credentials returns user`() {
// ... 30 lines of test code
}
```
Code Ratio: 85% (300 code tokens / 350 total tokens)
Issue: Full code belongs in files, not task sections
```
**Good Example** (Low Code Ratio):
```markdown
## Implementation Details
Created UserService with login/logout methods. Key decisions:
- Password hashing: BCrypt (cost factor 12)
- Session management: JWT with 1h expiration
- Rate limiting: 5 failed attempts → 15min lockout
Example usage:
```kotlin
userService.login(email, password) // Returns User or null
```
Code Ratio: 12% (20 code tokens / 165 total tokens)
Quality: Explains approach, minimal code snippet for clarity
```
### 4. Summary Quality
**Definition**: Summary is concise, informative, and follows best practices
**Checks**:
- ✅ Length: 300-500 characters (enforced by Status Progression Skill)
- ✅ Mentions what was done (not how or why - that's in sections)
- ✅ Includes test status
- ✅ Lists key files changed
- ✅ No filler words ("I have...", "successfully...", "as requested...")
**Scoring**:
```javascript
quality = {
length: inRange(summary.length, 300, 500) ? 25 : 0,
mentions_what: containsActionVerbs(summary) ? 25 : 0, // "Implemented", "Added", "Fixed"
test_status: mentionsTests(summary) ? 25 : 0, // "12 tests passing"
no_filler: !containsFiller(summary) ? 25 : 0 // No "successfully", "I have"
}
score = sum(quality.values) // 0-100
```
**Example Scores**:
90/100 (Excellent):
```
"Implemented OAuth2 authentication with JWT tokens. Added UserService for
user management. All 12 tests passing. Files: AuthController.kt, UserService.kt,
SecurityConfig.kt"
✓ Length: 387 chars
✓ Mentions what: "Implemented", "Added"
✓ Test status: "12 tests passing"
✓ No filler: Clean, direct
```
50/100 (Poor):
```
"I have successfully completed the authentication feature as requested. The
implementation involved creating the necessary components and ensuring that
everything works correctly. All tests are passing."
✓ Length: 349 chars
✗ Mentions what: Vague "components"
✓ Test status: "tests are passing"
✗ No filler: "successfully", "as requested", "I have"
```
### 5. Section Usefulness
**Definition**: Sections add value beyond what's in summary and files
**Checks per section**:
- ✅ Explains decisions/trade-offs
- ✅ Documents non-obvious approach
- ✅ Provides context for future developers
- ✅ References files instead of duplicating code
- ✅ Concise (bullet points > paragraphs)
**Scoring**:
```javascript
usefulness = {
explains_why: containsRationale(section) ? 20 : 0, // "Chose X because..."
approach: describesApproach(section) ? 20 : 0, // "Used pattern Y"
future_context: providesContext(section) ? 20 : 0, // "Note: Z limitation"
references_files: hasFileReferences(section) ? 20 : 0, // "See AuthController.kt:45"
concise: isConcise(section) ? 20 : 0 // Bullet points, not prose
}
score = sum(usefulness.values) // 0-100
```
## Wasteful Patterns to Detect
### Pattern 1: Full Code in Sections
**Issue**: Code belongs in files, not task documentation
**Detection**:
```javascript
if (section.codeBlockCount > 2 || section.codeRatio > 30) {
return {
pattern: "Full code in sections",
severity: "WARN",
found: `${section.codeBlockCount} code blocks, ${section.codeRatio}% of content`,
expected: "≤ 2 brief code snippets, ≤ 30% code ratio",
recommendation: "Move code to files, reference with: 'See FileName.kt:lineNumber'",
savings: estimateSavings(section) // e.g., "~500 tokens"
}
}
```
### Pattern 2: Full Test Output
**Issue**: Test results should be summarized, not pasted verbatim
**Detection**:
```javascript
if (section.title.includes("Test") && section.content.includes("PASSED") && section.content.length > 500) {
return {
pattern: "Full test output in section",
severity: "WARN",
found: `${section.content.length} chars of test output`,
expected: "Test summary: X/Y passed, failure details if any",
recommendation: "Summarize: '12/12 tests passing' or '11/12 passing (1 flaky test)'",
savings: `~${section.content.length * 0.75} tokens`
}
}
```
### Pattern 3: Summary Redundancy
**Issue**: Summary repeats information already in sections
**Detection**:
```javascript
overlap = calculateOverlap(task.summary, task.sections)
if (overlap > 40) {
return {
pattern: "High summary-section redundancy",
severity: "INFO",
found: `${overlap}% overlap between summary and sections`,
expected: "≤ 20% overlap (summary = high-level, sections = details)",
recommendation: "Make summary more concise, or add new details to sections",
savings: `~${estimateRedundantTokens(task)} tokens`
}
}
```
### Pattern 4: Filler Language
**Issue**: Verbose, unnecessary words that don't add information
**Detection**:
```javascript
fillerPhrases = [
"I have successfully",
"as requested",
"in order to",
"it should be noted that",
"for the purpose of",
"with regards to",
"in conclusion"
]
found = fillerPhrases.filter(phrase => task.summary.includes(phrase))
if (found.length > 0) {
return {
pattern: "Filler language in summary",
severity: "INFO",
found: found.join(", "),
expected: "Direct, concise language",
recommendation: "Remove filler: 'Implemented X' not 'I have successfully implemented X as requested'",
savings: `~${found.length * 3} tokens`
}
}
```
### Pattern 5: Over-Explaining Obvious
**Issue**: Explaining what's clear from file/function names
**Detection**:
```javascript
if (section.title == "Implementation" && containsObvious(section.content)) {
return {
pattern: "Over-explaining obvious implementation",
severity: "INFO",
example: "Explaining 'UserService manages users' when class is named UserService",
recommendation: "Focus on non-obvious: design decisions, trade-offs, gotchas",
savings: "~100-200 tokens"
}
}
```
### Pattern 6: Uncustomized Template Sections
**Issue**: Generic template sections with placeholder text that provide zero value
**Detection**:
```javascript
placeholderPatterns = [
/\[Component\s*\d*\]/i,
/\[Library\s*Name\]/i,
/\[Phase\s*Name\]/i,
/\[Library\]/i,
/\[Version\]/i,
/\[What it does\]/i,
/\[Why chosen\]/i,
/\[Goal\]:/i,
/\[Deliverables\]:/i
]
for (section in task.sections) {
// Check for placeholder patterns
hasPlaceholder = placeholderPatterns.some(pattern => pattern.test(section.content))
// Check for generic template titles with minimal content
genericTitles = ["Architecture Overview", "Key Dependencies", "Implementation Strategy"]
isGenericTitle = genericTitles.includes(section.title)
hasMinimalCustomization = section.content.length < 300 || section.content.includes('[')
if (hasPlaceholder || (isGenericTitle && hasMinimalCustomization)) {
return {
pattern: "Uncustomized template section",
severity: "WARN", // High priority - significant token waste
found: `Section "${section.title}" contains placeholder text or generic template`,
expected: "Task-specific content ≥200 chars, OR delete section entirely",
recommendation: "DELETE section using manage_sections(operation='delete', id='${section.id}') - Templates provide sufficient structure",
savings: `~${estimateTokens(section.content)} tokens`,
sectionId: section.id,
action: "DELETE" // Explicit action to take
}
}
}
```
**Common Placeholder Patterns**:
- `[Component 1]`, `[Component 2]` - Generic component names
- `[Library Name]`, `[Version]` - Dependency table placeholders
- `[Phase Name]`, `[Goal]:`, `[Deliverables]:` - Implementation strategy placeholders
- `[What it does]`, `[Why chosen]` - Generic explanations
**Examples of Violations**:
**Bad Example 1 - Architecture Overview with placeholders**:
```markdown
Title: Architecture Overview
Content:
This task involves the following components:
- [Component 1]: [What it does]
- [Component 2]: [What it does]
Technical approach:
- [Library Name] for [functionality]
- [Library Name] for [functionality]
(72 tokens of waste - DELETE this section)
```
**Bad Example 2 - Key Dependencies with placeholders**:
```markdown
Title: Key Dependencies
Content:
| Library | Version | Purpose |
|---------|---------|---------|
| [Library Name] | [Version] | [What it does] |
| [Library Name] | [Version] | [What it does] |
Rationale:
- [Library]: [Why chosen]
(85 tokens of waste - DELETE this section)
```
**Bad Example 3 - Implementation Strategy with placeholders**:
```markdown
Title: Implementation Strategy
Content:
Phase 1: [Phase Name]
- Goal: [Goal]
- Deliverables: [Deliverables]
Phase 2: [Phase Name]
- Goal: [Goal]
- Deliverables: [Deliverables]
(98 tokens of waste - DELETE this section)
```
**Proper Response When Detected**:
```markdown
⚠️ WARN - Uncustomized Template Sections (Pattern 6)
**Found**: 3 task sections contain placeholder text, wasting ~255 tokens
**Violations**:
1. Task [ID] - Section "Architecture Overview" (72 tokens)
- Placeholder patterns: `[Component 1]`, `[What it does]`
- **Action**: DELETE section (ID: xxx)
- **Reason**: Templates provide sufficient structure
2. Task [ID] - Section "Key Dependencies" (85 tokens)
- Placeholder patterns: `[Library Name]`, `[Version]`, `[Why chosen]`
- **Action**: DELETE section (ID: yyy)
- **Reason**: Generic table with no actual dependencies
3. Task [ID] - Section "Implementation Strategy" (98 tokens)
- Placeholder patterns: `[Phase Name]`, `[Goal]:`, `[Deliverables]:`
- **Action**: DELETE section (ID: zzz)
- **Reason**: Uncustomized phases with no specific strategy
**Expected**: Task-specific content ≥200 chars with NO placeholder text, OR delete section entirely
**Recommendation**:
- Planning Specialist must customize ALL sections before returning to orchestrator (Step 7.5 validation)
- Implementation Specialists must DELETE any placeholder sections during Step 4
- Templates provide sufficient structure for 95% of tasks (complexity ≤7)
**Root Cause**: Planning Specialist's bulkCreate operation included generic template sections without customization
**Prevention**:
1. Planning Specialist Step 7.5 (Validate Task Quality) must detect and delete placeholder sections
2. Implementation Specialists Step 4 must check for and delete placeholder sections
3. Orchestration QA Skill now detects this pattern automatically
**Token Savings**: ~255 tokens (current waste) → 0 tokens (after deletion)
```
## Analysis Workflow
### Step 1: Capture Baseline
**Before specialist executes**:
```javascript
baseline = {
taskId: task.id,
summaryLength: task.summary?.length || 0,
sectionCount: task.sections.length,
totalTokens: estimateTaskTokens(task)
}
```
### Step 2: Measure Addition
**After specialist completes**:
```javascript
delta = {
summaryAdded: task.summary.length - baseline.summaryLength,
sectionsAdded: task.sections.length - baseline.sectionCount,
tokensAdded: estimateTaskTokens(task) - baseline.totalTokens
}
```
### Step 3: Analyze Quality
**Run quality checks**:
```javascript
analysis = {
informationDensity: calculateDensity(task, delta),
redundancyScore: calculateRedundancy(task),
codeRatio: calculateCodeRatio(task),
summaryQuality: scoreSummary(task.summary),
sectionUsefulness: task.sections.map(s => scoreSection(s)),
wastefulPatterns: detectWaste(task)
}
```
### Step 4: Generate Report
**Format findings**:
```javascript
report = {
specialist: entityType,
taskId: task.id,
tokensAdded: delta.tokensAdded,
quality: {
informationDensity: `${analysis.informationDensity}%`,
redundancy: `${analysis.redundancyScore}%`,
codeRatio: `${analysis.codeRatio}%`,
summaryScore: `${analysis.summaryQuality}/100`,
avgSectionScore: average(analysis.sectionUsefulness)
},
wastefulPatterns: analysis.wastefulPatterns,
potentialSavings: calculateSavings(analysis.wastefulPatterns)
}
```
### Step 5: Track Trends
**Aggregate across tasks**:
```javascript
session.contentQuality.push(report)
// After N tasks (e.g., 5), analyze trends
if (session.contentQuality.length >= 5) {
trends = analyzeTrends(session.contentQuality)
// e.g., "Backend Engineer consistently has high code ratio (avg 65%)"
}
```
## Report Template
```markdown
## 📊 Task Content Quality Analysis
**Specialist**: [Backend Engineer / Frontend Developer / etc.]
**Task**: [Task Title] ([ID])
### Tokens Added
- Summary: [X] chars ([Y] tokens)
- Sections: [N] sections added ([Z] tokens)
- **Total Added**: [Y+Z] tokens
### Quality Metrics
- **Information Density**: [X]% ([Target: ≥70%])
- **Redundancy Score**: [Y]% ([Target: ≤20%])
- **Code Ratio**: [Z]% ([Target: ≤30%])
- **Summary Quality**: [Score]/100
### ✅ Strengths
- [What was done well]
- [Good practice observed]
### ⚠️ Wasteful Patterns Detected ([count])
**Pattern 1: [Name]**
- Found: [What was observed]
- Expected: [Best practice]
- Recommendation: [How to improve]
- Potential Savings: ~[X] tokens
**Pattern 2: [Name]**
- Found: [What was observed]
- Expected: [Best practice]
- Recommendation: [How to improve]
- Potential Savings: ~[Y] tokens
### 💰 Total Potential Savings
- Current: [N] tokens added
- Optimized: [N-X-Y] tokens
- **Savings**: ~[X+Y] tokens ([Z]% reduction)
### 🎯 Specific Recommendations
1. [Most impactful improvement]
2. [Secondary improvement]
3. [Optional enhancement]
```
## Trend Analysis (After 5+ Tasks)
```markdown
## 📈 Content Quality Trends
**Session**: [N] tasks analyzed
**Specialists**: [List of specialists used]
### Average Metrics
- Information Density: [X]% (Target: ≥70%)
- Redundancy: [Y]% (Target: ≤20%)
- Code Ratio: [Z]% (Target: ≤30%)
- Summary Quality: [Score]/100
### Recurring Patterns
**Most Common Issue**: [Pattern name] ([N] occurrences)
- **Specialists Affected**: [Backend Engineer (3x), Frontend (2x)]
- **Total Waste**: ~[X] tokens across tasks
- **Recommendation**: Update [specialist].md to emphasize [practice]
**Second Most Common**: [Pattern name] ([M] occurrences)
- **Specialists Affected**: [...]
- **Recommendation**: [...]
### Specialist Performance
**Backend Engineer** ([N] tasks):
- Avg Density: [X]%
- Avg Redundancy: [Y]%
- Common Issue: High code ratio (avg [Z]%)
- **Recommendation**: Reference files instead of embedding code
**Frontend Developer** ([M] tasks):
- Avg Density: [X]%
- Avg Redundancy: [Y]%
- Strengths: Excellent summary quality (avg 85/100)
### System-Wide Opportunities
1. **Update Specialist Templates**
- Add "Code in Files, Not Sections" guideline to all implementation specialists
- Estimated Impact: [X]% token reduction
2. **Enhance Summary Guidelines**
- Add anti-pattern examples (filler language)
- Estimated Impact: [Y]% improvement in quality scores
3. **Section Template Improvements**
- Provide better examples of useful vs wasteful sections
- Estimated Impact: [Z]% reduction in redundancy
```
## Integration with Post-Execution Review
```javascript
// In post-execution.md, after Step 4 (Validate completion quality):
if (isImplementationSpecialist(entityType)) {
// Read task-content-quality.md
Read ".claude/skills/orchestration-qa/task-content-quality.md"
// Run content quality analysis
contentAnalysis = analyzeTaskContent(task, baseline)
// Add to report
report.contentQuality = contentAnalysis
// Track for trends
session.contentQuality.push(contentAnalysis)
// If patterns found, add to deviations
if (contentAnalysis.wastefulPatterns.length > 0) {
deviations.push({
severity: "INFO", // Usually INFO, can be WARN if severe
type: "Content Quality",
patterns: contentAnalysis.wastefulPatterns,
savings: contentAnalysis.potentialSavings
})
}
}
```
## When to Report
**Individual Task**:
- Report if wasteful patterns detected
- Report if quality scores below targets
**Session Trends**:
- After 5+ tasks analyzed
- When recurring patterns detected (same issue 2+ times)
- At session end (via `phase="summary"`)
## Add to TodoWrite (If Issues Found)
```javascript
if (contentAnalysis.potentialSavings > 100) {
TodoWrite([{
content: `Review ${specialist} content quality: ${contentAnalysis.potentialSavings} tokens wasted`,
activeForm: `Reviewing ${specialist} content patterns`,
status: "pending"
}])
}
// If recurring pattern
if (trends.recurringPatterns.length > 0) {
TodoWrite([{
content: `Update ${specialist}.md: ${trends.recurringPatterns[0].name} pattern recurring`,
activeForm: `Improving ${specialist} guidelines`,
status: "pending"
}])
}
```
## Target Benchmarks
**Excellent** (95%+ of metrics in target):
- Information Density: ≥ 80%
- Redundancy: ≤ 15%
- Code Ratio: ≤ 20%
- Summary Quality: ≥ 85/100
- No wasteful patterns
**Good** (80%+ of metrics in target):
- Information Density: 70-79%
- Redundancy: 16-20%
- Code Ratio: 21-30%
- Summary Quality: 70-84/100
- Minor wasteful patterns (< 100 tokens waste)
**Needs Improvement** (< 80% in target):
- Information Density: < 70%
- Redundancy: > 20%
- Code Ratio: > 30%
- Summary Quality: < 70/100
- Significant waste (> 100 tokens)
## Continuous Improvement
**Track over time**:
- Are quality scores improving?
- Are wasteful patterns decreasing?
- Which specialists need guideline updates?
- What best practices emerge from high-quality tasks?
**Update specialist definitions when**:
- Same pattern occurs 3+ times
- Potential savings > 500 tokens across multiple tasks
- Quality scores consistently below targets

View File

@@ -0,0 +1,230 @@
# Token Optimization Analysis
**Purpose**: Identify token waste patterns and optimization opportunities.
**When**: Optional post-execution (controlled by enableEfficiencyAnalysis parameter)
**Applies To**: All Skills and Subagents
**Token Cost**: ~400-600 tokens
## Common Token Waste Patterns
### Pattern 1: Verbose Specialist Output
**Issue**: Specialist returns full code/documentation in response instead of brief summary
**Expected**: Specialists return 50-100 token summary, detailed work goes in sections/files
**Detection**:
```javascript
if (isImplementationSpecialist(entityType) && estimateTokens(output) > 200) {
return {
severity: "WARN",
pattern: "Verbose specialist output",
actual: estimateTokens(output),
expected: "50-100 tokens",
savings: estimateTokens(output) - 100,
recommendation: "Return brief summary, put details in task sections"
}
}
```
### Pattern 2: Reading with includeSections When Not Needed
**Issue**: Loading all sections when only metadata needed
**Expected**: Use scoped `overview` for hierarchical views without sections
**Detection**:
```javascript
if (mentions(output, "includeSections=true") && !needsSections(workflowType)) {
return {
severity: "INFO",
pattern: "Unnecessary section loading",
recommendation: "Use operation='overview' for metadata + task list",
savings: "85-93% tokens (e.g., 18.5k → 1.2k for typical feature)"
}
}
```
### Pattern 3: Multiple Get Operations Instead of Overview
**Issue**: Calling `query_container(operation="get")` multiple times instead of one overview
**Expected**: Single scoped overview provides hierarchical view efficiently
**Detection**:
```javascript
getCallCount = countToolCalls(output, "query_container", "get")
if (getCallCount > 1 && !usedOverview) {
return {
severity: "INFO",
pattern: "Multiple get calls instead of overview",
actual: `${getCallCount} get calls`,
expected: "1 scoped overview call",
savings: estimateSavings(getCallCount)
}
}
```
### Pattern 4: Listing All Entities When Filtering Would Work
**Issue**: Querying all tasks then filtering in code
**Expected**: Use query parameters (status, tags, priority) for filtering
**Detection**:
```javascript
if (mentions(output, "filter") && !usedQueryFilters) {
return {
severity: "INFO",
pattern: "Client-side filtering instead of query filters",
recommendation: "Use status/tags/priority parameters in query_container",
savings: "~50-70% tokens"
}
}
```
### Pattern 5: PRD Content in Description Instead of Sections
**Issue**: Feature Architect puts all PRD content in description field
**Expected**: Description is forward-looking summary; PRD sections go in feature sections
**Detection**:
```javascript
if (entityType == "feature-architect" && feature.description.length > 800) {
return {
severity: "WARN",
pattern: "PRD content in description field",
actual: `${feature.description.length} chars`,
expected: "200-500 chars description + sections for detailed content",
recommendation: "Move detailed content to feature sections"
}
}
```
### Pattern 6: Verbose Feature Architect Handoff
**Issue**: Feature Architect returns detailed feature explanation
**Expected**: Minimal handoff (50-100 tokens): "Feature created, ID: X, Y tasks ready"
**Detection**:
```javascript
if (entityType == "feature-architect" && estimateTokens(output) > 200) {
return {
severity: "WARN",
pattern: "Verbose Feature Architect handoff",
actual: estimateTokens(output),
expected: "50-100 tokens",
savings: estimateTokens(output) - 100,
recommendation: "Brief handoff: Feature ID, next action. Details in feature sections."
}
}
```
## Analysis Workflow
### Step 1: Estimate Token Usage
```javascript
tokenUsage = {
input: estimateTokens(context.userInput.fullText),
output: estimateTokens(entityOutput),
total: estimateTokens(context.userInput.fullText) + estimateTokens(entityOutput)
}
```
### Step 2: Compare Against Expected Range
```javascript
expectedRange = definition.tokenRange // e.g., [1800, 2200] for Feature Architect
deviation = tokenUsage.output - expectedRange[1]
if (deviation > expectedRange[1] * 0.5) { // More than 50% over
severity = "WARN"
} else {
severity = "INFO"
}
```
### Step 3: Detect Waste Patterns
```javascript
wastePatterns = []
// Check each pattern
if (verboseSpecialistOutput()) wastePatterns.push(pattern1)
if (unnecessarySectionLoading()) wastePatterns.push(pattern2)
if (multipleGetsInsteadOfOverview()) wastePatterns.push(pattern3)
if (clientSideFiltering()) wastePatterns.push(pattern4)
if (prdContentInDescription()) wastePatterns.push(pattern5)
if (verboseHandoff()) wastePatterns.push(pattern6)
```
### Step 4: Calculate Potential Savings
```javascript
totalSavings = wastePatterns.reduce((sum, pattern) => sum + pattern.savings, 0)
optimizedTokens = tokenUsage.total - totalSavings
efficiencyGain = (totalSavings / tokenUsage.total) * 100
```
### Step 5: Generate Report
```markdown
## 💡 Token Optimization Opportunities
**Current Usage**: [X] tokens
**Potential Savings**: [Y] tokens ([Z]% reduction)
**Optimized Usage**: [X - Y] tokens
### Patterns Detected ([count])
**⚠️ WARN** ([count]): Significant waste
- Verbose specialist output: [X] tokens (expected 50-100)
- PRD content in description: [Y] chars (expected 200-500)
** INFO** ([count]): Optimization opportunities
- Use overview instead of get: [savings] tokens
- Use query filters: [savings] tokens
### Recommendations
1. [Most impactful optimization]
2. [Secondary optimization]
```
## Recommended Baselines
- **Skills**: 200-900 tokens (lightweight coordination)
- **Feature Architect**: 1800-2200 tokens (complexity assessment + creation)
- **Planning Specialist**: 1800-2200 tokens (analysis + task creation)
- **Implementation Specialists**: 1800-2200 tokens (work done, not described)
- **Output**: 50-100 tokens (brief summary)
- **Sections**: Detailed work (not counted against specialist)
## When to Report
- **Only if** enableEfficiencyAnalysis=true
- **WARN**: Include in post-execution report
- **INFO**: Log for pattern tracking only
## Integration Example
```javascript
if (params.enableEfficiencyAnalysis) {
Read "token-optimization.md"
opportunities = analyzeTokenOptimization(entityType, entityOutput, context)
if (opportunities.length > 0) {
report.efficiencyAnalysis = {
currentUsage: tokenUsage.total,
savings: totalSavings,
gain: efficiencyGain,
opportunities: opportunities
}
}
}
```

View File

@@ -0,0 +1,153 @@
# Tool Selection Efficiency
**Purpose**: Verify optimal tool selection for the task at hand.
**When**: Optional post-execution (controlled by enableEfficiencyAnalysis parameter)
**Token Cost**: ~300-500 tokens
## Optimal Tool Selection Patterns
### Pattern 1: query_container Overview vs Get
**Optimal**: Use `operation="overview"` for hierarchical views without section content
**Suboptimal**: Use `operation="get"` with `includeSections=true` when only need metadata + child list
**Detection**:
```javascript
if (usedGet && includeSections && !needsFullSections) {
return {
pattern: "Used get with sections when overview would suffice",
current: "query_container(operation='get', includeSections=true)",
optimal: "query_container(operation='overview', id='...')",
savings: "85-93% tokens",
when: "Need: feature metadata + task list (no section content)"
}
}
```
### Pattern 2: Search vs Filtered Query
**Optimal**: Use `query_container` with filters for known criteria
**Suboptimal**: Use `operation="search"` when exact filters would work
**Detection**:
```javascript
if (usedSearch && hasExactCriteria) {
return {
pattern: "Used search when filtered query more efficient",
current: "query_container(operation='search', query='pending tasks')",
optimal: "query_container(operation='search', status='pending')",
savings: "Query filters are faster and more precise"
}
}
```
### Pattern 3: Bulk Operations vs Multiple Singles
**Optimal**: Use `operation="bulkUpdate"` for multiple updates
**Suboptimal**: Loop calling `update` multiple times
**Detection**:
```javascript
updateCount = countToolCalls(output, "manage_container", "update")
if (updateCount >= 3) {
return {
pattern: "Multiple update calls instead of bulkUpdate",
current: `${updateCount} separate update calls`,
optimal: "1 bulkUpdate call",
savings: `${updateCount - 1} round trips eliminated`
}
}
```
### Pattern 4: Scoped Overview vs Multiple Gets
**Optimal**: Single scoped overview for hierarchical view
**Suboptimal**: Multiple get calls for related entities
**Detection**:
```javascript
if (getCallCount >= 2 && queriedRelatedEntities) {
return {
pattern: "Multiple gets for related entities",
current: `${getCallCount} get calls`,
optimal: "1 scoped overview (returns entity + children)",
savings: `${getCallCount - 1} tool calls eliminated`
}
}
```
### Pattern 5: recommend_agent vs Manual Routing
**Optimal**: Use `recommend_agent` for specialist routing
**Suboptimal**: Manual tag analysis and routing logic
**Detection**:
```javascript
if (taskOrchestration && !usedRecommendAgent && launchedSpecialists) {
return {
pattern: "Manual specialist routing instead of recommend_agent",
current: "Manual tag → specialist mapping",
optimal: "recommend_agent(taskId) → automatic routing",
benefit: "Centralized routing logic, consistent with agent-mapping.yaml"
}
}
```
## Analysis Workflow
```javascript
toolSelectionIssues = []
// Check each pattern
checkOverviewVsGet()
checkSearchVsFiltered()
checkBulkOpsVsMultiple()
checkScopedOverviewVsGets()
checkRecommendAgentUsage()
// Generate report if issues found
if (toolSelectionIssues.length > 0) {
return {
issuesFound: toolSelectionIssues.length,
issues: toolSelectionIssues,
recommendations: prioritizeRecommendations(toolSelectionIssues)
}
}
```
## Report Template
```markdown
## 🔧 Tool Selection Efficiency
**Suboptimal Patterns**: [count]
### Issues Detected
** INFO**: Use overview instead of get
- Current: `query_container(operation='get', includeSections=true)`
- Optimal: `query_container(operation='overview', id='...')`
- Savings: 85-93% tokens
** INFO**: Use bulkUpdate instead of multiple updates
- Current: [X] separate update calls
- Optimal: 1 bulkUpdate call
- Savings: [X-1] round trips
### Recommendations
1. [Most impactful change]
2. [Secondary optimization]
```
## When to Report
- **Only if** enableEfficiencyAnalysis=true
- **INFO** level (observations, not violations)
- Include in efficiency analysis section