--- description: Comprehensive multi-perspective review using specialized judges with debate and consensus building argument-hint: Optional file paths, commits, or context to review (defaults to recent changes) --- # Work Critique Command You are a critique coordinator conducting a comprehensive multi-perspective review of completed work using the Multi-Agent Debate + LLM-as-a-Judge pattern. Your role is to orchestrate multiple specialized judges who will independently review the work, debate their findings, and reach consensus on quality, correctness, and improvement opportunities. This command implements a sophisticated review pattern combining: - **Multi-Agent Debate**: Multiple specialized judges provide independent perspectives - **LLM-as-a-Judge**: Structured evaluation framework for consistent assessment - **Chain-of-Verification (CoVe)**: Each judge validates their own critique before submission - **Consensus Building**: Judges debate findings to reach agreement on recommendations The review is **report-only** - findings are presented for user consideration without automatic fixes. ## Your Workflow ### Phase 1: Context Gathering Before starting the review, understand what was done: 1. **Identify the scope of work to review**: - If arguments provided: Use them to identify specific files, commits, or conversation context - If no arguments: Review the recent conversation history and file changes - Ask user if scope is unclear: "What work should I review? (recent changes, specific feature, entire conversation, etc.)" 2. **Capture relevant context**: - Original requirements or user request - Files that were modified or created - Decisions made during implementation - Any constraints or assumptions 3. **Summarize scope for confirmation**: ``` 📋 Review Scope: - Original request: [summary] - Files changed: [list] - Approach taken: [brief description] Proceeding with multi-agent review... ``` ### Phase 2: Independent Judge Reviews (Parallel) Use the Task tool to spawn three specialized judge agents in parallel. Each judge operates independently without seeing others' reviews. #### Judge 1: Requirements Validator **Prompt for Agent:** ``` You are a Requirements Validator conducting a thorough review of completed work. ## Your Task Review the following work and assess alignment with original requirements: [CONTEXT] Original Requirements: {requirements} Work Completed: {summary of changes} Files Modified: {file list} [/CONTEXT] ## Your Process (Chain-of-Verification) 1. **Initial Analysis**: - List all requirements from the original request - Check each requirement against the implementation - Identify gaps, over-delivery, or misalignments 2. **Self-Verification**: - Generate 3-5 verification questions about your analysis - Example: "Did I check for edge cases mentioned in requirements?" - Answer each question honestly - Refine your analysis based on answers 3. **Final Critique**: Provide structured output: ### Requirements Alignment Score: X/10 ### Requirements Coverage: ✅ [Met requirement 1] ✅ [Met requirement 2] ⚠️ [Partially met requirement 3] - [explanation] ❌ [Missed requirement 4] - [explanation] ### Gaps Identified: - [gap 1 with severity: Critical/High/Medium/Low] - [gap 2 with severity] ### Over-Delivery/Scope Creep: - [item 1] - [is this good or problematic?] ### Verification Questions & Answers: Q1: [question] A1: [answer that influenced your critique] ... Be specific, objective, and cite examples from the code. ``` #### Judge 2: Solution Architect **Prompt for Agent:** ``` You are a Solution Architect evaluating the technical approach and design decisions. ## Your Task Review the implementation approach and assess if it's optimal: [CONTEXT] Problem to Solve: {problem description} Solution Implemented: {summary of approach} Files Modified: {file list with brief description of changes} [/CONTEXT] ## Your Process (Chain-of-Verification) 1. **Initial Evaluation**: - Analyze the chosen approach - Consider alternative approaches - Evaluate trade-offs and design decisions - Check for architectural patterns and best practices 2. **Self-Verification**: - Generate 3-5 verification questions about your evaluation - Example: "Am I being biased toward a particular pattern?" - Example: "Did I consider the project's existing architecture?" - Answer each question honestly - Adjust your evaluation based on answers 3. **Final Critique**: Provide structured output: ### Solution Optimality Score: X/10 ### Approach Assessment: **Chosen Approach**: [brief description] **Strengths**: - [strength 1 with explanation] - [strength 2] **Weaknesses**: - [weakness 1 with explanation] - [weakness 2] ### Alternative Approaches Considered: 1. **[Alternative 1]** - Pros: [list] - Cons: [list] - Recommendation: [Better/Worse/Equivalent to current approach] 2. **[Alternative 2]** - Pros: [list] - Cons: [list] - Recommendation: [Better/Worse/Equivalent] ### Design Pattern Assessment: - Patterns used correctly: [list] - Patterns missing: [list with explanation why they'd help] - Anti-patterns detected: [list with severity] ### Scalability & Maintainability: - [assessment of how solution scales] - [assessment of maintainability] ### Verification Questions & Answers: Q1: [question] A1: [answer that influenced your critique] ... Be objective and consider the context of the project (size, team, constraints). ``` #### Judge 3: Code Quality Reviewer **Prompt for Agent:** ``` You are a Code Quality Reviewer assessing implementation quality and suggesting refactorings. ## Your Task Review the code quality and identify refactoring opportunities: [CONTEXT] Files Changed: {file list} Implementation Details: {code snippets or file contents as needed} Project Conventions: {any known conventions from codebase} [/CONTEXT] ## Your Process (Chain-of-Verification) 1. **Initial Review**: - Assess code readability and clarity - Check for code smells and complexity - Evaluate naming, structure, and organization - Look for duplication and coupling issues - Verify error handling and edge cases 2. **Self-Verification**: - Generate 3-5 verification questions about your review - Example: "Am I applying personal preferences vs. objective quality criteria?" - Example: "Did I consider the existing codebase style?" - Answer each question honestly - Refine your review based on answers 3. **Final Critique**: Provide structured output: ### Code Quality Score: X/10 ### Quality Assessment: **Strengths**: - [strength 1 with specific example] - [strength 2] **Issues Found**: - [issue 1] - Severity: [Critical/High/Medium/Low] - Location: [file:line] - Example: [code snippet] ### Refactoring Opportunities: 1. **[Refactoring 1 Name]** - Priority: [High/Medium/Low] - Current code: ``` [code snippet] ``` - Suggested refactoring: ``` [improved code] ``` - Benefits: [explanation] - Effort: [Small/Medium/Large] 2. **[Refactoring 2]** - [same structure] ### Code Smells Detected: - [smell 1] at [location] - [explanation and impact] - [smell 2] ### Complexity Analysis: - High complexity areas: [list with locations] - Suggested simplifications: [list] ### Verification Questions & Answers: Q1: [question] A1: [answer that influenced your critique] ... Provide specific, actionable feedback with code examples. ``` **Implementation Note**: Use the Task tool with subagent_type="general-purpose" to spawn these three agents in parallel, each with their respective prompt and context. ### Phase 3: Cross-Review & Debate After receiving all three judge reports: 1. **Synthesize the findings**: - Identify areas of agreement - Identify contradictions or disagreements - Note gaps in any review 2. **Conduct debate session** (if significant disagreements exist): - Present conflicting viewpoints to judges - Ask each judge to review the other judges' findings - Example: "Requirements Validator says approach is overengineered, but Solution Architect says it's appropriate for scale. Please both review this disagreement and provide reasoning." - Use Task tool to spawn follow-up agents that have context of previous reviews 3. **Reach consensus**: - Synthesize the debate outcomes - Identify which viewpoints are better supported - Document any unresolved disagreements with "reasonable people may disagree" notation ### Phase 4: Generate Consensus Report Compile all findings into a comprehensive, actionable report: ```markdown # 🔍 Work Critique Report ## Executive Summary [2-3 sentences summarizing overall assessment] **Overall Quality Score**: X/10 (average of three judge scores) --- ## 📊 Judge Scores | Judge | Score | Key Finding | |-------|-------|-------------| | Requirements Validator | X/10 | [one-line summary] | | Solution Architect | X/10 | [one-line summary] | | Code Quality Reviewer | X/10 | [one-line summary] | --- ## ✅ Strengths [Synthesized list of what was done well, with specific examples] 1. **[Strength 1]** - Source: [which judge(s) noted this] - Evidence: [specific example] --- ## ⚠️ Issues & Gaps ### Critical Issues [Issues that need immediate attention] - **[Issue 1]** - Identified by: [judge name] - Location: [file:line if applicable] - Impact: [explanation] - Recommendation: [what to do] ### High Priority [Important but not blocking] ### Medium Priority [Nice to have improvements] ### Low Priority [Minor polish items] --- ## 🎯 Requirements Alignment [Detailed breakdown from Requirements Validator] **Requirements Met**: X/Y **Coverage**: Z% [Specific requirements table with status] --- ## 🏗️ Solution Architecture [Key insights from Solution Architect] **Chosen Approach**: [brief description] **Alternative Approaches Considered**: 1. [Alternative 1] - [Why chosen approach is better/worse] 2. [Alternative 2] - [Why chosen approach is better/worse] **Recommendation**: [Stick with current / Consider alternative X because...] --- ## 🔨 Refactoring Recommendations [Prioritized list from Code Quality Reviewer] ### High Priority Refactorings 1. **[Refactoring Name]** - Benefit: [explanation] - Effort: [estimate] - Before/After: [code examples] ### Medium Priority Refactorings [similar structure] --- ## 🤝 Areas of Consensus [List where all judges agreed] - [Agreement 1] - [Agreement 2] --- ## 💬 Areas of Debate [If applicable - where judges disagreed] **Debate 1: [Topic]** - Requirements Validator position: [summary] - Solution Architect position: [summary] - Resolution: [consensus reached or "reasonable disagreement"] --- ## 📋 Action Items (Prioritized) Based on the critique, here are recommended next steps: **Must Do**: - [ ] [Critical action 1] - [ ] [Critical action 2] **Should Do**: - [ ] [High priority action 1] - [ ] [High priority action 2] **Could Do**: - [ ] [Medium priority action 1] - [ ] [Nice to have action 2] --- ## 🎓 Learning Opportunities [Lessons that could improve future work] - [Learning 1] - [Learning 2] --- ## 📝 Conclusion [Final assessment paragraph summarizing whether the work meets quality standards and key takeaways] **Verdict**: ✅ Ready to ship | ⚠️ Needs improvements before shipping | ❌ Requires significant rework --- *Generated using Multi-Agent Debate + LLM-as-a-Judge pattern* *Review Date: [timestamp]* ``` ## Important Guidelines 1. **Be Objective**: Base assessments on evidence, not preferences 2. **Be Specific**: Always cite file locations, line numbers, and code examples 3. **Be Constructive**: Frame criticism as opportunities for improvement 4. **Be Balanced**: Acknowledge both strengths and weaknesses 5. **Be Actionable**: Provide concrete recommendations with examples 6. **Consider Context**: Account for project constraints, team size, timelines 7. **Avoid Bias**: Don't favor certain patterns/styles without justification ## Usage Examples ```bash # Review recent work from conversation /critique # Review specific files /critique src/feature.ts src/feature.test.ts # Review with specific focus /critique --focus=security # Review a git commit /critique HEAD~1..HEAD ``` ## Notes - This is a **report-only** command - it does not make changes - The review may take 2-5 minutes due to multi-agent coordination - Scores are relative to professional development standards - Disagreements between judges are valuable insights, not failures - Use findings to inform future development decisions