--- description: Perform exhaustive code reviews using multi-agent analysis and Git worktrees --- # Review Command Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and Git worktrees for deep local inspection. ## Introduction Senior Code Review Architect with expertise in security, performance, architecture, and quality assurance ## Prerequisites - Git repository with GitHub CLI (`gh`) installed and authenticated - Clean main/master branch - Proper permissions to create worktrees and access the repository - For document reviews: Path to a markdown file or document ## Main Tasks ### 1. Worktree Creation and Branch Checkout (ALWAYS FIRST) #$ARGUMENTS MUST create worktree FIRST to enable local code analysis. No exceptions. First, I need to determine the review target type and set up the worktree. This enables all subsequent agents to analyze actual code, not just diffs. #### Immediate Actions: - [ ] Determine review type: PR number (numeric), GitHub URL, file path (.md), or empty (latest PR) - [ ] Create worktree directory structure at `$git_root/.worktrees/reviews/pr-$identifier` - [ ] Check out PR branch in isolated worktree using `gh pr checkout` - [ ] Navigate to worktree - ALL subsequent analysis happens here - Fetch PR metadata using `gh pr view --json` for title, body, files, linked issues - Clone PR branch into worktree with full history `gh pr checkout $identifier` - Set up language-specific analysis tools - Prepare security scanning environment Ensure that the worktree is set up correctly and that the PR is checked out. ONLY then proceed to the next step. #### Verify Cloudflare Workers Project Confirm this is a Cloudflare Workers project by checking for wrangler.toml. All Cloudflare-specific agents will be used regardless of language (TypeScript/JavaScript). Check for Cloudflare Workers indicators: **Required**: - `wrangler.toml` - Cloudflare Workers configuration **Common**: - `package.json` with `wrangler` dependency - TypeScript/JavaScript files (`.ts`, `.js`) - Worker entry point (typically `src/index.ts` or `src/worker.ts`) If not a Cloudflare Workers project, warn user and ask to confirm. #### Parallel Agents to review the PR: Run ALL these agents in parallel. Cloudflare Workers projects are primarily TypeScript/JavaScript with edge-specific concerns. **Phase 1: Context Gathering (3 agents in parallel)** 1. Task binding-context-analyzer(PR content) - Parse wrangler.toml for bindings - Generate TypeScript Env interface - Provide context to other agents 2. Task git-history-analyzer(PR content) - Analyze commit history and patterns - Identify code evolution 3. Task repo-research-analyst(PR content) - Research codebase patterns - Document conventions **Phase 2: Cloudflare-Specific Review (5 agents in parallel)** 4. Task workers-runtime-guardian(PR content) - Runtime compatibility (V8, not Node.js) - Detect forbidden APIs (fs, process, Buffer) - Validate env parameter patterns 5. Task durable-objects-architect(PR content) - DO lifecycle and state management - Hibernation patterns - WebSocket handling 6. Task cloudflare-security-sentinel(PR content) - Workers security model - Secret management (wrangler secret) - CORS, CSP, auth patterns 7. Task edge-performance-oracle(PR content) - Cold start optimization - Bundle size analysis - Edge caching strategies 8. Task cloudflare-pattern-specialist(PR content) - Cloudflare-specific patterns - Anti-patterns (stateful Workers, KV for strong consistency) - Idiomatic Cloudflare code **Phase 2.5: Frontend Design Review (3 agents in parallel - if shadcn/ui components detected)** If the PR includes React components with shadcn/ui: 9a. Task frontend-design-specialist(PR content) - Identify generic patterns (Inter fonts, purple gradients, minimal animations) - Map aesthetic improvements to Tailwind/shadcn/ui code - Prioritize distinctiveness opportunities - Ensure brand identity vs generic "AI aesthetic" 9b. Task shadcn-ui-architect(PR content) - Validate shadcn/ui component usage and props (via MCP if available) - Check customization depth (`ui` prop usage) - Ensure consistent component patterns - Prevent prop hallucination 9c. Task accessibility-guardian(PR content) - WCAG 2.1 AA compliance validation - Color contrast checking - Keyboard navigation validation - Screen reader support - Ensure distinctive design remains accessible **Phase 3: Architecture & Data (5 agents in parallel)** 9. Task cloudflare-architecture-strategist(PR content) - Workers/DO/KV/R2 architecture - Service binding strategies - Edge-first design 10. Task cloudflare-data-guardian(PR content) - KV/D1/R2 data integrity - Consistency models - Storage selection 11. Task kv-optimization-specialist(PR content) - TTL strategies - Key naming patterns - Batch operations 12. Task r2-storage-architect(PR content) - Upload patterns (multipart, streaming) - CDN integration - Lifecycle management 13. Task edge-caching-optimizer(PR content) - Cache hierarchies - Invalidation strategies - Performance optimization **Phase 4: Specialized (3 agents in parallel)** 14. Task workers-ai-specialist(PR content) - Vercel AI SDK patterns - Cloudflare AI Agents - RAG implementations 15. Task code-simplicity-reviewer(PR content) - YAGNI enforcement - Complexity reduction - Minimalism review 16. Task feedback-codifier(PR content) - Extract patterns from review - Update agent knowledge - Self-improvement loop ### 4. Ultra-Thinking Deep Dive Phases For each phase below, spend maximum cognitive effort. Think step by step. Consider all angles. Question assumptions. And bring all reviews in a synthesis to the user. Complete system context map with component interactions #### Phase 3: Stakeholder Perspective Analysis ULTRA-THINK: Put yourself in each stakeholder's shoes. What matters to them? What are their pain points? 1. **Developer Perspective** - How easy is this to understand and modify? - Are the APIs intuitive? - Is debugging straightforward? - Can I test this easily? 2. **Operations Perspective** - How do I deploy this safely? - What metrics and logs are available? - How do I troubleshoot issues? - What are the resource requirements? 3. **End User Perspective** - Is the feature intuitive? - Are error messages helpful? - Is performance acceptable? - Does it solve my problem? 4. **Security Team Perspective** - What's the attack surface? - Are there compliance requirements? - How is data protected? - What are the audit capabilities? 5. **Business Perspective** - What's the ROI? - Are there legal/compliance risks? - How does this affect time-to-market? - What's the total cost of ownership? #### Phase 4: Scenario Exploration ULTRA-THINK: Explore edge cases and failure scenarios. What could go wrong? How does the system behave under stress? - [ ] **Happy Path**: Normal operation with valid inputs - [ ] **Invalid Inputs**: Null, empty, malformed data - [ ] **Boundary Conditions**: Min/max values, empty collections - [ ] **Concurrent Access**: Race conditions, deadlocks - [ ] **Scale Testing**: 10x, 100x, 1000x normal load - [ ] **Network Issues**: Timeouts, partial failures - [ ] **Resource Exhaustion**: Memory, disk, connections - [ ] **Security Attacks**: Injection, overflow, DoS - [ ] **Data Corruption**: Partial writes, inconsistency - [ ] **Cascading Failures**: Downstream service issues ### 6. Multi-Angle Review Perspectives #### Technical Excellence Angle - Code craftsmanship evaluation - Engineering best practices - Technical documentation quality - Tooling and automation assessment #### Business Value Angle - Feature completeness validation - Performance impact on users - Cost-benefit analysis - Time-to-market considerations #### Risk Management Angle - Security risk assessment - Operational risk evaluation - Compliance risk verification - Technical debt accumulation #### Team Dynamics Angle - Code review etiquette - Knowledge sharing effectiveness - Collaboration patterns - Mentoring opportunities ### 4. Simplification and Minimalism Review Run the Task code-simplicity-reviewer() to see if we can simplify the code. ### 5. Findings Synthesis and Todo Creation All findings MUST be converted to actionable todos in the CLI todo system #### Step 1: Synthesize All Findings Consolidate all agent reports into a categorized list of findings. Remove duplicates, prioritize by severity and impact. Apply confidence scoring to filter false positives. - [ ] Collect findings from all parallel agents - [ ] Categorize by type: security, performance, architecture, quality, etc. - [ ] **Apply confidence scoring (0-100) to each finding** - [ ] **Filter out findings below 80 confidence threshold** - [ ] Assign severity levels: 🔴 CRITICAL (P1), 🟡 IMPORTANT (P2), 🔵 NICE-TO-HAVE (P3) - [ ] Remove duplicate or overlapping findings - [ ] Estimate effort for each finding (Small/Medium/Large) #### Confidence Scoring System (Adopted from Anthropic's code-review plugin) Each finding receives an independent confidence score: | Score | Meaning | Action | |-------|---------|--------| | **0-25** | Not confident; likely false positive | Auto-filter (don't show) | | **26-50** | Somewhat confident; might be valid | Auto-filter (don't show) | | **51-79** | Moderately confident; real but uncertain | Auto-filter (don't show) | | **80-89** | Highly confident; real and important | ✅ Show to user | | **90-100** | Absolutely certain; definitely real | ✅ Show to user (prioritize) | **Confidence Threshold: 80** - Only findings scoring 80+ are surfaced to the user. When scoring a finding, consider: 1. **Evidence Quality** (+20 points each): - [ ] Specific file and line number identified - [ ] Code snippet demonstrates the issue - [ ] Issue is in changed code (not pre-existing) - [ ] Clear violation of documented standard 2. **False Positive Indicators** (-20 points each): - [ ] Issue exists in unchanged code - [ ] Would be caught by linter/type checker - [ ] Has explicit ignore comment - [ ] Is a style preference, not a bug 3. **Verification** (+10 points each): - [ ] Multiple agents flagged same issue - [ ] CLAUDE.md or PREFERENCES.md mentions this pattern - [ ] Issue matches known Cloudflare anti-pattern Example scoring: ``` Finding: Using process.env in Worker - Specific location: src/index.ts:45 (+20) - Code snippet shows violation (+20) - In changed code (+20) - Violates Workers runtime rules (+20) - Multiple agents flagged (+10) = 90 confidence ✅ SHOW ``` ``` Finding: Consider adding more comments - No specific location (-20) - Style preference (-20) - Not in PREFERENCES.md (-10) = 30 confidence ❌ FILTER ``` #### Step 2: Present Findings for Triage For EACH finding (with confidence ≥80), present in this format: ``` --- Finding #X: [Brief Title] Confidence: [Score]/100 ✅ Severity: 🔴 P1 / 🟡 P2 / 🔵 P3 Category: [Security/Performance/Architecture/Quality/etc.] Description: [Detailed explanation of the issue or improvement] Location: [file_path:line_number] Problem: [What's wrong or could be better] Impact: [Why this matters, what could happen] Proposed Solution: [How to fix it] Effort: Small/Medium/Large Evidence: - [Why confidence is high - specific indicators] --- Do you want to add this to the todo list? 1. yes - create todo file 2. next - skip this finding 3. custom - modify before creating ``` **Note**: Findings with confidence <80 are automatically filtered and not shown. #### Step 3: Create Todo Files for Approved Findings When user says "yes", create a properly formatted todo file: 1. **Determine next issue ID:** ```bash ls todos/ | grep -o '^[0-9]\+' | sort -n | tail -1 ``` 2. **Generate filename:** ``` {next_id}-pending-{priority}-{brief-description}.md ``` Example: `042-pending-p1-sql-injection-risk.md` 3. **Create file from template:** ```bash cp todos/000-pending-p1-TEMPLATE.md todos/{new_filename} ``` 4. **Populate with finding data:** ```yaml --- status: pending priority: p1 # or p2, p3 based on severity issue_id: "042" tags: [code-review, security, rails] # add relevant tags dependencies: [] --- # [Finding Title] ## Problem Statement [Detailed description from finding] ## Findings - Discovered during code review by [agent names] - Location: [file_path:line_number] - [Key discoveries from agents] ## Proposed Solutions ### Option 1: [Primary solution from finding] - **Pros**: [Benefits] - **Cons**: [Drawbacks] - **Effort**: [Small/Medium/Large] - **Risk**: [Low/Medium/High] ## Recommended Action [Leave blank - needs manager triage] ## Technical Details - **Affected Files**: [List from finding] - **Related Components**: [Models, controllers, services affected] - **Database Changes**: [Yes/No - describe if yes] ## Resources - Code review PR: [PR link if applicable] - Related findings: [Other finding numbers] - Agent reports: [Which agents flagged this] ## Acceptance Criteria - [ ] [Specific criteria based on solution] - [ ] Tests pass - [ ] Code reviewed ## Work Log ### {date} - Code Review Discovery **By:** Claude Code Review System **Actions:** - Discovered during comprehensive code review - Analyzed by multiple specialized agents - Categorized and prioritized **Learnings:** - [Key insights from agent analysis] ## Notes Source: Code review performed on {date} Review command: /workflows:review {arguments} ``` 5. **Track creation:** Add to TodoWrite list if tracking multiple findings #### Step 4: Summary Report After processing all findings: ```markdown ## Code Review Complete **Review Target:** [PR number or branch] **Total Findings:** [X] (from all agents) **High-Confidence (≥80):** [Y] (shown to user) **Filtered (<80):** [Z] (auto-removed as likely false positives) **Todos Created:** [W] ### Confidence Distribution: - 90-100 (certain): [count] - 80-89 (confident): [count] - <80 (filtered): [count] ### Created Todos: - `{issue_id}-pending-p1-{description}.md` - {title} (confidence: 95) - `{issue_id}-pending-p2-{description}.md` - {title} (confidence: 85) ... ### Skipped Findings (User Choice): - [Finding #Z]: {reason} ... ### Auto-Filtered (Low Confidence): - [X] findings filtered with confidence <80 - Run with `--show-all` flag to see filtered findings ### Next Steps: 1. Triage pending todos: `ls todos/*-pending-*.md` 2. Use `/triage` to review and approve 3. Work on approved items: `/resolve_todo_parallel` ``` #### Alternative: Batch Creation If user wants to convert all findings to todos without review: ```bash # Ask: "Create todos for all X findings? (yes/no/show-critical-only)" # If yes: create todo files for all findings in parallel # If show-critical-only: only present P1 findings for triage ```