Files
gh-hirefrank-hirefrank-mark…/commands/es-review.md
2025-11-29 18:45:50 +08:00

16 KiB

description
description
Perform exhaustive code reviews using multi-agent analysis and Git worktrees

Review Command

<command_purpose> Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and Git worktrees for deep local inspection. </command_purpose>

Introduction

Senior Code Review Architect with expertise in security, performance, architecture, and quality assurance

Prerequisites

- Git repository with GitHub CLI (`gh`) installed and authenticated - Clean main/master branch - Proper permissions to create worktrees and access the repository - For document reviews: Path to a markdown file or document

Main Tasks

1. Worktree Creation and Branch Checkout (ALWAYS FIRST)

<review_target> #$ARGUMENTS </review_target>

<critical_requirement> MUST create worktree FIRST to enable local code analysis. No exceptions. </critical_requirement>

First, I need to determine the review target type and set up the worktree. This enables all subsequent agents to analyze actual code, not just diffs.

Immediate Actions:

<task_list>

  • Determine review type: PR number (numeric), GitHub URL, file path (.md), or empty (latest PR)

  • Create worktree directory structure at $git_root/.worktrees/reviews/pr-$identifier

  • Check out PR branch in isolated worktree using gh pr checkout

  • Navigate to worktree - ALL subsequent analysis happens here

  • Fetch PR metadata using gh pr view --json for title, body, files, linked issues

  • Clone PR branch into worktree with full history gh pr checkout $identifier

  • Set up language-specific analysis tools

  • Prepare security scanning environment

Ensure that the worktree is set up correctly and that the PR is checked out. ONLY then proceed to the next step.

</task_list>

Verify Cloudflare Workers Project

Confirm this is a Cloudflare Workers project by checking for wrangler.toml. All Cloudflare-specific agents will be used regardless of language (TypeScript/JavaScript).

<project_verification>

Check for Cloudflare Workers indicators:

Required:

  • wrangler.toml - Cloudflare Workers configuration

Common:

  • package.json with wrangler dependency
  • TypeScript/JavaScript files (.ts, .js)
  • Worker entry point (typically src/index.ts or src/worker.ts)

If not a Cloudflare Workers project, warn user and ask to confirm.

</project_verification>

Parallel Agents to review the PR:

<parallel_tasks>

Run ALL these agents in parallel. Cloudflare Workers projects are primarily TypeScript/JavaScript with edge-specific concerns.

Phase 1: Context Gathering (3 agents in parallel)

  1. Task binding-context-analyzer(PR content)

    • Parse wrangler.toml for bindings
    • Generate TypeScript Env interface
    • Provide context to other agents
  2. Task git-history-analyzer(PR content)

    • Analyze commit history and patterns
    • Identify code evolution
  3. Task repo-research-analyst(PR content)

    • Research codebase patterns
    • Document conventions

Phase 2: Cloudflare-Specific Review (5 agents in parallel)

  1. Task workers-runtime-guardian(PR content)

    • Runtime compatibility (V8, not Node.js)
    • Detect forbidden APIs (fs, process, Buffer)
    • Validate env parameter patterns
  2. Task durable-objects-architect(PR content)

    • DO lifecycle and state management
    • Hibernation patterns
    • WebSocket handling
  3. Task cloudflare-security-sentinel(PR content)

    • Workers security model
    • Secret management (wrangler secret)
    • CORS, CSP, auth patterns
  4. Task edge-performance-oracle(PR content)

    • Cold start optimization
    • Bundle size analysis
    • Edge caching strategies
  5. Task cloudflare-pattern-specialist(PR content)

    • Cloudflare-specific patterns
    • Anti-patterns (stateful Workers, KV for strong consistency)
    • Idiomatic Cloudflare code

Phase 2.5: Frontend Design Review (3 agents in parallel - if shadcn/ui components detected)

If the PR includes React components with shadcn/ui:

9a. Task frontend-design-specialist(PR content) - Identify generic patterns (Inter fonts, purple gradients, minimal animations) - Map aesthetic improvements to Tailwind/shadcn/ui code - Prioritize distinctiveness opportunities - Ensure brand identity vs generic "AI aesthetic"

9b. Task shadcn-ui-architect(PR content) - Validate shadcn/ui component usage and props (via MCP if available) - Check customization depth (ui prop usage) - Ensure consistent component patterns - Prevent prop hallucination

9c. Task accessibility-guardian(PR content) - WCAG 2.1 AA compliance validation - Color contrast checking - Keyboard navigation validation - Screen reader support - Ensure distinctive design remains accessible

Phase 3: Architecture & Data (5 agents in parallel)

  1. Task cloudflare-architecture-strategist(PR content)

    • Workers/DO/KV/R2 architecture
    • Service binding strategies
    • Edge-first design
  2. Task cloudflare-data-guardian(PR content)

    • KV/D1/R2 data integrity
    • Consistency models
    • Storage selection
  3. Task kv-optimization-specialist(PR content)

    • TTL strategies
    • Key naming patterns
    • Batch operations
  4. Task r2-storage-architect(PR content)

    • Upload patterns (multipart, streaming)
    • CDN integration
    • Lifecycle management
  5. Task edge-caching-optimizer(PR content)

    • Cache hierarchies
    • Invalidation strategies
    • Performance optimization

Phase 4: Specialized (3 agents in parallel)

  1. Task workers-ai-specialist(PR content)

    • Vercel AI SDK patterns
    • Cloudflare AI Agents
    • RAG implementations
  2. Task code-simplicity-reviewer(PR content)

    • YAGNI enforcement
    • Complexity reduction
    • Minimalism review
  3. Task feedback-codifier(PR content)

    • Extract patterns from review
    • Update agent knowledge
    • Self-improvement loop

</parallel_tasks>

4. Ultra-Thinking Deep Dive Phases

<ultrathink_instruction> For each phase below, spend maximum cognitive effort. Think step by step. Consider all angles. Question assumptions. And bring all reviews in a synthesis to the user.</ultrathink_instruction>

Complete system context map with component interactions

Phase 3: Stakeholder Perspective Analysis

<thinking_prompt> ULTRA-THINK: Put yourself in each stakeholder's shoes. What matters to them? What are their pain points? </thinking_prompt>

<stakeholder_perspectives>

  1. Developer Perspective

    • How easy is this to understand and modify?
    • Are the APIs intuitive?
    • Is debugging straightforward?
    • Can I test this easily?
  2. Operations Perspective

    • How do I deploy this safely?
    • What metrics and logs are available?
    • How do I troubleshoot issues?
    • What are the resource requirements?
  3. End User Perspective

    • Is the feature intuitive?
    • Are error messages helpful?
    • Is performance acceptable?
    • Does it solve my problem?
  4. Security Team Perspective

    • What's the attack surface?
    • Are there compliance requirements?
    • How is data protected?
    • What are the audit capabilities?
  5. Business Perspective

    • What's the ROI?
    • Are there legal/compliance risks?
    • How does this affect time-to-market?
    • What's the total cost of ownership? </stakeholder_perspectives>

Phase 4: Scenario Exploration

<thinking_prompt> ULTRA-THINK: Explore edge cases and failure scenarios. What could go wrong? How does the system behave under stress? </thinking_prompt>

<scenario_checklist>

  • Happy Path: Normal operation with valid inputs
  • Invalid Inputs: Null, empty, malformed data
  • Boundary Conditions: Min/max values, empty collections
  • Concurrent Access: Race conditions, deadlocks
  • Scale Testing: 10x, 100x, 1000x normal load
  • Network Issues: Timeouts, partial failures
  • Resource Exhaustion: Memory, disk, connections
  • Security Attacks: Injection, overflow, DoS
  • Data Corruption: Partial writes, inconsistency
  • Cascading Failures: Downstream service issues </scenario_checklist>

6. Multi-Angle Review Perspectives

Technical Excellence Angle

  • Code craftsmanship evaluation
  • Engineering best practices
  • Technical documentation quality
  • Tooling and automation assessment

Business Value Angle

  • Feature completeness validation
  • Performance impact on users
  • Cost-benefit analysis
  • Time-to-market considerations

Risk Management Angle

  • Security risk assessment
  • Operational risk evaluation
  • Compliance risk verification
  • Technical debt accumulation

Team Dynamics Angle

  • Code review etiquette
  • Knowledge sharing effectiveness
  • Collaboration patterns
  • Mentoring opportunities

4. Simplification and Minimalism Review

Run the Task code-simplicity-reviewer() to see if we can simplify the code.

5. Findings Synthesis and Todo Creation

<critical_requirement> All findings MUST be converted to actionable todos in the CLI todo system </critical_requirement>

Step 1: Synthesize All Findings

Consolidate all agent reports into a categorized list of findings. Remove duplicates, prioritize by severity and impact. Apply confidence scoring to filter false positives.

<synthesis_tasks>

  • Collect findings from all parallel agents
  • Categorize by type: security, performance, architecture, quality, etc.
  • Apply confidence scoring (0-100) to each finding
  • Filter out findings below 80 confidence threshold
  • Assign severity levels: 🔴 CRITICAL (P1), 🟡 IMPORTANT (P2), 🔵 NICE-TO-HAVE (P3)
  • Remove duplicate or overlapping findings
  • Estimate effort for each finding (Small/Medium/Large) </synthesis_tasks>

Confidence Scoring System (Adopted from Anthropic's code-review plugin)

Each finding receives an independent confidence score:

Score Meaning Action
0-25 Not confident; likely false positive Auto-filter (don't show)
26-50 Somewhat confident; might be valid Auto-filter (don't show)
51-79 Moderately confident; real but uncertain Auto-filter (don't show)
80-89 Highly confident; real and important Show to user
90-100 Absolutely certain; definitely real Show to user (prioritize)

Confidence Threshold: 80 - Only findings scoring 80+ are surfaced to the user.

<confidence_criteria> When scoring a finding, consider:

  1. Evidence Quality (+20 points each):

    • Specific file and line number identified
    • Code snippet demonstrates the issue
    • Issue is in changed code (not pre-existing)
    • Clear violation of documented standard
  2. False Positive Indicators (-20 points each):

    • Issue exists in unchanged code
    • Would be caught by linter/type checker
    • Has explicit ignore comment
    • Is a style preference, not a bug
  3. Verification (+10 points each):

    • Multiple agents flagged same issue
    • CLAUDE.md or PREFERENCES.md mentions this pattern
    • Issue matches known Cloudflare anti-pattern

Example scoring:

Finding: Using process.env in Worker
- Specific location: src/index.ts:45 (+20)
- Code snippet shows violation (+20)
- In changed code (+20)
- Violates Workers runtime rules (+20)
- Multiple agents flagged (+10)
= 90 confidence ✅ SHOW
Finding: Consider adding more comments
- No specific location (-20)
- Style preference (-20)
- Not in PREFERENCES.md (-10)
= 30 confidence ❌ FILTER

</confidence_criteria>

Step 2: Present Findings for Triage

For EACH finding (with confidence ≥80), present in this format:

---
Finding #X: [Brief Title]

Confidence: [Score]/100 ✅
Severity: 🔴 P1 / 🟡 P2 / 🔵 P3

Category: [Security/Performance/Architecture/Quality/etc.]

Description:
[Detailed explanation of the issue or improvement]

Location: [file_path:line_number]

Problem:
[What's wrong or could be better]

Impact:
[Why this matters, what could happen]

Proposed Solution:
[How to fix it]

Effort: Small/Medium/Large

Evidence:
- [Why confidence is high - specific indicators]

---
Do you want to add this to the todo list?
1. yes - create todo file
2. next - skip this finding
3. custom - modify before creating

Note: Findings with confidence <80 are automatically filtered and not shown.

Step 3: Create Todo Files for Approved Findings

When user says "yes", create a properly formatted todo file:

<todo_creation_process>

  1. Determine next issue ID:

    ls todos/ | grep -o '^[0-9]\+' | sort -n | tail -1
    
  2. Generate filename:

    {next_id}-pending-{priority}-{brief-description}.md
    

    Example: 042-pending-p1-sql-injection-risk.md

  3. Create file from template:

    cp todos/000-pending-p1-TEMPLATE.md todos/{new_filename}
    
  4. Populate with finding data:

    ---
    status: pending
    priority: p1  # or p2, p3 based on severity
    issue_id: "042"
    tags: [code-review, security, rails]  # add relevant tags
    dependencies: []
    ---
    
    # [Finding Title]
    
    ## Problem Statement
    [Detailed description from finding]
    
    ## Findings
    - Discovered during code review by [agent names]
    - Location: [file_path:line_number]
    - [Key discoveries from agents]
    
    ## Proposed Solutions
    
    ### Option 1: [Primary solution from finding]
    - **Pros**: [Benefits]
    - **Cons**: [Drawbacks]
    - **Effort**: [Small/Medium/Large]
    - **Risk**: [Low/Medium/High]
    
    ## Recommended Action
    [Leave blank - needs manager triage]
    
    ## Technical Details
    - **Affected Files**: [List from finding]
    - **Related Components**: [Models, controllers, services affected]
    - **Database Changes**: [Yes/No - describe if yes]
    
    ## Resources
    - Code review PR: [PR link if applicable]
    - Related findings: [Other finding numbers]
    - Agent reports: [Which agents flagged this]
    
    ## Acceptance Criteria
    - [ ] [Specific criteria based on solution]
    - [ ] Tests pass
    - [ ] Code reviewed
    
    ## Work Log
    
    ### {date} - Code Review Discovery
    **By:** Claude Code Review System
    **Actions:**
    - Discovered during comprehensive code review
    - Analyzed by multiple specialized agents
    - Categorized and prioritized
    
    **Learnings:**
    - [Key insights from agent analysis]
    
    ## Notes
    Source: Code review performed on {date}
    Review command: /workflows:review {arguments}
    
  5. Track creation: Add to TodoWrite list if tracking multiple findings

</todo_creation_process>

Step 4: Summary Report

After processing all findings:

## Code Review Complete

**Review Target:** [PR number or branch]
**Total Findings:** [X] (from all agents)
**High-Confidence (≥80):** [Y] (shown to user)
**Filtered (<80):** [Z] (auto-removed as likely false positives)
**Todos Created:** [W]

### Confidence Distribution:
- 90-100 (certain): [count]
- 80-89 (confident): [count]
- <80 (filtered): [count]

### Created Todos:
- `{issue_id}-pending-p1-{description}.md` - {title} (confidence: 95)
- `{issue_id}-pending-p2-{description}.md` - {title} (confidence: 85)
...

### Skipped Findings (User Choice):
- [Finding #Z]: {reason}
...

### Auto-Filtered (Low Confidence):
- [X] findings filtered with confidence <80
- Run with `--show-all` flag to see filtered findings

### Next Steps:
1. Triage pending todos: `ls todos/*-pending-*.md`
2. Use `/triage` to review and approve
3. Work on approved items: `/resolve_todo_parallel`

Alternative: Batch Creation

If user wants to convert all findings to todos without review:

# Ask: "Create todos for all X findings? (yes/no/show-critical-only)"
# If yes: create todo files for all findings in parallel
# If show-critical-only: only present P1 findings for triage