16 KiB
description
| description |
|---|
| Perform exhaustive code reviews using multi-agent analysis and Git worktrees |
Review Command
<command_purpose> Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and Git worktrees for deep local inspection. </command_purpose>
Introduction
Senior Code Review Architect with expertise in security, performance, architecture, and quality assurance
Prerequisites
- Git repository with GitHub CLI (`gh`) installed and authenticated - Clean main/master branch - Proper permissions to create worktrees and access the repository - For document reviews: Path to a markdown file or documentMain Tasks
1. Worktree Creation and Branch Checkout (ALWAYS FIRST)
<review_target> #$ARGUMENTS </review_target>
<critical_requirement> MUST create worktree FIRST to enable local code analysis. No exceptions. </critical_requirement>
First, I need to determine the review target type and set up the worktree. This enables all subsequent agents to analyze actual code, not just diffs.Immediate Actions:
<task_list>
-
Determine review type: PR number (numeric), GitHub URL, file path (.md), or empty (latest PR)
-
Create worktree directory structure at
$git_root/.worktrees/reviews/pr-$identifier -
Check out PR branch in isolated worktree using
gh pr checkout -
Navigate to worktree - ALL subsequent analysis happens here
-
Fetch PR metadata using
gh pr view --jsonfor title, body, files, linked issues -
Clone PR branch into worktree with full history
gh pr checkout $identifier -
Set up language-specific analysis tools
-
Prepare security scanning environment
Ensure that the worktree is set up correctly and that the PR is checked out. ONLY then proceed to the next step.
</task_list>
Verify Cloudflare Workers Project
Confirm this is a Cloudflare Workers project by checking for wrangler.toml. All Cloudflare-specific agents will be used regardless of language (TypeScript/JavaScript).<project_verification>
Check for Cloudflare Workers indicators:
Required:
wrangler.toml- Cloudflare Workers configuration
Common:
package.jsonwithwranglerdependency- TypeScript/JavaScript files (
.ts,.js) - Worker entry point (typically
src/index.tsorsrc/worker.ts)
If not a Cloudflare Workers project, warn user and ask to confirm.
</project_verification>
Parallel Agents to review the PR:
<parallel_tasks>
Run ALL these agents in parallel. Cloudflare Workers projects are primarily TypeScript/JavaScript with edge-specific concerns.
Phase 1: Context Gathering (3 agents in parallel)
-
Task binding-context-analyzer(PR content)
- Parse wrangler.toml for bindings
- Generate TypeScript Env interface
- Provide context to other agents
-
Task git-history-analyzer(PR content)
- Analyze commit history and patterns
- Identify code evolution
-
Task repo-research-analyst(PR content)
- Research codebase patterns
- Document conventions
Phase 2: Cloudflare-Specific Review (5 agents in parallel)
-
Task workers-runtime-guardian(PR content)
- Runtime compatibility (V8, not Node.js)
- Detect forbidden APIs (fs, process, Buffer)
- Validate env parameter patterns
-
Task durable-objects-architect(PR content)
- DO lifecycle and state management
- Hibernation patterns
- WebSocket handling
-
Task cloudflare-security-sentinel(PR content)
- Workers security model
- Secret management (wrangler secret)
- CORS, CSP, auth patterns
-
Task edge-performance-oracle(PR content)
- Cold start optimization
- Bundle size analysis
- Edge caching strategies
-
Task cloudflare-pattern-specialist(PR content)
- Cloudflare-specific patterns
- Anti-patterns (stateful Workers, KV for strong consistency)
- Idiomatic Cloudflare code
Phase 2.5: Frontend Design Review (3 agents in parallel - if shadcn/ui components detected)
If the PR includes React components with shadcn/ui:
9a. Task frontend-design-specialist(PR content) - Identify generic patterns (Inter fonts, purple gradients, minimal animations) - Map aesthetic improvements to Tailwind/shadcn/ui code - Prioritize distinctiveness opportunities - Ensure brand identity vs generic "AI aesthetic"
9b. Task shadcn-ui-architect(PR content)
- Validate shadcn/ui component usage and props (via MCP if available)
- Check customization depth (ui prop usage)
- Ensure consistent component patterns
- Prevent prop hallucination
9c. Task accessibility-guardian(PR content) - WCAG 2.1 AA compliance validation - Color contrast checking - Keyboard navigation validation - Screen reader support - Ensure distinctive design remains accessible
Phase 3: Architecture & Data (5 agents in parallel)
-
Task cloudflare-architecture-strategist(PR content)
- Workers/DO/KV/R2 architecture
- Service binding strategies
- Edge-first design
-
Task cloudflare-data-guardian(PR content)
- KV/D1/R2 data integrity
- Consistency models
- Storage selection
-
Task kv-optimization-specialist(PR content)
- TTL strategies
- Key naming patterns
- Batch operations
-
Task r2-storage-architect(PR content)
- Upload patterns (multipart, streaming)
- CDN integration
- Lifecycle management
-
Task edge-caching-optimizer(PR content)
- Cache hierarchies
- Invalidation strategies
- Performance optimization
Phase 4: Specialized (3 agents in parallel)
-
Task workers-ai-specialist(PR content)
- Vercel AI SDK patterns
- Cloudflare AI Agents
- RAG implementations
-
Task code-simplicity-reviewer(PR content)
- YAGNI enforcement
- Complexity reduction
- Minimalism review
-
Task feedback-codifier(PR content)
- Extract patterns from review
- Update agent knowledge
- Self-improvement loop
</parallel_tasks>
4. Ultra-Thinking Deep Dive Phases
<ultrathink_instruction> For each phase below, spend maximum cognitive effort. Think step by step. Consider all angles. Question assumptions. And bring all reviews in a synthesis to the user.</ultrathink_instruction>
Complete system context map with component interactionsPhase 3: Stakeholder Perspective Analysis
<thinking_prompt> ULTRA-THINK: Put yourself in each stakeholder's shoes. What matters to them? What are their pain points? </thinking_prompt>
<stakeholder_perspectives>
-
Developer Perspective
- How easy is this to understand and modify?
- Are the APIs intuitive?
- Is debugging straightforward?
- Can I test this easily?
-
Operations Perspective
- How do I deploy this safely?
- What metrics and logs are available?
- How do I troubleshoot issues?
- What are the resource requirements?
-
End User Perspective
- Is the feature intuitive?
- Are error messages helpful?
- Is performance acceptable?
- Does it solve my problem?
-
Security Team Perspective
- What's the attack surface?
- Are there compliance requirements?
- How is data protected?
- What are the audit capabilities?
-
Business Perspective
- What's the ROI?
- Are there legal/compliance risks?
- How does this affect time-to-market?
- What's the total cost of ownership? </stakeholder_perspectives>
Phase 4: Scenario Exploration
<thinking_prompt> ULTRA-THINK: Explore edge cases and failure scenarios. What could go wrong? How does the system behave under stress? </thinking_prompt>
<scenario_checklist>
- Happy Path: Normal operation with valid inputs
- Invalid Inputs: Null, empty, malformed data
- Boundary Conditions: Min/max values, empty collections
- Concurrent Access: Race conditions, deadlocks
- Scale Testing: 10x, 100x, 1000x normal load
- Network Issues: Timeouts, partial failures
- Resource Exhaustion: Memory, disk, connections
- Security Attacks: Injection, overflow, DoS
- Data Corruption: Partial writes, inconsistency
- Cascading Failures: Downstream service issues </scenario_checklist>
6. Multi-Angle Review Perspectives
Technical Excellence Angle
- Code craftsmanship evaluation
- Engineering best practices
- Technical documentation quality
- Tooling and automation assessment
Business Value Angle
- Feature completeness validation
- Performance impact on users
- Cost-benefit analysis
- Time-to-market considerations
Risk Management Angle
- Security risk assessment
- Operational risk evaluation
- Compliance risk verification
- Technical debt accumulation
Team Dynamics Angle
- Code review etiquette
- Knowledge sharing effectiveness
- Collaboration patterns
- Mentoring opportunities
4. Simplification and Minimalism Review
Run the Task code-simplicity-reviewer() to see if we can simplify the code.
5. Findings Synthesis and Todo Creation
<critical_requirement> All findings MUST be converted to actionable todos in the CLI todo system </critical_requirement>
Step 1: Synthesize All Findings
Consolidate all agent reports into a categorized list of findings. Remove duplicates, prioritize by severity and impact. Apply confidence scoring to filter false positives.<synthesis_tasks>
- Collect findings from all parallel agents
- Categorize by type: security, performance, architecture, quality, etc.
- Apply confidence scoring (0-100) to each finding
- Filter out findings below 80 confidence threshold
- Assign severity levels: 🔴 CRITICAL (P1), 🟡 IMPORTANT (P2), 🔵 NICE-TO-HAVE (P3)
- Remove duplicate or overlapping findings
- Estimate effort for each finding (Small/Medium/Large) </synthesis_tasks>
Confidence Scoring System (Adopted from Anthropic's code-review plugin)
Each finding receives an independent confidence score:
| Score | Meaning | Action |
|---|---|---|
| 0-25 | Not confident; likely false positive | Auto-filter (don't show) |
| 26-50 | Somewhat confident; might be valid | Auto-filter (don't show) |
| 51-79 | Moderately confident; real but uncertain | Auto-filter (don't show) |
| 80-89 | Highly confident; real and important | ✅ Show to user |
| 90-100 | Absolutely certain; definitely real | ✅ Show to user (prioritize) |
Confidence Threshold: 80 - Only findings scoring 80+ are surfaced to the user.
<confidence_criteria> When scoring a finding, consider:
-
Evidence Quality (+20 points each):
- Specific file and line number identified
- Code snippet demonstrates the issue
- Issue is in changed code (not pre-existing)
- Clear violation of documented standard
-
False Positive Indicators (-20 points each):
- Issue exists in unchanged code
- Would be caught by linter/type checker
- Has explicit ignore comment
- Is a style preference, not a bug
-
Verification (+10 points each):
- Multiple agents flagged same issue
- CLAUDE.md or PREFERENCES.md mentions this pattern
- Issue matches known Cloudflare anti-pattern
Example scoring:
Finding: Using process.env in Worker
- Specific location: src/index.ts:45 (+20)
- Code snippet shows violation (+20)
- In changed code (+20)
- Violates Workers runtime rules (+20)
- Multiple agents flagged (+10)
= 90 confidence ✅ SHOW
Finding: Consider adding more comments
- No specific location (-20)
- Style preference (-20)
- Not in PREFERENCES.md (-10)
= 30 confidence ❌ FILTER
</confidence_criteria>
Step 2: Present Findings for Triage
For EACH finding (with confidence ≥80), present in this format:
---
Finding #X: [Brief Title]
Confidence: [Score]/100 ✅
Severity: 🔴 P1 / 🟡 P2 / 🔵 P3
Category: [Security/Performance/Architecture/Quality/etc.]
Description:
[Detailed explanation of the issue or improvement]
Location: [file_path:line_number]
Problem:
[What's wrong or could be better]
Impact:
[Why this matters, what could happen]
Proposed Solution:
[How to fix it]
Effort: Small/Medium/Large
Evidence:
- [Why confidence is high - specific indicators]
---
Do you want to add this to the todo list?
1. yes - create todo file
2. next - skip this finding
3. custom - modify before creating
Note: Findings with confidence <80 are automatically filtered and not shown.
Step 3: Create Todo Files for Approved Findings
When user says "yes", create a properly formatted todo file:<todo_creation_process>
-
Determine next issue ID:
ls todos/ | grep -o '^[0-9]\+' | sort -n | tail -1 -
Generate filename:
{next_id}-pending-{priority}-{brief-description}.mdExample:
042-pending-p1-sql-injection-risk.md -
Create file from template:
cp todos/000-pending-p1-TEMPLATE.md todos/{new_filename} -
Populate with finding data:
--- status: pending priority: p1 # or p2, p3 based on severity issue_id: "042" tags: [code-review, security, rails] # add relevant tags dependencies: [] --- # [Finding Title] ## Problem Statement [Detailed description from finding] ## Findings - Discovered during code review by [agent names] - Location: [file_path:line_number] - [Key discoveries from agents] ## Proposed Solutions ### Option 1: [Primary solution from finding] - **Pros**: [Benefits] - **Cons**: [Drawbacks] - **Effort**: [Small/Medium/Large] - **Risk**: [Low/Medium/High] ## Recommended Action [Leave blank - needs manager triage] ## Technical Details - **Affected Files**: [List from finding] - **Related Components**: [Models, controllers, services affected] - **Database Changes**: [Yes/No - describe if yes] ## Resources - Code review PR: [PR link if applicable] - Related findings: [Other finding numbers] - Agent reports: [Which agents flagged this] ## Acceptance Criteria - [ ] [Specific criteria based on solution] - [ ] Tests pass - [ ] Code reviewed ## Work Log ### {date} - Code Review Discovery **By:** Claude Code Review System **Actions:** - Discovered during comprehensive code review - Analyzed by multiple specialized agents - Categorized and prioritized **Learnings:** - [Key insights from agent analysis] ## Notes Source: Code review performed on {date} Review command: /workflows:review {arguments} -
Track creation: Add to TodoWrite list if tracking multiple findings
</todo_creation_process>
Step 4: Summary Report
After processing all findings:
## Code Review Complete
**Review Target:** [PR number or branch]
**Total Findings:** [X] (from all agents)
**High-Confidence (≥80):** [Y] (shown to user)
**Filtered (<80):** [Z] (auto-removed as likely false positives)
**Todos Created:** [W]
### Confidence Distribution:
- 90-100 (certain): [count]
- 80-89 (confident): [count]
- <80 (filtered): [count]
### Created Todos:
- `{issue_id}-pending-p1-{description}.md` - {title} (confidence: 95)
- `{issue_id}-pending-p2-{description}.md` - {title} (confidence: 85)
...
### Skipped Findings (User Choice):
- [Finding #Z]: {reason}
...
### Auto-Filtered (Low Confidence):
- [X] findings filtered with confidence <80
- Run with `--show-all` flag to see filtered findings
### Next Steps:
1. Triage pending todos: `ls todos/*-pending-*.md`
2. Use `/triage` to review and approve
3. Work on approved items: `/resolve_todo_parallel`
Alternative: Batch Creation
If user wants to convert all findings to todos without review:
# Ask: "Create todos for all X findings? (yes/no/show-critical-only)"
# If yes: create todo files for all findings in parallel
# If show-critical-only: only present P1 findings for triage