Initial commit
This commit is contained in:
546
commands/es-review.md
Normal file
546
commands/es-review.md
Normal file
@@ -0,0 +1,546 @@
|
||||
---
|
||||
description: Perform exhaustive code reviews using multi-agent analysis and Git worktrees
|
||||
---
|
||||
|
||||
# Review Command
|
||||
|
||||
<command_purpose> Perform exhaustive code reviews using multi-agent analysis, ultra-thinking, and Git worktrees for deep local inspection. </command_purpose>
|
||||
|
||||
## Introduction
|
||||
|
||||
<role>Senior Code Review Architect with expertise in security, performance, architecture, and quality assurance</role>
|
||||
|
||||
## Prerequisites
|
||||
|
||||
<requirements>
|
||||
- Git repository with GitHub CLI (`gh`) installed and authenticated
|
||||
- Clean main/master branch
|
||||
- Proper permissions to create worktrees and access the repository
|
||||
- For document reviews: Path to a markdown file or document
|
||||
</requirements>
|
||||
|
||||
## Main Tasks
|
||||
|
||||
### 1. Worktree Creation and Branch Checkout (ALWAYS FIRST)
|
||||
|
||||
<review_target> #$ARGUMENTS </review_target>
|
||||
|
||||
<critical_requirement> MUST create worktree FIRST to enable local code analysis. No exceptions. </critical_requirement>
|
||||
|
||||
<thinking>
|
||||
First, I need to determine the review target type and set up the worktree.
|
||||
This enables all subsequent agents to analyze actual code, not just diffs.
|
||||
</thinking>
|
||||
|
||||
#### Immediate Actions:
|
||||
|
||||
<task_list>
|
||||
|
||||
- [ ] Determine review type: PR number (numeric), GitHub URL, file path (.md), or empty (latest PR)
|
||||
- [ ] Create worktree directory structure at `$git_root/.worktrees/reviews/pr-$identifier`
|
||||
- [ ] Check out PR branch in isolated worktree using `gh pr checkout`
|
||||
- [ ] Navigate to worktree - ALL subsequent analysis happens here
|
||||
|
||||
- Fetch PR metadata using `gh pr view --json` for title, body, files, linked issues
|
||||
- Clone PR branch into worktree with full history `gh pr checkout $identifier`
|
||||
- Set up language-specific analysis tools
|
||||
- Prepare security scanning environment
|
||||
|
||||
Ensure that the worktree is set up correctly and that the PR is checked out. ONLY then proceed to the next step.
|
||||
|
||||
</task_list>
|
||||
|
||||
#### Verify Cloudflare Workers Project
|
||||
|
||||
<thinking>
|
||||
Confirm this is a Cloudflare Workers project by checking for wrangler.toml.
|
||||
All Cloudflare-specific agents will be used regardless of language (TypeScript/JavaScript).
|
||||
</thinking>
|
||||
|
||||
<project_verification>
|
||||
|
||||
Check for Cloudflare Workers indicators:
|
||||
|
||||
**Required**:
|
||||
- `wrangler.toml` - Cloudflare Workers configuration
|
||||
|
||||
**Common**:
|
||||
- `package.json` with `wrangler` dependency
|
||||
- TypeScript/JavaScript files (`.ts`, `.js`)
|
||||
- Worker entry point (typically `src/index.ts` or `src/worker.ts`)
|
||||
|
||||
If not a Cloudflare Workers project, warn user and ask to confirm.
|
||||
|
||||
</project_verification>
|
||||
|
||||
#### Parallel Agents to review the PR:
|
||||
|
||||
<parallel_tasks>
|
||||
|
||||
Run ALL these agents in parallel. Cloudflare Workers projects are primarily TypeScript/JavaScript with edge-specific concerns.
|
||||
|
||||
**Phase 1: Context Gathering (3 agents in parallel)**
|
||||
|
||||
1. Task binding-context-analyzer(PR content)
|
||||
- Parse wrangler.toml for bindings
|
||||
- Generate TypeScript Env interface
|
||||
- Provide context to other agents
|
||||
|
||||
2. Task git-history-analyzer(PR content)
|
||||
- Analyze commit history and patterns
|
||||
- Identify code evolution
|
||||
|
||||
3. Task repo-research-analyst(PR content)
|
||||
- Research codebase patterns
|
||||
- Document conventions
|
||||
|
||||
**Phase 2: Cloudflare-Specific Review (5 agents in parallel)**
|
||||
|
||||
4. Task workers-runtime-guardian(PR content)
|
||||
- Runtime compatibility (V8, not Node.js)
|
||||
- Detect forbidden APIs (fs, process, Buffer)
|
||||
- Validate env parameter patterns
|
||||
|
||||
5. Task durable-objects-architect(PR content)
|
||||
- DO lifecycle and state management
|
||||
- Hibernation patterns
|
||||
- WebSocket handling
|
||||
|
||||
6. Task cloudflare-security-sentinel(PR content)
|
||||
- Workers security model
|
||||
- Secret management (wrangler secret)
|
||||
- CORS, CSP, auth patterns
|
||||
|
||||
7. Task edge-performance-oracle(PR content)
|
||||
- Cold start optimization
|
||||
- Bundle size analysis
|
||||
- Edge caching strategies
|
||||
|
||||
8. Task cloudflare-pattern-specialist(PR content)
|
||||
- Cloudflare-specific patterns
|
||||
- Anti-patterns (stateful Workers, KV for strong consistency)
|
||||
- Idiomatic Cloudflare code
|
||||
|
||||
**Phase 2.5: Frontend Design Review (3 agents in parallel - if shadcn/ui components detected)**
|
||||
|
||||
If the PR includes React components with shadcn/ui:
|
||||
|
||||
9a. Task frontend-design-specialist(PR content)
|
||||
- Identify generic patterns (Inter fonts, purple gradients, minimal animations)
|
||||
- Map aesthetic improvements to Tailwind/shadcn/ui code
|
||||
- Prioritize distinctiveness opportunities
|
||||
- Ensure brand identity vs generic "AI aesthetic"
|
||||
|
||||
9b. Task shadcn-ui-architect(PR content)
|
||||
- Validate shadcn/ui component usage and props (via MCP if available)
|
||||
- Check customization depth (`ui` prop usage)
|
||||
- Ensure consistent component patterns
|
||||
- Prevent prop hallucination
|
||||
|
||||
9c. Task accessibility-guardian(PR content)
|
||||
- WCAG 2.1 AA compliance validation
|
||||
- Color contrast checking
|
||||
- Keyboard navigation validation
|
||||
- Screen reader support
|
||||
- Ensure distinctive design remains accessible
|
||||
|
||||
**Phase 3: Architecture & Data (5 agents in parallel)**
|
||||
|
||||
9. Task cloudflare-architecture-strategist(PR content)
|
||||
- Workers/DO/KV/R2 architecture
|
||||
- Service binding strategies
|
||||
- Edge-first design
|
||||
|
||||
10. Task cloudflare-data-guardian(PR content)
|
||||
- KV/D1/R2 data integrity
|
||||
- Consistency models
|
||||
- Storage selection
|
||||
|
||||
11. Task kv-optimization-specialist(PR content)
|
||||
- TTL strategies
|
||||
- Key naming patterns
|
||||
- Batch operations
|
||||
|
||||
12. Task r2-storage-architect(PR content)
|
||||
- Upload patterns (multipart, streaming)
|
||||
- CDN integration
|
||||
- Lifecycle management
|
||||
|
||||
13. Task edge-caching-optimizer(PR content)
|
||||
- Cache hierarchies
|
||||
- Invalidation strategies
|
||||
- Performance optimization
|
||||
|
||||
**Phase 4: Specialized (3 agents in parallel)**
|
||||
|
||||
14. Task workers-ai-specialist(PR content)
|
||||
- Vercel AI SDK patterns
|
||||
- Cloudflare AI Agents
|
||||
- RAG implementations
|
||||
|
||||
15. Task code-simplicity-reviewer(PR content)
|
||||
- YAGNI enforcement
|
||||
- Complexity reduction
|
||||
- Minimalism review
|
||||
|
||||
16. Task feedback-codifier(PR content)
|
||||
- Extract patterns from review
|
||||
- Update agent knowledge
|
||||
- Self-improvement loop
|
||||
|
||||
</parallel_tasks>
|
||||
|
||||
### 4. Ultra-Thinking Deep Dive Phases
|
||||
|
||||
<ultrathink_instruction> For each phase below, spend maximum cognitive effort. Think step by step. Consider all angles. Question assumptions. And bring all reviews in a synthesis to the user.</ultrathink_instruction>
|
||||
|
||||
<deliverable>
|
||||
Complete system context map with component interactions
|
||||
</deliverable>
|
||||
|
||||
#### Phase 3: Stakeholder Perspective Analysis
|
||||
|
||||
<thinking_prompt> ULTRA-THINK: Put yourself in each stakeholder's shoes. What matters to them? What are their pain points? </thinking_prompt>
|
||||
|
||||
<stakeholder_perspectives>
|
||||
|
||||
1. **Developer Perspective** <questions>
|
||||
|
||||
- How easy is this to understand and modify?
|
||||
- Are the APIs intuitive?
|
||||
- Is debugging straightforward?
|
||||
- Can I test this easily? </questions>
|
||||
|
||||
2. **Operations Perspective** <questions>
|
||||
|
||||
- How do I deploy this safely?
|
||||
- What metrics and logs are available?
|
||||
- How do I troubleshoot issues?
|
||||
- What are the resource requirements? </questions>
|
||||
|
||||
3. **End User Perspective** <questions>
|
||||
|
||||
- Is the feature intuitive?
|
||||
- Are error messages helpful?
|
||||
- Is performance acceptable?
|
||||
- Does it solve my problem? </questions>
|
||||
|
||||
4. **Security Team Perspective** <questions>
|
||||
|
||||
- What's the attack surface?
|
||||
- Are there compliance requirements?
|
||||
- How is data protected?
|
||||
- What are the audit capabilities? </questions>
|
||||
|
||||
5. **Business Perspective** <questions>
|
||||
- What's the ROI?
|
||||
- Are there legal/compliance risks?
|
||||
- How does this affect time-to-market?
|
||||
- What's the total cost of ownership? </questions> </stakeholder_perspectives>
|
||||
|
||||
#### Phase 4: Scenario Exploration
|
||||
|
||||
<thinking_prompt> ULTRA-THINK: Explore edge cases and failure scenarios. What could go wrong? How does the system behave under stress? </thinking_prompt>
|
||||
|
||||
<scenario_checklist>
|
||||
|
||||
- [ ] **Happy Path**: Normal operation with valid inputs
|
||||
- [ ] **Invalid Inputs**: Null, empty, malformed data
|
||||
- [ ] **Boundary Conditions**: Min/max values, empty collections
|
||||
- [ ] **Concurrent Access**: Race conditions, deadlocks
|
||||
- [ ] **Scale Testing**: 10x, 100x, 1000x normal load
|
||||
- [ ] **Network Issues**: Timeouts, partial failures
|
||||
- [ ] **Resource Exhaustion**: Memory, disk, connections
|
||||
- [ ] **Security Attacks**: Injection, overflow, DoS
|
||||
- [ ] **Data Corruption**: Partial writes, inconsistency
|
||||
- [ ] **Cascading Failures**: Downstream service issues </scenario_checklist>
|
||||
|
||||
### 6. Multi-Angle Review Perspectives
|
||||
|
||||
#### Technical Excellence Angle
|
||||
|
||||
- Code craftsmanship evaluation
|
||||
- Engineering best practices
|
||||
- Technical documentation quality
|
||||
- Tooling and automation assessment
|
||||
|
||||
#### Business Value Angle
|
||||
|
||||
- Feature completeness validation
|
||||
- Performance impact on users
|
||||
- Cost-benefit analysis
|
||||
- Time-to-market considerations
|
||||
|
||||
#### Risk Management Angle
|
||||
|
||||
- Security risk assessment
|
||||
- Operational risk evaluation
|
||||
- Compliance risk verification
|
||||
- Technical debt accumulation
|
||||
|
||||
#### Team Dynamics Angle
|
||||
|
||||
- Code review etiquette
|
||||
- Knowledge sharing effectiveness
|
||||
- Collaboration patterns
|
||||
- Mentoring opportunities
|
||||
|
||||
### 4. Simplification and Minimalism Review
|
||||
|
||||
Run the Task code-simplicity-reviewer() to see if we can simplify the code.
|
||||
|
||||
### 5. Findings Synthesis and Todo Creation
|
||||
|
||||
<critical_requirement> All findings MUST be converted to actionable todos in the CLI todo system </critical_requirement>
|
||||
|
||||
#### Step 1: Synthesize All Findings
|
||||
|
||||
<thinking>
|
||||
Consolidate all agent reports into a categorized list of findings.
|
||||
Remove duplicates, prioritize by severity and impact.
|
||||
Apply confidence scoring to filter false positives.
|
||||
</thinking>
|
||||
|
||||
<synthesis_tasks>
|
||||
- [ ] Collect findings from all parallel agents
|
||||
- [ ] Categorize by type: security, performance, architecture, quality, etc.
|
||||
- [ ] **Apply confidence scoring (0-100) to each finding**
|
||||
- [ ] **Filter out findings below 80 confidence threshold**
|
||||
- [ ] Assign severity levels: 🔴 CRITICAL (P1), 🟡 IMPORTANT (P2), 🔵 NICE-TO-HAVE (P3)
|
||||
- [ ] Remove duplicate or overlapping findings
|
||||
- [ ] Estimate effort for each finding (Small/Medium/Large)
|
||||
</synthesis_tasks>
|
||||
|
||||
#### Confidence Scoring System (Adopted from Anthropic's code-review plugin)
|
||||
|
||||
Each finding receives an independent confidence score:
|
||||
|
||||
| Score | Meaning | Action |
|
||||
|-------|---------|--------|
|
||||
| **0-25** | Not confident; likely false positive | Auto-filter (don't show) |
|
||||
| **26-50** | Somewhat confident; might be valid | Auto-filter (don't show) |
|
||||
| **51-79** | Moderately confident; real but uncertain | Auto-filter (don't show) |
|
||||
| **80-89** | Highly confident; real and important | ✅ Show to user |
|
||||
| **90-100** | Absolutely certain; definitely real | ✅ Show to user (prioritize) |
|
||||
|
||||
**Confidence Threshold: 80** - Only findings scoring 80+ are surfaced to the user.
|
||||
|
||||
<confidence_criteria>
|
||||
When scoring a finding, consider:
|
||||
|
||||
1. **Evidence Quality** (+20 points each):
|
||||
- [ ] Specific file and line number identified
|
||||
- [ ] Code snippet demonstrates the issue
|
||||
- [ ] Issue is in changed code (not pre-existing)
|
||||
- [ ] Clear violation of documented standard
|
||||
|
||||
2. **False Positive Indicators** (-20 points each):
|
||||
- [ ] Issue exists in unchanged code
|
||||
- [ ] Would be caught by linter/type checker
|
||||
- [ ] Has explicit ignore comment
|
||||
- [ ] Is a style preference, not a bug
|
||||
|
||||
3. **Verification** (+10 points each):
|
||||
- [ ] Multiple agents flagged same issue
|
||||
- [ ] CLAUDE.md or PREFERENCES.md mentions this pattern
|
||||
- [ ] Issue matches known Cloudflare anti-pattern
|
||||
|
||||
Example scoring:
|
||||
```
|
||||
Finding: Using process.env in Worker
|
||||
- Specific location: src/index.ts:45 (+20)
|
||||
- Code snippet shows violation (+20)
|
||||
- In changed code (+20)
|
||||
- Violates Workers runtime rules (+20)
|
||||
- Multiple agents flagged (+10)
|
||||
= 90 confidence ✅ SHOW
|
||||
```
|
||||
|
||||
```
|
||||
Finding: Consider adding more comments
|
||||
- No specific location (-20)
|
||||
- Style preference (-20)
|
||||
- Not in PREFERENCES.md (-10)
|
||||
= 30 confidence ❌ FILTER
|
||||
```
|
||||
</confidence_criteria>
|
||||
|
||||
#### Step 2: Present Findings for Triage
|
||||
|
||||
For EACH finding (with confidence ≥80), present in this format:
|
||||
|
||||
```
|
||||
---
|
||||
Finding #X: [Brief Title]
|
||||
|
||||
Confidence: [Score]/100 ✅
|
||||
Severity: 🔴 P1 / 🟡 P2 / 🔵 P3
|
||||
|
||||
Category: [Security/Performance/Architecture/Quality/etc.]
|
||||
|
||||
Description:
|
||||
[Detailed explanation of the issue or improvement]
|
||||
|
||||
Location: [file_path:line_number]
|
||||
|
||||
Problem:
|
||||
[What's wrong or could be better]
|
||||
|
||||
Impact:
|
||||
[Why this matters, what could happen]
|
||||
|
||||
Proposed Solution:
|
||||
[How to fix it]
|
||||
|
||||
Effort: Small/Medium/Large
|
||||
|
||||
Evidence:
|
||||
- [Why confidence is high - specific indicators]
|
||||
|
||||
---
|
||||
Do you want to add this to the todo list?
|
||||
1. yes - create todo file
|
||||
2. next - skip this finding
|
||||
3. custom - modify before creating
|
||||
```
|
||||
|
||||
**Note**: Findings with confidence <80 are automatically filtered and not shown.
|
||||
|
||||
#### Step 3: Create Todo Files for Approved Findings
|
||||
|
||||
<instructions>
|
||||
When user says "yes", create a properly formatted todo file:
|
||||
</instructions>
|
||||
|
||||
<todo_creation_process>
|
||||
|
||||
1. **Determine next issue ID:**
|
||||
```bash
|
||||
ls todos/ | grep -o '^[0-9]\+' | sort -n | tail -1
|
||||
```
|
||||
|
||||
2. **Generate filename:**
|
||||
```
|
||||
{next_id}-pending-{priority}-{brief-description}.md
|
||||
```
|
||||
Example: `042-pending-p1-sql-injection-risk.md`
|
||||
|
||||
3. **Create file from template:**
|
||||
```bash
|
||||
cp todos/000-pending-p1-TEMPLATE.md todos/{new_filename}
|
||||
```
|
||||
|
||||
4. **Populate with finding data:**
|
||||
```yaml
|
||||
---
|
||||
status: pending
|
||||
priority: p1 # or p2, p3 based on severity
|
||||
issue_id: "042"
|
||||
tags: [code-review, security, rails] # add relevant tags
|
||||
dependencies: []
|
||||
---
|
||||
|
||||
# [Finding Title]
|
||||
|
||||
## Problem Statement
|
||||
[Detailed description from finding]
|
||||
|
||||
## Findings
|
||||
- Discovered during code review by [agent names]
|
||||
- Location: [file_path:line_number]
|
||||
- [Key discoveries from agents]
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option 1: [Primary solution from finding]
|
||||
- **Pros**: [Benefits]
|
||||
- **Cons**: [Drawbacks]
|
||||
- **Effort**: [Small/Medium/Large]
|
||||
- **Risk**: [Low/Medium/High]
|
||||
|
||||
## Recommended Action
|
||||
[Leave blank - needs manager triage]
|
||||
|
||||
## Technical Details
|
||||
- **Affected Files**: [List from finding]
|
||||
- **Related Components**: [Models, controllers, services affected]
|
||||
- **Database Changes**: [Yes/No - describe if yes]
|
||||
|
||||
## Resources
|
||||
- Code review PR: [PR link if applicable]
|
||||
- Related findings: [Other finding numbers]
|
||||
- Agent reports: [Which agents flagged this]
|
||||
|
||||
## Acceptance Criteria
|
||||
- [ ] [Specific criteria based on solution]
|
||||
- [ ] Tests pass
|
||||
- [ ] Code reviewed
|
||||
|
||||
## Work Log
|
||||
|
||||
### {date} - Code Review Discovery
|
||||
**By:** Claude Code Review System
|
||||
**Actions:**
|
||||
- Discovered during comprehensive code review
|
||||
- Analyzed by multiple specialized agents
|
||||
- Categorized and prioritized
|
||||
|
||||
**Learnings:**
|
||||
- [Key insights from agent analysis]
|
||||
|
||||
## Notes
|
||||
Source: Code review performed on {date}
|
||||
Review command: /workflows:review {arguments}
|
||||
```
|
||||
|
||||
5. **Track creation:**
|
||||
Add to TodoWrite list if tracking multiple findings
|
||||
|
||||
</todo_creation_process>
|
||||
|
||||
#### Step 4: Summary Report
|
||||
|
||||
After processing all findings:
|
||||
|
||||
```markdown
|
||||
## Code Review Complete
|
||||
|
||||
**Review Target:** [PR number or branch]
|
||||
**Total Findings:** [X] (from all agents)
|
||||
**High-Confidence (≥80):** [Y] (shown to user)
|
||||
**Filtered (<80):** [Z] (auto-removed as likely false positives)
|
||||
**Todos Created:** [W]
|
||||
|
||||
### Confidence Distribution:
|
||||
- 90-100 (certain): [count]
|
||||
- 80-89 (confident): [count]
|
||||
- <80 (filtered): [count]
|
||||
|
||||
### Created Todos:
|
||||
- `{issue_id}-pending-p1-{description}.md` - {title} (confidence: 95)
|
||||
- `{issue_id}-pending-p2-{description}.md` - {title} (confidence: 85)
|
||||
...
|
||||
|
||||
### Skipped Findings (User Choice):
|
||||
- [Finding #Z]: {reason}
|
||||
...
|
||||
|
||||
### Auto-Filtered (Low Confidence):
|
||||
- [X] findings filtered with confidence <80
|
||||
- Run with `--show-all` flag to see filtered findings
|
||||
|
||||
### Next Steps:
|
||||
1. Triage pending todos: `ls todos/*-pending-*.md`
|
||||
2. Use `/triage` to review and approve
|
||||
3. Work on approved items: `/resolve_todo_parallel`
|
||||
```
|
||||
|
||||
#### Alternative: Batch Creation
|
||||
|
||||
If user wants to convert all findings to todos without review:
|
||||
|
||||
```bash
|
||||
# Ask: "Create todos for all X findings? (yes/no/show-critical-only)"
|
||||
# If yes: create todo files for all findings in parallel
|
||||
# If show-critical-only: only present P1 findings for triage
|
||||
```
|
||||
Reference in New Issue
Block a user