Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:47:43 +08:00
commit 2e8d89fca3
41 changed files with 14051 additions and 0 deletions

328
commands/create-pr.md Normal file
View File

@@ -0,0 +1,328 @@
---
description: Create PR as draft, wait for CI to pass, then mark ready for review (solves the "I'll check back later" problem)
---
# Creating PR with CI Verification
**IMPORTANT:** This command solves the common problem where Claude creates a PR, says "I'll wait for CI", then gives up. This command ACTUALLY waits for CI completion.
## Workflow
This command will:
1. ✅ Create PR as draft
2. ✅ Wait for CI checks to complete (actually wait, not give up)
3. ✅ Mark as ready for review if CI passes
4. ✅ Report failures with links if CI fails
## Step 1: Determine Current Branch and Changes
```bash
# Get current branch
CURRENT_BRANCH=$(git branch --show-current)
# Get base branch (usually main or master)
BASE_BRANCH=$(git remote show origin | grep 'HEAD branch' | cut -d' ' -f5)
# Verify we're not on main/master
if [[ "$CURRENT_BRANCH" == "main" || "$CURRENT_BRANCH" == "master" ]]; then
echo "ERROR: Cannot create PR from main/master branch"
echo "Create a feature branch first: git checkout -b feature-name"
exit 1
fi
# Check if branch is pushed
git rev-parse --verify origin/$CURRENT_BRANCH > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "Branch not pushed yet. Pushing now..."
git push -u origin $CURRENT_BRANCH
fi
```
## Step 2: Gather PR Context
Run in parallel to gather information:
```bash
# 1. Git status
git status
# 2. Diff since divergence from base
git diff ${BASE_BRANCH}...HEAD
# 3. Commit history for this branch
git log ${BASE_BRANCH}..HEAD --oneline
# 4. Check if PR already exists
gh pr view --json number,state,isDraft 2>/dev/null
```
**If PR already exists:**
```bash
EXISTING_PR=$(gh pr view --json number --jq '.number')
echo "PR #$EXISTING_PR already exists for this branch"
echo "Use /review-pr $EXISTING_PR or /fix-pr $EXISTING_PR instead"
exit 0
```
## Step 3: Create PR Title and Body
Based on the changes (read from git diff and commit history):
**Title:** Concise summary of changes (50 chars max)
**Body format:**
```markdown
## Summary
- Bullet point 1
- Bullet point 2
## Changes
- Specific change 1
- Specific change 2
## Testing
- Test approach
- Verification steps
## Checklist
- [ ] Tests pass locally
- [ ] Code formatted (make format)
- [ ] Changelog entry created (if applicable)
---
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
```
## Step 4: Create PR as Draft
**CRITICAL:** Always create as draft initially.
```bash
# Create PR as draft
gh pr create \
--title "PR Title Here" \
--body "$(cat <<'EOF'
PR body here
EOF
)" \
--draft
# Get PR number
PR_NUMBER=$(gh pr view --json number --jq '.number')
echo "Created draft PR #$PR_NUMBER"
echo "URL: $(gh pr view --json url --jq '.url')"
```
## Step 5: Wait for CI Checks (DO NOT GIVE UP!)
**This is where Claude usually fails. DO NOT say "I'll check back later".**
**Instead, ACTUALLY wait using a polling loop with NO timeout:**
```bash
echo "Waiting for CI checks to complete..."
echo "Typical CI times by repository:"
echo " - policyengine-app: 3-6 minutes"
echo " - policyengine-uk: 9-12 minutes"
echo " - policyengine-api: 30 minutes"
echo " - policyengine-us: 60-75 minutes (longest)"
echo ""
echo "I will poll every 15 seconds until all checks complete. No time limit."
POLL_INTERVAL=15 # Check every 15 seconds
ELAPSED=0
while true; do
# Get check status
CHECKS_JSON=$(gh pr checks $PR_NUMBER --json name,status,conclusion)
# Count checks
TOTAL=$(echo "$CHECKS_JSON" | jq '. | length')
if [ "$TOTAL" -eq 0 ]; then
echo "No CI checks found yet. Waiting for checks to start..."
sleep $POLL_INTERVAL
ELAPSED=$((ELAPSED + POLL_INTERVAL))
continue
fi
# Count by status
COMPLETED=$(echo "$CHECKS_JSON" | jq '[.[] | select(.status == "COMPLETED")] | length')
FAILED=$(echo "$CHECKS_JSON" | jq '[.[] | select(.conclusion == "FAILURE")] | length')
echo "[$ELAPSED s] CI Checks: $COMPLETED/$TOTAL completed, $FAILED failed"
# If all completed
if [ "$COMPLETED" -eq "$TOTAL" ]; then
if [ "$FAILED" -eq 0 ]; then
echo "✅ All CI checks passed after $ELAPSED seconds!"
break
else
echo "❌ Some CI checks failed after $ELAPSED seconds."
# Show failures
gh pr checks $PR_NUMBER
break
fi
fi
# Continue waiting (no timeout!)
sleep $POLL_INTERVAL
ELAPSED=$((ELAPSED + POLL_INTERVAL))
done
```
**Important:** No timeout means we actually wait as long as needed. Population simulations can take 30+ minutes.
## Step 6: Mark Ready for Review (If CI Passed)
```bash
if [ "$FAILED" -eq 0 ] && [ "$COMPLETED" -eq "$TOTAL" ]; then
echo "Marking PR as ready for review..."
gh pr ready $PR_NUMBER
echo ""
echo "✅ PR #$PR_NUMBER is ready for review!"
echo "URL: $(gh pr view $PR_NUMBER --json url --jq '.url')"
echo ""
echo "All CI checks passed:"
gh pr checks $PR_NUMBER
else
echo ""
echo "⚠️ PR remains as draft due to CI issues."
echo "URL: $(gh pr view $PR_NUMBER --json url --jq '.url')"
echo ""
echo "CI status:"
gh pr checks $PR_NUMBER
echo ""
echo "To fix and retry:"
echo "1. Fix the failing checks"
echo "2. Push changes: git push"
echo "3. Run this command again to re-check CI"
fi
```
## Key Differences from Default Behavior
**Old way (Claude gives up):**
```
Claude: "I've created the PR. CI will take a while, I'll check back later..."
[Chat ends, Claude never checks back]
User: Has to manually check CI and mark ready
```
**New way (this command actually waits):**
```
Claude: "I've created the PR as draft. Now waiting for CI..."
[Actually polls CI status every 15 seconds]
Claude: "CI passed! Marking as ready for review."
[PR is ready, user doesn't have to do anything]
```
## Error Handling
**If CI fails:**
```bash
echo "❌ CI checks failed. Here are the failures:"
gh pr checks $PR_NUMBER
echo ""
echo "Common fixes:"
echo "- Linting errors: make format && git push"
echo "- Test failures: See logs at PR URL"
echo "- Use /fix-pr $PR_NUMBER to attempt automated fixes"
```
**If no CI configured:**
```bash
if [ "$TOTAL" -eq 0 ] && [ $ELAPSED -gt 60 ]; then
echo "⚠️ No CI checks detected after 60 seconds."
echo "Repository may not have CI configured."
echo "Marking PR as ready for manual review."
gh pr ready $PR_NUMBER
fi
```
## Usage Examples
**Basic:**
```bash
# Make changes, commit
git add .
git commit -m "Add feature"
# Create PR (command waits for CI)
/create-pr
```
**With custom title:**
```bash
# Command can accept optional arguments
/create-pr "Add California EITC implementation"
```
**Checking existing PR:**
```bash
# If PR exists, command tells you and suggests alternatives
/create-pr
# Output: "PR #123 exists. Use /review-pr 123 or /fix-pr 123"
```
## Integration with Other Commands
**After /encode-policy or /fix-pr:**
```bash
# These commands might push code
# Use /create-pr to create PR and wait for CI
/create-pr
```
**After fixing CI issues:**
```bash
# Fix issues locally
make format
git add .
git commit -m "Fix linting"
git push
# Command will detect existing PR and re-check CI
/create-pr # "PR #123 exists, checking CI status..."
```
## CI Duration Expectations
**By repository (based on actual data):**
- **policyengine-app:** 3-6 minutes (fast)
- **policyengine-uk:** 9-12 minutes (medium)
- **policyengine-api:** ~30 minutes (slow)
- **policyengine-us:** 60-75 minutes (very slow - population simulations)
**The command has no timeout** - it will wait as long as needed, even for policyengine-us's 75-minute CI runs.
## Why This Works
**Problem:** Claude's internal timeout or context leads it to give up
**Solution:**
1. ✅ Explicit polling loop (Claude sees the code will wait)
2. ✅ Clear status updates (Claude knows it's still working)
3. ✅ Timeout is explicit (Claude knows how long to wait)
4. ✅ Fallback handling (What to do if timeout occurs)
**Key insight:** By having the command contain the waiting logic in bash, Claude executes it and actually waits instead of just saying "I'll wait."
## Notes
**This command is essential for:**
- Automated PR workflows
- CI-dependent merges
- Team workflows requiring CI validation
- Reducing manual PR management
**Use this instead of:**
- Manual `gh pr create` followed by manual CI checking
- Asking Claude to "wait for CI" (which it can't do reliably)
- Creating ready PRs before CI passes

303
commands/encode-policy.md Normal file
View File

@@ -0,0 +1,303 @@
---
description: Orchestrates multi-agent workflow to implement new government benefit programs
---
# Implementing $ARGUMENTS in PolicyEngine
Coordinate the multi-agent workflow to implement $ARGUMENTS as a complete, production-ready government benefit program.
## Program Type Detection
This workflow adapts based on the type of program being implemented:
**TANF/Benefit Programs** (e.g., state TANF, SNAP, WIC):
- Phase 4: test-creator creates both unit and integration tests
- Phase 7: Uses specialized @complete:country-models:tanf-program-reviewer agent in parallel validation
- Optional phases: May be skipped for simplified implementations
**Other Government Programs** (e.g., tax credits, deductions):
- Phase 4: test-creator creates both unit and integration tests
- Phase 7: Uses general @complete:country-models:implementation-validator agent in parallel validation
- Optional phases: Include based on production requirements
## Phase 0: Implementation Approach (TANF Programs Only)
**For TANF programs, detect implementation approach from $ARGUMENTS:**
**Auto-detect from user's request:**
- If $ARGUMENTS contains "simple" or "simplified" → Use **Simplified** approach
- If $ARGUMENTS contains "full" or "complete" → Use **Full** approach
- If unclear → Default to **Simplified** approach
**Simplified Implementation:**
- Use federal baseline for gross income, demographic eligibility, immigration eligibility
- Faster implementation
- Suitable for most states that follow federal definitions
- Only creates state-specific variables for: income limits, disregards, benefit amounts, final calculation
**Full Implementation:**
- Create state-specific income definitions, eligibility criteria
- More detailed implementation
- Required when state has unique income definitions or eligibility rules
- Creates all state-specific variables
**Record the detected approach and pass it to Phase 4 (rules-engineer).**
## Phase 1: Issue and PR Setup
Invoke @complete:country-models:issue-manager agent to:
- Search for existing issue or create new one for $ARGUMENTS
- Create draft PR immediately in PolicyEngine/policyengine-us repository (NOT personal fork)
- Return issue number and PR URL for tracking
## Phase 2: Variable Naming Convention
Invoke @complete:country-models:naming-coordinator agent to:
- Analyze existing naming patterns in the codebase
- Establish variable naming convention for $ARGUMENTS
- Analyze existing folder structure patterns in the codebase
- Post naming decisions and folder structure to GitHub issue for all agents to reference
**Quality Gate**: Naming convention and folder structure must be documented before proceeding to ensure consistency across parallel development.
## Phase 3: Document Collection
**Phase 3A: Initial Document Gathering**
Invoke @complete:country-models:document-collector agent to gather official $ARGUMENTS documentation, save as `working_references.md` in the repository, and post to GitHub issue
**After agent completes:**
1. Check the agent's report for "📄 PDFs Requiring Extraction" section
2. **Decision Point:**
- **If PDFs are CRITICAL** (State Plans with benefit formulas, calculation methodology):
- Ask the user: "Please send me these PDF URLs so I can extract their content:"
- List each PDF URL on a separate line
- Wait for user to send the URLs (they will auto-extract)
- Proceed to Phase 3B
- **If PDFs are SUPPLEMENTARY** (additional reference, not essential):
- Note them for future reference
- Proceed directly to Phase 4 with current documentation
3. **If no PDFs listed:**
- Skip to Phase 4 (documentation complete)
**Phase 3B: PDF Extraction & Complete Documentation** (Only if CRITICAL PDFs found)
1. After receiving extracted PDF content from user:
2. Relaunch @complete:country-models:document-collector agent with:
- Original task description
- Extracted PDF content included in prompt
- Instruction: "You are in Phase 2 - integrate this PDF content with your HTML research"
3. Agent creates complete documentation
**Quality Gate**: Documentation must include:
- Official program guidelines or state plan
- Income limits and benefit schedules
- Eligibility criteria and priority groups
- Seasonal/temporal rules if applicable
- ✅ All critical PDFs extracted and integrated (if applicable)
## Phase 4: Development (Parallel on Same Branch)
Run both agents IN PARALLEL - they work on different folders so no conflicts:
**@complete:country-models:test-creator** → works in `tests/` folder:
- Create comprehensive INTEGRATION tests from documentation
- Create UNIT tests for each variable that will have a formula
- Both test types created in ONE invocation
- Use only existing PolicyEngine variables
- Test realistic calculations based on documentation
**@complete:country-models:rules-engineer** → works in `variables/` + `parameters/` folders:
- **Implementation Approach:** [Pass the decision from Phase 0: "simplified" or "full"]
- **If Simplified TANF:** Do NOT create state-specific gross income variables - use federal baseline (`tanf_gross_earned_income`, `tanf_gross_unearned_income`)
- **If Full TANF:** Create complete state-specific income definitions as needed
- **Step 1:** Create all parameters first using embedded parameter-architect patterns
- Complete parameter structure with all thresholds, amounts, rates
- Include proper references from documentation
- **Step 2:** Implement variables using the parameters
- Zero hard-coded values
- Complete implementations only
- Follow simplified/full approach from Phase 0
**Quality Requirements**:
- rules-engineer: ZERO hard-coded values, parameters created before variables
- test-creator: All tests (unit + integration) created together, based purely on documentation
## Phase 5: Pre-Push Validation
Invoke @complete:country-models:pr-pusher agent to:
- Ensure changelog entry exists
- Run formatters (black, isort)
- Fix any linting issues
- Run local tests for quick validation
- Push branch and report initial CI status
**Quality Gate**: Branch must be properly formatted with changelog before continuing.
## Optional Enhancement Phases
**These phases are OPTIONAL and should be considered based on implementation type:**
### For Production-Ready Implementations:
The following enhancements may be applied to ensure production quality:
1. **Cross-Program Validation**
- @complete:country-models:cross-program-validator: Check interactions with other benefits
- Prevents benefit cliffs and unintended interactions
2. **Documentation Enhancement**
- @complete:country-models:documentation-enricher: Add examples and regulatory citations
- Improves maintainability and compliance verification
3. **Performance Optimization**
- @complete:country-models:performance-optimizer: Vectorize and optimize calculations
- Ensures scalability for large-scale simulations
**Decision Criteria:**
- **Simplified/Experimental TANF**: Skip these optional phases
- **Production TANF**: Include based on specific requirements
- **Full Production Deployment**: Include all enhancements
## Phase 6: Validation
**Run validators to check implementation quality:**
**@complete:country-models:implementation-validator:**
- Check for hard-coded values in variables
- Verify placeholder or incomplete implementations
- Check federal/state parameter organization
- Assess test quality and coverage
- Identify performance and vectorization issues
**For TANF/Benefit Programs, also run @complete:country-models:tanf-program-reviewer:**
- Learn from PA TANF and OH OWF reference implementations first
- Validate code formulas against regulations
- Verify test coverage with manual calculations
- Check parameter structure and references
- Focus on: eligibility rules, income disregards, benefit formulas
**Quality Gate:** Review validator reports before proceeding
## Phase 7: Local Testing & Fixes
**CRITICAL: ALWAYS invoke @complete:country-models:ci-fixer agent - do NOT manually fix issues**
Invoke @complete:country-models:ci-fixer agent to:
- Run all tests locally: `policyengine-core test policyengine_us/tests/policy/baseline/gov/states/[STATE]/[PROGRAM] -c policyengine_us -v`
- Identify ALL failing tests
- For each failing test:
- Read the test file to understand expected values
- Read the actual test output to see what was calculated
- Determine root cause: incorrect test expectations OR bug in implementation
- Fix the issue:
- If test expectations are wrong: update the test file with correct values
- If implementation is wrong: fix the variable/parameter code
- Re-run tests to verify fix
- Iterate until ALL tests pass locally
- **Reference Verification**
- Verify all parameters have reference metadata
- Verify all variables have reference fields
- Keep `sources/` folder files for future reference
- If references missing: Create todos for adding them
- Run `make format` before committing fixes
- Push final fixes to PR branch
**Success Metrics**:
- All tests pass locally (green output)
- All references embedded in code metadata
- `sources/` folder kept for future reference
- Code properly formatted
- Implementation complete and working
- Clean commit history
## Phase 8: Final Summary
After all tests pass and references are embedded:
- Update PR description with final implementation status
- Add summary of what was implemented
- Report completion to user
- **Keep PR as draft** - user will mark ready when they choose
- **WORKFLOW COMPLETE**
## Anti-Patterns This Workflow Prevents
1. **Hard-coded values**: Rules-engineer enforces parameterization
2. **Incomplete implementations**: Validator catches before PR
3. **Federal/state mixing**: Proper parameter organization enforced
4. **Non-existent variables in tests**: Test creator uses only real variables
5. **Missing edge cases**: Edge-case-generator covers all boundaries
6. **Benefit cliffs**: Cross-program-validator identifies interactions
7. **Poor documentation**: Documentation-enricher adds examples
8. **Performance issues**: Performance-optimizer ensures vectorization
9. **Review delays**: Most issues caught and fixed automatically
## Execution Instructions
**YOUR ROLE**: You are an orchestrator ONLY. You must:
1. Invoke agents using the Task tool
2. Wait for their completion
3. Check quality gates
4. **PAUSE and wait for user confirmation before proceeding to next phase**
**YOU MUST NOT**:
- Write any code yourself
- Fix any issues manually
- Run tests directly
- Edit files
**Execution Flow (ONE PHASE AT A TIME)**:
Execute each phase sequentially and **STOP after each phase** to wait for user instructions:
0. **Phase 0**: Implementation Approach (TANF Programs Only)
- Auto-detect from $ARGUMENTS ("simple"/"simplified" vs. "full"/"complete")
- Default to Simplified if unclear
- Inform user of detected approach
- **STOP - Wait for user to say "continue" or provide adjustments**
1. **Phase 1**: Issue and PR Setup
- Complete the phase
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
2. **Phase 2**: Variable Naming Convention
- Complete the phase
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
3. **Phase 3**: Document Collection
- Complete the phase
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
4. **Phase 4**: Development (Test + Implementation)
- Pass simplified/full decision to rules-engineer
- Run test-creator and rules-engineer in parallel (different folders)
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
5. **Phase 5**: Pre-Push Validation
- Complete the phase
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
6. **Phase 6**: Validation
- Run validators
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
7. **Phase 7**: Local Testing & Fixes (Including Reference Verification)
- Complete the phase
- Report results
- **STOP - Wait for user to say "continue" or provide adjustments**
8. **Phase 8**: Final Summary
- Update PR description
- Report final results (keep PR as draft)
- **WORKFLOW COMPLETE**
**CRITICAL RULES**:
- Do NOT proceed to the next phase until user explicitly says to continue
- After each phase, summarize what was accomplished
- If user provides adjustments, incorporate them before continuing
- All 8 phases are REQUIRED - pausing doesn't mean skipping
If any agent fails, report the failure but DO NOT attempt to fix it yourself. Wait for user instructions.

333
commands/fix-pr.md Normal file
View File

@@ -0,0 +1,333 @@
---
description: Apply fixes to an existing PR, addressing all review comments and validation issues
---
# Fixing PR: $ARGUMENTS
## Determining Which PR to Review
First, determine which PR to review based on the arguments:
```bash
# If no arguments provided, use current branch's PR
if [ -z "$ARGUMENTS" ]; then
CURRENT_BRANCH=$(git branch --show-current)
PR_NUMBER=$(gh pr list --head "$CURRENT_BRANCH" --json number --jq '.[0].number')
if [ -z "$PR_NUMBER" ]; then
echo "No PR found for current branch $CURRENT_BRANCH"
exit 1
fi
# If argument is a number, use it directly
elif [[ "$ARGUMENTS" =~ ^[0-9]+$ ]]; then
PR_NUMBER=$ARGUMENTS
# Otherwise, search for PR by description/title
else
PR_NUMBER=$(gh pr list --search "$ARGUMENTS" --json number,title --jq '.[0].number')
if [ -z "$PR_NUMBER" ]; then
echo "No PR found matching: $ARGUMENTS"
exit 1
fi
fi
echo "Reviewing PR #$PR_NUMBER"
```
Orchestrate agents to review, validate, and fix issues in PR #$PR_NUMBER, addressing all GitHub review comments.
## ⚠️ CRITICAL: Agent Usage is MANDATORY
**You are a coordinator, NOT an implementer. You MUST:**
1. **NEVER make direct code changes** - always use agents
2. **INVOKE agents with specific, detailed instructions**
3. **WAIT for each agent to complete** before proceeding
4. **VERIFY agent outputs** before moving to next phase
**If you find yourself using Edit, Write, or MultiEdit directly, STOP and invoke the appropriate agent instead.**
## Phase 1: PR Analysis
After determining PR_NUMBER above, gather context about the PR and review comments:
```bash
gh pr view $PR_NUMBER --comments
gh pr checks $PR_NUMBER # Note: CI runs on draft PRs too!
gh pr diff $PR_NUMBER
```
Document findings:
- Current CI status (even draft PRs have CI)
- Review comments to address
- Files changed
- Type of implementation (new program, bug fix, enhancement)
## Phase 2: Enhanced Domain Validation
Run comprehensive validation using specialized agents:
### Step 1: Domain-Specific Validation
Invoke **implementation-validator** to check:
- Federal/state jurisdiction separation
- Variable naming conventions and duplicates
- Hard-coded value patterns
- Performance optimization opportunities
- Documentation placement
- PolicyEngine-specific patterns
### Step 2: Reference Validation
Invoke **reference-validator** to verify:
- All parameters have references
- References actually corroborate values
- Federal sources for federal params
- State sources for state params
- Specific citations, not generic links
### Step 3: Implementation Validation
Invoke **implementation-validator** to check:
- No hard-coded values in formulas
- Complete implementations (no TODOs)
- Proper entity usage
- Correct formula patterns
Create a comprehensive checklist:
- [ ] Domain validation issues
- [ ] Reference validation issues
- [ ] Implementation issues
- [ ] Review comments to address
- [ ] CI failures to fix
## Phase 3: Sequential Fix Application
**CRITICAL: You MUST use agents for ALL fixes. DO NOT make direct edits yourself.**
Based on issues found, invoke agents IN ORDER to avoid conflicts.
**Why Sequential**: Unlike initial implementation where we need isolation, PR fixes must be applied sequentially because:
- Each fix builds on the previous one
- Avoids merge conflicts
- Tests need to see the fixed implementation
- Documentation needs to reflect the final code
**MANDATORY AGENT USAGE - Apply fixes in this exact order:**
### Step 1: Fix Domain & Parameter Issues
```markdown
YOU MUST INVOKE THESE AGENTS - DO NOT FIX DIRECTLY:
1. First, invoke implementation-validator:
"Scan all files in this PR and create a comprehensive list of all domain violations and implementation issues"
2. Then invoke parameter-architect (REQUIRED for ANY hard-coded values):
"Design parameter structure for these hard-coded values found: [list all values]
Create the YAML parameter files with proper federal/state separation"
3. Then invoke rules-engineer (REQUIRED for code changes):
"Refactor all variables to use the new parameters created by parameter-architect.
Fix all hard-coded values in: [list files]"
4. Then invoke reference-validator:
"Add proper references to all new parameters created"
5. ONLY AFTER all agents complete: Commit changes
```
### Step 2: Add Missing Tests (MANDATORY)
```markdown
REQUIRED - Must generate tests even if none were failing:
1. Invoke edge-case-generator:
"Generate boundary tests for all parameters created in Step 1.
Test edge cases for: [list all new parameters]"
2. Invoke test-creator:
"Create integration tests for the refactored Idaho LIHEAP implementation.
Include tests for all new parameter files created."
3. VERIFY tests pass before committing
4. Commit test additions
```
### Step 3: Enhance Documentation
1. **documentation-enricher**: Add examples and references to updated code
2. Commit documentation improvements
### Step 4: Optimize Performance (if needed)
1. **performance-optimizer**: Vectorize and optimize calculations
2. Run tests to ensure no regressions
3. Commit optimizations
### Step 5: Validate Integrations
1. **cross-program-validator**: Check benefit interactions
2. Fix any cliff effects or integration issues found
3. Commit integration fixes
## Phase 4: Apply Fixes
For each issue identified:
1. **Read current implementation**
```bash
gh pr checkout $PR_NUMBER
```
2. **Apply agent-generated fixes**
- Use Edit/MultiEdit for targeted fixes
- Preserve existing functionality
- Add only what's needed
3. **Verify fixes locally**
```bash
make test
make format
```
## Phase 5: Address Review Comments
For each GitHub comment:
1. **Parse comment intent**
- Is it requesting a change?
- Is it asking for clarification?
- Is it pointing out an issue?
2. **Generate response**
- If change requested: Apply fix and confirm
- If clarification: Add documentation/comment
- If issue: Fix and explain approach
3. **Post response on GitHub**
```bash
gh pr comment $PR_NUMBER --body "Addressed: [explanation of fix]"
```
## Phase 6: CI Validation
Invoke ci-fixer to ensure all checks pass:
1. **Push fixes**
```bash
git add -A
git commit -m "Address review comments
- Fixed hard-coded values identified in review
- Added missing tests for edge cases
- Enhanced documentation with examples
- Optimized performance issues
Addresses comments from @reviewer"
git push
```
2. **Monitor CI**
```bash
gh pr checks $PR_NUMBER --watch
```
3. **Fix any CI failures**
- Format issues: `make format`
- Test failures: Fix with targeted agents
- Lint issues: Apply corrections
## Phase 7: Final Review & Summary
Invoke implementation-validator for final validation:
- All comments addressed?
- All tests passing?
- No regressions introduced?
Post summary comment:
```bash
gh pr comment $PR_NUMBER --body "## Summary of Changes
### Issues Addressed
✅ Fixed hard-coded values in [files]
✅ Added parameterization for [values]
✅ Enhanced test coverage (+X tests)
✅ Improved documentation
✅ All CI checks passing
### Review Comments Addressed
- @reviewer1: [Issue] → [Fix applied]
- @reviewer2: [Question] → [Clarification added]
### Ready for Re-Review
All identified issues have been addressed. The implementation now:
- Uses parameters for all configurable values
- Has comprehensive test coverage
- Includes documentation with examples
- Passes all CI checks"
```
## Command Options
### Usage Examples
- `/review-pr` - Review PR for current branch
- `/review-pr 6444` - Review PR #6444
- `/review-pr "Idaho LIHEAP"` - Search for and review PR by title/description
### Quick Fix Mode
`/review-pr [PR] --quick`
- Only fix CI failures
- Skip comprehensive review
- Focus on getting checks green
### Deep Review Mode
`/review-pr [PR] --deep`
- Run all validators
- Generate comprehensive tests
- Full documentation enhancement
- Cross-program validation
### Comment Only Mode
`/review-pr [PR] --comments-only`
- Only address GitHub review comments
- Skip additional validation
- Faster turnaround
## Success Metrics
The PR is ready when:
- ✅ All CI checks passing
- ✅ All review comments addressed
- ✅ No hard-coded values
- ✅ Comprehensive test coverage
- ✅ Documentation complete
- ✅ No performance issues
## Common Review Patterns
### "Hard-coded value" Comment
1. Identify the value
2. Create parameter with parameter-architect
3. Update implementation with rules-engineer
4. Add test with test-creator
### "Missing test" Comment
1. Identify the scenario
2. Use edge-case-generator for boundaries
3. Use test-creator for integration tests
4. Verify with implementation-validator
### "Needs documentation" Comment
1. Use documentation-enricher
2. Add calculation examples
3. Add regulatory references
4. Explain edge cases
### "Performance issue" Comment
1. Use performance-optimizer
2. Vectorize operations
3. Remove redundant calculations
4. Test with large datasets
## Error Handling
If agents produce conflicting fixes:
1. Prioritize fixes that address review comments
2. Ensure no regressions
3. Maintain backward compatibility
4. Document any tradeoffs
## Pre-Flight Checklist
Before starting, confirm:
- [ ] I will NOT make direct edits (no Edit/Write/MultiEdit by coordinator)
- [ ] I will invoke agents for ALL changes
- [ ] I will wait for each agent to complete
- [ ] I will generate tests even if current tests pass
- [ ] I will commit after each agent phase
Start with determining which PR to review, then proceed to Phase 1: Analyze the PR and review comments.

420
commands/review-pr.md Normal file
View File

@@ -0,0 +1,420 @@
---
description: Review an existing PR and post findings to GitHub (does not make code changes)
---
# Reviewing PR: $ARGUMENTS
**READ-ONLY MODE**: This command analyzes the PR and posts a comprehensive review to GitHub WITHOUT making any code changes.
Use `/fix-pr` to apply fixes automatically.
## Important: Avoiding Duplicate Reviews
**Before posting**, check if you already have an active review on this PR. If you do AND there are no replies from others:
- Delete your old comments
- Post a single updated review
- DO NOT create multiple review comments
Only post additional reviews if others have engaged with your previous review.
## Determining Which PR to Review
First, determine which PR to review based on the arguments:
```bash
# If no arguments provided, use current branch's PR
if [ -z "$ARGUMENTS" ]; then
CURRENT_BRANCH=$(git branch --show-current)
PR_NUMBER=$(gh pr list --head "$CURRENT_BRANCH" --json number --jq '.[0].number')
if [ -z "$PR_NUMBER" ]; then
echo "No PR found for current branch $CURRENT_BRANCH"
exit 1
fi
# If argument is a number, use it directly
elif [[ "$ARGUMENTS" =~ ^[0-9]+$ ]]; then
PR_NUMBER=$ARGUMENTS
# Otherwise, search for PR by description/title
else
PR_NUMBER=$(gh pr list --search "$ARGUMENTS" --json number,title --jq '.[0].number')
if [ -z "$PR_NUMBER" ]; then
echo "No PR found matching: $ARGUMENTS"
exit 1
fi
fi
echo "Reviewing PR #$PR_NUMBER"
```
## Phase 1: PR Analysis
Gather context about the PR:
```bash
gh pr view $PR_NUMBER --comments
gh pr checks $PR_NUMBER
gh pr diff $PR_NUMBER
```
Document findings:
- Current CI status
- Existing review comments
- Files changed
- Type of implementation (new program, bug fix, enhancement)
## Phase 2: Comprehensive Validation (Read-Only)
Run validators to COLLECT issues (do not fix):
### Step 1: Domain Validation
Invoke **implementation-validator** to identify:
- Federal/state jurisdiction issues
- Variable naming violations
- Hard-coded values
- Performance optimization opportunities
- Documentation gaps
- PolicyEngine pattern violations
**IMPORTANT**: Instruct agent to ONLY report issues, not fix them.
### Step 2: Reference Validation
Invoke **reference-validator** to check:
- Missing parameter references
- References that don't corroborate values
- Incorrect source types (federal vs state)
- Generic/vague citations
### Step 3: Implementation Validation
Invoke **implementation-validator** to identify:
- Hard-coded values in formulas
- Incomplete implementations (TODOs, stubs)
- Entity usage errors
- Formula pattern violations
### Step 4: Test Coverage Analysis
Invoke **edge-case-generator** to identify:
- Missing boundary tests
- Untested edge cases
- Parameter combinations not tested
### Step 5: Documentation Review
Invoke **documentation-enricher** to find:
- Missing docstrings
- Lack of calculation examples
- Missing regulatory references
- Unclear variable descriptions
## Phase 3: Collect and Organize Findings
Aggregate all issues into structured format:
```json
{
"overall_summary": "Brief summary of PR quality and main concerns",
"severity": "APPROVE|COMMENT|REQUEST_CHANGES",
"line_comments": [
{
"path": "policyengine_us/variables/gov/states/ma/dta/ccfa/income.py",
"line": 42,
"body": "💡 **Hard-coded value detected**\n\nThis value should be parameterized:\n\n```suggestion\nmax_income = parameters(period).gov.states.ma.dta.ccfa.max_income\n```\n\n📚 See: [Parameter Guidelines](https://github.com/PolicyEngine/policyengine-us/blob/master/CLAUDE.md#parameter-structure-best-practices)"
},
{
"path": "policyengine_us/tests/variables/gov/states/ma/dta/ccfa/test_eligibility.yaml",
"line": 15,
"body": "🧪 **Missing edge case test**\n\nConsider adding a test for income exactly at the threshold:\n\n```yaml\n- name: At threshold\n period: 2024-01\n input:\n household_income: 75_000\n output:\n ma_ccfa_eligible: true\n```"
}
],
"general_comments": [
"Consider adding integration tests for multi-child households",
"Documentation could benefit from real-world calculation examples"
]
}
```
## Phase 4: Check for Existing Reviews
**IMPORTANT**: Before posting a new review, check if you already have an active review:
```bash
# Check for existing reviews from you (Claude)
EXISTING_REVIEWS=$(gh api "/repos/{owner}/{repo}/pulls/$PR_NUMBER/reviews" \
--jq '[.[] | select(.user.login == "MaxGhenis") | {id: .id, state: .state, submitted_at: .submitted_at}]')
# Check if there are any comments/replies from others since your last review
LATEST_REVIEW_TIME=$(echo "$EXISTING_REVIEWS" | jq -r '.[0].submitted_at // empty')
if [ -n "$LATEST_REVIEW_TIME" ]; then
# Check for comments after your review
COMMENTS_AFTER=$(gh api "/repos/{owner}/{repo}/issues/$PR_NUMBER/comments" \
--jq "[.[] | select(.created_at > \"$LATEST_REVIEW_TIME\" and .user.login != \"MaxGhenis\")]")
if [ -z "$COMMENTS_AFTER" ] || [ "$COMMENTS_AFTER" == "[]" ]; then
echo "Existing review found with no replies from others"
echo "STRATEGY: Delete old comments and post single updated review"
# Delete old comment-only reviews/comments
gh api "/repos/{owner}/{repo}/issues/$PR_NUMBER/comments" \
--jq '.[] | select(.user.login == "MaxGhenis") | .id' | \
while read comment_id; do
gh api --method DELETE "/repos/{owner}/{repo}/issues/comments/$comment_id"
done
else
echo "Others have replied to your review - will add new review"
fi
fi
```
## Phase 5: Post GitHub Review
Post the review using GitHub CLI:
```bash
# Create review comments JSON
cat > /tmp/review_comments.json <<'EOF'
[FORMATTED_COMMENTS_ARRAY]
EOF
# Post the review with inline comments
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"/repos/{owner}/{repo}/pulls/$PR_NUMBER/reviews" \
-f body="$OVERALL_SUMMARY" \
-f event="$SEVERITY" \
-F comments=@/tmp/review_comments.json
```
### Severity Guidelines
- **APPROVE**: Minor suggestions only, no critical issues
- **COMMENT**: Has issues but not blocking (educational feedback)
- **REQUEST_CHANGES**: Has critical issues that must be fixed:
- Hard-coded values
- Missing tests for core functionality
- Incorrect implementations
- Missing required documentation
## Phase 6: Post Summary Comment
Add a comprehensive summary as a regular comment:
```bash
gh pr comment $PR_NUMBER --body "## 📋 Review Summary
### ✅ Strengths
- [List positive aspects]
- Well-structured variable organization
- Clear parameter names
### 🔍 Issues Found
#### Critical (Must Fix)
- [ ] 3 hard-coded values in income calculations
- [ ] Missing edge case tests for boundary conditions
- [ ] Parameter references not corroborated by sources
#### Suggestions (Optional)
- [ ] Consider vectorization in \`calculate_benefit()\`
- [ ] Add calculation walkthrough to documentation
- [ ] Extract shared logic into helper variable
### 📊 Validation Results
- **Domain Validation**: X issues
- **Reference Validation**: X issues
- **Implementation Validation**: X issues
- **Test Coverage**: X gaps identified
- **Documentation**: X improvements suggested
### 🚀 Next Steps
To apply these fixes automatically, run:
\`\`\`bash
/fix-pr $PR_NUMBER
\`\`\`
Or address manually and re-request review when ready.
---
💡 **Tip**: Use \`/fix-pr\` to automatically apply all suggested fixes."
```
## Phase 6: CI Status Check
If CI is failing, include CI failure summary:
```bash
gh pr checks $PR_NUMBER --json name,status,conclusion \
--jq '.[] | select(.conclusion == "failure") | "❌ " + .name'
```
Add to review:
```
### ⚠️ CI Failures
- ❌ lint: Black formatting issues
- ❌ test: 3 test failures in test_income.py
Run `/fix-pr` to automatically fix these issues.
```
## Command Options
### Usage Examples
- `/review-pr` - Review PR for current branch
- `/review-pr 6390` - Review PR #6390
- `/review-pr "Massachusetts CCFA"` - Search for and review PR by title
### Validation Modes
#### Standard Review (Default)
- All validators run
- Comprehensive analysis
- Balanced severity
#### Quick Review
`/review-pr [PR] --quick`
- Skip deep validation
- Focus on CI failures and critical issues
- Faster turnaround
#### Strict Review
`/review-pr [PR] --strict`
- Maximum scrutiny
- Flag even minor style issues
- Request changes for any violations
## Output Format
The review will include:
1. **Overall Summary**: High-level assessment
2. **Inline Comments**: Specific issues at exact lines
3. **Code Suggestions**: GitHub suggestion blocks where applicable
4. **Summary Comment**: Checklist of all issues
5. **Next Steps**: How to address findings
## Review Categories
### 🔴 Critical Issues (Always flag)
- Hard-coded values in formulas
- Missing parameter references
- Incorrect formula implementations
- Missing tests for core functionality
- CI failures
### 🟡 Suggestions (Optional improvements)
- Performance optimizations
- Documentation enhancements
- Code style improvements
- Additional test coverage
- Refactoring opportunities
### 🟢 Positive Feedback
- Well-implemented patterns
- Good test coverage
- Clear documentation
- Proper parameterization
## Success Criteria
A good review should:
- ✅ Identify all hard-coded values
- ✅ Flag missing/incorrect references
- ✅ Suggest specific improvements with examples
- ✅ Provide actionable next steps
- ✅ Be respectful and constructive
- ✅ Include code suggestion blocks for easy application
## Common Review Patterns
### Hard-coded Value
```markdown
💡 **Hard-coded value detected**
This value should be parameterized. Create a parameter:
\`\`\`yaml
# policyengine_us/parameters/gov/states/ma/program/threshold.yaml
values:
2024-01-01: 75000
metadata:
unit: currency-USD
period: year
reference: [Link to source]
\`\`\`
Then use in formula:
\`\`\`suggestion
threshold = parameters(period).gov.states.ma.program.threshold
\`\`\`
```
### Missing Test
```markdown
🧪 **Missing edge case test**
Add test for boundary condition:
\`\`\`yaml
- name: At exact threshold
period: 2024-01
input:
income: 75_000
output:
eligible: false # Just above threshold
\`\`\`
```
### Missing Documentation
```markdown
📚 **Documentation enhancement**
Add docstring with example:
\`\`\`suggestion
def formula(person, period, parameters):
"""
Calculate benefit amount.
Example:
Income $50,000 → Benefit $2,400
Income $75,000 → Benefit $1,200
Reference: MA DTA Regulation 106 CMR 702.310
"""
\`\`\`
```
### Performance Issue
```markdown
**Vectorization opportunity**
Replace loop with vectorized operation:
\`\`\`suggestion
return where(
household_size == 1,
base_amount,
base_amount * household_size * 0.7
)
\`\`\`
```
## Error Handling
If validation agents fail:
1. Run basic validation manually
2. Document what couldn't be checked
3. Note limitations in review
4. Still post whatever findings were collected
## Pre-Flight Checklist
Before starting, confirm:
- [ ] I will NOT make any code changes
- [ ] I will ONLY analyze and post reviews
- [ ] I will collect findings from all validators
- [ ] I will format findings as GitHub review comments
- [ ] I will provide actionable suggestions
- [ ] I will be constructive and respectful
Start by determining which PR to review, then proceed to Phase 1: Analyze the PR.