--- name: policyengine-review-patterns description: PolicyEngine code review patterns - validation checklist, common issues, review standards --- # PolicyEngine Review Patterns Comprehensive patterns for reviewing PolicyEngine implementations. ## Understanding WHY, Not Just WHAT ### Pattern Analysis Before Review When reviewing implementations that reference other states: **🔴 CRITICAL: Check WHY Variables Exist** Before approving any state-specific variable, verify: 1. **Does it have state-specific logic?** - Read the formula 2. **Are state parameters used?** - Check for `parameters(period).gov.states.XX` 3. **Is there transformation beyond aggregation?** - Look for calculations 4. **Would removing it break functionality?** - Test dependencies **Example Analysis:** ```python # IL TANF has this variable: class il_tanf_assistance_unit_size(Variable): adds = ["il_tanf_payment_eligible_child", "il_tanf_payment_eligible_parent"] # ✅ VALID: IL-specific eligibility rules # But IN TANF shouldn't copy it blindly: class in_tanf_assistance_unit_size(Variable): def formula(spm_unit, period): return spm_unit("spm_unit_size", period) # ❌ INVALID: No IN-specific logic, just wrapper ``` ### Wrapper Variable Detection **Red Flags - Variables that shouldn't exist:** - Formula is just `return entity("federal_variable", period)` - Aggregates federal baseline with no transformation - No state parameters accessed - Comment says "use federal" but creates variable anyway **Action:** Request deletion of unnecessary wrapper variables --- ## Priority Review Checklist ### 🔴 CRITICAL - Automatic Failures These issues will cause crashes or incorrect results: #### 1. Vectorization Violations ```python ❌ FAILS: if household("income") > 1000: # Will crash with arrays return 500 ✅ PASSES: return where(household("income") > 1000, 500, 100) ``` #### 2. Hard-Coded Values ```python ❌ FAILS: benefit = min_(income * 0.33, 500) # Hard-coded 0.33 and 500 ✅ PASSES: benefit = min_(income * p.rate, p.maximum) ``` #### 3. Missing Parameter Sources ```yaml ❌ FAILS: reference: - title: State website href: https://state.gov ✅ PASSES: reference: - title: Idaho Admin Code 16.05.03.205(3) href: https://adminrules.idaho.gov/rules/current/16/160503.pdf#page=14 ``` --- ### 🟡 MAJOR - Must Fix These affect accuracy or maintainability: #### 4. Test Quality Issues ```yaml ❌ FAILS: income: 50000 # No separator ✅ PASSES: income: 50_000 # Proper formatting ``` #### 5. Calculation Accuracy - Order of operations matches regulations - Deductions applied in correct sequence - Edge cases handled (negatives, zeros) #### 6. Description Style ```yaml ❌ FAILS: description: The amount of SNAP benefits # Passive voice ✅ PASSES: description: SNAP benefits # Active voice ``` --- ### 🟢 MINOR - Should Fix These improve code quality: #### 7. Code Organization - One variable per file - Proper use of `defined_for` - Use of `adds` for simple sums #### 8. Documentation - Clear references to regulation sections - Changelog entry present --- ## Common Issues Reference ### Documentation Issues | Issue | Example | Fix | |-------|---------|-----| | No primary source | "See SNAP website" | Add USC/CFR citation | | Wrong value | $198 vs $200 in source | Update parameter | | Generic link | dol.gov | Link to specific regulation | | Missing subsection | "7 CFR 273" | "7 CFR 273.9(d)(3)" | ### Code Issues | Issue | Impact | Fix | |-------|--------|-----| | if-elif-else with data | Crashes microsim | Use where/select | | Hard-coded values | Inflexible | Move to parameters | | Missing defined_for | Inefficient | Add eligibility condition | | Manual summing | Wrong pattern | Use adds attribute | ### Test Issues | Issue | Example | Fix | |-------|---------|-----| | No separators | 100000 | 100_000 | | No documentation | output: 500 | Add calculation comment | | Wrong period | 2024-04 | Use 2024-01 or 2024 | | Made-up variables | heating_expense | Use existing variables | --- ## Source Verification Process ### Step 1: Check Parameter Values For each parameter file: ```python ✓ Value matches source document ✓ Source is primary (statute > regulation > website) ✓ URL links to exact section with page anchor ✓ Effective dates correct ``` ### Step 2: Validate References **Primary sources (preferred):** - USC (United States Code) - CFR (Code of Federal Regulations) - State statutes - State admin codes **Secondary sources (acceptable):** - Official policy manuals - State plan documents **Not acceptable alone:** - Websites without specific sections - Summaries or fact sheets - News articles --- ## Code Quality Checks ### Vectorization Scan Search for these patterns: ```python # Red flags that indicate scalar logic: "if household" "if person" "elif" "else:" "and " (should be &) "or " (should be |) "not " (should be ~) ``` ### Hard-Coding Scan Search for numeric literals: ```python # Check for any number except: # 0, 1, -1 (basic math) # 12 (month conversion) # Small indices (2, 3 for known structures) # Flag anything like: "0.5" "100" "0.33" "65" (unless it's a standard age) ``` --- ## Review Response Templates ### For Approval ```markdown ## PolicyEngine Review: APPROVED ✅ ### Verification Summary - ✅ All parameters trace to primary sources - ✅ Code is properly vectorized - ✅ Tests document calculations - ✅ No hard-coded values ### Strengths - Excellent USC/CFR citations - Comprehensive test coverage - Clear calculation logic ### Minor Suggestions (optional) - Consider adding edge case for zero income ``` ### For Changes Required ```markdown ## PolicyEngine Review: CHANGES REQUIRED ❌ ### Critical Issues (Must Fix) 1. **Non-vectorized code** - lines 45-50 ```python # Replace this: if income > threshold: benefit = high_amount # With this: benefit = where(income > threshold, high_amount, low_amount) ``` 2. **Parameter value mismatch** - standard_deduction.yaml - Source shows $200, parameter has $198 - Reference: 7 CFR 273.9(d)(1), page 5 ### Major Issues (Should Fix) 3. **Missing primary source** - income_limit.yaml - Add statute/regulation citation - Current website link insufficient Please address these issues and re-request review. ``` --- ## Test Validation ### Check Test Structure ```yaml # Verify proper format: - name: Case 1, description. # Numbered case with period period: 2024-01 # Valid period (2024-01 or 2024) input: people: person1: # Generic names employment_income: 50_000 # Underscores output: # Calculation documented # Income: $50,000/year = $4,167/month program_benefit: 250 ``` ### Run Test Commands ```bash # Unit tests pytest policyengine_us/tests/policy/baseline/gov/ # Integration tests policyengine-core test -c policyengine_us # Microsimulation pytest policyengine_us/tests/microsimulation/ ``` --- ## Review Priorities by Context ### New Program Implementation 1. Parameter completeness 2. All documented scenarios tested 3. Eligibility paths covered 4. No hard-coded values ### Bug Fixes 1. Root cause addressed 2. No regression potential 3. Tests prevent recurrence 4. Vectorization maintained ### Refactoring 1. Functionality preserved 2. Tests still pass 3. Performance maintained 4. Code clarity improved --- ## Quick Review Checklist **Parameters:** - [ ] Values match sources - [ ] References include subsections - [ ] All metadata fields present - [ ] Effective dates correct **Variables:** - [ ] Properly vectorized (no if-elif-else) - [ ] No hard-coded values - [ ] Uses existing variables - [ ] Includes proper metadata **Tests:** - [ ] Proper period format - [ ] Underscore separators - [ ] Calculation comments - [ ] Realistic scenarios **Overall:** - [ ] Changelog entry - [ ] Code formatted - [ ] Tests pass - [ ] Documentation complete --- ## For Agents When reviewing code: 1. **Check vectorization first** - crashes are worst 2. **Verify parameter sources** - accuracy critical 3. **Scan for hard-coding** - maintainability issue 4. **Validate test quality** - ensures correctness 5. **Run all tests** - catch integration issues 6. **Document issues clearly** - help fixes 7. **Provide fix examples** - speed resolution