Files
2025-11-30 08:47:57 +08:00

8.2 KiB

name, description
name description
policyengine-review-patterns PolicyEngine code review patterns - validation checklist, common issues, review standards

PolicyEngine Review Patterns

Comprehensive patterns for reviewing PolicyEngine implementations.

Understanding WHY, Not Just WHAT

Pattern Analysis Before Review

When reviewing implementations that reference other states:

🔴 CRITICAL: Check WHY Variables Exist

Before approving any state-specific variable, verify:

  1. Does it have state-specific logic? - Read the formula
  2. Are state parameters used? - Check for parameters(period).gov.states.XX
  3. Is there transformation beyond aggregation? - Look for calculations
  4. Would removing it break functionality? - Test dependencies

Example Analysis:

# IL TANF has this variable:
class il_tanf_assistance_unit_size(Variable):
    adds = ["il_tanf_payment_eligible_child", "il_tanf_payment_eligible_parent"]
    # ✅ VALID: IL-specific eligibility rules

# But IN TANF shouldn't copy it blindly:
class in_tanf_assistance_unit_size(Variable):
    def formula(spm_unit, period):
        return spm_unit("spm_unit_size", period)
    # ❌ INVALID: No IN-specific logic, just wrapper

Wrapper Variable Detection

Red Flags - Variables that shouldn't exist:

  • Formula is just return entity("federal_variable", period)
  • Aggregates federal baseline with no transformation
  • No state parameters accessed
  • Comment says "use federal" but creates variable anyway

Action: Request deletion of unnecessary wrapper variables


Priority Review Checklist

🔴 CRITICAL - Automatic Failures

These issues will cause crashes or incorrect results:

1. Vectorization Violations

 FAILS:
if household("income") > 1000:  # Will crash with arrays
    return 500

 PASSES:
return where(household("income") > 1000, 500, 100)

2. Hard-Coded Values

 FAILS:
benefit = min_(income * 0.33, 500)  # Hard-coded 0.33 and 500

 PASSES:
benefit = min_(income * p.rate, p.maximum)

3. Missing Parameter Sources

❌ FAILS:
reference:
  - title: State website
    href: https://state.gov

✅ PASSES:
reference:
  - title: Idaho Admin Code 16.05.03.205(3)
    href: https://adminrules.idaho.gov/rules/current/16/160503.pdf#page=14

🟡 MAJOR - Must Fix

These affect accuracy or maintainability:

4. Test Quality Issues

❌ FAILS:
income: 50000  # No separator

✅ PASSES:
income: 50_000  # Proper formatting

5. Calculation Accuracy

  • Order of operations matches regulations
  • Deductions applied in correct sequence
  • Edge cases handled (negatives, zeros)

6. Description Style

❌ FAILS:
description: The amount of SNAP benefits  # Passive voice

✅ PASSES:
description: SNAP benefits  # Active voice

🟢 MINOR - Should Fix

These improve code quality:

7. Code Organization

  • One variable per file
  • Proper use of defined_for
  • Use of adds for simple sums

8. Documentation

  • Clear references to regulation sections
  • Changelog entry present

Common Issues Reference

Documentation Issues

Issue Example Fix
No primary source "See SNAP website" Add USC/CFR citation
Wrong value $198 vs $200 in source Update parameter
Generic link dol.gov Link to specific regulation
Missing subsection "7 CFR 273" "7 CFR 273.9(d)(3)"

Code Issues

Issue Impact Fix
if-elif-else with data Crashes microsim Use where/select
Hard-coded values Inflexible Move to parameters
Missing defined_for Inefficient Add eligibility condition
Manual summing Wrong pattern Use adds attribute

Test Issues

Issue Example Fix
No separators 100000 100_000
No documentation output: 500 Add calculation comment
Wrong period 2024-04 Use 2024-01 or 2024
Made-up variables heating_expense Use existing variables

Source Verification Process

Step 1: Check Parameter Values

For each parameter file:

 Value matches source document
 Source is primary (statute > regulation > website)
 URL links to exact section with page anchor
 Effective dates correct

Step 2: Validate References

Primary sources (preferred):

  • USC (United States Code)
  • CFR (Code of Federal Regulations)
  • State statutes
  • State admin codes

Secondary sources (acceptable):

  • Official policy manuals
  • State plan documents

Not acceptable alone:

  • Websites without specific sections
  • Summaries or fact sheets
  • News articles

Code Quality Checks

Vectorization Scan

Search for these patterns:

# Red flags that indicate scalar logic:
"if household"
"if person"
"elif"
"else:"
"and " (should be &)
"or " (should be |)
"not " (should be ~)

Hard-Coding Scan

Search for numeric literals:

# Check for any number except:
# 0, 1, -1 (basic math)
# 12 (month conversion)
# Small indices (2, 3 for known structures)

# Flag anything like:
"0.5"
"100"
"0.33"
"65" (unless it's a standard age)

Review Response Templates

For Approval

## PolicyEngine Review: APPROVED ✅

### Verification Summary
- ✅ All parameters trace to primary sources
- ✅ Code is properly vectorized
- ✅ Tests document calculations
- ✅ No hard-coded values

### Strengths
- Excellent USC/CFR citations
- Comprehensive test coverage
- Clear calculation logic

### Minor Suggestions (optional)
- Consider adding edge case for zero income

For Changes Required

## PolicyEngine Review: CHANGES REQUIRED ❌

### Critical Issues (Must Fix)

1. **Non-vectorized code** - lines 45-50
   ```python
   # Replace this:
   if income > threshold:
       benefit = high_amount

   # With this:
   benefit = where(income > threshold, high_amount, low_amount)
  1. Parameter value mismatch - standard_deduction.yaml
    • Source shows $200, parameter has $198
    • Reference: 7 CFR 273.9(d)(1), page 5

Major Issues (Should Fix)

  1. Missing primary source - income_limit.yaml
    • Add statute/regulation citation
    • Current website link insufficient

Please address these issues and re-request review.


---

## Test Validation

### Check Test Structure

```yaml
# Verify proper format:
- name: Case 1, description.  # Numbered case with period
  period: 2024-01  # Valid period (2024-01 or 2024)
  input:
    people:
      person1:  # Generic names
        employment_income: 50_000  # Underscores
  output:
    # Calculation documented
    # Income: $50,000/year = $4,167/month
    program_benefit: 250

Run Test Commands

# Unit tests
pytest policyengine_us/tests/policy/baseline/gov/

# Integration tests
policyengine-core test <path> -c policyengine_us

# Microsimulation
pytest policyengine_us/tests/microsimulation/

Review Priorities by Context

New Program Implementation

  1. Parameter completeness
  2. All documented scenarios tested
  3. Eligibility paths covered
  4. No hard-coded values

Bug Fixes

  1. Root cause addressed
  2. No regression potential
  3. Tests prevent recurrence
  4. Vectorization maintained

Refactoring

  1. Functionality preserved
  2. Tests still pass
  3. Performance maintained
  4. Code clarity improved

Quick Review Checklist

Parameters:

  • Values match sources
  • References include subsections
  • All metadata fields present
  • Effective dates correct

Variables:

  • Properly vectorized (no if-elif-else)
  • No hard-coded values
  • Uses existing variables
  • Includes proper metadata

Tests:

  • Proper period format
  • Underscore separators
  • Calculation comments
  • Realistic scenarios

Overall:

  • Changelog entry
  • Code formatted
  • Tests pass
  • Documentation complete

For Agents

When reviewing code:

  1. Check vectorization first - crashes are worst
  2. Verify parameter sources - accuracy critical
  3. Scan for hard-coding - maintainability issue
  4. Validate test quality - ensures correctness
  5. Run all tests - catch integration issues
  6. Document issues clearly - help fixes
  7. Provide fix examples - speed resolution