23 KiB
name, description
| name | description |
|---|---|
| policyengine-implementation-patterns | PolicyEngine implementation patterns - variable creation, no hard-coding principle, federal/state separation, metadata standards |
PolicyEngine Implementation Patterns
Essential patterns for implementing government benefit program rules in PolicyEngine.
PolicyEngine Architecture Constraints
What CANNOT Be Simulated (Single-Period Limitation)
CRITICAL: PolicyEngine uses single-period simulation architecture
The following CANNOT be implemented and should be SKIPPED when found in documentation:
1. Time Limits and Lifetime Counters
Cannot simulate:
- ANY lifetime benefit limits (X months total)
- ANY time windows (X months within Y period)
- Benefit clocks and countable months
- Cumulative time tracking
Why: Requires tracking benefit history across multiple periods. PolicyEngine simulates one period at a time with no state persistence.
What to do: Document in comments but DON'T parameterize or implement:
# NOTE: [State] has [X]-month lifetime limit on [Program] benefits
# This cannot be simulated in PolicyEngine's single-period architecture
2. Work History Requirements
Cannot simulate:
- "Must have worked 6 of last 12 months"
- "Averaged 30 hours/week over past quarter"
- Prior employment verification
- Work participation rate tracking
Why: Requires historical data from previous periods.
3. Waiting Periods and Benefit Delays
Cannot simulate:
- "3-month waiting period for new residents"
- "Benefits start month after application"
- Retroactive eligibility
- Benefit recertification cycles
Why: Requires tracking application dates and eligibility history.
4. Progressive Sanctions and Penalties
Cannot simulate:
- "First violation: 1-month sanction, Second: 3-month, Third: permanent"
- Graduated penalties
- Strike systems
Why: Requires tracking violation history.
5. Asset Spend-Down Over Time
Cannot simulate:
- Medical spend-down across months
- Resource depletion tracking
- Accumulated medical expenses
Why: Requires tracking expenses and resources across periods.
What CAN Be Simulated (With Caveats)
PolicyEngine CAN simulate point-in-time eligibility and benefits:
- ✅ Current month income limits
- ✅ Current month resource limits
- ✅ Current benefit calculations
- ✅ Current household composition
- ✅ Current deductions and disregards
Time-Limited Benefits That Affect Current Calculations
Special Case: Time-limited deductions/disregards
When a deduction or disregard is only available for X months:
- DO implement the deduction (assume it applies)
- DO add a comment explaining the time limitation
- DON'T try to track or enforce the time limit
Example:
class state_tanf_countable_earned_income(Variable):
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.xx.tanf.income
earned = spm_unit("tanf_gross_earned_income", period)
# NOTE: In reality, this 75% disregard only applies for first 4 months
# of employment. PolicyEngine cannot track employment duration, so we
# apply the disregard assuming the household qualifies.
# Actual rule: [State Code Citation]
disregard_rate = p.earned_income_disregard_rate # 0.75
return earned * (1 - disregard_rate)
Rule: If it requires history or future tracking, it CANNOT be fully simulated - but implement what we can and document limitations
Critical Principles
1. ZERO Hard-Coded Values
Every numeric value MUST be parameterized
❌ FORBIDDEN:
return where(eligible, 1000, 0) # Hard-coded 1000
age < 15 # Hard-coded 15
benefit = income * 0.33 # Hard-coded 0.33
month >= 10 and month <= 3 # Hard-coded months
✅ REQUIRED:
return where(eligible, p.maximum_benefit, 0)
age < p.age_threshold.minor_child
benefit = income * p.benefit_rate
month >= p.season.start_month
Acceptable literals:
0,1,-1for basic math12for month conversion (/ 12,* 12)- Array indices when structure is known
2. No Placeholder Implementations
Delete the file rather than leave placeholders
❌ NEVER:
def formula(entity, period, parameters):
# TODO: Implement
return 75 # Placeholder
✅ ALWAYS:
# Complete implementation or no file at all
Variable Implementation Standards
Variable Metadata Format
Follow established patterns:
class il_tanf_countable_earned_income(Variable):
value_type = float
entity = SPMUnit
definition_period = MONTH
label = "Illinois TANF countable earned income"
unit = USD
reference = "https://www.law.cornell.edu/regulations/illinois/..."
defined_for = StateCode.IL
# Use adds for simple sums
adds = ["il_tanf_earned_income_after_disregard"]
Key rules:
- ✅ Use full URL in
reference(clickable) - ❌ Don't use
documentationfield - ❌ Don't use statute citations without URLs
When to Use adds vs formula
Use adds when:
- Just summing variables
- Passing through a single variable
- No transformations needed
✅ BEST - Simple sum:
class tanf_gross_income(Variable):
adds = ["employment_income", "self_employment_income"]
Use formula when:
- Applying transformations
- Conditional logic
- Calculations needed
✅ CORRECT - Need logic:
def formula(entity, period, parameters):
income = add(entity, period, ["income1", "income2"])
return max_(0, income) # Need max_
TANF Countable Income Pattern
Critical: Verify Calculation Order from Legal Code
MOST IMPORTANT: Always check the state's legal code or policy manual for the exact calculation order. The pattern below is typical but not universal.
The Typical Pattern:
- Apply deductions/disregards to earned income only
- Use
max_()to prevent negative earned income - Add unearned income (which typically has no deductions)
This pattern is based on how MOST TANF programs work, but you MUST verify with the specific state's legal code.
❌ WRONG - Applying deductions to total income
def formula(spm_unit, period, parameters):
gross_earned = spm_unit("tanf_gross_earned_income", period)
unearned = spm_unit("tanf_gross_unearned_income", period)
deductions = spm_unit("tanf_earned_income_deductions", period)
# ❌ WRONG: Deductions applied to total income
total_income = gross_earned + unearned
countable = total_income - deductions
return max_(countable, 0)
Why this is wrong:
- Deductions should ONLY reduce earned income
- Unearned income (SSI, child support, etc.) is not subject to work expense deductions
- This incorrectly reduces unearned income when earned income is low
Example error:
- Earned: $100, Unearned: $500, Deductions: $200
- Wrong result:
max_($100 + $500 - $200, 0) = $400(reduces unearned!) - Correct result:
max_($100 - $200, 0) + $500 = $500
✅ CORRECT - Apply deductions to earned only, then add unearned
def formula(spm_unit, period, parameters):
gross_earned = spm_unit("tanf_gross_earned_income", period)
unearned = spm_unit("tanf_gross_unearned_income", period)
deductions = spm_unit("tanf_earned_income_deductions", period)
# ✅ CORRECT: Deductions applied to earned only, then add unearned
return max_(gross_earned - deductions, 0) + unearned
Pattern Variations
With multiple deduction steps:
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.xx.tanf.income
gross_earned = spm_unit("tanf_gross_earned_income", period)
unearned = spm_unit("tanf_gross_unearned_income", period)
# Step 1: Apply work expense deduction
work_expense = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
after_work_expense = max_(gross_earned - work_expense, 0)
# Step 2: Apply earnings disregard
earnings_disregard = after_work_expense * p.disregard_rate
countable_earned = max_(after_work_expense - earnings_disregard, 0)
# Step 3: Add unearned (no deductions applied)
return countable_earned + unearned
With disregard percentage (simplified):
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.xx.tanf.income
gross_earned = spm_unit("tanf_gross_earned_income", period)
unearned = spm_unit("tanf_gross_unearned_income", period)
# Apply disregard to earned (keep 33% = disregard 67%)
countable_earned = gross_earned * (1 - p.earned_disregard_rate)
return max_(countable_earned, 0) + unearned
When Unearned Income HAS Deductions
Some states DO have unearned income deductions (rare). Handle separately:
def formula(spm_unit, period, parameters):
gross_earned = spm_unit("tanf_gross_earned_income", period)
gross_unearned = spm_unit("tanf_gross_unearned_income", period)
earned_deductions = spm_unit("tanf_earned_income_deductions", period)
unearned_deductions = spm_unit("tanf_unearned_income_deductions", period)
# Apply each type of deduction to its respective income type
countable_earned = max_(gross_earned - earned_deductions, 0)
countable_unearned = max_(gross_unearned - unearned_deductions, 0)
return countable_earned + countable_unearned
Quick Reference
Standard TANF pattern:
Countable Income = max_(Earned - Earned Deductions, 0) + Unearned
NOT:
❌ max_(Earned + Unearned - Deductions, 0)
❌ max_(Earned - Deductions + Unearned, 0) # Can go negative
Federal/State Separation
Federal Parameters
Location: /parameters/gov/{agency}/
- Base formulas and methodologies
- National standards
- Required elements
State Parameters
Location: /parameters/gov/states/{state}/
- State-specific thresholds
- Implementation choices
- Scale factors
# Federal: parameters/gov/hhs/fpg/base.yaml
first_person: 14_580
# State: parameters/gov/states/ca/scale_factor.yaml
fpg_multiplier: 2.0 # 200% of FPG
Code Reuse Patterns
Avoid Duplication - Create Intermediate Variables
❌ ANTI-PATTERN: Copy-pasting calculations
# File 1: calculates income after deduction
def formula(household, period, parameters):
gross = add(household, period, ["income"])
deduction = p.deduction * household.nb_persons()
return max_(gross - deduction, 0)
# File 2: DUPLICATES same calculation
def formula(household, period, parameters):
gross = add(household, period, ["income"]) # Copy-pasted
deduction = p.deduction * household.nb_persons() # Copy-pasted
after_deduction = max_(gross - deduction, 0) # Copy-pasted
return after_deduction < p.threshold
✅ CORRECT: Reuse existing variables
# File 2: reuses calculation
def formula(household, period, parameters):
countable_income = household("program_countable_income", period)
return countable_income < p.threshold
When to create intermediate variables:
- Same calculation in 2+ places
- Logic exceeds 5 lines
- Reference implementations have similar variable
TANF-Specific Patterns
Study Reference Implementations First
MANDATORY before implementing any TANF:
- DC TANF:
/variables/gov/states/dc/dhs/tanf/ - IL TANF:
/variables/gov/states/il/dhs/tanf/ - TX TANF:
/variables/gov/states/tx/hhs/tanf/
Learn from them:
- Variable organization
- Naming conventions
- Code reuse patterns
- When to use
addsvsformula
Standard TANF Structure
tanf/
├── eligibility/
│ ├── demographic_eligible.py
│ ├── income_eligible.py
│ └── eligible.py
├── income/
│ ├── earned/
│ ├── unearned/
│ └── countable_income.py
└── [state]_tanf.py
Simplified TANF Rules
For simplified implementations:
DON'T create state-specific versions of:
- Demographic eligibility (use federal)
- Immigration eligibility (use federal)
- Income sources (use federal baseline)
❌ DON'T CREATE:
ca_tanf_demographic_eligible_person.py
ca_tanf_gross_earned_income.py
parameters/.../income/sources/earned.yaml
✅ DO USE:
# Federal demographic eligibility
is_demographic_tanf_eligible
# Federal income aggregation
tanf_gross_earned_income
Avoiding Unnecessary Wrapper Variables (CRITICAL)
Golden Rule: Only create a state variable if you're adding state-specific logic to it!
Understand WHY Variables Exist, Not Just WHAT
When studying reference implementations:
- Note which variables they have
- READ THE CODE inside each variable
- Ask: "Does this variable have state-specific logic?"
- If it just returns federal baseline → DON'T copy it
Variable Creation Decision Tree
Before creating ANY state-specific variable, ask:
- Does federal baseline already calculate this?
- Does my state do it DIFFERENTLY than federal?
- Can I write the difference in 1+ lines of state-specific logic?
- Will this calculation be used in 2+ other variables? (Code reuse exception)
Decision:
- If YES/NO/NO/NO → DON'T create the variable, use federal directly
- If YES/YES/YES/NO → CREATE the variable with state logic
- If YES/NO/NO/YES → CREATE as intermediate variable for code reuse (see exception below)
EXCEPTION: Code Reuse Justifies Intermediate Variables
Even without state-specific logic, create a variable if the SAME calculation is used in multiple places.
❌ Bad - Duplicating calculation across variables:
# Variable 1 - Income eligibility
class mo_tanf_income_eligible(Variable):
def formula(spm_unit, period, parameters):
# Duplicated calculation
gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
return gross <= p.income_limit
# Variable 2 - Countable income
class mo_tanf_countable_income(Variable):
def formula(spm_unit, period, parameters):
# SAME calculation repeated!
gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
deductions = spm_unit("mo_tanf_deductions", period)
return max_(gross - deductions, 0)
# Variable 3 - Need standard
class mo_tanf_need_standard(Variable):
def formula(spm_unit, period, parameters):
# SAME calculation AGAIN!
gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
return where(gross < p.threshold, p.high, p.low)
✅ Good - Extract into reusable intermediate variable:
# Intermediate variable - used in multiple places
class mo_tanf_gross_income(Variable):
adds = ["tanf_gross_earned_income", "tanf_gross_unearned_income"]
# Variable 1 - Reuses intermediate
class mo_tanf_income_eligible(Variable):
def formula(spm_unit, period, parameters):
gross = spm_unit("mo_tanf_gross_income", period) # Reuse
return gross <= p.income_limit
# Variable 2 - Reuses intermediate
class mo_tanf_countable_income(Variable):
def formula(spm_unit, period, parameters):
gross = spm_unit("mo_tanf_gross_income", period) # Reuse
deductions = spm_unit("mo_tanf_deductions", period)
return max_(gross - deductions, 0)
# Variable 3 - Reuses intermediate
class mo_tanf_need_standard(Variable):
def formula(spm_unit, period, parameters):
gross = spm_unit("mo_tanf_gross_income", period) # Reuse
return where(gross < p.threshold, p.high, p.low)
When to create intermediate variables for reuse:
- ✅ Same calculation appears in 2+ variables
- ✅ Represents a meaningful concept (e.g., "gross income", "net resources")
- ✅ Simplifies maintenance (change once vs many places)
- ✅ Follows DRY (Don't Repeat Yourself) principle
When NOT to create (still a wrapper):
- ❌ Only used in ONE place
- ❌ Just passes through another variable unchanged
- ❌ Adds indirection without code reuse benefit
Red Flags for Unnecessary Wrapper Variables
❌ INVALID - Pure wrapper, no state logic:
class in_tanf_assistance_unit_size(Variable):
def formula(spm_unit, period):
return spm_unit("spm_unit_size", period) # Just returns federal
❌ INVALID - Aggregation without transformation:
class in_tanf_countable_unearned_income(Variable):
def formula(tax_unit, period):
return tax_unit.sum(person("tanf_gross_unearned_income", period))
❌ INVALID - Pass-through with no modification:
class in_tanf_gross_income(Variable):
def formula(entity, period):
return entity("tanf_gross_income", period)
Examples of VALID State Variables
✅ VALID - Has state-specific disregard:
class in_tanf_countable_earned_income(Variable):
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.in.tanf.income
earned = spm_unit("tanf_gross_earned_income", period)
return earned * (1 - p.earned_income_disregard_rate) # STATE LOGIC
✅ VALID - Uses state-specific limits:
class in_tanf_income_eligible(Variable):
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.in.tanf
income = spm_unit("tanf_countable_income", period)
size = spm_unit("spm_unit_size", period.this_year)
limit = p.income_limit[min_(size, p.max_household_size)] # STATE PARAMS
return income <= limit
✅ VALID - IL has different counting rules:
class il_tanf_assistance_unit_size(Variable):
adds = [
"il_tanf_payment_eligible_child", # STATE-SPECIFIC
"il_tanf_payment_eligible_parent", # STATE-SPECIFIC
]
State Variables to AVOID Creating
For TANF implementations:
❌ DON'T create these (use federal directly):
state_tanf_assistance_unit_size(unless different counting rules like IL)state_tanf_countable_unearned_income(unless state has disregards)state_tanf_gross_income(just use federal baseline)- Any variable that's just
return entity("federal_variable", period)
✅ DO create these (when state has unique rules):
state_tanf_countable_earned_income(if unique disregard %)state_tanf_income_eligible(state income limits)state_tanf_maximum_benefit(state payment standards)state_tanf(final benefit calculation)
Demographic Eligibility Pattern
Option 1: Use Federal (Simplified)
class ca_tanf_eligible(Variable):
def formula(spm_unit, period, parameters):
# Use federal variable
has_eligible = spm_unit.any(
spm_unit.members("is_demographic_tanf_eligible", period)
)
return has_eligible & income_eligible
Option 2: State-Specific (Different thresholds)
class ca_tanf_demographic_eligible_person(Variable):
def formula(person, period, parameters):
p = parameters(period).gov.states.ca.tanf
age = person("age", period.this_year) # NOT monthly_age
age_limit = where(
person("is_full_time_student", period),
p.age_threshold.student,
p.age_threshold.minor_child
)
return age < age_limit
Common Implementation Patterns
Income Eligibility
class program_income_eligible(Variable):
value_type = bool
entity = SPMUnit
definition_period = MONTH
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.xx.program
income = spm_unit("program_countable_income", period)
size = spm_unit("spm_unit_size", period.this_year)
# Get threshold from parameters
threshold = p.income_limit[min_(size, p.max_household_size)]
return income <= threshold
Benefit Calculation
class program_benefit(Variable):
value_type = float
entity = SPMUnit
definition_period = MONTH
unit = USD
def formula(spm_unit, period, parameters):
p = parameters(period).gov.states.xx.program
eligible = spm_unit("program_eligible", period)
# Calculate benefit amount
base = p.benefit_schedule.base_amount
adjustment = p.benefit_schedule.adjustment_rate
size = spm_unit("spm_unit_size", period.this_year)
amount = base + (size - 1) * adjustment
return where(eligible, amount, 0)
Using Scale Parameters
def formula(entity, period, parameters):
p = parameters(period).gov.states.az.program
federal_p = parameters(period).gov.hhs.fpg
# Federal base with state scale
size = entity("household_size", period.this_year)
fpg = federal_p.first_person + federal_p.additional * (size - 1)
state_scale = p.income_limit_scale # Often exists
income_limit = fpg * state_scale
Variable Creation Checklist
Before creating any variable:
- Check if it already exists
- Use standard demographic variables (age, is_disabled)
- Reuse federal calculations where applicable
- Check for household_income before creating new
- Look for existing intermediate variables
- Study reference implementations
Quality Standards
Complete Implementation Requirements
- All values from parameters (no hard-coding)
- Complete formula logic
- Proper entity aggregation
- Correct period handling
- Meaningful variable names
- Proper metadata
Anti-Patterns to Avoid
- Copy-pasting logic between files
- Hard-coding any numeric values
- Creating duplicate income variables
- State-specific versions of federal rules
- Placeholder TODOs in production code
Parameter-to-Variable Mapping Requirements
Every Parameter Must Have a Variable
CRITICAL: Complete implementation means every parameter is used!
When you create parameters, you MUST create corresponding variables:
| Parameter Type | Required Variable(s) |
|---|---|
| resources/limit | state_program_resource_eligible |
| income/limit | state_program_income_eligible |
| payment_standard | state_program_maximum_benefit |
| income/disregard | state_program_countable_earned_income |
| categorical/requirements | state_program_categorically_eligible |
Complete Eligibility Formula
The main eligibility variable MUST combine ALL checks:
class state_program_eligible(Variable):
def formula(spm_unit, period, parameters):
income_eligible = spm_unit("state_program_income_eligible", period)
resource_eligible = spm_unit("state_program_resource_eligible", period) # DON'T FORGET!
categorical = spm_unit("state_program_categorically_eligible", period)
return income_eligible & resource_eligible & categorical
Common Implementation Failures:
- ❌ Created resource limit parameter but no resource_eligible variable
- ❌ Main eligible variable only checks income, ignores resources
- ❌ Parameters created but never referenced in any formula
For Agents
When implementing variables:
- Study reference implementations (DC, IL, TX TANF)
- Never hard-code values - use parameters
- Map every parameter to a variable - no orphaned parameters
- Complete ALL eligibility checks - income AND resources AND categorical
- Reuse existing variables - avoid duplication
- Use
addswhen possible - cleaner than formula - Create intermediate variables for complex logic
- Follow metadata standards exactly
- Complete implementation or delete the file