gh-policyengine-policyengin…/skills/policyengine-implementation-patterns-skill/SKILL.md

---
name: policyengine-implementation-patterns
description: PolicyEngine implementation patterns - variable creation, no hard-coding principle, federal/state separation, metadata standards
---

# PolicyEngine Implementation Patterns

Essential patterns for implementing government benefit program rules in PolicyEngine.

## PolicyEngine Architecture Constraints

### What CANNOT Be Simulated (Single-Period Limitation)

**CRITICAL: PolicyEngine uses single-period simulation architecture**

The following CANNOT be implemented and should be SKIPPED when found in documentation:

#### 1. Time Limits and Lifetime Counters
**Cannot simulate:**
- ANY lifetime benefit limits (X months total)
- ANY time windows (X months within Y period)
- Benefit clocks and countable months
- Cumulative time tracking

**Why:** Requires tracking benefit history across multiple periods. PolicyEngine simulates one period at a time with no state persistence.

**What to do:** Document in comments but DON'T parameterize or implement:
```python
# NOTE: [State] has [X]-month lifetime limit on [Program] benefits
# This cannot be simulated in PolicyEngine's single-period architecture
```

#### 2. Work History Requirements
**Cannot simulate:**
- "Must have worked 6 of last 12 months"
- "Averaged 30 hours/week over past quarter"
- Prior employment verification
- Work participation rate tracking

**Why:** Requires historical data from previous periods.

#### 3. Waiting Periods and Benefit Delays
**Cannot simulate:**
- "3-month waiting period for new residents"
- "Benefits start month after application"
- Retroactive eligibility
- Benefit recertification cycles

**Why:** Requires tracking application dates and eligibility history.

#### 4. Progressive Sanctions and Penalties
**Cannot simulate:**
- "First violation: 1-month sanction, Second: 3-month, Third: permanent"
- Graduated penalties
- Strike systems

**Why:** Requires tracking violation history.

#### 5. Asset Spend-Down Over Time
**Cannot simulate:**
- Medical spend-down across months
- Resource depletion tracking
- Accumulated medical expenses

**Why:** Requires tracking expenses and resources across periods.

### What CAN Be Simulated (With Caveats)

PolicyEngine CAN simulate point-in-time eligibility and benefits:
- ✅ Current month income limits
- ✅ Current month resource limits
- ✅ Current benefit calculations
- ✅ Current household composition
- ✅ Current deductions and disregards

### Time-Limited Benefits That Affect Current Calculations

**Special Case: Time-limited deductions/disregards**

When a deduction or disregard is only available for X months:
- **DO implement the deduction** (assume it applies)
- **DO add a comment** explaining the time limitation
- **DON'T try to track or enforce the time limit**

Example:
```python
class state_tanf_countable_earned_income(Variable):
    def formula(spm_unit, period, parameters):
        p = parameters(period).gov.states.xx.tanf.income
        earned = spm_unit("tanf_gross_earned_income", period)

        # NOTE: In reality, this 75% disregard only applies for first 4 months
        # of employment. PolicyEngine cannot track employment duration, so we
        # apply the disregard assuming the household qualifies.
        # Actual rule: [State Code Citation]
        disregard_rate = p.earned_income_disregard_rate  # 0.75

        return earned * (1 - disregard_rate)
```

**Rule: If it requires history or future tracking, it CANNOT be fully simulated - but implement what we can and document limitations**

---

## Critical Principles

### 1. ZERO Hard-Coded Values
**Every numeric value MUST be parameterized**

```python
❌ FORBIDDEN:
return where(eligible, 1000, 0)     # Hard-coded 1000
age < 15                             # Hard-coded 15
benefit = income * 0.33              # Hard-coded 0.33
month >= 10 and month <= 3           # Hard-coded months

✅ REQUIRED:
return where(eligible, p.maximum_benefit, 0)
age < p.age_threshold.minor_child
benefit = income * p.benefit_rate
month >= p.season.start_month
```

**Acceptable literals:**
- `0`, `1`, `-1` for basic math
- `12` for month conversion (`/ 12`, `* 12`)
- Array indices when structure is known

### 2. No Placeholder Implementations
**Delete the file rather than leave placeholders**

```python
❌ NEVER:
def formula(entity, period, parameters):
    # TODO: Implement
    return 75  # Placeholder

✅ ALWAYS:
# Complete implementation or no file at all
```

---

## Variable Implementation Standards

### Variable Metadata Format

Follow established patterns:
```python
class il_tanf_countable_earned_income(Variable):
    value_type = float
    entity = SPMUnit
    definition_period = MONTH
    label = "Illinois TANF countable earned income"
    unit = USD
    reference = "https://www.law.cornell.edu/regulations/illinois/..."
    defined_for = StateCode.IL

    # Use adds for simple sums
    adds = ["il_tanf_earned_income_after_disregard"]
```

**Key rules:**
- ✅ Use full URL in `reference` (clickable)
- ❌ Don't use `documentation` field
- ❌ Don't use statute citations without URLs

### When to Use `adds` vs `formula`

**Use `adds` when:**
- Just summing variables
- Passing through a single variable
- No transformations needed

```python
✅ BEST - Simple sum:
class tanf_gross_income(Variable):
    adds = ["employment_income", "self_employment_income"]
```

**Use `formula` when:**
- Applying transformations
- Conditional logic
- Calculations needed

```python
✅ CORRECT - Need logic:
def formula(entity, period, parameters):
    income = add(entity, period, ["income1", "income2"])
    return max_(0, income)  # Need max_
```

---

## TANF Countable Income Pattern

### Critical: Verify Calculation Order from Legal Code

**MOST IMPORTANT:** Always check the state's legal code or policy manual for the exact calculation order. The pattern below is typical but not universal.

**The Typical Pattern:**
1. Apply deductions/disregards to **earned income only**
2. Use `max_()` to prevent negative earned income
3. Add unearned income (which typically has no deductions)

**This pattern is based on how MOST TANF programs work, but you MUST verify with the specific state's legal code.**

### ❌ WRONG - Applying deductions to total income

```python
def formula(spm_unit, period, parameters):
    gross_earned = spm_unit("tanf_gross_earned_income", period)
    unearned = spm_unit("tanf_gross_unearned_income", period)
    deductions = spm_unit("tanf_earned_income_deductions", period)

    # ❌ WRONG: Deductions applied to total income
    total_income = gross_earned + unearned
    countable = total_income - deductions

    return max_(countable, 0)
```

**Why this is wrong:**
- Deductions should ONLY reduce earned income
- Unearned income (SSI, child support, etc.) is not subject to work expense deductions
- This incorrectly reduces unearned income when earned income is low

**Example error:**
- Earned: $100, Unearned: $500, Deductions: $200
- Wrong result: `max_($100 + $500 - $200, 0) = $400` (reduces unearned!)
- Correct result: `max_($100 - $200, 0) + $500 = $500`

### ✅ CORRECT - Apply deductions to earned only, then add unearned

```python
def formula(spm_unit, period, parameters):
    gross_earned = spm_unit("tanf_gross_earned_income", period)
    unearned = spm_unit("tanf_gross_unearned_income", period)
    deductions = spm_unit("tanf_earned_income_deductions", period)

    # ✅ CORRECT: Deductions applied to earned only, then add unearned
    return max_(gross_earned - deductions, 0) + unearned
```

### Pattern Variations

**With multiple deduction steps:**
```python
def formula(spm_unit, period, parameters):
    p = parameters(period).gov.states.xx.tanf.income
    gross_earned = spm_unit("tanf_gross_earned_income", period)
    unearned = spm_unit("tanf_gross_unearned_income", period)

    # Step 1: Apply work expense deduction
    work_expense = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
    after_work_expense = max_(gross_earned - work_expense, 0)

    # Step 2: Apply earnings disregard
    earnings_disregard = after_work_expense * p.disregard_rate
    countable_earned = max_(after_work_expense - earnings_disregard, 0)

    # Step 3: Add unearned (no deductions applied)
    return countable_earned + unearned
```

**With disregard percentage (simplified):**
```python
def formula(spm_unit, period, parameters):
    p = parameters(period).gov.states.xx.tanf.income
    gross_earned = spm_unit("tanf_gross_earned_income", period)
    unearned = spm_unit("tanf_gross_unearned_income", period)

    # Apply disregard to earned (keep 33% = disregard 67%)
    countable_earned = gross_earned * (1 - p.earned_disregard_rate)

    return max_(countable_earned, 0) + unearned
```

### When Unearned Income HAS Deductions

Some states DO have unearned income deductions (rare). Handle separately:

```python
def formula(spm_unit, period, parameters):
    gross_earned = spm_unit("tanf_gross_earned_income", period)
    gross_unearned = spm_unit("tanf_gross_unearned_income", period)
    earned_deductions = spm_unit("tanf_earned_income_deductions", period)
    unearned_deductions = spm_unit("tanf_unearned_income_deductions", period)

    # Apply each type of deduction to its respective income type
    countable_earned = max_(gross_earned - earned_deductions, 0)
    countable_unearned = max_(gross_unearned - unearned_deductions, 0)

    return countable_earned + countable_unearned
```

### Quick Reference

**Standard TANF pattern:**
```
Countable Income = max_(Earned - Earned Deductions, 0) + Unearned
```

**NOT:**
```
❌ max_(Earned + Unearned - Deductions, 0)
❌ max_(Earned - Deductions + Unearned, 0)  # Can go negative
```

---

## Federal/State Separation

### Federal Parameters
Location: `/parameters/gov/{agency}/`
- Base formulas and methodologies
- National standards
- Required elements

### State Parameters
Location: `/parameters/gov/states/{state}/`
- State-specific thresholds
- Implementation choices
- Scale factors

```yaml
# Federal: parameters/gov/hhs/fpg/base.yaml
first_person: 14_580

# State: parameters/gov/states/ca/scale_factor.yaml
fpg_multiplier: 2.0  # 200% of FPG
```

---

## Code Reuse Patterns

### Avoid Duplication - Create Intermediate Variables

**❌ ANTI-PATTERN: Copy-pasting calculations**
```python
# File 1: calculates income after deduction
def formula(household, period, parameters):
    gross = add(household, period, ["income"])
    deduction = p.deduction * household.nb_persons()
    return max_(gross - deduction, 0)

# File 2: DUPLICATES same calculation
def formula(household, period, parameters):
    gross = add(household, period, ["income"])  # Copy-pasted
    deduction = p.deduction * household.nb_persons()  # Copy-pasted
    after_deduction = max_(gross - deduction, 0)  # Copy-pasted
    return after_deduction < p.threshold
```

**✅ CORRECT: Reuse existing variables**
```python
# File 2: reuses calculation
def formula(household, period, parameters):
    countable_income = household("program_countable_income", period)
    return countable_income < p.threshold
```

**When to create intermediate variables:**
- Same calculation in 2+ places
- Logic exceeds 5 lines
- Reference implementations have similar variable

---

## TANF-Specific Patterns

### Study Reference Implementations First

**MANDATORY before implementing any TANF:**
- DC TANF: `/variables/gov/states/dc/dhs/tanf/`
- IL TANF: `/variables/gov/states/il/dhs/tanf/`
- TX TANF: `/variables/gov/states/tx/hhs/tanf/`

**Learn from them:**
1. Variable organization
2. Naming conventions
3. Code reuse patterns
4. When to use `adds` vs `formula`

### Standard TANF Structure
```
tanf/
├── eligibility/
│   ├── demographic_eligible.py
│   ├── income_eligible.py
│   └── eligible.py
├── income/
│   ├── earned/
│   ├── unearned/
│   └── countable_income.py
└── [state]_tanf.py
```

### Simplified TANF Rules

For simplified implementations:

**DON'T create state-specific versions of:**
- Demographic eligibility (use federal)
- Immigration eligibility (use federal)
- Income sources (use federal baseline)

```python
❌ DON'T CREATE:
ca_tanf_demographic_eligible_person.py
ca_tanf_gross_earned_income.py
parameters/.../income/sources/earned.yaml

✅ DO USE:
# Federal demographic eligibility
is_demographic_tanf_eligible
# Federal income aggregation
tanf_gross_earned_income
```

### Avoiding Unnecessary Wrapper Variables (CRITICAL)

**Golden Rule: Only create a state variable if you're adding state-specific logic to it!**

#### Understand WHY Variables Exist, Not Just WHAT

When studying reference implementations:
1. **Note which variables they have**
2. **READ THE CODE inside each variable**
3. **Ask: "Does this variable have state-specific logic?"**
4. **If it just returns federal baseline → DON'T copy it**

#### Variable Creation Decision Tree

Before creating ANY state-specific variable, ask:
1. Does federal baseline already calculate this?
2. Does my state do it DIFFERENTLY than federal?
3. Can I write the difference in 1+ lines of state-specific logic?
4. **Will this calculation be used in 2+ other variables?** (Code reuse exception)

**Decision:**
- If YES/NO/NO/NO → **DON'T create the variable**, use federal directly
- If YES/YES/YES/NO → **CREATE the variable** with state logic
- If YES/NO/NO/YES → **CREATE as intermediate variable** for code reuse (see exception below)

#### EXCEPTION: Code Reuse Justifies Intermediate Variables

**Even without state-specific logic, create a variable if the SAME calculation is used in multiple places.**

❌ **Bad - Duplicating calculation across variables:**
```python
# Variable 1 - Income eligibility
class mo_tanf_income_eligible(Variable):
    def formula(spm_unit, period, parameters):
        # Duplicated calculation
        gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
        return gross <= p.income_limit

# Variable 2 - Countable income
class mo_tanf_countable_income(Variable):
    def formula(spm_unit, period, parameters):
        # SAME calculation repeated!
        gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
        deductions = spm_unit("mo_tanf_deductions", period)
        return max_(gross - deductions, 0)

# Variable 3 - Need standard
class mo_tanf_need_standard(Variable):
    def formula(spm_unit, period, parameters):
        # SAME calculation AGAIN!
        gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
        return where(gross < p.threshold, p.high, p.low)
```

✅ **Good - Extract into reusable intermediate variable:**
```python
# Intermediate variable - used in multiple places
class mo_tanf_gross_income(Variable):
    adds = ["tanf_gross_earned_income", "tanf_gross_unearned_income"]

# Variable 1 - Reuses intermediate
class mo_tanf_income_eligible(Variable):
    def formula(spm_unit, period, parameters):
        gross = spm_unit("mo_tanf_gross_income", period)  # Reuse
        return gross <= p.income_limit

# Variable 2 - Reuses intermediate
class mo_tanf_countable_income(Variable):
    def formula(spm_unit, period, parameters):
        gross = spm_unit("mo_tanf_gross_income", period)  # Reuse
        deductions = spm_unit("mo_tanf_deductions", period)
        return max_(gross - deductions, 0)

# Variable 3 - Reuses intermediate
class mo_tanf_need_standard(Variable):
    def formula(spm_unit, period, parameters):
        gross = spm_unit("mo_tanf_gross_income", period)  # Reuse
        return where(gross < p.threshold, p.high, p.low)
```

**When to create intermediate variables for reuse:**
- ✅ Same calculation appears in 2+ variables
- ✅ Represents a meaningful concept (e.g., "gross income", "net resources")
- ✅ Simplifies maintenance (change once vs many places)
- ✅ Follows DRY (Don't Repeat Yourself) principle

**When NOT to create (still a wrapper):**
- ❌ Only used in ONE place
- ❌ Just passes through another variable unchanged
- ❌ Adds indirection without code reuse benefit

#### Red Flags for Unnecessary Wrapper Variables

```python
❌ INVALID - Pure wrapper, no state logic:
class in_tanf_assistance_unit_size(Variable):
    def formula(spm_unit, period):
        return spm_unit("spm_unit_size", period)  # Just returns federal

❌ INVALID - Aggregation without transformation:
class in_tanf_countable_unearned_income(Variable):
    def formula(tax_unit, period):
        return tax_unit.sum(person("tanf_gross_unearned_income", period))

❌ INVALID - Pass-through with no modification:
class in_tanf_gross_income(Variable):
    def formula(entity, period):
        return entity("tanf_gross_income", period)
```

#### Examples of VALID State Variables

```python
✅ VALID - Has state-specific disregard:
class in_tanf_countable_earned_income(Variable):
    def formula(spm_unit, period, parameters):
        p = parameters(period).gov.states.in.tanf.income
        earned = spm_unit("tanf_gross_earned_income", period)
        return earned * (1 - p.earned_income_disregard_rate)  # STATE LOGIC

✅ VALID - Uses state-specific limits:
class in_tanf_income_eligible(Variable):
    def formula(spm_unit, period, parameters):
        p = parameters(period).gov.states.in.tanf
        income = spm_unit("tanf_countable_income", period)
        size = spm_unit("spm_unit_size", period.this_year)
        limit = p.income_limit[min_(size, p.max_household_size)]  # STATE PARAMS
        return income <= limit

✅ VALID - IL has different counting rules:
class il_tanf_assistance_unit_size(Variable):
    adds = [
        "il_tanf_payment_eligible_child",  # STATE-SPECIFIC
        "il_tanf_payment_eligible_parent",  # STATE-SPECIFIC
    ]
```

#### State Variables to AVOID Creating

For TANF implementations:

**❌ DON'T create these (use federal directly):**
- `state_tanf_assistance_unit_size` (unless different counting rules like IL)
- `state_tanf_countable_unearned_income` (unless state has disregards)
- `state_tanf_gross_income` (just use federal baseline)
- Any variable that's just `return entity("federal_variable", period)`

**✅ DO create these (when state has unique rules):**
- `state_tanf_countable_earned_income` (if unique disregard %)
- `state_tanf_income_eligible` (state income limits)
- `state_tanf_maximum_benefit` (state payment standards)
- `state_tanf` (final benefit calculation)

### Demographic Eligibility Pattern

**Option 1: Use Federal (Simplified)**
```python
class ca_tanf_eligible(Variable):
    def formula(spm_unit, period, parameters):
        # Use federal variable
        has_eligible = spm_unit.any(
            spm_unit.members("is_demographic_tanf_eligible", period)
        )
        return has_eligible & income_eligible
```

**Option 2: State-Specific (Different thresholds)**
```python
class ca_tanf_demographic_eligible_person(Variable):
    def formula(person, period, parameters):
        p = parameters(period).gov.states.ca.tanf
        age = person("age", period.this_year)  # NOT monthly_age

        age_limit = where(
            person("is_full_time_student", period),
            p.age_threshold.student,
            p.age_threshold.minor_child
        )
        return age < age_limit
```

---

## Common Implementation Patterns

### Income Eligibility
```python
class program_income_eligible(Variable):
    value_type = bool
    entity = SPMUnit
    definition_period = MONTH

    def formula(spm_unit, period, parameters):
        p = parameters(period).gov.states.xx.program
        income = spm_unit("program_countable_income", period)
        size = spm_unit("spm_unit_size", period.this_year)

        # Get threshold from parameters
        threshold = p.income_limit[min_(size, p.max_household_size)]
        return income <= threshold
```

### Benefit Calculation
```python
class program_benefit(Variable):
    value_type = float
    entity = SPMUnit
    definition_period = MONTH
    unit = USD

    def formula(spm_unit, period, parameters):
        p = parameters(period).gov.states.xx.program
        eligible = spm_unit("program_eligible", period)

        # Calculate benefit amount
        base = p.benefit_schedule.base_amount
        adjustment = p.benefit_schedule.adjustment_rate
        size = spm_unit("spm_unit_size", period.this_year)

        amount = base + (size - 1) * adjustment
        return where(eligible, amount, 0)
```

### Using Scale Parameters
```python
def formula(entity, period, parameters):
    p = parameters(period).gov.states.az.program
    federal_p = parameters(period).gov.hhs.fpg

    # Federal base with state scale
    size = entity("household_size", period.this_year)
    fpg = federal_p.first_person + federal_p.additional * (size - 1)
    state_scale = p.income_limit_scale  # Often exists
    income_limit = fpg * state_scale
```

---

## Variable Creation Checklist

Before creating any variable:
- [ ] Check if it already exists
- [ ] Use standard demographic variables (age, is_disabled)
- [ ] Reuse federal calculations where applicable
- [ ] Check for household_income before creating new
- [ ] Look for existing intermediate variables
- [ ] Study reference implementations

---

## Quality Standards

### Complete Implementation Requirements
- All values from parameters (no hard-coding)
- Complete formula logic
- Proper entity aggregation
- Correct period handling
- Meaningful variable names
- Proper metadata

### Anti-Patterns to Avoid
- Copy-pasting logic between files
- Hard-coding any numeric values
- Creating duplicate income variables
- State-specific versions of federal rules
- Placeholder TODOs in production code

---

## Parameter-to-Variable Mapping Requirements

### Every Parameter Must Have a Variable

**CRITICAL: Complete implementation means every parameter is used!**

When you create parameters, you MUST create corresponding variables:

| Parameter Type | Required Variable(s) |
|---------------|---------------------|
| resources/limit | `state_program_resource_eligible` |
| income/limit | `state_program_income_eligible` |
| payment_standard | `state_program_maximum_benefit` |
| income/disregard | `state_program_countable_earned_income` |
| categorical/requirements | `state_program_categorically_eligible` |

### Complete Eligibility Formula

The main eligibility variable MUST combine ALL checks:

```python
class state_program_eligible(Variable):
    def formula(spm_unit, period, parameters):
        income_eligible = spm_unit("state_program_income_eligible", period)
        resource_eligible = spm_unit("state_program_resource_eligible", period)  # DON'T FORGET!
        categorical = spm_unit("state_program_categorically_eligible", period)

        return income_eligible & resource_eligible & categorical
```

**Common Implementation Failures:**
- ❌ Created resource limit parameter but no resource_eligible variable
- ❌ Main eligible variable only checks income, ignores resources
- ❌ Parameters created but never referenced in any formula

---

## For Agents

When implementing variables:
1. **Study reference implementations** (DC, IL, TX TANF)
2. **Never hard-code values** - use parameters
3. **Map every parameter to a variable** - no orphaned parameters
4. **Complete ALL eligibility checks** - income AND resources AND categorical
5. **Reuse existing variables** - avoid duplication
6. **Use `adds` when possible** - cleaner than formula
7. **Create intermediate variables** for complex logic
8. **Follow metadata standards** exactly
9. **Complete implementation** or delete the file