Initial commit
This commit is contained in:
@@ -0,0 +1,329 @@
|
||||
---
|
||||
name: policyengine-aggregation
|
||||
description: PolicyEngine aggregation patterns - using adds attribute and add() function for summing variables across entities
|
||||
---
|
||||
|
||||
# PolicyEngine Aggregation Patterns
|
||||
|
||||
Essential patterns for summing variables across entities in PolicyEngine.
|
||||
|
||||
## Quick Decision Guide
|
||||
|
||||
```
|
||||
Is the variable ONLY a sum of other variables?
|
||||
│
|
||||
├─ YES → Use `adds` attribute (NO formula needed!)
|
||||
│ adds = ["var1", "var2"]
|
||||
│
|
||||
└─ NO → Use `add()` function in formula
|
||||
(when you need max_, where, conditions, etc.)
|
||||
```
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Need | Use | Example |
|
||||
|------|-----|---------|
|
||||
| Simple sum | `adds` | `adds = ["var1", "var2"]` |
|
||||
| Sum from parameters | `adds` | `adds = "gov.path.to.list"` |
|
||||
| Sum + max_() | `add()` | `max_(0, add(...))` |
|
||||
| Sum + where() | `add()` | `where(cond, add(...), 0)` |
|
||||
| Sum + conditions | `add()` | `if cond: add(...)` |
|
||||
| Count booleans | `adds` | `adds = ["is_eligible"]` |
|
||||
|
||||
---
|
||||
|
||||
## 1. `adds` Class Attribute (Preferred When Possible)
|
||||
|
||||
### When to Use
|
||||
Use `adds` when a variable is **ONLY** the sum of other variables with **NO additional logic**.
|
||||
|
||||
### Syntax
|
||||
```python
|
||||
class variable_name(Variable):
|
||||
value_type = float
|
||||
entity = Entity
|
||||
definition_period = PERIOD
|
||||
|
||||
# Option 1: List of variables
|
||||
adds = ["variable1", "variable2", "variable3"]
|
||||
|
||||
# Option 2: Parameter tree path
|
||||
adds = "gov.path.to.parameter.list"
|
||||
```
|
||||
|
||||
### Key Points
|
||||
- ✅ No `formula()` method needed
|
||||
- ✅ Automatically handles entity aggregation (person → household/tax_unit/spm_unit)
|
||||
- ✅ Clean and declarative
|
||||
|
||||
### Example: Simple Income Sum
|
||||
```python
|
||||
class tanf_gross_earned_income(Variable):
|
||||
value_type = float
|
||||
entity = SPMUnit
|
||||
label = "TANF gross earned income"
|
||||
unit = USD
|
||||
definition_period = MONTH
|
||||
|
||||
adds = ["employment_income", "self_employment_income"]
|
||||
# NO formula needed! Automatically:
|
||||
# 1. Gets each person's employment_income
|
||||
# 2. Gets each person's self_employment_income
|
||||
# 3. Sums all values across SPM unit members
|
||||
```
|
||||
|
||||
### Example: Using Parameter List
|
||||
```python
|
||||
class income_tax_refundable_credits(Variable):
|
||||
value_type = float
|
||||
entity = TaxUnit
|
||||
definition_period = YEAR
|
||||
|
||||
adds = "gov.irs.credits.refundable"
|
||||
# Parameter file contains list like:
|
||||
# - earned_income_tax_credit
|
||||
# - child_tax_credit
|
||||
# - additional_child_tax_credit
|
||||
```
|
||||
|
||||
### Example: Counting Boolean Values
|
||||
```python
|
||||
class count_eligible_people(Variable):
|
||||
value_type = int
|
||||
entity = SPMUnit
|
||||
definition_period = YEAR
|
||||
|
||||
adds = ["is_eligible_person"]
|
||||
# Automatically sums True (1) and False (0) across members
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. `add()` Function (When Logic Needed)
|
||||
|
||||
### When to Use
|
||||
Use `add()` inside a `formula()` when you need:
|
||||
- To apply `max_()`, `where()`, or conditions
|
||||
- To combine with other operations
|
||||
- To modify values before/after summing
|
||||
|
||||
### Syntax
|
||||
```python
|
||||
from policyengine_us.model_api import *
|
||||
|
||||
def formula(entity, period, parameters):
|
||||
result = add(entity, period, variable_list)
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `entity`: The entity to operate on
|
||||
- `period`: The time period for calculation
|
||||
- `variable_list`: List of variable names or parameter path
|
||||
|
||||
### Example: With max_() to Prevent Negatives
|
||||
```python
|
||||
class adjusted_earned_income(Variable):
|
||||
value_type = float
|
||||
entity = SPMUnit
|
||||
definition_period = MONTH
|
||||
|
||||
def formula(spm_unit, period, parameters):
|
||||
# Need max_() to clip negative values
|
||||
gross = add(spm_unit, period, ["employment_income", "self_employment_income"])
|
||||
return max_(0, gross) # Prevent negative income
|
||||
```
|
||||
|
||||
### Example: With Additional Logic
|
||||
```python
|
||||
class household_benefits(Variable):
|
||||
value_type = float
|
||||
entity = Household
|
||||
definition_period = YEAR
|
||||
|
||||
def formula(household, period, parameters):
|
||||
# Sum existing benefits
|
||||
BENEFITS = ["snap", "tanf", "ssi", "social_security"]
|
||||
existing = add(household, period, BENEFITS)
|
||||
|
||||
# Add new benefit conditionally
|
||||
new_benefit = household("special_benefit", period)
|
||||
p = parameters(period).gov.special_benefit
|
||||
|
||||
if p.include_in_total:
|
||||
return existing + new_benefit
|
||||
return existing
|
||||
```
|
||||
|
||||
### Example: Building on Previous Variables
|
||||
```python
|
||||
class total_deductions(Variable):
|
||||
value_type = float
|
||||
entity = TaxUnit
|
||||
definition_period = YEAR
|
||||
|
||||
def formula(tax_unit, period, parameters):
|
||||
p = parameters(period).gov.irs.deductions
|
||||
|
||||
# Get standard deductions using parameter list
|
||||
standard = add(tax_unit, period, p.standard_items)
|
||||
|
||||
# Apply phase-out logic
|
||||
income = tax_unit("adjusted_gross_income", period)
|
||||
phase_out_rate = p.phase_out_rate
|
||||
phase_out_start = p.phase_out_start
|
||||
|
||||
reduction = max_(0, (income - phase_out_start) * phase_out_rate)
|
||||
return max_(0, standard - reduction)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Common Anti-Patterns to Avoid
|
||||
|
||||
### ❌ NEVER: Manual Summing
|
||||
```python
|
||||
# WRONG - Never do this!
|
||||
def formula(spm_unit, period, parameters):
|
||||
person = spm_unit.members
|
||||
employment = person("employment_income", period)
|
||||
self_emp = person("self_employment_income", period)
|
||||
return spm_unit.sum(employment + self_emp) # ❌ BAD
|
||||
```
|
||||
|
||||
### ✅ CORRECT: Use adds
|
||||
```python
|
||||
# RIGHT - Clean and simple
|
||||
adds = ["employment_income", "self_employment_income"] # ✅ GOOD
|
||||
```
|
||||
|
||||
### ❌ WRONG: Using add() When adds Suffices
|
||||
```python
|
||||
# WRONG - Unnecessary complexity
|
||||
def formula(spm_unit, period, parameters):
|
||||
return add(spm_unit, period, ["income1", "income2"]) # ❌ Overkill
|
||||
```
|
||||
|
||||
### ✅ CORRECT: Use adds
|
||||
```python
|
||||
# RIGHT - Simpler
|
||||
adds = ["income1", "income2"] # ✅ GOOD
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Entity Aggregation Explained
|
||||
|
||||
When using `adds` or `add()`, PolicyEngine automatically handles entity aggregation:
|
||||
|
||||
```python
|
||||
class household_total_income(Variable):
|
||||
entity = Household # Higher-level entity
|
||||
definition_period = YEAR
|
||||
|
||||
adds = ["employment_income", "self_employment_income"]
|
||||
# employment_income is defined for Person (lower-level)
|
||||
# PolicyEngine automatically:
|
||||
# 1. Gets employment_income for each person in household
|
||||
# 2. Gets self_employment_income for each person
|
||||
# 3. Sums all values to household level
|
||||
```
|
||||
|
||||
This works across all entity hierarchies:
|
||||
- Person → Tax Unit
|
||||
- Person → SPM Unit
|
||||
- Person → Household
|
||||
- Tax Unit → Household
|
||||
- SPM Unit → Household
|
||||
|
||||
---
|
||||
|
||||
## 5. Parameter Lists
|
||||
|
||||
Parameters can define lists of variables to sum:
|
||||
|
||||
**Parameter file** (`gov/irs/credits/refundable.yaml`):
|
||||
```yaml
|
||||
description: List of refundable tax credits
|
||||
values:
|
||||
2024-01-01:
|
||||
- earned_income_tax_credit
|
||||
- child_tax_credit
|
||||
- additional_child_tax_credit
|
||||
```
|
||||
|
||||
**Usage in variable**:
|
||||
```python
|
||||
adds = "gov.irs.credits.refundable"
|
||||
# Automatically sums all credits in the list
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Decision Matrix
|
||||
|
||||
| Scenario | Solution | Code |
|
||||
|----------|----------|------|
|
||||
| Sum 2-3 variables | `adds` attribute | `adds = ["var1", "var2"]` |
|
||||
| Sum many variables | Parameter list | `adds = "gov.path.list"` |
|
||||
| Sum + prevent negatives | `add()` with `max_()` | `max_(0, add(...))` |
|
||||
| Sum + conditional | `add()` with `where()` | `where(eligible, add(...), 0)` |
|
||||
| Sum + phase-out | `add()` with calculation | `add(...) - reduction` |
|
||||
| Count people/entities | `adds` with boolean | `adds = ["is_child"]` |
|
||||
|
||||
---
|
||||
|
||||
## 7. Key Principles
|
||||
|
||||
1. **Default to `adds` attribute** when variable is only a sum
|
||||
2. **Use `add()` function** only when additional logic is needed
|
||||
3. **Never manually sum** with `entity.sum(person(...) + person(...))`
|
||||
4. **Let PolicyEngine handle** entity aggregation automatically
|
||||
5. **Use parameter lists** for maintainable, configurable sums
|
||||
|
||||
---
|
||||
|
||||
## Related Skills
|
||||
|
||||
- **policyengine-period-patterns-skill**: For period conversion when summing across different time periods
|
||||
- **policyengine-core-skill**: For understanding entity hierarchies and relationships
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When implementing or reviewing code:
|
||||
|
||||
1. **Check if `adds` can be used** before writing a formula
|
||||
2. **Prefer declarative over imperative** when possible
|
||||
3. **Follow existing patterns** in the codebase
|
||||
4. **Test entity aggregation** carefully in YAML tests
|
||||
5. **Document parameter lists** clearly for `adds` references
|
||||
|
||||
---
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### Earned Income
|
||||
```python
|
||||
adds = ["employment_income", "self_employment_income"]
|
||||
```
|
||||
|
||||
### Unearned Income
|
||||
```python
|
||||
adds = ["interest_income", "dividend_income", "rental_income"]
|
||||
```
|
||||
|
||||
### Total Benefits
|
||||
```python
|
||||
adds = ["snap", "tanf", "wic", "ssi", "social_security"]
|
||||
```
|
||||
|
||||
### Tax Credits
|
||||
```python
|
||||
adds = "gov.irs.credits.refundable"
|
||||
```
|
||||
|
||||
### Counting Children
|
||||
```python
|
||||
adds = ["is_child"] # Returns count of children
|
||||
```
|
||||
382
skills/technical-patterns/policyengine-code-style-skill/SKILL.md
Normal file
382
skills/technical-patterns/policyengine-code-style-skill/SKILL.md
Normal file
@@ -0,0 +1,382 @@
|
||||
---
|
||||
name: policyengine-code-style
|
||||
description: PolicyEngine code writing style guide - formula optimization, direct returns, eliminating unnecessary variables
|
||||
---
|
||||
|
||||
# PolicyEngine Code Writing Style Guide
|
||||
|
||||
Essential patterns for writing clean, efficient PolicyEngine formulas.
|
||||
|
||||
## Core Principles
|
||||
|
||||
1. **Eliminate unnecessary intermediate variables**
|
||||
2. **Use direct parameter/variable access**
|
||||
3. **Return directly when possible**
|
||||
4. **Combine boolean logic**
|
||||
5. **Use correct period access** (period vs period.this_year)
|
||||
6. **NO hardcoded values** - use parameters or constants
|
||||
|
||||
---
|
||||
|
||||
## Pattern 1: Direct Parameter Access
|
||||
|
||||
### ❌ Bad - Unnecessary intermediate variable
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
countable = spm_unit("tn_tanf_countable_resources", period)
|
||||
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
|
||||
resource_limit = p.amount # ❌ Unnecessary
|
||||
return countable <= resource_limit
|
||||
```
|
||||
|
||||
### ✅ Good - Direct access
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
countable = spm_unit("tn_tanf_countable_resources", period)
|
||||
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
|
||||
return countable <= p.amount
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 2: Direct Return
|
||||
|
||||
### ❌ Bad - Unnecessary result variable
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
assets = spm_unit("spm_unit_assets", period.this_year)
|
||||
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
|
||||
vehicle_exemption = p.vehicle_exemption # ❌ Unnecessary
|
||||
countable = max_(assets - vehicle_exemption, 0) # ❌ Unnecessary
|
||||
return countable
|
||||
```
|
||||
|
||||
### ✅ Good - Direct return
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
assets = spm_unit("spm_unit_assets", period.this_year)
|
||||
p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
|
||||
return max_(assets - p.vehicle_exemption, 0)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 3: Combined Boolean Logic
|
||||
|
||||
### ❌ Bad - Too many intermediate booleans
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
person = spm_unit.members
|
||||
age = person("age", period.this_year)
|
||||
is_disabled = person("is_disabled", period.this_year)
|
||||
|
||||
caretaker_is_60_or_older = spm_unit.any(age >= 60) # ❌ Unnecessary
|
||||
caretaker_is_disabled = spm_unit.any(is_disabled) # ❌ Unnecessary
|
||||
eligible = caretaker_is_60_or_older | caretaker_is_disabled # ❌ Unnecessary
|
||||
|
||||
return eligible
|
||||
```
|
||||
|
||||
### ✅ Good - Combined logic
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
person = spm_unit.members
|
||||
age = person("age", period.this_year)
|
||||
is_disabled = person("is_disabled", period.this_year)
|
||||
|
||||
return spm_unit.any((age >= 60) | is_disabled)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 4: Period Access - period vs period.this_year
|
||||
|
||||
### ❌ Bad - Wrong period access
|
||||
|
||||
```python
|
||||
def formula(person, period, parameters):
|
||||
# MONTH formula accessing YEAR variables
|
||||
age = person("age", period) # ❌ Gives age/12 = 2.5 "monthly age"
|
||||
assets = person("assets", period) # ❌ Gives assets/12
|
||||
monthly_income = person("employment_income", period.this_year) / MONTHS_IN_YEAR # ❌ Redundant
|
||||
|
||||
return (age >= 18) & (assets < 10000) & (monthly_income < 2000)
|
||||
```
|
||||
|
||||
### ✅ Good - Correct period access
|
||||
|
||||
```python
|
||||
def formula(person, period, parameters):
|
||||
# MONTH formula accessing YEAR variables
|
||||
age = person("age", period.this_year) # ✅ Gets actual age (30)
|
||||
assets = person("assets", period.this_year) # ✅ Gets actual assets ($10,000)
|
||||
monthly_income = person("employment_income", period) # ✅ Auto-converts to monthly
|
||||
|
||||
p = parameters(period).gov.program.eligibility
|
||||
return (age >= p.age_min) & (age <= p.age_max) &
|
||||
(assets < p.asset_limit) & (monthly_income < p.income_threshold)
|
||||
```
|
||||
|
||||
**Rule:**
|
||||
- Income/flows → Use `period` (want monthly from annual)
|
||||
- Age/assets/counts/booleans → Use `period.this_year` (don't divide by 12)
|
||||
|
||||
---
|
||||
|
||||
## Pattern 5: No Hardcoded Values
|
||||
|
||||
### ❌ Bad - Hardcoded numbers
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
size = spm_unit.nb_persons()
|
||||
capped_size = min_(size, 10) # ❌ Hardcoded
|
||||
|
||||
age = person("age", period.this_year)
|
||||
income = person("income", period) / 12 # ❌ Use MONTHS_IN_YEAR
|
||||
|
||||
# ❌ Hardcoded thresholds
|
||||
if age >= 18 and age <= 65 and income < 2000:
|
||||
return True
|
||||
```
|
||||
|
||||
### ✅ Good - Parameterized
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.program
|
||||
capped_size = min_(spm_unit.nb_persons(), p.max_unit_size) # ✅
|
||||
|
||||
age = person("age", period.this_year)
|
||||
monthly_income = person("income", period) # ✅ Auto-converts (no manual /12)
|
||||
|
||||
age_eligible = (age >= p.age_min) & (age <= p.age_max) # ✅
|
||||
income_eligible = monthly_income < p.income_threshold # ✅
|
||||
|
||||
return age_eligible & income_eligible
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 6: Streamline Variable Access
|
||||
|
||||
### ❌ Bad - Redundant steps
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
unit_size = spm_unit.nb_persons() # ❌ Unnecessary
|
||||
max_size = 10 # ❌ Hardcoded
|
||||
capped_size = min_(unit_size, max_size)
|
||||
|
||||
p = parameters(period).gov.states.tn.dhs.tanf.benefit
|
||||
spa = p.standard_payment_amount[capped_size] # ❌ Unnecessary
|
||||
dgpa = p.differential_grant_payment_amount[capped_size] # ❌ Unnecessary
|
||||
|
||||
eligible = spm_unit("eligible_for_dgpa", period)
|
||||
return where(eligible, dgpa, spa)
|
||||
```
|
||||
|
||||
### ✅ Good - Streamlined
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.tn.dhs.tanf.benefit
|
||||
capped_size = min_(spm_unit.nb_persons(), p.max_unit_size)
|
||||
eligible = spm_unit("eligible_for_dgpa", period)
|
||||
|
||||
return where(
|
||||
eligible,
|
||||
p.differential_grant_payment_amount[capped_size],
|
||||
p.standard_payment_amount[capped_size]
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Keep Intermediate Variables
|
||||
|
||||
### ✅ Keep when value is used multiple times
|
||||
|
||||
```python
|
||||
def formula(tax_unit, period, parameters):
|
||||
p = parameters(period).gov.irs.credits
|
||||
filing_status = tax_unit("filing_status", period)
|
||||
|
||||
# ✅ Used multiple times - keep as variable
|
||||
threshold = p.phase_out.start[filing_status]
|
||||
|
||||
income = tax_unit("adjusted_gross_income", period)
|
||||
excess = max_(0, income - threshold)
|
||||
reduction = (excess / p.phase_out.width) * threshold
|
||||
|
||||
return max_(0, threshold - reduction)
|
||||
```
|
||||
|
||||
### ✅ Keep when calculation is complex
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.program
|
||||
gross_earned = spm_unit("gross_earned_income", period)
|
||||
|
||||
# ✅ Complex multi-step calculation - break it down
|
||||
work_expense_deduction = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
|
||||
after_work_expense = gross_earned - work_expense_deduction
|
||||
|
||||
earned_disregard = after_work_expense * p.earned_disregard_rate
|
||||
countable_earned = after_work_expense - earned_disregard
|
||||
|
||||
dependent_care = spm_unit("dependent_care_expenses", period)
|
||||
|
||||
return max_(0, countable_earned - dependent_care)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complete Example: Before vs After
|
||||
|
||||
### ❌ Before - Multiple Issues
|
||||
|
||||
```python
|
||||
def formula(person, period, parameters):
|
||||
# Wrong period access
|
||||
age = person("age", period) # ❌ age/12
|
||||
assets = person("assets", period) # ❌ assets/12
|
||||
annual_income = person("employment_income", period.this_year)
|
||||
monthly_income = annual_income / 12 # ❌ Use MONTHS_IN_YEAR
|
||||
|
||||
# Hardcoded values
|
||||
min_age = 18 # ❌
|
||||
max_age = 64 # ❌
|
||||
asset_limit = 10000 # ❌
|
||||
income_limit = 2000 # ❌
|
||||
|
||||
# Unnecessary intermediate variables
|
||||
age_check = (age >= min_age) & (age <= max_age)
|
||||
asset_check = assets <= asset_limit
|
||||
income_check = monthly_income <= income_limit
|
||||
eligible = age_check & asset_check & income_check
|
||||
|
||||
return eligible
|
||||
```
|
||||
|
||||
### ✅ After - Clean and Correct
|
||||
|
||||
```python
|
||||
def formula(person, period, parameters):
|
||||
p = parameters(period).gov.program.eligibility
|
||||
|
||||
# Correct period access
|
||||
age = person("age", period.this_year)
|
||||
assets = person("assets", period.this_year)
|
||||
monthly_income = person("employment_income", period)
|
||||
|
||||
# Direct return with combined logic
|
||||
return (
|
||||
(age >= p.age_min) & (age <= p.age_max) &
|
||||
(assets <= p.asset_limit) &
|
||||
(monthly_income <= p.income_threshold)
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 7: Minimal Comments
|
||||
|
||||
### Code Should Be Self-Documenting
|
||||
|
||||
**Variable names and structure should explain the code - not comments.**
|
||||
|
||||
### ❌ Bad - Verbose explanatory comments
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
# Wisconsin disregards all earned income of dependent children (< 18)
|
||||
# Calculate earned income for adults only
|
||||
is_adult = spm_unit.members("age", period.this_year) >= 18 # Hard-coded!
|
||||
adult_earned = spm_unit.sum(
|
||||
spm_unit.members("tanf_gross_earned_income", period) * is_adult
|
||||
)
|
||||
|
||||
# All unearned income is counted (including children's)
|
||||
gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
|
||||
|
||||
# NOTE: Wisconsin disregards many additional income sources that
|
||||
# are not separately tracked in PolicyEngine (educational aid, etc.)
|
||||
return max_(total_income - disregards, 0)
|
||||
```
|
||||
|
||||
### ✅ Good - Clean self-documenting code
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.wi.dcf.tanf.income
|
||||
|
||||
is_adult = spm_unit.members("age", period.this_year) >= p.adult_age_threshold
|
||||
adult_earned = spm_unit.sum(
|
||||
spm_unit.members("tanf_gross_earned_income", period) * is_adult
|
||||
)
|
||||
gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
|
||||
child_support = add(spm_unit, period, ["child_support_received"])
|
||||
|
||||
return max_(adult_earned + gross_unearned - child_support, 0)
|
||||
```
|
||||
|
||||
### Comment Rules
|
||||
|
||||
1. **NO comments explaining what code does** - variable names should be clear
|
||||
2. **OK: Brief NOTE about PolicyEngine limitations** (one line):
|
||||
```python
|
||||
# NOTE: Time limit cannot be tracked in PolicyEngine
|
||||
```
|
||||
3. **NO multi-line explanations** of what the code calculates
|
||||
|
||||
---
|
||||
|
||||
## Quick Checklist
|
||||
|
||||
Before finalizing code:
|
||||
- [ ] No hardcoded numbers (use parameters or constants like MONTHS_IN_YEAR)
|
||||
- [ ] Correct period access:
|
||||
- Income/flows use `period`
|
||||
- Age/assets/counts/booleans use `period.this_year`
|
||||
- [ ] No single-use intermediate variables
|
||||
- [ ] Direct parameter access (`p.amount` not `amount = p.amount`)
|
||||
- [ ] Direct returns when possible
|
||||
- [ ] Combined boolean logic when possible
|
||||
- [ ] Minimal comments (code should be self-documenting)
|
||||
|
||||
---
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
1. **Less is more** - Eliminate unnecessary variables
|
||||
2. **Direct is better** - Access parameters and return directly
|
||||
3. **Combine when logical** - Group related boolean conditions
|
||||
4. **Keep when needed** - Complex calculations and reused values deserve variables
|
||||
5. **Period matters** - Use correct period access to avoid auto-conversion bugs
|
||||
|
||||
---
|
||||
|
||||
## Related Skills
|
||||
|
||||
- **policyengine-period-patterns-skill** - Deep dive on period handling
|
||||
- **policyengine-implementation-patterns-skill** - Variable structure and patterns
|
||||
- **policyengine-vectorization-skill** - NumPy operations and vectorization
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When writing or reviewing formulas:
|
||||
1. **Scan for single-use variables** - eliminate them
|
||||
2. **Check period access** - ensure correct for variable type
|
||||
3. **Look for hardcoded values** - parameterize them
|
||||
4. **Identify redundant steps** - streamline them
|
||||
5. **Consider readability** - keep complex calculations clear
|
||||
@@ -0,0 +1,739 @@
|
||||
---
|
||||
name: policyengine-implementation-patterns
|
||||
description: PolicyEngine implementation patterns - variable creation, no hard-coding principle, federal/state separation, metadata standards
|
||||
---
|
||||
|
||||
# PolicyEngine Implementation Patterns
|
||||
|
||||
Essential patterns for implementing government benefit program rules in PolicyEngine.
|
||||
|
||||
## PolicyEngine Architecture Constraints
|
||||
|
||||
### What CANNOT Be Simulated (Single-Period Limitation)
|
||||
|
||||
**CRITICAL: PolicyEngine uses single-period simulation architecture**
|
||||
|
||||
The following CANNOT be implemented and should be SKIPPED when found in documentation:
|
||||
|
||||
#### 1. Time Limits and Lifetime Counters
|
||||
**Cannot simulate:**
|
||||
- ANY lifetime benefit limits (X months total)
|
||||
- ANY time windows (X months within Y period)
|
||||
- Benefit clocks and countable months
|
||||
- Cumulative time tracking
|
||||
|
||||
**Why:** Requires tracking benefit history across multiple periods. PolicyEngine simulates one period at a time with no state persistence.
|
||||
|
||||
**What to do:** Document in comments but DON'T parameterize or implement:
|
||||
```python
|
||||
# NOTE: [State] has [X]-month lifetime limit on [Program] benefits
|
||||
# This cannot be simulated in PolicyEngine's single-period architecture
|
||||
```
|
||||
|
||||
#### 2. Work History Requirements
|
||||
**Cannot simulate:**
|
||||
- "Must have worked 6 of last 12 months"
|
||||
- "Averaged 30 hours/week over past quarter"
|
||||
- Prior employment verification
|
||||
- Work participation rate tracking
|
||||
|
||||
**Why:** Requires historical data from previous periods.
|
||||
|
||||
#### 3. Waiting Periods and Benefit Delays
|
||||
**Cannot simulate:**
|
||||
- "3-month waiting period for new residents"
|
||||
- "Benefits start month after application"
|
||||
- Retroactive eligibility
|
||||
- Benefit recertification cycles
|
||||
|
||||
**Why:** Requires tracking application dates and eligibility history.
|
||||
|
||||
#### 4. Progressive Sanctions and Penalties
|
||||
**Cannot simulate:**
|
||||
- "First violation: 1-month sanction, Second: 3-month, Third: permanent"
|
||||
- Graduated penalties
|
||||
- Strike systems
|
||||
|
||||
**Why:** Requires tracking violation history.
|
||||
|
||||
#### 5. Asset Spend-Down Over Time
|
||||
**Cannot simulate:**
|
||||
- Medical spend-down across months
|
||||
- Resource depletion tracking
|
||||
- Accumulated medical expenses
|
||||
|
||||
**Why:** Requires tracking expenses and resources across periods.
|
||||
|
||||
### What CAN Be Simulated (With Caveats)
|
||||
|
||||
PolicyEngine CAN simulate point-in-time eligibility and benefits:
|
||||
- ✅ Current month income limits
|
||||
- ✅ Current month resource limits
|
||||
- ✅ Current benefit calculations
|
||||
- ✅ Current household composition
|
||||
- ✅ Current deductions and disregards
|
||||
|
||||
### Time-Limited Benefits That Affect Current Calculations
|
||||
|
||||
**Special Case: Time-limited deductions/disregards**
|
||||
|
||||
When a deduction or disregard is only available for X months:
|
||||
- **DO implement the deduction** (assume it applies)
|
||||
- **DO add a comment** explaining the time limitation
|
||||
- **DON'T try to track or enforce the time limit**
|
||||
|
||||
Example:
|
||||
```python
|
||||
class state_tanf_countable_earned_income(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.xx.tanf.income
|
||||
earned = spm_unit("tanf_gross_earned_income", period)
|
||||
|
||||
# NOTE: In reality, this 75% disregard only applies for first 4 months
|
||||
# of employment. PolicyEngine cannot track employment duration, so we
|
||||
# apply the disregard assuming the household qualifies.
|
||||
# Actual rule: [State Code Citation]
|
||||
disregard_rate = p.earned_income_disregard_rate # 0.75
|
||||
|
||||
return earned * (1 - disregard_rate)
|
||||
```
|
||||
|
||||
**Rule: If it requires history or future tracking, it CANNOT be fully simulated - but implement what we can and document limitations**
|
||||
|
||||
---
|
||||
|
||||
## Critical Principles
|
||||
|
||||
### 1. ZERO Hard-Coded Values
|
||||
**Every numeric value MUST be parameterized**
|
||||
|
||||
```python
|
||||
❌ FORBIDDEN:
|
||||
return where(eligible, 1000, 0) # Hard-coded 1000
|
||||
age < 15 # Hard-coded 15
|
||||
benefit = income * 0.33 # Hard-coded 0.33
|
||||
month >= 10 and month <= 3 # Hard-coded months
|
||||
|
||||
✅ REQUIRED:
|
||||
return where(eligible, p.maximum_benefit, 0)
|
||||
age < p.age_threshold.minor_child
|
||||
benefit = income * p.benefit_rate
|
||||
month >= p.season.start_month
|
||||
```
|
||||
|
||||
**Acceptable literals:**
|
||||
- `0`, `1`, `-1` for basic math
|
||||
- `12` for month conversion (`/ 12`, `* 12`)
|
||||
- Array indices when structure is known
|
||||
|
||||
### 2. No Placeholder Implementations
|
||||
**Delete the file rather than leave placeholders**
|
||||
|
||||
```python
|
||||
❌ NEVER:
|
||||
def formula(entity, period, parameters):
|
||||
# TODO: Implement
|
||||
return 75 # Placeholder
|
||||
|
||||
✅ ALWAYS:
|
||||
# Complete implementation or no file at all
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Variable Implementation Standards
|
||||
|
||||
### Variable Metadata Format
|
||||
|
||||
Follow established patterns:
|
||||
```python
|
||||
class il_tanf_countable_earned_income(Variable):
|
||||
value_type = float
|
||||
entity = SPMUnit
|
||||
definition_period = MONTH
|
||||
label = "Illinois TANF countable earned income"
|
||||
unit = USD
|
||||
reference = "https://www.law.cornell.edu/regulations/illinois/..."
|
||||
defined_for = StateCode.IL
|
||||
|
||||
# Use adds for simple sums
|
||||
adds = ["il_tanf_earned_income_after_disregard"]
|
||||
```
|
||||
|
||||
**Key rules:**
|
||||
- ✅ Use full URL in `reference` (clickable)
|
||||
- ❌ Don't use `documentation` field
|
||||
- ❌ Don't use statute citations without URLs
|
||||
|
||||
### When to Use `adds` vs `formula`
|
||||
|
||||
**Use `adds` when:**
|
||||
- Just summing variables
|
||||
- Passing through a single variable
|
||||
- No transformations needed
|
||||
|
||||
```python
|
||||
✅ BEST - Simple sum:
|
||||
class tanf_gross_income(Variable):
|
||||
adds = ["employment_income", "self_employment_income"]
|
||||
```
|
||||
|
||||
**Use `formula` when:**
|
||||
- Applying transformations
|
||||
- Conditional logic
|
||||
- Calculations needed
|
||||
|
||||
```python
|
||||
✅ CORRECT - Need logic:
|
||||
def formula(entity, period, parameters):
|
||||
income = add(entity, period, ["income1", "income2"])
|
||||
return max_(0, income) # Need max_
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## TANF Countable Income Pattern
|
||||
|
||||
### Critical: Verify Calculation Order from Legal Code
|
||||
|
||||
**MOST IMPORTANT:** Always check the state's legal code or policy manual for the exact calculation order. The pattern below is typical but not universal.
|
||||
|
||||
**The Typical Pattern:**
|
||||
1. Apply deductions/disregards to **earned income only**
|
||||
2. Use `max_()` to prevent negative earned income
|
||||
3. Add unearned income (which typically has no deductions)
|
||||
|
||||
**This pattern is based on how MOST TANF programs work, but you MUST verify with the specific state's legal code.**
|
||||
|
||||
### ❌ WRONG - Applying deductions to total income
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
gross_earned = spm_unit("tanf_gross_earned_income", period)
|
||||
unearned = spm_unit("tanf_gross_unearned_income", period)
|
||||
deductions = spm_unit("tanf_earned_income_deductions", period)
|
||||
|
||||
# ❌ WRONG: Deductions applied to total income
|
||||
total_income = gross_earned + unearned
|
||||
countable = total_income - deductions
|
||||
|
||||
return max_(countable, 0)
|
||||
```
|
||||
|
||||
**Why this is wrong:**
|
||||
- Deductions should ONLY reduce earned income
|
||||
- Unearned income (SSI, child support, etc.) is not subject to work expense deductions
|
||||
- This incorrectly reduces unearned income when earned income is low
|
||||
|
||||
**Example error:**
|
||||
- Earned: $100, Unearned: $500, Deductions: $200
|
||||
- Wrong result: `max_($100 + $500 - $200, 0) = $400` (reduces unearned!)
|
||||
- Correct result: `max_($100 - $200, 0) + $500 = $500`
|
||||
|
||||
### ✅ CORRECT - Apply deductions to earned only, then add unearned
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
gross_earned = spm_unit("tanf_gross_earned_income", period)
|
||||
unearned = spm_unit("tanf_gross_unearned_income", period)
|
||||
deductions = spm_unit("tanf_earned_income_deductions", period)
|
||||
|
||||
# ✅ CORRECT: Deductions applied to earned only, then add unearned
|
||||
return max_(gross_earned - deductions, 0) + unearned
|
||||
```
|
||||
|
||||
### Pattern Variations
|
||||
|
||||
**With multiple deduction steps:**
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.xx.tanf.income
|
||||
gross_earned = spm_unit("tanf_gross_earned_income", period)
|
||||
unearned = spm_unit("tanf_gross_unearned_income", period)
|
||||
|
||||
# Step 1: Apply work expense deduction
|
||||
work_expense = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
|
||||
after_work_expense = max_(gross_earned - work_expense, 0)
|
||||
|
||||
# Step 2: Apply earnings disregard
|
||||
earnings_disregard = after_work_expense * p.disregard_rate
|
||||
countable_earned = max_(after_work_expense - earnings_disregard, 0)
|
||||
|
||||
# Step 3: Add unearned (no deductions applied)
|
||||
return countable_earned + unearned
|
||||
```
|
||||
|
||||
**With disregard percentage (simplified):**
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.xx.tanf.income
|
||||
gross_earned = spm_unit("tanf_gross_earned_income", period)
|
||||
unearned = spm_unit("tanf_gross_unearned_income", period)
|
||||
|
||||
# Apply disregard to earned (keep 33% = disregard 67%)
|
||||
countable_earned = gross_earned * (1 - p.earned_disregard_rate)
|
||||
|
||||
return max_(countable_earned, 0) + unearned
|
||||
```
|
||||
|
||||
### When Unearned Income HAS Deductions
|
||||
|
||||
Some states DO have unearned income deductions (rare). Handle separately:
|
||||
|
||||
```python
|
||||
def formula(spm_unit, period, parameters):
|
||||
gross_earned = spm_unit("tanf_gross_earned_income", period)
|
||||
gross_unearned = spm_unit("tanf_gross_unearned_income", period)
|
||||
earned_deductions = spm_unit("tanf_earned_income_deductions", period)
|
||||
unearned_deductions = spm_unit("tanf_unearned_income_deductions", period)
|
||||
|
||||
# Apply each type of deduction to its respective income type
|
||||
countable_earned = max_(gross_earned - earned_deductions, 0)
|
||||
countable_unearned = max_(gross_unearned - unearned_deductions, 0)
|
||||
|
||||
return countable_earned + countable_unearned
|
||||
```
|
||||
|
||||
### Quick Reference
|
||||
|
||||
**Standard TANF pattern:**
|
||||
```
|
||||
Countable Income = max_(Earned - Earned Deductions, 0) + Unearned
|
||||
```
|
||||
|
||||
**NOT:**
|
||||
```
|
||||
❌ max_(Earned + Unearned - Deductions, 0)
|
||||
❌ max_(Earned - Deductions + Unearned, 0) # Can go negative
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Federal/State Separation
|
||||
|
||||
### Federal Parameters
|
||||
Location: `/parameters/gov/{agency}/`
|
||||
- Base formulas and methodologies
|
||||
- National standards
|
||||
- Required elements
|
||||
|
||||
### State Parameters
|
||||
Location: `/parameters/gov/states/{state}/`
|
||||
- State-specific thresholds
|
||||
- Implementation choices
|
||||
- Scale factors
|
||||
|
||||
```yaml
|
||||
# Federal: parameters/gov/hhs/fpg/base.yaml
|
||||
first_person: 14_580
|
||||
|
||||
# State: parameters/gov/states/ca/scale_factor.yaml
|
||||
fpg_multiplier: 2.0 # 200% of FPG
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Code Reuse Patterns
|
||||
|
||||
### Avoid Duplication - Create Intermediate Variables
|
||||
|
||||
**❌ ANTI-PATTERN: Copy-pasting calculations**
|
||||
```python
|
||||
# File 1: calculates income after deduction
|
||||
def formula(household, period, parameters):
|
||||
gross = add(household, period, ["income"])
|
||||
deduction = p.deduction * household.nb_persons()
|
||||
return max_(gross - deduction, 0)
|
||||
|
||||
# File 2: DUPLICATES same calculation
|
||||
def formula(household, period, parameters):
|
||||
gross = add(household, period, ["income"]) # Copy-pasted
|
||||
deduction = p.deduction * household.nb_persons() # Copy-pasted
|
||||
after_deduction = max_(gross - deduction, 0) # Copy-pasted
|
||||
return after_deduction < p.threshold
|
||||
```
|
||||
|
||||
**✅ CORRECT: Reuse existing variables**
|
||||
```python
|
||||
# File 2: reuses calculation
|
||||
def formula(household, period, parameters):
|
||||
countable_income = household("program_countable_income", period)
|
||||
return countable_income < p.threshold
|
||||
```
|
||||
|
||||
**When to create intermediate variables:**
|
||||
- Same calculation in 2+ places
|
||||
- Logic exceeds 5 lines
|
||||
- Reference implementations have similar variable
|
||||
|
||||
---
|
||||
|
||||
## TANF-Specific Patterns
|
||||
|
||||
### Study Reference Implementations First
|
||||
|
||||
**MANDATORY before implementing any TANF:**
|
||||
- DC TANF: `/variables/gov/states/dc/dhs/tanf/`
|
||||
- IL TANF: `/variables/gov/states/il/dhs/tanf/`
|
||||
- TX TANF: `/variables/gov/states/tx/hhs/tanf/`
|
||||
|
||||
**Learn from them:**
|
||||
1. Variable organization
|
||||
2. Naming conventions
|
||||
3. Code reuse patterns
|
||||
4. When to use `adds` vs `formula`
|
||||
|
||||
### Standard TANF Structure
|
||||
```
|
||||
tanf/
|
||||
├── eligibility/
|
||||
│ ├── demographic_eligible.py
|
||||
│ ├── income_eligible.py
|
||||
│ └── eligible.py
|
||||
├── income/
|
||||
│ ├── earned/
|
||||
│ ├── unearned/
|
||||
│ └── countable_income.py
|
||||
└── [state]_tanf.py
|
||||
```
|
||||
|
||||
### Simplified TANF Rules
|
||||
|
||||
For simplified implementations:
|
||||
|
||||
**DON'T create state-specific versions of:**
|
||||
- Demographic eligibility (use federal)
|
||||
- Immigration eligibility (use federal)
|
||||
- Income sources (use federal baseline)
|
||||
|
||||
```python
|
||||
❌ DON'T CREATE:
|
||||
ca_tanf_demographic_eligible_person.py
|
||||
ca_tanf_gross_earned_income.py
|
||||
parameters/.../income/sources/earned.yaml
|
||||
|
||||
✅ DO USE:
|
||||
# Federal demographic eligibility
|
||||
is_demographic_tanf_eligible
|
||||
# Federal income aggregation
|
||||
tanf_gross_earned_income
|
||||
```
|
||||
|
||||
### Avoiding Unnecessary Wrapper Variables (CRITICAL)
|
||||
|
||||
**Golden Rule: Only create a state variable if you're adding state-specific logic to it!**
|
||||
|
||||
#### Understand WHY Variables Exist, Not Just WHAT
|
||||
|
||||
When studying reference implementations:
|
||||
1. **Note which variables they have**
|
||||
2. **READ THE CODE inside each variable**
|
||||
3. **Ask: "Does this variable have state-specific logic?"**
|
||||
4. **If it just returns federal baseline → DON'T copy it**
|
||||
|
||||
#### Variable Creation Decision Tree
|
||||
|
||||
Before creating ANY state-specific variable, ask:
|
||||
1. Does federal baseline already calculate this?
|
||||
2. Does my state do it DIFFERENTLY than federal?
|
||||
3. Can I write the difference in 1+ lines of state-specific logic?
|
||||
4. **Will this calculation be used in 2+ other variables?** (Code reuse exception)
|
||||
|
||||
**Decision:**
|
||||
- If YES/NO/NO/NO → **DON'T create the variable**, use federal directly
|
||||
- If YES/YES/YES/NO → **CREATE the variable** with state logic
|
||||
- If YES/NO/NO/YES → **CREATE as intermediate variable** for code reuse (see exception below)
|
||||
|
||||
#### EXCEPTION: Code Reuse Justifies Intermediate Variables
|
||||
|
||||
**Even without state-specific logic, create a variable if the SAME calculation is used in multiple places.**
|
||||
|
||||
❌ **Bad - Duplicating calculation across variables:**
|
||||
```python
|
||||
# Variable 1 - Income eligibility
|
||||
class mo_tanf_income_eligible(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
# Duplicated calculation
|
||||
gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
|
||||
return gross <= p.income_limit
|
||||
|
||||
# Variable 2 - Countable income
|
||||
class mo_tanf_countable_income(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
# SAME calculation repeated!
|
||||
gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
|
||||
deductions = spm_unit("mo_tanf_deductions", period)
|
||||
return max_(gross - deductions, 0)
|
||||
|
||||
# Variable 3 - Need standard
|
||||
class mo_tanf_need_standard(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
# SAME calculation AGAIN!
|
||||
gross = add(spm_unit, period, ["tanf_gross_earned_income", "tanf_gross_unearned_income"])
|
||||
return where(gross < p.threshold, p.high, p.low)
|
||||
```
|
||||
|
||||
✅ **Good - Extract into reusable intermediate variable:**
|
||||
```python
|
||||
# Intermediate variable - used in multiple places
|
||||
class mo_tanf_gross_income(Variable):
|
||||
adds = ["tanf_gross_earned_income", "tanf_gross_unearned_income"]
|
||||
|
||||
# Variable 1 - Reuses intermediate
|
||||
class mo_tanf_income_eligible(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
gross = spm_unit("mo_tanf_gross_income", period) # Reuse
|
||||
return gross <= p.income_limit
|
||||
|
||||
# Variable 2 - Reuses intermediate
|
||||
class mo_tanf_countable_income(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
gross = spm_unit("mo_tanf_gross_income", period) # Reuse
|
||||
deductions = spm_unit("mo_tanf_deductions", period)
|
||||
return max_(gross - deductions, 0)
|
||||
|
||||
# Variable 3 - Reuses intermediate
|
||||
class mo_tanf_need_standard(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
gross = spm_unit("mo_tanf_gross_income", period) # Reuse
|
||||
return where(gross < p.threshold, p.high, p.low)
|
||||
```
|
||||
|
||||
**When to create intermediate variables for reuse:**
|
||||
- ✅ Same calculation appears in 2+ variables
|
||||
- ✅ Represents a meaningful concept (e.g., "gross income", "net resources")
|
||||
- ✅ Simplifies maintenance (change once vs many places)
|
||||
- ✅ Follows DRY (Don't Repeat Yourself) principle
|
||||
|
||||
**When NOT to create (still a wrapper):**
|
||||
- ❌ Only used in ONE place
|
||||
- ❌ Just passes through another variable unchanged
|
||||
- ❌ Adds indirection without code reuse benefit
|
||||
|
||||
#### Red Flags for Unnecessary Wrapper Variables
|
||||
|
||||
```python
|
||||
❌ INVALID - Pure wrapper, no state logic:
|
||||
class in_tanf_assistance_unit_size(Variable):
|
||||
def formula(spm_unit, period):
|
||||
return spm_unit("spm_unit_size", period) # Just returns federal
|
||||
|
||||
❌ INVALID - Aggregation without transformation:
|
||||
class in_tanf_countable_unearned_income(Variable):
|
||||
def formula(tax_unit, period):
|
||||
return tax_unit.sum(person("tanf_gross_unearned_income", period))
|
||||
|
||||
❌ INVALID - Pass-through with no modification:
|
||||
class in_tanf_gross_income(Variable):
|
||||
def formula(entity, period):
|
||||
return entity("tanf_gross_income", period)
|
||||
```
|
||||
|
||||
#### Examples of VALID State Variables
|
||||
|
||||
```python
|
||||
✅ VALID - Has state-specific disregard:
|
||||
class in_tanf_countable_earned_income(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.in.tanf.income
|
||||
earned = spm_unit("tanf_gross_earned_income", period)
|
||||
return earned * (1 - p.earned_income_disregard_rate) # STATE LOGIC
|
||||
|
||||
✅ VALID - Uses state-specific limits:
|
||||
class in_tanf_income_eligible(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.in.tanf
|
||||
income = spm_unit("tanf_countable_income", period)
|
||||
size = spm_unit("spm_unit_size", period.this_year)
|
||||
limit = p.income_limit[min_(size, p.max_household_size)] # STATE PARAMS
|
||||
return income <= limit
|
||||
|
||||
✅ VALID - IL has different counting rules:
|
||||
class il_tanf_assistance_unit_size(Variable):
|
||||
adds = [
|
||||
"il_tanf_payment_eligible_child", # STATE-SPECIFIC
|
||||
"il_tanf_payment_eligible_parent", # STATE-SPECIFIC
|
||||
]
|
||||
```
|
||||
|
||||
#### State Variables to AVOID Creating
|
||||
|
||||
For TANF implementations:
|
||||
|
||||
**❌ DON'T create these (use federal directly):**
|
||||
- `state_tanf_assistance_unit_size` (unless different counting rules like IL)
|
||||
- `state_tanf_countable_unearned_income` (unless state has disregards)
|
||||
- `state_tanf_gross_income` (just use federal baseline)
|
||||
- Any variable that's just `return entity("federal_variable", period)`
|
||||
|
||||
**✅ DO create these (when state has unique rules):**
|
||||
- `state_tanf_countable_earned_income` (if unique disregard %)
|
||||
- `state_tanf_income_eligible` (state income limits)
|
||||
- `state_tanf_maximum_benefit` (state payment standards)
|
||||
- `state_tanf` (final benefit calculation)
|
||||
|
||||
### Demographic Eligibility Pattern
|
||||
|
||||
**Option 1: Use Federal (Simplified)**
|
||||
```python
|
||||
class ca_tanf_eligible(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
# Use federal variable
|
||||
has_eligible = spm_unit.any(
|
||||
spm_unit.members("is_demographic_tanf_eligible", period)
|
||||
)
|
||||
return has_eligible & income_eligible
|
||||
```
|
||||
|
||||
**Option 2: State-Specific (Different thresholds)**
|
||||
```python
|
||||
class ca_tanf_demographic_eligible_person(Variable):
|
||||
def formula(person, period, parameters):
|
||||
p = parameters(period).gov.states.ca.tanf
|
||||
age = person("age", period.this_year) # NOT monthly_age
|
||||
|
||||
age_limit = where(
|
||||
person("is_full_time_student", period),
|
||||
p.age_threshold.student,
|
||||
p.age_threshold.minor_child
|
||||
)
|
||||
return age < age_limit
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Implementation Patterns
|
||||
|
||||
### Income Eligibility
|
||||
```python
|
||||
class program_income_eligible(Variable):
|
||||
value_type = bool
|
||||
entity = SPMUnit
|
||||
definition_period = MONTH
|
||||
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.xx.program
|
||||
income = spm_unit("program_countable_income", period)
|
||||
size = spm_unit("spm_unit_size", period.this_year)
|
||||
|
||||
# Get threshold from parameters
|
||||
threshold = p.income_limit[min_(size, p.max_household_size)]
|
||||
return income <= threshold
|
||||
```
|
||||
|
||||
### Benefit Calculation
|
||||
```python
|
||||
class program_benefit(Variable):
|
||||
value_type = float
|
||||
entity = SPMUnit
|
||||
definition_period = MONTH
|
||||
unit = USD
|
||||
|
||||
def formula(spm_unit, period, parameters):
|
||||
p = parameters(period).gov.states.xx.program
|
||||
eligible = spm_unit("program_eligible", period)
|
||||
|
||||
# Calculate benefit amount
|
||||
base = p.benefit_schedule.base_amount
|
||||
adjustment = p.benefit_schedule.adjustment_rate
|
||||
size = spm_unit("spm_unit_size", period.this_year)
|
||||
|
||||
amount = base + (size - 1) * adjustment
|
||||
return where(eligible, amount, 0)
|
||||
```
|
||||
|
||||
### Using Scale Parameters
|
||||
```python
|
||||
def formula(entity, period, parameters):
|
||||
p = parameters(period).gov.states.az.program
|
||||
federal_p = parameters(period).gov.hhs.fpg
|
||||
|
||||
# Federal base with state scale
|
||||
size = entity("household_size", period.this_year)
|
||||
fpg = federal_p.first_person + federal_p.additional * (size - 1)
|
||||
state_scale = p.income_limit_scale # Often exists
|
||||
income_limit = fpg * state_scale
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Variable Creation Checklist
|
||||
|
||||
Before creating any variable:
|
||||
- [ ] Check if it already exists
|
||||
- [ ] Use standard demographic variables (age, is_disabled)
|
||||
- [ ] Reuse federal calculations where applicable
|
||||
- [ ] Check for household_income before creating new
|
||||
- [ ] Look for existing intermediate variables
|
||||
- [ ] Study reference implementations
|
||||
|
||||
---
|
||||
|
||||
## Quality Standards
|
||||
|
||||
### Complete Implementation Requirements
|
||||
- All values from parameters (no hard-coding)
|
||||
- Complete formula logic
|
||||
- Proper entity aggregation
|
||||
- Correct period handling
|
||||
- Meaningful variable names
|
||||
- Proper metadata
|
||||
|
||||
### Anti-Patterns to Avoid
|
||||
- Copy-pasting logic between files
|
||||
- Hard-coding any numeric values
|
||||
- Creating duplicate income variables
|
||||
- State-specific versions of federal rules
|
||||
- Placeholder TODOs in production code
|
||||
|
||||
---
|
||||
|
||||
## Parameter-to-Variable Mapping Requirements
|
||||
|
||||
### Every Parameter Must Have a Variable
|
||||
|
||||
**CRITICAL: Complete implementation means every parameter is used!**
|
||||
|
||||
When you create parameters, you MUST create corresponding variables:
|
||||
|
||||
| Parameter Type | Required Variable(s) |
|
||||
|---------------|---------------------|
|
||||
| resources/limit | `state_program_resource_eligible` |
|
||||
| income/limit | `state_program_income_eligible` |
|
||||
| payment_standard | `state_program_maximum_benefit` |
|
||||
| income/disregard | `state_program_countable_earned_income` |
|
||||
| categorical/requirements | `state_program_categorically_eligible` |
|
||||
|
||||
### Complete Eligibility Formula
|
||||
|
||||
The main eligibility variable MUST combine ALL checks:
|
||||
|
||||
```python
|
||||
class state_program_eligible(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
income_eligible = spm_unit("state_program_income_eligible", period)
|
||||
resource_eligible = spm_unit("state_program_resource_eligible", period) # DON'T FORGET!
|
||||
categorical = spm_unit("state_program_categorically_eligible", period)
|
||||
|
||||
return income_eligible & resource_eligible & categorical
|
||||
```
|
||||
|
||||
**Common Implementation Failures:**
|
||||
- ❌ Created resource limit parameter but no resource_eligible variable
|
||||
- ❌ Main eligible variable only checks income, ignores resources
|
||||
- ❌ Parameters created but never referenced in any formula
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When implementing variables:
|
||||
1. **Study reference implementations** (DC, IL, TX TANF)
|
||||
2. **Never hard-code values** - use parameters
|
||||
3. **Map every parameter to a variable** - no orphaned parameters
|
||||
4. **Complete ALL eligibility checks** - income AND resources AND categorical
|
||||
5. **Reuse existing variables** - avoid duplication
|
||||
6. **Use `adds` when possible** - cleaner than formula
|
||||
7. **Create intermediate variables** for complex logic
|
||||
8. **Follow metadata standards** exactly
|
||||
9. **Complete implementation** or delete the file
|
||||
@@ -0,0 +1,440 @@
|
||||
---
|
||||
name: policyengine-parameter-patterns
|
||||
description: PolicyEngine parameter patterns - YAML structure, naming conventions, metadata requirements, federal/state separation
|
||||
---
|
||||
|
||||
# PolicyEngine Parameter Patterns
|
||||
|
||||
Comprehensive patterns for creating PolicyEngine parameter files.
|
||||
|
||||
## Critical: Required Structure
|
||||
|
||||
Every parameter MUST have this exact structure:
|
||||
```yaml
|
||||
description: [One sentence description].
|
||||
values:
|
||||
YYYY-MM-DD: value
|
||||
|
||||
metadata:
|
||||
unit: [type] # REQUIRED
|
||||
period: [period] # REQUIRED
|
||||
label: [name] # REQUIRED
|
||||
reference: # REQUIRED
|
||||
- title: [source]
|
||||
href: [url]
|
||||
```
|
||||
|
||||
**Missing ANY metadata field = validation error**
|
||||
|
||||
---
|
||||
|
||||
## 1. File Naming Conventions
|
||||
|
||||
### Study Reference Implementations First
|
||||
Before naming, examine:
|
||||
- DC TANF: `/parameters/gov/states/dc/dhs/tanf/`
|
||||
- IL TANF: `/parameters/gov/states/il/dhs/tanf/`
|
||||
- TX TANF: `/parameters/gov/states/tx/hhs/tanf/`
|
||||
|
||||
### Naming Patterns
|
||||
|
||||
**Dollar amounts → `/amount.yaml`**
|
||||
```
|
||||
income/deductions/work_expense/amount.yaml # $120
|
||||
resources/limit/amount.yaml # $6,000
|
||||
payment_standard/amount.yaml # $320
|
||||
```
|
||||
|
||||
**Percentages/rates → `/rate.yaml` or `/percentage.yaml`**
|
||||
```
|
||||
income_limit/rate.yaml # 1.85 (185% FPL)
|
||||
benefit_reduction/rate.yaml # 0.2 (20%)
|
||||
income/disregard/percentage.yaml # 0.67 (67%)
|
||||
```
|
||||
|
||||
**Thresholds → `/threshold.yaml`**
|
||||
```
|
||||
age_threshold/minor_child.yaml # 18
|
||||
age_threshold/elderly.yaml # 60
|
||||
income/threshold.yaml # 30_000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Description Field
|
||||
|
||||
### The ONLY Acceptable Formula
|
||||
|
||||
```yaml
|
||||
description: [State] [verb] [category] to [this X] under the [Full Program Name] program.
|
||||
```
|
||||
|
||||
**Components:**
|
||||
1. **[State]**: Full state name (Indiana, Texas, California)
|
||||
2. **[verb]**: ONLY use: limits, provides, sets, excludes, deducts, uses
|
||||
3. **[category]**: What's being limited/provided (gross income, resources, payment standard)
|
||||
4. **[this X]**: ALWAYS use generic placeholder
|
||||
- `this amount` (for currency-USD)
|
||||
- `this share` or `this percentage` (for rates/percentages)
|
||||
- `this threshold` (for age/counts)
|
||||
5. **[Full Program Name]**: ALWAYS spell out (Temporary Assistance for Needy Families, NOT TANF)
|
||||
|
||||
### Copy These Exact Templates
|
||||
|
||||
**For income limits:**
|
||||
```yaml
|
||||
description: [State] limits gross income to this amount under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
**For resource limits:**
|
||||
```yaml
|
||||
description: [State] limits resources to this amount under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
**For payment standards:**
|
||||
```yaml
|
||||
description: [State] provides this amount as the payment standard under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
**For disregards:**
|
||||
```yaml
|
||||
description: [State] excludes this share of earnings from countable income under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
### Description Validation Checklist
|
||||
|
||||
Run this check on EVERY description:
|
||||
```python
|
||||
# Pseudo-code validation
|
||||
def validate_description(desc):
|
||||
checks = [
|
||||
desc.count('.') == 1, # Exactly one sentence
|
||||
'TANF' not in desc, # No acronyms
|
||||
'SNAP' not in desc, # No acronyms
|
||||
'this amount' in desc or 'this share' in desc or 'this percentage' in desc,
|
||||
'under the' in desc and 'program' in desc,
|
||||
'by household size' not in desc, # No explanatory text
|
||||
'based on' not in desc, # No explanatory text
|
||||
'for eligibility' not in desc, # Redundant
|
||||
]
|
||||
return all(checks)
|
||||
```
|
||||
|
||||
**CRITICAL: Always spell out full program names in descriptions!**
|
||||
|
||||
---
|
||||
|
||||
## 3. Values Section
|
||||
|
||||
### Format Rules
|
||||
```yaml
|
||||
values:
|
||||
2024-01-01: 3_000 # Use underscores
|
||||
# NOT: 3000
|
||||
|
||||
2024-01-01: 0.2 # Remove trailing zeros
|
||||
# NOT: 0.20 or 0.200
|
||||
|
||||
2024-01-01: 2 # No decimals for integers
|
||||
# NOT: 2.0 or 2.00
|
||||
```
|
||||
|
||||
### Effective Dates
|
||||
|
||||
**Use exact dates from sources:**
|
||||
```yaml
|
||||
# If source says "effective July 1, 2023"
|
||||
2023-07-01: value
|
||||
|
||||
# If source says "as of October 1"
|
||||
2024-10-01: value
|
||||
|
||||
# NOT arbitrary dates:
|
||||
2000-01-01: value # Shows no research
|
||||
```
|
||||
|
||||
**Date format:** `YYYY-MM-01` (always use 01 for day)
|
||||
|
||||
---
|
||||
|
||||
## 4. Metadata Fields (ALL REQUIRED)
|
||||
|
||||
### unit
|
||||
Common units:
|
||||
- `currency-USD` - Dollar amounts
|
||||
- `/1` - Rates, percentages (as decimals)
|
||||
- `month` - Number of months
|
||||
- `year` - Age in years
|
||||
- `bool` - True/false
|
||||
- `person` - Count of people
|
||||
|
||||
### period
|
||||
- `year` - Annual values
|
||||
- `month` - Monthly values
|
||||
- `day` - Daily values
|
||||
- `eternity` - Never changes
|
||||
|
||||
### label
|
||||
Pattern: `[State] [PROGRAM] [description]`
|
||||
```yaml
|
||||
label: Montana TANF minor child age threshold
|
||||
label: Illinois TANF earned income disregard rate
|
||||
label: California SNAP resource limit
|
||||
```
|
||||
**Rules:**
|
||||
- Spell out state name
|
||||
- Abbreviate program (TANF, SNAP)
|
||||
- No period at end
|
||||
|
||||
### reference
|
||||
**Requirements:**
|
||||
1. At least one source (prefer two)
|
||||
2. Must contain the actual value
|
||||
3. Legal codes need subsections
|
||||
4. PDFs need page anchors
|
||||
|
||||
```yaml
|
||||
✅ GOOD:
|
||||
reference:
|
||||
- title: Idaho Admin Code 16.05.03.205(3)
|
||||
href: https://adminrules.idaho.gov/rules/current/16/160503.pdf#page=14
|
||||
- title: Idaho LIHEAP Guidelines, Section 3, page 8
|
||||
href: https://healthandwelfare.idaho.gov/guidelines.pdf#page=8
|
||||
|
||||
❌ BAD:
|
||||
reference:
|
||||
- title: Federal LIHEAP regulations # Too generic
|
||||
href: https://www.acf.hhs.gov/ocs # No specific section
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Federal/State Separation
|
||||
|
||||
### Federal Parameters
|
||||
Location: `/parameters/gov/{agency}/{program}/`
|
||||
```yaml
|
||||
# parameters/gov/hhs/fpg/first_person.yaml
|
||||
description: HHS sets this amount as the federal poverty guideline for one person.
|
||||
```
|
||||
|
||||
### State Parameters
|
||||
Location: `/parameters/gov/states/{state}/{agency}/{program}/`
|
||||
```yaml
|
||||
# parameters/gov/states/ca/dss/tanf/income_limit/rate.yaml
|
||||
description: California uses this multiplier of the federal poverty guideline for TANF income eligibility.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Common Parameter Patterns
|
||||
|
||||
### Income Limits (as FPL multiplier)
|
||||
```yaml
|
||||
# income_limit/rate.yaml
|
||||
description: State uses this multiplier of the federal poverty guideline for program income limits.
|
||||
values:
|
||||
2024-01-01: 1.85 # 185% FPL
|
||||
|
||||
metadata:
|
||||
unit: /1
|
||||
period: year
|
||||
label: State PROGRAM income limit multiplier
|
||||
```
|
||||
|
||||
### Benefit Amounts
|
||||
```yaml
|
||||
# payment_standard/amount.yaml
|
||||
description: State provides this amount as the monthly program benefit.
|
||||
values:
|
||||
2024-01-01: 500
|
||||
|
||||
metadata:
|
||||
unit: currency-USD
|
||||
period: month
|
||||
label: State PROGRAM payment standard amount
|
||||
```
|
||||
|
||||
### Age Thresholds
|
||||
```yaml
|
||||
# age_threshold/minor_child.yaml
|
||||
description: State defines minor children as under this age for program eligibility.
|
||||
values:
|
||||
2024-01-01: 18
|
||||
|
||||
metadata:
|
||||
unit: year
|
||||
period: eternity
|
||||
label: State PROGRAM minor child age threshold
|
||||
```
|
||||
|
||||
### Disregard Percentages
|
||||
```yaml
|
||||
# income/disregard/percentage.yaml
|
||||
description: State excludes this share of earned income from program calculations.
|
||||
values:
|
||||
2024-01-01: 0.67 # 67%
|
||||
|
||||
metadata:
|
||||
unit: /1
|
||||
period: eternity
|
||||
label: State PROGRAM earned income disregard percentage
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Validation Checklist
|
||||
|
||||
Before creating parameters:
|
||||
- [ ] Studied reference implementations (DC, IL, TX)
|
||||
- [ ] All four metadata fields present
|
||||
- [ ] Description is one complete sentence
|
||||
- [ ] Values use underscore separators
|
||||
- [ ] Trailing zeros removed from decimals
|
||||
- [ ] References include subsections and page numbers
|
||||
- [ ] Label follows naming pattern
|
||||
- [ ] Effective date matches source document
|
||||
|
||||
---
|
||||
|
||||
## 8. Common Mistakes to Avoid
|
||||
|
||||
### Missing Metadata
|
||||
```yaml
|
||||
❌ WRONG - Missing required fields:
|
||||
metadata:
|
||||
unit: currency-USD
|
||||
label: Benefit amount
|
||||
# Missing: period, reference
|
||||
```
|
||||
|
||||
### Generic References
|
||||
```yaml
|
||||
❌ WRONG:
|
||||
reference:
|
||||
- title: State TANF Manual
|
||||
href: https://state.gov/tanf
|
||||
|
||||
✅ CORRECT:
|
||||
reference:
|
||||
- title: State TANF Manual Section 5.2, page 15
|
||||
href: https://state.gov/tanf-manual.pdf#page=15
|
||||
```
|
||||
|
||||
### Arbitrary Dates
|
||||
```yaml
|
||||
❌ WRONG:
|
||||
values:
|
||||
2000-01-01: 500 # Lazy default
|
||||
|
||||
✅ CORRECT:
|
||||
values:
|
||||
2023-07-01: 500 # From source: "effective July 1, 2023"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Real-World Examples from Production Code
|
||||
|
||||
**CRITICAL: Study actual parameter files, not just examples!**
|
||||
|
||||
Before writing ANY parameter:
|
||||
1. Open and READ 3+ similar parameter files from TX/IL/DC
|
||||
2. COPY their exact description pattern
|
||||
3. Replace state name and specific details only
|
||||
|
||||
### Payment Standards
|
||||
```yaml
|
||||
# Texas (actual production)
|
||||
description: Texas provides this amount as the payment standard under the Temporary Assistance for Needy Families program.
|
||||
|
||||
# Pennsylvania (actual production)
|
||||
description: Pennsylvania limits TANF benefits to households with resources at or below this amount.
|
||||
```
|
||||
|
||||
### Income Limits
|
||||
```yaml
|
||||
# Indiana (should be)
|
||||
description: Indiana limits gross income to this amount under the Temporary Assistance for Needy Families program.
|
||||
|
||||
# Texas (actual production)
|
||||
description: Texas limits countable resources to this amount under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
### Disregards
|
||||
```yaml
|
||||
# Indiana (should be)
|
||||
description: Indiana excludes this share of earnings from countable income under the Temporary Assistance for Needy Families program.
|
||||
|
||||
# Texas (actual production)
|
||||
description: Texas deducts this standard work expense amount from gross earned income for Temporary Assistance for Needy Families program calculations.
|
||||
```
|
||||
|
||||
### Pattern Analysis
|
||||
- **ALWAYS** spell out full program name
|
||||
- Use "under the [Program] program" or "for [Program] program calculations"
|
||||
- One simple verb (limits, provides, excludes, deducts)
|
||||
- One "this X" placeholder
|
||||
- NO extra explanation ("based on X", "This is Y")
|
||||
|
||||
### Common Description Mistakes to AVOID
|
||||
|
||||
**❌ WRONG - Using acronyms:**
|
||||
```yaml
|
||||
description: Indiana sets this gross income limit for TANF eligibility by household size.
|
||||
# Problems: "TANF" not spelled out, unnecessary "by household size"
|
||||
```
|
||||
|
||||
**✅ CORRECT:**
|
||||
```yaml
|
||||
description: Indiana limits gross income to this amount under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
**❌ WRONG - Adding explanatory text:**
|
||||
```yaml
|
||||
description: Indiana provides this payment standard amount based on household size.
|
||||
# Problem: "based on household size" is unnecessary (evident from breakdown)
|
||||
```
|
||||
|
||||
**✅ CORRECT:**
|
||||
```yaml
|
||||
description: Indiana provides this amount as the payment standard under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
**❌ WRONG - Missing program context:**
|
||||
```yaml
|
||||
description: Indiana sets the gross income limit.
|
||||
# Problem: No program name, no "this amount"
|
||||
```
|
||||
|
||||
**✅ CORRECT:**
|
||||
```yaml
|
||||
description: Indiana limits gross income to this amount under the Temporary Assistance for Needy Families program.
|
||||
```
|
||||
|
||||
### Authoritative Source Requirements
|
||||
|
||||
**ONLY use official government sources:**
|
||||
- ✅ State codes and administrative regulations
|
||||
- ✅ Official state agency websites (.gov domains)
|
||||
- ✅ Federal regulations (CFR, USC)
|
||||
- ✅ State plans and official manuals (.gov PDFs)
|
||||
|
||||
**NEVER use:**
|
||||
- ❌ Third-party guides (singlemotherguide.com, benefits.gov descriptions)
|
||||
- ❌ Wikipedia
|
||||
- ❌ Nonprofit summaries (unless no official source exists)
|
||||
- ❌ News articles
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When creating parameters:
|
||||
1. **READ ACTUAL FILES** - Study TX/IL/DC parameter files, not just skill examples
|
||||
2. **Include ALL metadata fields** - missing any causes errors
|
||||
3. **Use exact effective dates** from sources
|
||||
4. **Follow naming conventions** (amount/rate/threshold)
|
||||
5. **Write simple descriptions** with "this" placeholders and full program names
|
||||
6. **Include ONLY official government references** with subsections and pages
|
||||
7. **Format values properly** (underscores, no trailing zeros)
|
||||
@@ -0,0 +1,478 @@
|
||||
---
|
||||
name: policyengine-period-patterns
|
||||
description: PolicyEngine period handling - converting between YEAR, MONTH definition periods and testing patterns
|
||||
---
|
||||
|
||||
# PolicyEngine Period Patterns
|
||||
|
||||
Essential patterns for handling different definition periods (YEAR, MONTH) in PolicyEngine.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| From | To | Method | Example |
|
||||
|------|-----|--------|---------|
|
||||
| MONTH formula | YEAR variable | `period.this_year` | `age = person("age", period.this_year)` |
|
||||
| YEAR formula | MONTH variable | `period.first_month` | `person("monthly_rent", period.first_month)` |
|
||||
| Any | Year integer | `period.start.year` | `year = period.start.year` |
|
||||
| Any | Month integer | `period.start.month` | `month = period.start.month` |
|
||||
| Annual → Monthly | Divide by 12 | `/ MONTHS_IN_YEAR` | `monthly = annual / 12` |
|
||||
| Monthly → Annual | Multiply by 12 | `* MONTHS_IN_YEAR` | `annual = monthly * 12` |
|
||||
|
||||
---
|
||||
|
||||
## 1. Definition Periods in PolicyEngine US
|
||||
|
||||
### Available Periods
|
||||
- **YEAR**: Annual values (most common - 2,883 variables)
|
||||
- **MONTH**: Monthly values (395 variables)
|
||||
- **ETERNITY**: Never changes (1 variable - structural relationships)
|
||||
|
||||
**Note:** QUARTER is NOT used in PolicyEngine US
|
||||
|
||||
### Examples
|
||||
```python
|
||||
from policyengine_us.model_api import *
|
||||
|
||||
class annual_income(Variable):
|
||||
definition_period = YEAR # Annual amount
|
||||
|
||||
class monthly_benefit(Variable):
|
||||
definition_period = MONTH # Monthly amount
|
||||
|
||||
class is_head(Variable):
|
||||
definition_period = ETERNITY # Never changes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. The Golden Rule
|
||||
|
||||
**When accessing a variable with a different definition period than your formula, you must specify the target period explicitly.**
|
||||
|
||||
```python
|
||||
# ✅ CORRECT - MONTH formula accessing YEAR variable
|
||||
def formula(person, period, parameters):
|
||||
age = person("age", period.this_year) # Gets actual age
|
||||
|
||||
# ❌ WRONG - Would get age/12
|
||||
def formula(person, period, parameters):
|
||||
age = person("age", period) # BAD: gives age divided by 12!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Common Patterns
|
||||
|
||||
### Pattern 1: MONTH Formula Accessing YEAR Variable
|
||||
|
||||
**Use Case**: Monthly benefits need annual demographic data
|
||||
|
||||
```python
|
||||
class monthly_benefit_eligible(Variable):
|
||||
value_type = bool
|
||||
entity = Person
|
||||
definition_period = MONTH # Monthly eligibility
|
||||
|
||||
def formula(person, period, parameters):
|
||||
# Age is YEAR-defined, use period.this_year
|
||||
age = person("age", period.this_year) # ✅ Gets full age
|
||||
|
||||
# is_pregnant is MONTH-defined, just use period
|
||||
is_pregnant = person("is_pregnant", period) # ✅ Same period
|
||||
|
||||
return (age < 18) | is_pregnant
|
||||
```
|
||||
|
||||
### Pattern 2: Accessing Stock Variables (Assets)
|
||||
|
||||
**Stock variables** (point-in-time values like assets) are typically YEAR-defined
|
||||
|
||||
```python
|
||||
class tanf_countable_resources(Variable):
|
||||
value_type = float
|
||||
entity = SPMUnit
|
||||
definition_period = MONTH # Monthly check
|
||||
|
||||
def formula(spm_unit, period, parameters):
|
||||
# Assets are stocks (YEAR-defined)
|
||||
cash = spm_unit("cash_assets", period.this_year) # ✅
|
||||
vehicles = spm_unit("vehicles_value", period.this_year) # ✅
|
||||
|
||||
p = parameters(period).gov.tanf.resources
|
||||
return cash + max_(0, vehicles - p.vehicle_exemption)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Understanding Auto-Conversion: When to Use `period` vs `period.this_year`
|
||||
|
||||
### The Key Question
|
||||
|
||||
**When accessing a YEAR variable from a MONTH formula, should the value be divided by 12?**
|
||||
|
||||
- **If YES** → Use `period` (let auto-conversion happen)
|
||||
- **If NO** → Use `period.this_year` (prevent auto-conversion)
|
||||
|
||||
### When Auto-Conversion Makes Sense (Use `period`)
|
||||
|
||||
**Flow variables** where you want the monthly portion:
|
||||
|
||||
```python
|
||||
class monthly_benefit(Variable):
|
||||
definition_period = MONTH
|
||||
|
||||
def formula(person, period, parameters):
|
||||
# ✅ Use period - want $2,000/month from $24,000/year
|
||||
monthly_income = person("employment_income", period)
|
||||
|
||||
# Compare to monthly threshold
|
||||
p = parameters(period).gov.program
|
||||
return monthly_income < p.monthly_threshold
|
||||
```
|
||||
|
||||
Why: If annual income is $24,000, you want $2,000/month for monthly eligibility checks.
|
||||
|
||||
### When Auto-Conversion Breaks Things (Use `period.this_year`)
|
||||
|
||||
**Stock variables and counts** where division by 12 is nonsensical:
|
||||
|
||||
**1. Age**
|
||||
```python
|
||||
# ❌ WRONG - gives age/12
|
||||
age = person("age", period) # 30 years → 2.5 "monthly age" ???
|
||||
|
||||
# ✅ CORRECT - gives actual age
|
||||
age = person("age", period.this_year) # 30 years
|
||||
```
|
||||
|
||||
**2. Assets/Resources (Stocks)**
|
||||
```python
|
||||
# ❌ WRONG - gives assets/12
|
||||
assets = spm_unit("spm_unit_assets", period) # $12,000 → $1,000 ???
|
||||
|
||||
# ✅ CORRECT - gives point-in-time value
|
||||
assets = spm_unit("spm_unit_assets", period.this_year) # $12,000
|
||||
```
|
||||
|
||||
**3. Counts (Household Size, Number of Children)**
|
||||
```python
|
||||
# ❌ WRONG - gives count/12
|
||||
size = spm_unit("household_size", period) # 4 people → 0.33 people ???
|
||||
|
||||
# ✅ CORRECT - gives actual count
|
||||
size = spm_unit("household_size", period.this_year) # 4 people
|
||||
```
|
||||
|
||||
**4. Boolean/Enum Variables**
|
||||
```python
|
||||
# ❌ WRONG - weird fractional conversion
|
||||
status = person("is_disabled", period)
|
||||
|
||||
# ✅ CORRECT - actual status
|
||||
status = person("is_disabled", period.this_year)
|
||||
```
|
||||
|
||||
### Decision Tree
|
||||
|
||||
```
|
||||
Accessing YEAR variable from MONTH formula?
|
||||
│
|
||||
├─ Is it an INCOME or FLOW variable?
|
||||
│ └─ YES → Use period (auto-convert to monthly) ✅
|
||||
│ Example: employment_income, self_employment_income
|
||||
│
|
||||
└─ Is it AGE, ASSET, COUNT, or BOOLEAN?
|
||||
└─ YES → Use period.this_year (prevent conversion) ✅
|
||||
Examples: age, assets, household_size, is_disabled
|
||||
```
|
||||
|
||||
### Complete Example
|
||||
|
||||
```python
|
||||
class monthly_tanf_eligible(Variable):
|
||||
value_type = bool
|
||||
entity = Person
|
||||
definition_period = MONTH
|
||||
|
||||
def formula(person, period, parameters):
|
||||
# Age: Use period.this_year (don't want age/12)
|
||||
age = person("age", period.this_year) # ✅
|
||||
|
||||
# Assets: Use period.this_year (don't want assets/12)
|
||||
assets = person("assets", period.this_year) # ✅
|
||||
|
||||
# Income: Use period (DO want monthly income from annual)
|
||||
monthly_income = person("employment_income", period) # ✅
|
||||
|
||||
p = parameters(period).gov.tanf.eligibility
|
||||
|
||||
age_eligible = (age >= 18) & (age <= 64)
|
||||
asset_eligible = assets <= p.asset_limit
|
||||
income_eligible = monthly_income <= p.monthly_income_limit
|
||||
|
||||
return age_eligible & asset_eligible & income_eligible
|
||||
```
|
||||
|
||||
### Quick Reference for Auto-Conversion
|
||||
|
||||
| Variable Type | Use `period` | Use `period.this_year` | Why |
|
||||
|--------------|-------------|----------------------|-----|
|
||||
| Income (flow) | ✅ | ❌ | Want monthly portion |
|
||||
| Age | ❌ | ✅ | Age/12 is meaningless |
|
||||
| Assets/Resources (stock) | ❌ | ✅ | Point-in-time value |
|
||||
| Household size/counts | ❌ | ✅ | Can't divide people |
|
||||
| Boolean/status flags | ❌ | ✅ | True/12 is nonsense |
|
||||
| Demographic attributes | ❌ | ✅ | Properties don't divide |
|
||||
|
||||
**Rule of thumb:** If dividing by 12 makes the value meaningless → use `period.this_year`
|
||||
|
||||
### Pattern 3: Converting Annual to Monthly
|
||||
|
||||
```python
|
||||
class monthly_income_limit(Variable):
|
||||
definition_period = MONTH
|
||||
|
||||
def formula(household, period, parameters):
|
||||
# Get annual parameter
|
||||
annual_limit = parameters(period).gov.program.annual_limit
|
||||
|
||||
# Convert to monthly
|
||||
monthly_limit = annual_limit / MONTHS_IN_YEAR # ✅
|
||||
|
||||
return monthly_limit
|
||||
```
|
||||
|
||||
### Pattern 4: Getting Period Components
|
||||
|
||||
```python
|
||||
class federal_poverty_guideline(Variable):
|
||||
definition_period = MONTH
|
||||
|
||||
def formula(entity, period, parameters):
|
||||
# Get year and month as integers
|
||||
year = period.start.year # e.g., 2024
|
||||
month = period.start.month # e.g., 1-12
|
||||
|
||||
# FPG updates October 1st
|
||||
if month >= 10:
|
||||
instant_str = f"{year}-10-01"
|
||||
else:
|
||||
instant_str = f"{year - 1}-10-01"
|
||||
|
||||
# Access parameters at specific date
|
||||
p_fpg = parameters(instant_str).gov.hhs.fpg
|
||||
return p_fpg.first_person / MONTHS_IN_YEAR
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Parameter Access
|
||||
|
||||
### Standard Access
|
||||
```python
|
||||
def formula(entity, period, parameters):
|
||||
# Parameters use current period
|
||||
p = parameters(period).gov.program.benefit
|
||||
return p.amount
|
||||
```
|
||||
|
||||
### Specific Date Access
|
||||
```python
|
||||
def formula(entity, period, parameters):
|
||||
# Access parameters at specific instant
|
||||
p = parameters("2024-10-01").gov.hhs.fpg
|
||||
return p.amount
|
||||
```
|
||||
|
||||
**Important**: Never use `parameters(period.this_year)` - parameters always use the formula's period
|
||||
|
||||
---
|
||||
|
||||
## 6. Testing with Different Periods
|
||||
|
||||
### Critical Testing Rules
|
||||
|
||||
**For MONTH period tests** (`period: 2025-01`):
|
||||
- **Input** YEAR variables as **annual amounts**
|
||||
- **Output** YEAR variables show **monthly values** (÷12)
|
||||
|
||||
### Test Examples
|
||||
|
||||
**Example 1: Basic MONTH Test**
|
||||
```yaml
|
||||
- name: Monthly income test
|
||||
period: 2025-01 # MONTH period
|
||||
input:
|
||||
people:
|
||||
person1:
|
||||
employment_income: 12_000 # Input: Annual
|
||||
output:
|
||||
employment_income: 1_000 # Output: Monthly (12_000/12)
|
||||
```
|
||||
|
||||
**Example 2: Mixed Variables**
|
||||
```yaml
|
||||
- name: Eligibility with age and income
|
||||
period: 2024-01 # MONTH period
|
||||
input:
|
||||
age: 30 # Age doesn't convert
|
||||
employment_income: 24_000 # Annual input
|
||||
output:
|
||||
age: 30 # Age stays same
|
||||
employment_income: 2_000 # Monthly output
|
||||
monthly_eligible: true
|
||||
```
|
||||
|
||||
**Example 3: YEAR Period Test**
|
||||
```yaml
|
||||
- name: Annual calculation
|
||||
period: 2024 # YEAR period
|
||||
input:
|
||||
employment_income: 18_000 # Annual
|
||||
output:
|
||||
employment_income: 18_000 # Annual output
|
||||
annual_tax: 2_000
|
||||
```
|
||||
|
||||
### Testing Best Practices
|
||||
|
||||
1. **Always specify period explicitly**
|
||||
2. **Input YEAR variables as annual amounts**
|
||||
3. **Expect monthly output for YEAR variables in MONTH tests**
|
||||
4. **Use underscore separators**: `12_000` not `12000`
|
||||
5. **Add calculation comments** in integration tests
|
||||
|
||||
---
|
||||
|
||||
## 7. Common Mistakes and Solutions
|
||||
|
||||
### ❌ Mistake 1: Not Using period.this_year
|
||||
```python
|
||||
# WRONG - From MONTH formula
|
||||
def formula(person, period, parameters):
|
||||
age = person("age", period) # Gets age/12!
|
||||
|
||||
# CORRECT
|
||||
def formula(person, period, parameters):
|
||||
age = person("age", period.this_year) # Gets actual age
|
||||
```
|
||||
|
||||
### ❌ Mistake 2: Mixing Annual and Monthly
|
||||
```python
|
||||
# WRONG - Comparing different units
|
||||
monthly_income = person("monthly_income", period)
|
||||
annual_limit = parameters(period).gov.limit
|
||||
if monthly_income < annual_limit: # BAD comparison
|
||||
|
||||
# CORRECT - Convert to same units
|
||||
monthly_income = person("monthly_income", period)
|
||||
annual_limit = parameters(period).gov.limit
|
||||
monthly_limit = annual_limit / MONTHS_IN_YEAR
|
||||
if monthly_income < monthly_limit: # Good comparison
|
||||
```
|
||||
|
||||
### ❌ Mistake 3: Wrong Test Expectations
|
||||
```yaml
|
||||
# WRONG - Expecting annual in MONTH test
|
||||
period: 2024-01
|
||||
input:
|
||||
employment_income: 12_000
|
||||
output:
|
||||
employment_income: 12_000 # Wrong!
|
||||
|
||||
# CORRECT
|
||||
period: 2024-01
|
||||
input:
|
||||
employment_income: 12_000 # Annual input
|
||||
output:
|
||||
employment_income: 1_000 # Monthly output
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Quick Patterns Cheat Sheet
|
||||
|
||||
### Accessing Variables
|
||||
| Your Formula | Target Variable | Use |
|
||||
|--------------|-----------------|-----|
|
||||
| MONTH | YEAR | `period.this_year` |
|
||||
| YEAR | MONTH | `period.first_month` |
|
||||
| Any | ETERNITY | `period` |
|
||||
|
||||
### Common Variables That Need period.this_year
|
||||
- `age`
|
||||
- `household_size`, `spm_unit_size`
|
||||
- `cash_assets`, `vehicles_value`
|
||||
- `state_name`, `state_code`
|
||||
- Any demographic variable
|
||||
|
||||
### Period Conversion
|
||||
```python
|
||||
# Annual to monthly
|
||||
monthly = annual / MONTHS_IN_YEAR
|
||||
|
||||
# Monthly to annual
|
||||
annual = monthly * MONTHS_IN_YEAR
|
||||
|
||||
# Get year/month numbers
|
||||
year = period.start.year # 2024
|
||||
month = period.start.month # 1-12
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Real-World Example
|
||||
|
||||
```python
|
||||
class tanf_income_eligible(Variable):
|
||||
value_type = bool
|
||||
entity = SPMUnit
|
||||
definition_period = MONTH # Monthly eligibility
|
||||
|
||||
def formula(spm_unit, period, parameters):
|
||||
# YEAR variables need period.this_year
|
||||
household_size = spm_unit("spm_unit_size", period.this_year)
|
||||
state = spm_unit.household("state_code", period.this_year)
|
||||
|
||||
# MONTH variables use period
|
||||
gross_income = spm_unit("tanf_gross_income", period)
|
||||
|
||||
# Parameters use period
|
||||
p = parameters(period).gov.states[state].tanf
|
||||
|
||||
# Convert annual limit to monthly
|
||||
annual_limit = p.income_limit[household_size]
|
||||
monthly_limit = annual_limit / MONTHS_IN_YEAR
|
||||
|
||||
return gross_income <= monthly_limit
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Checklist for Period Handling
|
||||
|
||||
When writing a formula:
|
||||
|
||||
- [ ] Identify your formula's `definition_period`
|
||||
- [ ] Check `definition_period` of accessed variables
|
||||
- [ ] Use `period.this_year` for YEAR variables from MONTH formulas
|
||||
- [ ] Use `period` for parameters (not `period.this_year`)
|
||||
- [ ] Convert units when comparing (annual ↔ monthly)
|
||||
- [ ] Test with appropriate period values
|
||||
|
||||
---
|
||||
|
||||
## Related Skills
|
||||
|
||||
- **policyengine-aggregation-skill**: For summing across entities with period handling
|
||||
- **policyengine-core-skill**: For understanding variable and parameter systems
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
1. **Always check definition_period** before accessing variables
|
||||
2. **Default to period.this_year** for demographic/stock variables from MONTH formulas
|
||||
3. **Test thoroughly** - period mismatches cause subtle bugs
|
||||
4. **Document period conversions** in comments
|
||||
5. **Follow existing patterns** in similar variables
|
||||
@@ -0,0 +1,376 @@
|
||||
---
|
||||
name: policyengine-review-patterns
|
||||
description: PolicyEngine code review patterns - validation checklist, common issues, review standards
|
||||
---
|
||||
|
||||
# PolicyEngine Review Patterns
|
||||
|
||||
Comprehensive patterns for reviewing PolicyEngine implementations.
|
||||
|
||||
## Understanding WHY, Not Just WHAT
|
||||
|
||||
### Pattern Analysis Before Review
|
||||
|
||||
When reviewing implementations that reference other states:
|
||||
|
||||
**🔴 CRITICAL: Check WHY Variables Exist**
|
||||
|
||||
Before approving any state-specific variable, verify:
|
||||
1. **Does it have state-specific logic?** - Read the formula
|
||||
2. **Are state parameters used?** - Check for `parameters(period).gov.states.XX`
|
||||
3. **Is there transformation beyond aggregation?** - Look for calculations
|
||||
4. **Would removing it break functionality?** - Test dependencies
|
||||
|
||||
**Example Analysis:**
|
||||
```python
|
||||
# IL TANF has this variable:
|
||||
class il_tanf_assistance_unit_size(Variable):
|
||||
adds = ["il_tanf_payment_eligible_child", "il_tanf_payment_eligible_parent"]
|
||||
# ✅ VALID: IL-specific eligibility rules
|
||||
|
||||
# But IN TANF shouldn't copy it blindly:
|
||||
class in_tanf_assistance_unit_size(Variable):
|
||||
def formula(spm_unit, period):
|
||||
return spm_unit("spm_unit_size", period)
|
||||
# ❌ INVALID: No IN-specific logic, just wrapper
|
||||
```
|
||||
|
||||
### Wrapper Variable Detection
|
||||
|
||||
**Red Flags - Variables that shouldn't exist:**
|
||||
- Formula is just `return entity("federal_variable", period)`
|
||||
- Aggregates federal baseline with no transformation
|
||||
- No state parameters accessed
|
||||
- Comment says "use federal" but creates variable anyway
|
||||
|
||||
**Action:** Request deletion of unnecessary wrapper variables
|
||||
|
||||
---
|
||||
|
||||
## Priority Review Checklist
|
||||
|
||||
### 🔴 CRITICAL - Automatic Failures
|
||||
|
||||
These issues will cause crashes or incorrect results:
|
||||
|
||||
#### 1. Vectorization Violations
|
||||
```python
|
||||
❌ FAILS:
|
||||
if household("income") > 1000: # Will crash with arrays
|
||||
return 500
|
||||
|
||||
✅ PASSES:
|
||||
return where(household("income") > 1000, 500, 100)
|
||||
```
|
||||
|
||||
#### 2. Hard-Coded Values
|
||||
```python
|
||||
❌ FAILS:
|
||||
benefit = min_(income * 0.33, 500) # Hard-coded 0.33 and 500
|
||||
|
||||
✅ PASSES:
|
||||
benefit = min_(income * p.rate, p.maximum)
|
||||
```
|
||||
|
||||
#### 3. Missing Parameter Sources
|
||||
```yaml
|
||||
❌ FAILS:
|
||||
reference:
|
||||
- title: State website
|
||||
href: https://state.gov
|
||||
|
||||
✅ PASSES:
|
||||
reference:
|
||||
- title: Idaho Admin Code 16.05.03.205(3)
|
||||
href: https://adminrules.idaho.gov/rules/current/16/160503.pdf#page=14
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 🟡 MAJOR - Must Fix
|
||||
|
||||
These affect accuracy or maintainability:
|
||||
|
||||
#### 4. Test Quality Issues
|
||||
```yaml
|
||||
❌ FAILS:
|
||||
income: 50000 # No separator
|
||||
|
||||
✅ PASSES:
|
||||
income: 50_000 # Proper formatting
|
||||
```
|
||||
|
||||
#### 5. Calculation Accuracy
|
||||
- Order of operations matches regulations
|
||||
- Deductions applied in correct sequence
|
||||
- Edge cases handled (negatives, zeros)
|
||||
|
||||
#### 6. Description Style
|
||||
```yaml
|
||||
❌ FAILS:
|
||||
description: The amount of SNAP benefits # Passive voice
|
||||
|
||||
✅ PASSES:
|
||||
description: SNAP benefits # Active voice
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 🟢 MINOR - Should Fix
|
||||
|
||||
These improve code quality:
|
||||
|
||||
#### 7. Code Organization
|
||||
- One variable per file
|
||||
- Proper use of `defined_for`
|
||||
- Use of `adds` for simple sums
|
||||
|
||||
#### 8. Documentation
|
||||
- Clear references to regulation sections
|
||||
- Changelog entry present
|
||||
|
||||
---
|
||||
|
||||
## Common Issues Reference
|
||||
|
||||
### Documentation Issues
|
||||
|
||||
| Issue | Example | Fix |
|
||||
|-------|---------|-----|
|
||||
| No primary source | "See SNAP website" | Add USC/CFR citation |
|
||||
| Wrong value | $198 vs $200 in source | Update parameter |
|
||||
| Generic link | dol.gov | Link to specific regulation |
|
||||
| Missing subsection | "7 CFR 273" | "7 CFR 273.9(d)(3)" |
|
||||
|
||||
### Code Issues
|
||||
|
||||
| Issue | Impact | Fix |
|
||||
|-------|--------|-----|
|
||||
| if-elif-else with data | Crashes microsim | Use where/select |
|
||||
| Hard-coded values | Inflexible | Move to parameters |
|
||||
| Missing defined_for | Inefficient | Add eligibility condition |
|
||||
| Manual summing | Wrong pattern | Use adds attribute |
|
||||
|
||||
### Test Issues
|
||||
|
||||
| Issue | Example | Fix |
|
||||
|-------|---------|-----|
|
||||
| No separators | 100000 | 100_000 |
|
||||
| No documentation | output: 500 | Add calculation comment |
|
||||
| Wrong period | 2024-04 | Use 2024-01 or 2024 |
|
||||
| Made-up variables | heating_expense | Use existing variables |
|
||||
|
||||
---
|
||||
|
||||
## Source Verification Process
|
||||
|
||||
### Step 1: Check Parameter Values
|
||||
|
||||
For each parameter file:
|
||||
```python
|
||||
✓ Value matches source document
|
||||
✓ Source is primary (statute > regulation > website)
|
||||
✓ URL links to exact section with page anchor
|
||||
✓ Effective dates correct
|
||||
```
|
||||
|
||||
### Step 2: Validate References
|
||||
|
||||
**Primary sources (preferred):**
|
||||
- USC (United States Code)
|
||||
- CFR (Code of Federal Regulations)
|
||||
- State statutes
|
||||
- State admin codes
|
||||
|
||||
**Secondary sources (acceptable):**
|
||||
- Official policy manuals
|
||||
- State plan documents
|
||||
|
||||
**Not acceptable alone:**
|
||||
- Websites without specific sections
|
||||
- Summaries or fact sheets
|
||||
- News articles
|
||||
|
||||
---
|
||||
|
||||
## Code Quality Checks
|
||||
|
||||
### Vectorization Scan
|
||||
|
||||
Search for these patterns:
|
||||
```python
|
||||
# Red flags that indicate scalar logic:
|
||||
"if household"
|
||||
"if person"
|
||||
"elif"
|
||||
"else:"
|
||||
"and " (should be &)
|
||||
"or " (should be |)
|
||||
"not " (should be ~)
|
||||
```
|
||||
|
||||
### Hard-Coding Scan
|
||||
|
||||
Search for numeric literals:
|
||||
```python
|
||||
# Check for any number except:
|
||||
# 0, 1, -1 (basic math)
|
||||
# 12 (month conversion)
|
||||
# Small indices (2, 3 for known structures)
|
||||
|
||||
# Flag anything like:
|
||||
"0.5"
|
||||
"100"
|
||||
"0.33"
|
||||
"65" (unless it's a standard age)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Review Response Templates
|
||||
|
||||
### For Approval
|
||||
|
||||
```markdown
|
||||
## PolicyEngine Review: APPROVED ✅
|
||||
|
||||
### Verification Summary
|
||||
- ✅ All parameters trace to primary sources
|
||||
- ✅ Code is properly vectorized
|
||||
- ✅ Tests document calculations
|
||||
- ✅ No hard-coded values
|
||||
|
||||
### Strengths
|
||||
- Excellent USC/CFR citations
|
||||
- Comprehensive test coverage
|
||||
- Clear calculation logic
|
||||
|
||||
### Minor Suggestions (optional)
|
||||
- Consider adding edge case for zero income
|
||||
```
|
||||
|
||||
### For Changes Required
|
||||
|
||||
```markdown
|
||||
## PolicyEngine Review: CHANGES REQUIRED ❌
|
||||
|
||||
### Critical Issues (Must Fix)
|
||||
|
||||
1. **Non-vectorized code** - lines 45-50
|
||||
```python
|
||||
# Replace this:
|
||||
if income > threshold:
|
||||
benefit = high_amount
|
||||
|
||||
# With this:
|
||||
benefit = where(income > threshold, high_amount, low_amount)
|
||||
```
|
||||
|
||||
2. **Parameter value mismatch** - standard_deduction.yaml
|
||||
- Source shows $200, parameter has $198
|
||||
- Reference: 7 CFR 273.9(d)(1), page 5
|
||||
|
||||
### Major Issues (Should Fix)
|
||||
|
||||
3. **Missing primary source** - income_limit.yaml
|
||||
- Add statute/regulation citation
|
||||
- Current website link insufficient
|
||||
|
||||
Please address these issues and re-request review.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Validation
|
||||
|
||||
### Check Test Structure
|
||||
|
||||
```yaml
|
||||
# Verify proper format:
|
||||
- name: Case 1, description. # Numbered case with period
|
||||
period: 2024-01 # Valid period (2024-01 or 2024)
|
||||
input:
|
||||
people:
|
||||
person1: # Generic names
|
||||
employment_income: 50_000 # Underscores
|
||||
output:
|
||||
# Calculation documented
|
||||
# Income: $50,000/year = $4,167/month
|
||||
program_benefit: 250
|
||||
```
|
||||
|
||||
### Run Test Commands
|
||||
|
||||
```bash
|
||||
# Unit tests
|
||||
pytest policyengine_us/tests/policy/baseline/gov/
|
||||
|
||||
# Integration tests
|
||||
policyengine-core test <path> -c policyengine_us
|
||||
|
||||
# Microsimulation
|
||||
pytest policyengine_us/tests/microsimulation/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Review Priorities by Context
|
||||
|
||||
### New Program Implementation
|
||||
1. Parameter completeness
|
||||
2. All documented scenarios tested
|
||||
3. Eligibility paths covered
|
||||
4. No hard-coded values
|
||||
|
||||
### Bug Fixes
|
||||
1. Root cause addressed
|
||||
2. No regression potential
|
||||
3. Tests prevent recurrence
|
||||
4. Vectorization maintained
|
||||
|
||||
### Refactoring
|
||||
1. Functionality preserved
|
||||
2. Tests still pass
|
||||
3. Performance maintained
|
||||
4. Code clarity improved
|
||||
|
||||
---
|
||||
|
||||
## Quick Review Checklist
|
||||
|
||||
**Parameters:**
|
||||
- [ ] Values match sources
|
||||
- [ ] References include subsections
|
||||
- [ ] All metadata fields present
|
||||
- [ ] Effective dates correct
|
||||
|
||||
**Variables:**
|
||||
- [ ] Properly vectorized (no if-elif-else)
|
||||
- [ ] No hard-coded values
|
||||
- [ ] Uses existing variables
|
||||
- [ ] Includes proper metadata
|
||||
|
||||
**Tests:**
|
||||
- [ ] Proper period format
|
||||
- [ ] Underscore separators
|
||||
- [ ] Calculation comments
|
||||
- [ ] Realistic scenarios
|
||||
|
||||
**Overall:**
|
||||
- [ ] Changelog entry
|
||||
- [ ] Code formatted
|
||||
- [ ] Tests pass
|
||||
- [ ] Documentation complete
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When reviewing code:
|
||||
1. **Check vectorization first** - crashes are worst
|
||||
2. **Verify parameter sources** - accuracy critical
|
||||
3. **Scan for hard-coding** - maintainability issue
|
||||
4. **Validate test quality** - ensures correctness
|
||||
5. **Run all tests** - catch integration issues
|
||||
6. **Document issues clearly** - help fixes
|
||||
7. **Provide fix examples** - speed resolution
|
||||
@@ -0,0 +1,412 @@
|
||||
---
|
||||
name: policyengine-testing-patterns
|
||||
description: PolicyEngine testing patterns - YAML test structure, naming conventions, period handling, and quality standards
|
||||
---
|
||||
|
||||
# PolicyEngine Testing Patterns
|
||||
|
||||
Comprehensive patterns and standards for creating PolicyEngine tests.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### File Structure
|
||||
```
|
||||
policyengine_us/tests/policy/baseline/gov/states/[state]/[agency]/[program]/
|
||||
├── [variable_name].yaml # Unit test for specific variable
|
||||
├── [another_variable].yaml # Another unit test
|
||||
└── integration.yaml # Integration test (NEVER prefixed)
|
||||
```
|
||||
|
||||
### Period Restrictions
|
||||
- ✅ `2024-01` - First month only
|
||||
- ✅ `2024` - Whole year
|
||||
- ❌ `2024-04` - Other months NOT supported
|
||||
- ❌ `2024-01-01` - Full dates NOT supported
|
||||
|
||||
### Naming Convention
|
||||
- Files: `variable_name.yaml` (matches variable exactly)
|
||||
- Integration: Always `integration.yaml` (never prefixed)
|
||||
- Cases: `Case 1, description.` (numbered, comma, period)
|
||||
- People: `person1`, `person2` (never descriptive names)
|
||||
|
||||
---
|
||||
|
||||
## 1. Test File Organization
|
||||
|
||||
### File Naming Rules
|
||||
|
||||
**Unit tests** - Named after the variable they test:
|
||||
```
|
||||
✅ CORRECT:
|
||||
az_liheap_eligible.yaml # Tests az_liheap_eligible variable
|
||||
az_liheap_benefit.yaml # Tests az_liheap_benefit variable
|
||||
|
||||
❌ WRONG:
|
||||
test_az_liheap.yaml # Wrong prefix
|
||||
liheap_tests.yaml # Wrong pattern
|
||||
```
|
||||
|
||||
**Integration tests** - Always named `integration.yaml`:
|
||||
```
|
||||
✅ CORRECT:
|
||||
integration.yaml # Standard name
|
||||
|
||||
❌ WRONG:
|
||||
az_liheap_integration.yaml # Never prefix integration
|
||||
program_integration.yaml # Never prefix integration
|
||||
```
|
||||
|
||||
### Folder Structure
|
||||
|
||||
Follow state/agency/program hierarchy:
|
||||
```
|
||||
gov/
|
||||
└── states/
|
||||
└── [state_code]/
|
||||
└── [agency]/
|
||||
└── [program]/
|
||||
├── eligibility/
|
||||
│ └── income_eligible.yaml
|
||||
├── income/
|
||||
│ └── countable_income.yaml
|
||||
└── integration.yaml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Period Format Restrictions
|
||||
|
||||
### Critical: Only Two Formats Supported
|
||||
|
||||
PolicyEngine test system ONLY supports:
|
||||
- `2024-01` - First month of year
|
||||
- `2024` - Whole year
|
||||
|
||||
**Never use:**
|
||||
- `2024-04` - April (will fail)
|
||||
- `2024-10` - October (will fail)
|
||||
- `2024-01-01` - Full date (will fail)
|
||||
|
||||
### Handling Mid-Year Policy Changes
|
||||
|
||||
If policy changes April 1, 2024:
|
||||
```yaml
|
||||
# Option 1: Test with first month
|
||||
period: 2024-01 # Tests January with new policy
|
||||
|
||||
# Option 2: Test next year
|
||||
period: 2025-01 # When policy definitely active
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Test Naming Conventions
|
||||
|
||||
### Case Names
|
||||
|
||||
Use numbered cases with descriptions:
|
||||
```yaml
|
||||
✅ CORRECT:
|
||||
- name: Case 1, single parent with one child.
|
||||
- name: Case 2, two parents with two children.
|
||||
- name: Case 3, income at threshold.
|
||||
|
||||
❌ WRONG:
|
||||
- name: Single parent test
|
||||
- name: Test case for family
|
||||
- name: Case 1 - single parent # Wrong punctuation
|
||||
```
|
||||
|
||||
### Person Names
|
||||
|
||||
Use generic sequential names:
|
||||
```yaml
|
||||
✅ CORRECT:
|
||||
people:
|
||||
person1:
|
||||
age: 30
|
||||
person2:
|
||||
age: 10
|
||||
person3:
|
||||
age: 8
|
||||
|
||||
❌ WRONG:
|
||||
people:
|
||||
parent:
|
||||
age: 30
|
||||
child1:
|
||||
age: 10
|
||||
```
|
||||
|
||||
### Output Format
|
||||
|
||||
Use simplified format without entity key:
|
||||
```yaml
|
||||
✅ CORRECT:
|
||||
output:
|
||||
tx_tanf_eligible: true
|
||||
tx_tanf_benefit: 250
|
||||
|
||||
❌ WRONG:
|
||||
output:
|
||||
tx_tanf_eligible:
|
||||
spm_unit: true # Don't nest under entity
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Which Variables Need Tests
|
||||
|
||||
### Variables That DON'T Need Tests
|
||||
|
||||
Skip tests for simple composition variables using only `adds` or `subtracts`:
|
||||
```python
|
||||
# NO TEST NEEDED - just summing
|
||||
class tx_tanf_countable_income(Variable):
|
||||
adds = ["earned_income", "unearned_income"]
|
||||
|
||||
# NO TEST NEEDED - simple arithmetic
|
||||
class net_income(Variable):
|
||||
adds = ["gross_income"]
|
||||
subtracts = ["deductions"]
|
||||
```
|
||||
|
||||
### Variables That NEED Tests
|
||||
|
||||
Create tests for variables with:
|
||||
- Conditional logic (`where`, `select`, `if`)
|
||||
- Calculations/transformations
|
||||
- Business logic
|
||||
- Deductions/disregards
|
||||
- Eligibility determinations
|
||||
|
||||
```python
|
||||
# NEEDS TEST - has logic
|
||||
class tx_tanf_income_eligible(Variable):
|
||||
def formula(spm_unit, period, parameters):
|
||||
return where(enrolled, passes_test, other_test)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Period Conversion in Tests
|
||||
|
||||
### Critical Rule for MONTH Tests
|
||||
|
||||
When `period: 2025-01`:
|
||||
- **Input**: YEAR variables as annual amounts
|
||||
- **Output**: YEAR variables show monthly values (÷12)
|
||||
|
||||
```yaml
|
||||
- name: Case 1, income conversion.
|
||||
period: 2025-01 # MONTH period
|
||||
input:
|
||||
people:
|
||||
person1:
|
||||
employment_income: 12_000 # Input: Annual
|
||||
output:
|
||||
employment_income: 1_000 # Output: Monthly (12_000/12)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Numeric Formatting
|
||||
|
||||
### Always Use Underscore Separators
|
||||
|
||||
```yaml
|
||||
✅ CORRECT:
|
||||
employment_income: 50_000
|
||||
cash_assets: 1_500
|
||||
|
||||
❌ WRONG:
|
||||
employment_income: 50000
|
||||
cash_assets: 1500
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Integration Test Quality Standards
|
||||
|
||||
### Inline Calculation Comments
|
||||
|
||||
Document every calculation step:
|
||||
```yaml
|
||||
- name: Case 2, earnings with deductions.
|
||||
period: 2025-01
|
||||
input:
|
||||
people:
|
||||
person1:
|
||||
employment_income: 3_000 # $250/month
|
||||
output:
|
||||
# Person-level arrays
|
||||
tx_tanf_gross_earned_income: [250, 0]
|
||||
# Person1: 3,000/12 = 250
|
||||
|
||||
tx_tanf_earned_after_disregard: [87.1, 0]
|
||||
# Person1: 250 - 120 = 130
|
||||
# Disregard: 130/3 = 43.33
|
||||
# After: 130 - 43.33 = 86.67 ≈ 87.1
|
||||
```
|
||||
|
||||
### Comprehensive Scenarios
|
||||
|
||||
Include 5-7 scenarios covering:
|
||||
1. Basic eligible case
|
||||
2. Earnings with deductions
|
||||
3. Edge case at threshold
|
||||
4. Mixed enrollment status
|
||||
5. Special circumstances (SSI, immigration)
|
||||
6. Ineligible case
|
||||
|
||||
### Verify Intermediate Values
|
||||
|
||||
Check 8-10 values per test:
|
||||
```yaml
|
||||
output:
|
||||
# Income calculation chain
|
||||
program_gross_income: 250
|
||||
program_earned_after_disregard: 87.1
|
||||
program_deductions: 200
|
||||
program_countable_income: 0
|
||||
|
||||
# Eligibility chain
|
||||
program_income_eligible: true
|
||||
program_resources_eligible: true
|
||||
program_eligible: true
|
||||
|
||||
# Final benefit
|
||||
program_benefit: 320
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Common Variables to Use
|
||||
|
||||
### Always Available
|
||||
```yaml
|
||||
# Demographics
|
||||
age: 30
|
||||
is_disabled: false
|
||||
is_pregnant: false
|
||||
|
||||
# Income
|
||||
employment_income: 50_000
|
||||
self_employment_income: 10_000
|
||||
social_security: 12_000
|
||||
ssi: 9_000
|
||||
|
||||
# Benefits
|
||||
snap: 200
|
||||
tanf: 150
|
||||
medicaid: true
|
||||
|
||||
# Location
|
||||
state_code: CA
|
||||
county_code: "06037" # String for FIPS
|
||||
```
|
||||
|
||||
### Variables That DON'T Exist
|
||||
|
||||
Never use these (not in PolicyEngine):
|
||||
- `heating_expense`
|
||||
- `utility_expense`
|
||||
- `utility_shut_off_notice`
|
||||
- `past_due_balance`
|
||||
- `bulk_fuel_amount`
|
||||
- `weatherization_needed`
|
||||
|
||||
---
|
||||
|
||||
## 9. Enum Verification
|
||||
|
||||
### Always Check Actual Enum Values
|
||||
|
||||
Before using enums in tests:
|
||||
```bash
|
||||
# Find enum definition
|
||||
grep -r "class ImmigrationStatus" --include="*.py"
|
||||
```
|
||||
|
||||
```python
|
||||
# Check actual values
|
||||
class ImmigrationStatus(Enum):
|
||||
CITIZEN = "Citizen"
|
||||
LEGAL_PERMANENT_RESIDENT = "Legal Permanent Resident" # NOT "PERMANENT_RESIDENT"
|
||||
REFUGEE = "Refugee"
|
||||
```
|
||||
|
||||
```yaml
|
||||
✅ CORRECT:
|
||||
immigration_status: LEGAL_PERMANENT_RESIDENT
|
||||
|
||||
❌ WRONG:
|
||||
immigration_status: PERMANENT_RESIDENT # Doesn't exist
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Test Quality Checklist
|
||||
|
||||
Before submitting tests:
|
||||
- [ ] All variables exist in PolicyEngine
|
||||
- [ ] Period format is `2024-01` or `2024` only
|
||||
- [ ] Numbers use underscore separators
|
||||
- [ ] Integration tests have calculation comments
|
||||
- [ ] 5-7 comprehensive scenarios in integration.yaml
|
||||
- [ ] Enum values verified against actual definitions
|
||||
- [ ] Output values realistic, not placeholders
|
||||
- [ ] File names match variable names exactly
|
||||
|
||||
---
|
||||
|
||||
## Common Test Patterns
|
||||
|
||||
### Income Eligibility
|
||||
```yaml
|
||||
- name: Case 1, income exactly at threshold.
|
||||
period: 2024-01
|
||||
input:
|
||||
people:
|
||||
person1:
|
||||
employment_income: 30_360 # Annual limit
|
||||
output:
|
||||
program_income_eligible: true # At threshold = eligible
|
||||
```
|
||||
|
||||
### Priority Groups
|
||||
```yaml
|
||||
- name: Case 2, elderly priority.
|
||||
period: 2024-01
|
||||
input:
|
||||
people:
|
||||
person1:
|
||||
age: 65
|
||||
output:
|
||||
program_priority_group: true
|
||||
```
|
||||
|
||||
### Categorical Eligibility
|
||||
```yaml
|
||||
- name: Case 3, SNAP categorical.
|
||||
period: 2024-01
|
||||
input:
|
||||
spm_units:
|
||||
spm_unit:
|
||||
snap: 200 # Receives SNAP
|
||||
output:
|
||||
program_categorical_eligible: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When creating tests:
|
||||
1. **Check existing variables** before using any in tests
|
||||
2. **Use only supported periods** (2024-01 or 2024)
|
||||
3. **Document calculations** in integration tests
|
||||
4. **Verify enum values** against actual code
|
||||
5. **Follow naming conventions** exactly
|
||||
6. **Include edge cases** at thresholds
|
||||
7. **Test realistic scenarios** not placeholders
|
||||
@@ -0,0 +1,303 @@
|
||||
---
|
||||
name: policyengine-vectorization
|
||||
description: PolicyEngine vectorization patterns - NumPy operations, where/select usage, avoiding scalar logic with arrays
|
||||
---
|
||||
|
||||
# PolicyEngine Vectorization Patterns
|
||||
|
||||
Critical patterns for vectorized operations in PolicyEngine. Scalar logic with arrays will crash the microsimulation.
|
||||
|
||||
## The Golden Rule
|
||||
|
||||
**PolicyEngine processes multiple households simultaneously using NumPy arrays. NEVER use if-elif-else with entity data.**
|
||||
|
||||
---
|
||||
|
||||
## 1. Critical: What Will Crash
|
||||
|
||||
### ❌ NEVER: if-elif-else with Arrays
|
||||
|
||||
```python
|
||||
# THIS WILL CRASH - household data is an array
|
||||
def formula(household, period, parameters):
|
||||
income = household("income", period)
|
||||
if income > 1000: # ❌ CRASH: "truth value of array is ambiguous"
|
||||
return 500
|
||||
else:
|
||||
return 100
|
||||
```
|
||||
|
||||
### ✅ ALWAYS: Vectorized Operations
|
||||
|
||||
```python
|
||||
# CORRECT - works with arrays
|
||||
def formula(household, period, parameters):
|
||||
income = household("income", period)
|
||||
return where(income > 1000, 500, 100) # ✅ Vectorized
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Common Vectorization Patterns
|
||||
|
||||
### Pattern 1: Simple Conditions → `where()`
|
||||
|
||||
```python
|
||||
# Instead of if-else
|
||||
❌ if age >= 65:
|
||||
amount = senior_amount
|
||||
else:
|
||||
amount = regular_amount
|
||||
|
||||
✅ amount = where(age >= 65, senior_amount, regular_amount)
|
||||
```
|
||||
|
||||
### Pattern 2: Multiple Conditions → `select()`
|
||||
|
||||
```python
|
||||
# Instead of if-elif-else
|
||||
❌ if age < 18:
|
||||
benefit = child_amount
|
||||
elif age >= 65:
|
||||
benefit = senior_amount
|
||||
else:
|
||||
benefit = adult_amount
|
||||
|
||||
✅ benefit = select(
|
||||
[age < 18, age >= 65],
|
||||
[child_amount, senior_amount],
|
||||
default=adult_amount
|
||||
)
|
||||
```
|
||||
|
||||
### Pattern 3: Boolean Operations
|
||||
|
||||
```python
|
||||
# Combining conditions
|
||||
eligible = (age >= 18) & (income < threshold) # Use & not 'and'
|
||||
eligible = (is_disabled | is_elderly) # Use | not 'or'
|
||||
eligible = ~is_excluded # Use ~ not 'not'
|
||||
```
|
||||
|
||||
### Pattern 4: Clipping Values
|
||||
|
||||
```python
|
||||
# Instead of if for bounds checking
|
||||
❌ if amount < 0:
|
||||
amount = 0
|
||||
elif amount > maximum:
|
||||
amount = maximum
|
||||
|
||||
✅ amount = clip(amount, 0, maximum)
|
||||
# Or: amount = max_(0, min_(amount, maximum))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. When if-else IS Acceptable
|
||||
|
||||
### ✅ OK: Parameter-Only Conditions
|
||||
|
||||
```python
|
||||
# OK - parameters are scalars, not arrays
|
||||
def formula(entity, period, parameters):
|
||||
p = parameters(period).gov.program
|
||||
|
||||
# This is fine - p.enabled is a scalar boolean
|
||||
if p.enabled:
|
||||
base = p.base_amount
|
||||
else:
|
||||
base = 0
|
||||
|
||||
# But must vectorize when using entity data
|
||||
income = entity("income", period)
|
||||
return where(income < p.threshold, base, 0)
|
||||
```
|
||||
|
||||
### ✅ OK: Control Flow (Not Data)
|
||||
|
||||
```python
|
||||
# OK - controlling which calculation to use
|
||||
def formula(entity, period, parameters):
|
||||
year = period.start.year
|
||||
|
||||
if year >= 2024:
|
||||
# Use new formula (still vectorized)
|
||||
return entity("new_calculation", period)
|
||||
else:
|
||||
# Use old formula (still vectorized)
|
||||
return entity("old_calculation", period)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Common Vectorization Mistakes
|
||||
|
||||
### Mistake 1: Scalar Comparison with Array
|
||||
|
||||
```python
|
||||
❌ WRONG:
|
||||
if household("income", period) > 1000:
|
||||
# Error: truth value of array is ambiguous
|
||||
|
||||
✅ CORRECT:
|
||||
income = household("income", period)
|
||||
high_income = income > 1000 # Boolean array
|
||||
benefit = where(high_income, low_benefit, high_benefit)
|
||||
```
|
||||
|
||||
### Mistake 2: Using Python's and/or/not
|
||||
|
||||
```python
|
||||
❌ WRONG:
|
||||
eligible = is_elderly or is_disabled # Python's 'or'
|
||||
|
||||
✅ CORRECT:
|
||||
eligible = is_elderly | is_disabled # NumPy's '|'
|
||||
```
|
||||
|
||||
### Mistake 3: Nested if Statements
|
||||
|
||||
```python
|
||||
❌ WRONG:
|
||||
if eligible:
|
||||
if income < threshold:
|
||||
return full_benefit
|
||||
else:
|
||||
return partial_benefit
|
||||
else:
|
||||
return 0
|
||||
|
||||
✅ CORRECT:
|
||||
return where(
|
||||
eligible,
|
||||
where(income < threshold, full_benefit, partial_benefit),
|
||||
0
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Advanced Patterns
|
||||
|
||||
### Pattern: Vectorized Lookup Tables
|
||||
|
||||
```python
|
||||
# Instead of if-elif for ranges
|
||||
❌ if size == 1:
|
||||
amount = 100
|
||||
elif size == 2:
|
||||
amount = 150
|
||||
elif size == 3:
|
||||
amount = 190
|
||||
|
||||
✅ # Using parameter brackets
|
||||
amount = p.benefit_schedule.calc(size)
|
||||
|
||||
✅ # Or using select
|
||||
amounts = [100, 150, 190, 220, 250]
|
||||
amount = select(
|
||||
[size == i for i in range(1, 6)],
|
||||
amounts[:5],
|
||||
default=amounts[-1] # 5+ people
|
||||
)
|
||||
```
|
||||
|
||||
### Pattern: Accumulating Conditions
|
||||
|
||||
```python
|
||||
# Building complex eligibility
|
||||
income_eligible = income < p.income_threshold
|
||||
resource_eligible = resources < p.resource_limit
|
||||
demographic_eligible = (age < 18) | is_pregnant
|
||||
|
||||
# Combine with & (not 'and')
|
||||
eligible = income_eligible & resource_eligible & demographic_eligible
|
||||
```
|
||||
|
||||
### Pattern: Conditional Accumulation
|
||||
|
||||
```python
|
||||
# Sum only for eligible members
|
||||
person = household.members
|
||||
is_eligible = person("is_eligible", period)
|
||||
person_income = person("income", period)
|
||||
|
||||
# Only count income of eligible members
|
||||
eligible_income = where(is_eligible, person_income, 0)
|
||||
total = household.sum(eligible_income)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Performance Implications
|
||||
|
||||
### Why Vectorization Matters
|
||||
|
||||
- **Scalar logic**: Processes 1 household at a time → SLOW
|
||||
- **Vectorized**: Processes 1000s of households simultaneously → FAST
|
||||
|
||||
```python
|
||||
# Performance comparison
|
||||
❌ SLOW (if it worked):
|
||||
for household in households:
|
||||
if household.income > 1000:
|
||||
household.benefit = 500
|
||||
|
||||
✅ FAST:
|
||||
benefits = where(incomes > 1000, 500, 100) # All at once!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Testing for Vectorization Issues
|
||||
|
||||
### Signs Your Code Isn't Vectorized
|
||||
|
||||
**Error messages:**
|
||||
- "The truth value of an array is ambiguous"
|
||||
- "ValueError: The truth value of an array with more than one element"
|
||||
|
||||
**Performance:**
|
||||
- Tests run slowly
|
||||
- Microsimulation times out
|
||||
|
||||
### How to Test
|
||||
|
||||
```python
|
||||
# Your formula should work with arrays
|
||||
def test_vectorization():
|
||||
# Create array inputs
|
||||
incomes = np.array([500, 1500, 3000])
|
||||
|
||||
# Should return array output
|
||||
benefits = formula_with_arrays(incomes)
|
||||
assert len(benefits) == 3
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Card
|
||||
|
||||
| Operation | Scalar (WRONG) | Vectorized (CORRECT) |
|
||||
|-----------|---------------|---------------------|
|
||||
| Simple condition | `if x > 5:` | `where(x > 5, ...)` |
|
||||
| Multiple conditions | `if-elif-else` | `select([...], [...])` |
|
||||
| Boolean AND | `and` | `&` |
|
||||
| Boolean OR | `or` | `\|` |
|
||||
| Boolean NOT | `not` | `~` |
|
||||
| Bounds checking | `if x < 0: x = 0` | `max_(0, x)` |
|
||||
| Complex logic | Nested if | Nested where/select |
|
||||
|
||||
---
|
||||
|
||||
## For Agents
|
||||
|
||||
When implementing formulas:
|
||||
1. **Never use if-elif-else** with entity data
|
||||
2. **Always use where()** for simple conditions
|
||||
3. **Use select()** for multiple conditions
|
||||
4. **Use NumPy operators** (&, |, ~) not Python (and, or, not)
|
||||
5. **Test with arrays** to ensure vectorization
|
||||
6. **Parameter conditions** can use if-else (scalars)
|
||||
7. **Entity data** must use vectorized operations
|
||||
Reference in New Issue
Block a user